PredictionErrorDisplay#
- class skore.PredictionErrorDisplay(*, prediction_error, range_y_true, range_y_pred, range_residuals, data_source, ml_task, report_type)[source]#
Visualization of the prediction error of a regression model.
This tool can display “residuals vs predicted” or “actual vs predicted” using scatter plots to qualitatively assess the behavior of a regressor, preferably on held-out data points.
An instance of this class should be created by
EstimatorReport.metrics.prediction_error(). You should not create an instance of this class directly.- Parameters:
- prediction_errorDataFrame
The prediction error data to display. The columns are
estimatorsplit(may be null)y_truey_predresiduals.
- range_y_trueRangeData
Global range of the true values.
- range_y_predRangeData
Global range of the predicted values.
- range_residualsRangeData
Global range of the residuals.
- data_source{“train”, “test”, “X_y”, “both”}
The data source used to display the prediction error.
- ml_task{“regression”, “multioutput-regression”}
The machine learning task.
- report_type{“comparison-cross-validation”, “comparison-estimator”, “cross-validation”, “estimator”}
The type of report.
- Attributes:
- facet_seaborn FacetGrid
FacetGrid containing the prediction error.
- figure_matplotlib Figure
The figure on which the prediction error is plotted.
- ax_matplotlib Axes
The axes on which the prediction error is plotted.
Examples
>>> from sklearn.datasets import load_diabetes >>> from sklearn.linear_model import Ridge >>> from skore import train_test_split >>> from skore import EstimatorReport >>> X, y = load_diabetes(return_X_y=True) >>> split_data = train_test_split(X=X, y=y, random_state=0, as_dict=True) >>> classifier = Ridge() >>> report = EstimatorReport(classifier, **split_data) >>> display = report.metrics.prediction_error() >>> display.plot(kind="actual_vs_predicted")
- frame()[source]#
Get the data used to create the prediction error plot.
- Returns:
- DataFrame
A DataFrame containing the prediction error data with columns depending on the report type:
estimator: Name of the estimator (when comparing estimators)split: Cross-validation split ID (when doing cross-validation)y_true: True target valuesy_pred: Predicted target valuesresiduals: Difference between true and predicted values(y_true - y_pred)
Examples
>>> from sklearn.datasets import load_diabetes >>> from sklearn.linear_model import Ridge >>> from skore import train_test_split, EstimatorReport >>> X, y = load_diabetes(return_X_y=True) >>> split_data = train_test_split(X=X, y=y, random_state=0, as_dict=True) >>> reg = Ridge() >>> report = EstimatorReport(reg, **split_data) >>> display = report.metrics.prediction_error() >>> df = display.frame()
- plot(*, subplot_by='auto', kind='residual_vs_predicted', despine=True)[source]#
Plot visualization.
Extra keyword arguments will be passed to matplotlib’s
plot.- Parameters:
- subplot_by{“auto”, “data_source”, “split”, “estimator”, None}, default=”auto”
The variable to use for creating subplots:
“auto” creates subplots by estimator for comparison reports, otherwise uses a single plot.
“data_source” creates subplots by data source (train/test).
“split” creates subplots by cross-validation split.
“estimator” creates subplots by estimator.
None creates a single plot.
- kind{“actual_vs_predicted”, “residual_vs_predicted”}, default=”residual_vs_predicted”
The type of plot to draw:
“actual_vs_predicted” draws the observed values (y-axis) vs. the predicted values (x-axis).
“residual_vs_predicted” draws the residuals, i.e. difference between observed and predicted values, (y-axis) vs. the predicted values (x-axis).
- despinebool, default=True
Whether to remove the top and right spines from the plot.
Examples
>>> from sklearn.datasets import load_diabetes >>> from sklearn.linear_model import Ridge >>> from skore import train_test_split >>> from skore import EstimatorReport >>> X, y = load_diabetes(return_X_y=True) >>> split_data = train_test_split(X=X, y=y, random_state=0, as_dict=True) >>> classifier = Ridge() >>> report = EstimatorReport(classifier, **split_data) >>> display = report.metrics.prediction_error() >>> display.plot(kind="actual_vs_predicted")
- set_style(*, policy='update', relplot_kwargs=None, perfect_model_kwargs=None)[source]#
Set the style parameters for the display.
- Parameters:
- policy{“override”, “update”}, default=”update”
Policy to use when setting the style parameters. If “override”, existing settings are set to the provided values. If “update”, existing settings are not changed; only settings that were previously unset are changed.
- relplot_kwargsdict, default=None
Additional keyword arguments to be passed to
seaborn.relplot()for rendering the scatter plot(s). Common options includepalette,alpha,s,marker, etc.- perfect_model_kwargsdict, default=None
Additional keyword arguments to be passed to
matplotlib.pyplot.plot()for drawing the perfect prediction line. Common options includecolor,alpha,linestyle, etc.
- Returns:
- selfobject
The instance with a modified style.
- Raises:
- ValueError
If a style parameter is unknown.
- static style_plot(plot_func)[source]#
Apply consistent style to skore displays.
This decorator: 1. Applies default style settings 2. Executes
plot_func3. Callsplt.tight_layout()to make sure axis does not overlap 4. Restores the original style settings- Parameters:
- plot_funccallable
The plot function to be decorated.
- Returns:
- callable
The decorated plot function.