Error

class mastml.plots.Error[source]

Bases: object

Class to make plots related to model error assessment and uncertainty quantification

Args:: None

Methods:

plot_cdf: Method for plotting the cumulative distribution function of the r-statistic (also called Z-score)

Args:
savepath: (str), string denoting the save path to save the figure to

data_type: (str), string denoting the data type, e.g. train, test, leftout

residuals: (pd.Series), series containing the true errors (model residuals)

model_errors: (pd.Series), series containing the predicted model errors

plot_rstat: Method for plotting the r-statistic distribution (true divided by predicted error)

Args:
savepath: (str), string denoting the save path to save the figure to

data_type: (str), string denoting the data type, e.g. train, test, leftout

residuals: (pd.Series), series containing the true errors (model residuals)

model_errors: (pd.Series), series containing the predicted model errors

show_figure: (bool), whether or not the generated figure is output to the notebook screen (default False)

is_calibrated: (bool), whether or not the model errors have been recalibrated (default False)

name_str: (str), extra string to append to saved figure name. Useful for distinguishing figures if running as post-processing.

Returns:
None

plot_rstat_uncal_cal_overlay: Method for plotting the r-statistic distribution for two cases together: the as-obtained uncalibrated model errors and calibrated errors

Args:
savepath: (str), string denoting the save path to save the figure to

data_type: (str), string denoting the data type, e.g. train, test, leftout

residuals: (pd.Series), series containing the true errors (model residuals)

model_errors: (pd.Series), series containing the predicted model errors

model_errors_cal: (pd.Series), series containing the calibrated predicted model errors

show_figure: (bool), whether or not the generated figure is output to the notebook screen (default False)

name_str: (str), extra string to append to saved figure name. Useful for distinguishing figures if running as post-processing.

Returns:
None

plot_real_vs_predicted_error: Sometimes called the RvE plot, or residual vs. error plot, this method plots the binned RMS residuals as a function of the binned model errors

Args:
savepath: (str), string denoting the save path to save the figure to

data_type: (str), string denoting the data type, e.g. train, test, leftout

model_errors: (pd.Series), series containing the predicted model errors

residuals: (pd.Series), series containing the true errors (model residuals)

dataset_stdev: (float), the standard deviation of the training dataset

show_figure: (bool), whether or not the generated figure is output to the notebook screen (default False)

is_calibrated: (bool), whether or not the model errors have been recalibrated (default False)

well_sampled_number: (int), number denoting whether a bin qualifies as well-sampled or not. Only affects visuals, not fitting

number_of_bins: (int), number of bins to use in plotting.

equal_sized_bins: (bool), whether to bin values such that an equal number of points reside in each bin

name_str: (str), extra string to append to saved figure name. Useful for distinguishing figures if running as post-processing.

Returns:
None

plot_real_vs_predicted_error_uncal_cal_overlay: Method for making the residual vs. error plot for two cases together: using the as-obtained uncalibrated model errors and calibrated errors

Args:
savepath: (str), string denoting the save path to save the figure to

data_type: (str), string denoting the data type, e.g. train, test, leftout

model_errors: (pd.Series), series containing the predicted model errors

model_errors_cal: (pd.Series), series containing the calibrated predicted model errors

residuals: (pd.Series), series containing the true errors (model residuals)

dataset_stdev: (float), the standard deviation of the training dataset

show_figure: (bool), whether or not the generated figure is output to the notebook screen (default False)

well_sampled_number: (int), number denoting whether a bin qualifies as well-sampled or not. Only affects visuals, not fitting

number_of_bins: (int), number of bins to use in plotting.

equal_sized_bins: (bool), whether to bin values such that an equal number of points reside in each bin

name_str: (str), extra string to append to saved figure name. Useful for distinguishing figures if running as post-processing.

Returns:
None

Methods Summary

`plot_real_vs_predicted_error`(savepath, ...)
`plot_real_vs_predicted_error_uncal_cal_overlay`(...)
`plot_rstat`(savepath, data_type, residuals, ...)
`plot_rstat_uncal_cal_overlay`(savepath, ...)

Methods Documentation

classmethod plot_real_vs_predicted_error(savepath, data_type, model_errors, residuals, dataset_stdev, show_figure=False, is_calibrated=False, well_sampled_number=30, image_dpi=250, number_of_bins=15, equal_sized_bins=False, name_str=None)[source]

classmethod plot_real_vs_predicted_error_uncal_cal_overlay(savepath, data_type, model_errors, model_errors_cal, residuals, dataset_stdev, show_figure=False, well_sampled_number=30, image_dpi=250, number_of_bins=15, equal_sized_bins=False, name_str=None)[source]

classmethod plot_rstat(savepath, data_type, residuals, model_errors, show_figure=False, is_calibrated=False, image_dpi=250, name_str=None)[source]

classmethod plot_rstat_uncal_cal_overlay(savepath, data_type, residuals, model_errors, model_errors_cal, show_figure=False, image_dpi=250, name_str=None)[source]