LearningCurve¶
- class mastml.learning_curve.LearningCurve[source]¶
Bases:
object
This class is used to construct learning curves, both in the form of model performance vs. amount of training data and model performance vs. number of features used in the fit.
- Args:
None
- Methods:
- evaluate: Sets up a save directory and performs both the data and feature-based learning curves
- Args:
model: (SklearnModel or EnsembleModel), a model made in MAST-ML
X: (pd.DataFrame), dataframe containing the X feature matrix
y: (pd.Series), series containing the target y data
savepath: (str), string denoting the savepath to save the learning curve output
groups: (pd.Series), series of group designation
train_sizes: (list or np.array), list or array of floats denoting fractions of training data to evaluate for data learning curve
cv: (scikit-learn cross-validation object), a scikit-learn cross-validation object
scoring: (str), string denoting name of regression metric to evaluate learning curves. See mastml.metrics.Metrics._metric_zoo for full list
selector: (mastml.feature_selector), a mastml.feature_selectors instance
make_plot: (bool), whether or not to make the learning curve plots
- data_learning_curve: Method that calculates the model CV score as a function of amount of training data used
- Args:
model: (SklearnModel or EnsembleModel), a model made in MAST-ML
X: (pd.DataFrame), dataframe containing the X feature matrix
y: (pd.Series), series containing the target y data
savepath: (str), string denoting the savepath to save the learning curve output
groups: (pd.Series), series of group designation
train_sizes: (list or np.array), list or array of floats denoting fractions of training data to evaluate for data learning curve
cv: (scikit-learn cross-validation object), a scikit-learn cross-validation object
scoring: (str), string denoting name of regression metric to evaluate learning curves. See mastml.metrics.Metrics._metric_zoo for full list
make_plot: (bool), whether or not to make the learning curve plots
- Returns:
None
- feature_learning_curve: Method that calculates the model CV score as a function of the number of features used
- Args:
model: (SklearnModel or EnsembleModel), a model made in MAST-ML
X: (pd.DataFrame), dataframe containing the X feature matrix
y: (pd.Series), series containing the target y data
savepath: (str), string denoting the savepath to save the learning curve output
groups: (pd.Series), series of group designation
cv: (scikit-learn cross-validation object), a scikit-learn cross-validation object
scoring: (str), string denoting name of regression metric to evaluate learning curves. See mastml.metrics.Metrics._metric_zoo for full list
selector: (mastml.feature_selector), a mastml.feature_selectors instance
make_plot: (bool), whether or not to make the learning curve plots
- Returns:
None
- _setup_savedir: Method to create the output save directory for learning curve data
- Args:
savepath: (str), string denoting the base path to save the output to
- Returns:
splitdir: (str), path where learning curve data will be saved to
Methods Summary
data_learning_curve
(model, X, y[, savepath, ...])evaluate
(model, X, y[, savepath, groups, ...])feature_learning_curve
(model, X, y[, ...])Methods Documentation
- data_learning_curve(model, X, y, savepath=None, groups=None, train_sizes=None, cv=None, scoring=None, make_plot=True)[source]¶