feature_learning_curve¶
-
mastml.learning_curve.
feature_learning_curve
(X, y, estimator, cv, scoring, selector_name, savepath, n_features_to_select=None, Xgroups=None)[source]¶ Method that calculates data used to plot a feature learning curve, e.g. the RMSE of a cross-validation routine using a specified model and a given number of features
- Args:
X: (numpy array), array of X data values
y: (numpy array), array of y data values
estimator: (scikit-learn model object), a scikit-learn model used for fitting
cv: (scikit-learn cross validation object), a scikit-learn cross validation object to construct train/test splits
scoring: (scikit-learn metric object), a scikit-learn metric to use as a scorer
selector_name: (str), name of a scikit-learn or MAST-ML feature selection routine
n_features_to_select: (int), total number of features to select, i.e. stopping criterion for number of features
Xgroups: (list), list of row indices corresponding to each group
- Returns:
train_sizes: (numpy array), array of fractions of training data used in learning curve
train_mean: (numpy array), array of means of training data scores for each number of features
test_mean: (numpy array), array of means of testing data scores for each number of features
train_stdev: (numpy array), array of standard deviations of training data scores for each number of features
test_stdev: (numpy array), array of standard deviations of testing data scores for each number of features