EnsembleModelFeatureSelector

class mastml.feature_selectors.EnsembleModelFeatureSelector(model, n_features_to_select, n_random_dummy=0, n_permuted_dummy=0)[source]

Bases: mastml.feature_selectors.BaseSelector

Class custom-written for MAST-ML to conduct selection of features with ensemble model feature importances

Args:

model: (mastml.models object), a MAST-ML compatable model

n_features_to_select: (int), the number of features to select

n_random_dummy: (int), the number of random dummy variable to use. default is 0 if not used

n_permuted_dummy: (int), the number of permuted dummy variable to use. default is 0 if not used

Methods:
fit: performs feature selection
Args:

X: (dataframe), dataframe of X features

y: (dataframe), dataframe of y data

Returns:
None
transform: performs the transform to generate output of only selected features
Args:
X: (dataframe), dataframe of X features
Returns:
dataframe: (dataframe), dataframe of selected X features
create_dummy_variable: Inserts n_dummy_variable of dummy variables with the same standard deviation and mean of
of the whole dataframe
Args:
X: (dataframe), dataframe of X features
Returns:
X: dataframe that includes dummy variables and scaled with standard scaler
check_dummy_ranking: If dummy variable is used, prints warning when number of features selected
is not optimal (numbers of features selected ranks below the dummy variable)
Args:
feature_importances_sorted: list of features sorted based on their importances

Methods Summary

check_dummy_ranking(feature_importances_sorted)
create_dummy_variable(X)
fit(X, y)
transform(X)

Methods Documentation

check_dummy_ranking(feature_importances_sorted)[source]
create_dummy_variable(X)[source]
fit(X, y)[source]
transform(X)[source]