EnsembleModelFeatureSelector

class mastml.feature_selectors.EnsembleModelFeatureSelector(model, n_features_to_select, n_random_dummy=0, n_permuted_dummy=0)[source]

Bases: BaseSelector

Class custom-written for MAST-ML to conduct selection of features with ensemble model feature importances

Args:

model: (mastml.models object), a MAST-ML compatable model

n_features_to_select: (int), the number of features to select

n_random_dummy: (int), the number of random dummy variable to use. default is 0 if not used

n_permuted_dummy: (int), the number of permuted dummy variable to use. default is 0 if not used

Methods:
fit: performs feature selection
Args:

X: (dataframe), dataframe of X features

y: (dataframe), dataframe of y data

Returns:

None

transform: performs the transform to generate output of only selected features
Args:

X: (dataframe), dataframe of X features

Returns:

dataframe: (dataframe), dataframe of selected X features

create_dummy_variable: Inserts n_dummy_variable of dummy variables with the same standard deviation and mean of

of the whole dataframe

Args:

X: (dataframe), dataframe of X features

Returns:

X: dataframe that includes dummy variables and scaled with standard scaler

check_dummy_ranking: If dummy variable is used, prints warning when number of features selected

is not optimal (numbers of features selected ranks below the dummy variable)

Args:

feature_importances_sorted: list of features sorted based on their importances

Methods Summary

check_dummy_ranking(feature_importances_sorted)

create_dummy_variable(X)

fit(X, y)

transform(X)

Methods Documentation

check_dummy_ranking(feature_importances_sorted)[source]
create_dummy_variable(X)[source]
fit(X, y)[source]
transform(X)[source]