OneHotGroupGenerator
- class mastml.feature_generators.OneHotGroupGenerator(featurize_df, remove_constant_columns=False)[source]
Bases:
BaseGeneratorClass to generate one-hot encoded values from a list of categories using scikit-learn’s one hot encoder method More info at: https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.OneHotEncoder.html
- Args:
featurize_df: (pd.DataFrame): pandas dataframe of group (category) names to make one hot features from
- remove_constant_columns: (bool), whether to remove constant columns from the generated feature set. It is recommended
for this to be set to False to preserve as many features as possible, to avoid potential issues at inference time when features for new test points need to be generated.
- Methods:
- fit: pass through, copies input columns as pre-generated features
- Args:
X: (pd.DataFrame), input dataframe containing X data
y: (pd.Series), series containing y data
- transform: generate the one-hot encoded features. There will be n columns made, where n = number of unique categories in groups
- Args:
None.
- Returns:
df: (dataframe), output dataframe containing generated features
y: (series), output y data as series
Methods Summary
fit(X[, y])transform([X])Methods Documentation