BasePreprocessor¶
-
class
mastml.preprocessing.
BasePreprocessor
(preprocessor, as_frame=False)[source]¶ Bases:
sklearn.base.BaseEstimator
,sklearn.base.TransformerMixin
Base class to provide new methods beyond sklearn fit_transform, such as dataframe support and directory management
- Args:
- preprocessor : a sklearn.preprocessor object, e.g. StandardScaler or mastml.preprocessing object
- Methods:
- fit_transform: method that fits the data to the preprocessor, then transforms it to the preprocessed data
- Args:
X: (pd.DataFrame), dataframe of X features
y: (pd.Series), series of y target data
- Returns:
- Transformed data (pd.DataFrame or numpy array based on self.as_frame)
- evaluate: main method to evaluate a preprocessor, build directory and save data output
- Args:
X: (pd.DataFrame), dataframe of X features
y: (pd.Series), series of y target data
savepath: (str), string containing main savepath to construct splits for saving output
- Returns:
- Xnew (pd.DataFrame or numpy array), dataframe or array of the preprocessed X features
- help: method to output key information on class use, e.g. methods and parameters
- Args:
- None
- Returns:
- None, but outputs help to screen
- _setup_savedir: method to create a savedir based on the provided model, splitter, selector names and datetime
- Args:
model: (mastml.models.SklearnModel or other estimator object), an estimator, e.g. KernelRidge
selector: (mastml.feature_selectors or other selector object), a selector, e.g. EnsembleModelFeatureSelector
savepath: (str), string designating the savepath
- Returns:
- splitdir: (str), string containing the new subdirectory to save results to
Methods Summary
evaluate
(X[, y, savepath, file_name, …])fit
(X)fit_transform
(X[, y])Fit to data, then transform it. help
()inverse_transform
(X)transform
(X)Methods Documentation
-
fit_transform
(X, y=None, **fit_params)[source]¶ Fit to data, then transform it.
Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.
- X : array-like of shape (n_samples, n_features)
- Input samples.
- y : array-like of shape (n_samples,) or (n_samples, n_outputs), default=None
- Target values (None for unsupervised transformations).
- **fit_params : dict
- Additional fit parameters.
- X_new : ndarray array of shape (n_samples, n_features_new)
- Transformed array.