BasePreprocessor

class mastml.preprocessing.BasePreprocessor(preprocessor, as_frame=False)[source]

Bases: sklearn.base.BaseEstimator, sklearn.base.TransformerMixin

Base class to provide new methods beyond sklearn fit_transform, such as dataframe support and directory management

Args:
preprocessor : a sklearn.preprocessor object, e.g. StandardScaler or mastml.preprocessing object
Methods:
fit_transform: method that fits the data to the preprocessor, then transforms it to the preprocessed data
Args:

X: (pd.DataFrame), dataframe of X features

y: (pd.Series), series of y target data

Returns:
Transformed data (pd.DataFrame or numpy array based on self.as_frame)
evaluate: main method to evaluate a preprocessor, build directory and save data output
Args:

X: (pd.DataFrame), dataframe of X features

y: (pd.Series), series of y target data

savepath: (str), string containing main savepath to construct splits for saving output

Returns:
Xnew (pd.DataFrame or numpy array), dataframe or array of the preprocessed X features
help: method to output key information on class use, e.g. methods and parameters
Args:
None
Returns:
None, but outputs help to screen
_setup_savedir: method to create a savedir based on the provided model, splitter, selector names and datetime
Args:

model: (mastml.models.SklearnModel or other estimator object), an estimator, e.g. KernelRidge

selector: (mastml.feature_selectors or other selector object), a selector, e.g. EnsembleModelFeatureSelector

savepath: (str), string designating the savepath

Returns:
splitdir: (str), string containing the new subdirectory to save results to

Methods Summary

evaluate(X[, y, savepath, file_name, …])
fit(X)
fit_transform(X[, y]) Fit to data, then transform it.
help()
inverse_transform(X)
transform(X)

Methods Documentation

evaluate(X, y=None, savepath=None, file_name='', make_new_dir=False)[source]
fit(X)[source]
fit_transform(X, y=None, **fit_params)[source]

Fit to data, then transform it.

Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.

X : array-like of shape (n_samples, n_features)
Input samples.
y : array-like of shape (n_samples,) or (n_samples, n_outputs), default=None
Target values (None for unsupervised transformations).
**fit_params : dict
Additional fit parameters.
X_new : ndarray array of shape (n_samples, n_features_new)
Transformed array.
help()[source]
inverse_transform(X)[source]
transform(X)[source]