BasePreprocessor

class mastml.preprocessing.BasePreprocessor(preprocessor, as_frame=False)[source]

Bases: BaseEstimator, TransformerMixin

Base class to provide new methods beyond sklearn fit_transform, such as dataframe support and directory management

Args:

preprocessor : a sklearn.preprocessor object, e.g. StandardScaler or mastml.preprocessing object

Methods:
fit_transform: method that fits the data to the preprocessor, then transforms it to the preprocessed data
Args:

X: (pd.DataFrame), dataframe of X features

y: (pd.Series), series of y target data

Returns:

Transformed data (pd.DataFrame or numpy array based on self.as_frame)

evaluate: main method to evaluate a preprocessor, build directory and save data output
Args:

X: (pd.DataFrame), dataframe of X features

y: (pd.Series), series of y target data

savepath: (str), string containing main savepath to construct splits for saving output

file_extension: (str), must be either ‘.xlsx’ or ‘.csv’, determines data file type for saving

Returns:

Xnew (pd.DataFrame or numpy array), dataframe or array of the preprocessed X features

help: method to output key information on class use, e.g. methods and parameters
Args:

None

Returns:

None, but outputs help to screen

_setup_savedir: method to create a savedir based on the provided model, splitter, selector names and datetime
Args:

model: (mastml.models.SklearnModel or other estimator object), an estimator, e.g. KernelRidge

selector: (mastml.feature_selectors or other selector object), a selector, e.g. EnsembleModelFeatureSelector

savepath: (str), string designating the savepath

Returns:

splitdir: (str), string containing the new subdirectory to save results to

Methods Summary

evaluate(X[, y, savepath, file_name, ...])

fit(X)

fit_transform(X[, y])

Fit to data, then transform it.

help()

inverse_transform(X)

transform(X)

Methods Documentation

evaluate(X, y=None, savepath=None, file_name='', make_new_dir=False, file_extension='.csv')[source]
fit(X)[source]
fit_transform(X, y=None, **fit_params)[source]

Fit to data, then transform it.

Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.

Parameters

Xarray-like of shape (n_samples, n_features)

Input samples.

yarray-like of shape (n_samples,) or (n_samples, n_outputs), default=None

Target values (None for unsupervised transformations).

**fit_paramsdict

Additional fit parameters.

Returns

X_newndarray array of shape (n_samples, n_features_new)

Transformed array.

help()[source]
inverse_transform(X)[source]
transform(X)[source]