flag_outliers

mastml.data_cleaner.flag_outliers(df, conf_not_input_features, savepath, n_stdevs=3)[source]

Method that scans values in each X feature matrix column and flags values that are larger than 3 standard deviations from the average of that column value. The index and column values of potentially problematic points are listed and written to an output file.

Args:
df: (dataframe), pandas dataframe containing data
Returns:
None, just writes results to file