DataframeUtilities¶
- class mastml.feature_generators.DataframeUtilities[source]¶
Bases:
object
Class of basic utilities for dataframe manipulation, and exchanging between dataframes and numpy arrays
- Args:
None
- Methods:
- clean_dataframeMethod to clean dataframes after feature generation has occurred, to remove columns that have a single missing or NaN value, or remove a row that is fully empty
- Args:
df: (dataframe), a post feature generation dataframe that needs cleaning
- Returns:
df: (dataframe), the cleaned dataframe
- merge_dataframe_columnsmerge two dataframes by concatenating the column names (duplicate columns omitted)
- Args:
dataframe1: (dataframe), a pandas dataframe object
dataframe2: (dataframe), a pandas dataframe object
- Returns:
dataframe: (dataframe), merged dataframe
- merge_dataframe_rowsmerge two dataframes by concatenating the row contents (duplicate rows omitted)
- Args:
dataframe1: (dataframe), a pandas dataframe object
dataframe2: (dataframe), a pandas dataframe object
- Returns:
dataframe: (dataframe), merged dataframe
- get_dataframe_statisticsobtain basic statistics about data contained in the dataframe
- Args:
dataframe: (dataframe), a pandas dataframe object
- Returns:
dataframe_stats: (dataframe), dataframe containing input dataframe statistics
- dataframe_to_arraytransform a pandas dataframe to a numpy array
- Args:
dataframe: (dataframe), a pandas dataframe object
- Returns:
array: (numpy array), a numpy array representation of the inputted dataframe
- array_to_dataframetransform a numpy array to a pandas dataframe
- Args:
array: (numpy array), a numpy array
- Returns:
dataframe: (dataframe), a pandas dataframe representation of the inputted numpy array
- concatenate_arraysmerge two numpy arrays by concatenating along the columns
- Args:
Xarray: (numpy array), a numpy array object
yarray: (numpy array), a numpy array object
- Returns:
array: (numpy array), a numpy array merging the two input arrays
- assign_columns_as_featuresadds column names to dataframe based on the x and y feature names
- Args:
dataframe: (dataframe), a pandas dataframe object
x_features: (list), list containing x feature names
y_feature: (str), target feature name
- Returns:
dataframe: (dataframe), dataframe containing same data as input, with columns labeled with features
- save_all_dataframe_statisticsobtain dataframe statistics and save it to a csv file
- Args:
dataframe: (dataframe), a pandas dataframe object
data_path: (str), file path to save dataframe statistics to
- Returns:
fname: (str), name of file dataframe stats saved to
Methods Summary
array_to_dataframe
(array)assign_columns_as_features
(dataframe, ...[, ...])clean_dataframe
(df)concatenate_arrays
(X_array, y_array)dataframe_to_array
(dataframe)get_dataframe_statistics
(dataframe)merge_dataframe_columns
(dataframe1, dataframe2)merge_dataframe_rows
(dataframe1, dataframe2)remove_constant_columns
(dataframe)save_all_dataframe_statistics
(dataframe, ...)Methods Documentation