Baseline_tests¶
- class mastml.baseline_tests.Baseline_tests[source]¶
Bases:
object
- Methods:
- test_mean: Compares the score of the model with a constant test value
- Args:
X: (dataframe), dataframe of X features
y: (dataframe), dataframe of y data
metrics: (list), list of metric names to evaluate true vs. pred data in each split
- Returns:
A dataframe of the results of the model for the selected metrics
- test_permuted: Compares the score of the model with a permuted test value
- Args:
X: (dataframe), dataframe of X features
y: (dataframe), dataframe of y data
metrics: (list), list of metric names to evaluate true vs. pred data in each split
- Returns:
A dataframe of the results of the model for the selected metrics
- test_nearest_neighbour_kdTree: Compares the score of the model with the test value of the nearest neighbour found using kdTree
- Args:
X: (dataframe), dataframe of X features
y: (dataframe), dataframe of y data
metrics: (list), list of metric names to evaluate true vs. pred data in each split
- Returns:
A dataframe of the results of the model for the selected metrics
- test_nearest_neighbour_cdist: Compares the score of the model with the test value of the nearest neighbour found using cdist
- Args:
X: (dataframe), dataframe of X features
y: (dataframe), dataframe of y data
metrics: (list), list of metric names to evaluate true vs. pred data in each split
d_metric: Metric to use to calculate the distance. Default is euclidean
- Returns:
A dataframe of the results of the model for the selected metrics
- test_classifier_random: Compares the score of the model with a test value of a random class
- Args:
X: (dataframe), dataframe of X features
y: (dataframe), dataframe of y data
metrics: (list), list of metric names to evaluate true vs. pred data in each split
- Returns:
A dataframe of the results of the model for the selected metrics
- test_classifier_dominant: Compares the score of the model with a test value of the dominant class (ie highest count)
- Args:
X: (dataframe), dataframe of X features
y: (dataframe), dataframe of y data
metrics: (list), list of metric names to evaluate true vs. pred data in each split
- Returns:
A dataframe of the results of the model for the selected metrics
- print_results: Prints the comparison between the naive score and the real score
- Args:
real_score: The actual score of the model
naive_score: The naive score of the model tested with fake_test
Methods Summary
test_classifier_dominant
(X_train, X_test, ...)test_classifier_random
(X_train, X_test, ...)test_mean
(X_train, X_test, y_train, y_test, ...)test_nearest_neighbour_cdist
(X_train, ...[, ...])test_nearest_neighbour_kdtree
(X_train, ...)test_permuted
(X_train, X_test, y_train, ...)to_excel
(real_score, naive_score)Methods Documentation
- test_classifier_dominant(X_train, X_test, y_train, y_test, model, metrics=['mean_absolute_error'])[source]¶
- test_classifier_random(X_train, X_test, y_train, y_test, model, metrics=['mean_absolute_error'])[source]¶
- test_nearest_neighbour_cdist(X_train, X_test, y_train, y_test, model, metrics=['mean_absolute_error'], d_metric='euclidean')[source]¶