Bootstrap¶
-
class
mastml.legos.data_splitters.
Bootstrap
(n, n_bootstraps=3, train_size=0.5, test_size=None, n_train=None, n_test=None, random_state=0)[source]¶ Bases:
object
# Note: Bootstrap taken directly from sklearn Github (https://github.com/scikit-learn/scikit-learn/blob/0.11.X/sklearn/cross_validation.py) # which was necessary as it was later removed from more recent sklearn releases Random sampling with replacement cross-validation iterator Provides train/test indices to split data in train test sets while resampling the input n_bootstraps times: each time a new random split of the data is performed and then samples are drawn (with replacement) on each side of the split to build the training and test sets. Note: contrary to other cross-validation strategies, bootstrapping will allow some samples to occur several times in each splits. However a sample that occurs in the train split will never occur in the test split and vice-versa. If you want each sample to occur at most once you should probably use ShuffleSplit cross validation instead.
- Args:
- n : int
- Total number of elements in the dataset.
- n_bootstraps : int (default is 3)
- Number of bootstrapping iterations
- train_size : int or float (default is 0.5)
- If int, number of samples to include in the training split (should be smaller than the total number of samples passed in the dataset). If float, should be between 0.0 and 1.0 and represent the proportion of the dataset to include in the train split.
- test_size : int or float or None (default is None)
- If int, number of samples to include in the training set (should be smaller than the total number of samples passed in the dataset). If float, should be between 0.0 and 1.0 and represent the proportion of the dataset to include in the test split. If None, n_test is set as the complement of n_train.
- random_state : int or RandomState
- Pseudo number generator state used for random sampling.
Attributes Summary
indices
Methods Summary
get_n_splits
([X, y, groups])split
(X, y[, groups])Attributes Documentation
-
indices
= True¶
Methods Documentation