Takes repeated stratified samples from the population. See the Details section below for further information.
Arguments
- k
number of folds
- strata
vector of stratification variables. The population size is
length(strata)
Details
Each element in the population is randomly assigned to one of the k
folds - so that the percentage of each stratum in the population is preserved
in each fold - by using the stratifiedKFolds
function.
A list of length k is then created from these folds, so that the i-th
item of the list is a vector of indices generated by removing the i-th fold
and merging the remaining k - 1 folds together.
Examples
#Set seed for reproducibility
set.seed(seed = 5381L)
#Define strata
strata = c(1,1,1,2,2,2,2,2,2)
#Check ratio
table(strata)/length(strata)
#> strata
#> 1 2
#> 0.3333333 0.6666667
#Assign data to 3 folds
i = repeatedStratifiedKm1Folds(
strata = strata,
k = 3
)
#Check indices
i
#> [[1]]
#> [1] 1 3 4 5 7 8
#>
#> [[2]]
#> [1] 2 3 4 6 8 9
#>
#> [[3]]
#> [1] 1 2 5 6 7 9
#>
#Check ratio in the samples made of k-1 folds
table(strata[i[[1]]])/length(strata[i[[1]]])
#>
#> 1 2
#> 0.3333333 0.6666667
table(strata[i[[2]]])/length(strata[i[[2]]])
#>
#> 1 2
#> 0.3333333 0.6666667
table(strata[i[[3]]])/length(strata[i[[3]]])
#>
#> 1 2
#> 0.3333333 0.6666667