Skip to contents

Takes repeated stratified samples from the population. See the Details section below for further information.

Usage

repeatedStratifiedKm1Folds(k, strata)

Arguments

k

number of folds

strata

vector of stratification variables. The population size is length(strata)

Value

A list of length k where each element is a vector containing the indices of the sampled data.

Details

Each element in the population is randomly assigned to one of the k folds - so that the percentage of each stratum in the population is preserved in each fold - by using the stratifiedKFolds function. A list of length k is then created from these folds, so that the i-th item of the list is a vector of indices generated by removing the i-th fold and merging the remaining k - 1 folds together.

See also

Author

Alessandro Barberis

Examples

#Set seed for reproducibility
set.seed(seed = 5381L)

#Define strata
strata = c(1,1,1,2,2,2,2,2,2)

#Check ratio
table(strata)/length(strata)
#> strata
#>         1         2 
#> 0.3333333 0.6666667 

#Assign data to 3 folds
i = repeatedStratifiedKm1Folds(
 strata = strata,
 k = 3
)
#Check indices
i
#> [[1]]
#> [1] 1 3 4 5 7 8
#> 
#> [[2]]
#> [1] 2 3 4 6 8 9
#> 
#> [[3]]
#> [1] 1 2 5 6 7 9
#> 
#Check ratio in the samples made of k-1 folds
table(strata[i[[1]]])/length(strata[i[[1]]])
#> 
#>         1         2 
#> 0.3333333 0.6666667 
table(strata[i[[2]]])/length(strata[i[[2]]])
#> 
#>         1         2 
#> 0.3333333 0.6666667 
table(strata[i[[3]]])/length(strata[i[[3]]])
#> 
#>         1         2 
#> 0.3333333 0.6666667