Skip to contents

Assigns the population to k stratified folds and returns a random sample made of elements from k-1 folds. See the Details section below for further information.

Usage

stratifiedKm1Folds(k, strata, i = NULL)

Arguments

k

number of folds

strata

vector of stratification variables. The population size is length(strata)

i

(optional) integer, fold to be use as holdout data

Value

A vector containing the indices of the sampled data.

Details

Each element in the population is randomly assigned to one of the k folds - so that the percentage of each stratum in the population is preserved in each fold - by using the stratifiedKFolds function. If provided, i indicates the i-th fold to be considered as holdout data. If i is missing, one fold is randomly selected to be the holdout data. A random sample is then generated by removing the i-th fold and merging the remaining k - 1 folds together.

See also

Author

Alessandro Barberis

Examples

#Set seed for reproducibility
set.seed(seed = 5381L)

#Define strata
strata = c(1,1,1,2,2,2,2,2,2)

#Check ratio
table(strata)/length(strata)
#> strata
#>         1         2 
#> 0.3333333 0.6666667 

#Assign data to 3 folds
i = repeatedStratifiedKm1Folds(
 strata = strata,
 k = 3
)
#Check indices
i
#> [[1]]
#> [1] 1 3 4 5 7 8
#> 
#> [[2]]
#> [1] 2 3 4 6 8 9
#> 
#> [[3]]
#> [1] 1 2 5 6 7 9
#> 
#Check ratio in the samples made of k-1 folds
table(strata[i[[1]]])/length(strata[i[[1]]])
#> 
#>         1         2 
#> 0.3333333 0.6666667 
table(strata[i[[2]]])/length(strata[i[[2]]])
#> 
#>         1         2 
#> 0.3333333 0.6666667 
table(strata[i[[3]]])/length(strata[i[[3]]])
#> 
#>         1         2 
#> 0.3333333 0.6666667