Skip to contents

Takes a stratified sample without replacement from the population. See the Details section below for further information.

Usage

stratifiedSampleWithoutReplacement(strata, n, prob = NULL)

Arguments

strata

vector of stratification variables. The population size is length(strata)

n

positive integer value, the sample size

prob

(optional) vector of positive numeric values, the probability weights for obtaining the strata elements. If provided, it must be the same length as strata

Value

A vector of length n containing the index of the computed random set of observations.

Details

Stratified sampling is a technique of sampling from a population that can be partitioned into 'strata' (or 'subpopulations'), where each element in the population is part of one and only one stratum. It is used to ensure that subgroups of the population are represented in the taken sample. This function implements the so-called "proportionate allocation", in which the proportion of the strata in the population is maintained in the samples.

References

https://en.wikipedia.org/wiki/Stratified_sampling

Author

Alessandro Barberis

Examples

#Set seed for reproducibility
set.seed(seed = 5381L)

#Define strata
strata = c(rep("a", 3),rep("b", 6))

#Check ratio
table(strata)/length(strata)
#> strata
#>         a         b 
#> 0.3333333 0.6666667 

#Stratified random sample
i = stratifiedSampleWithoutReplacement(
  strata = strata,
  n = 3
)
#Check indices
i
#> [1] 5 4 1
#Check ratio in the sample
table(strata[i])/length(strata[i])
#> 
#>         a         b 
#> 0.3333333 0.6666667