Balanced Random Sample With Replacement
Source:R/1-sampling-functions.R
balancedSampleWithReplacement.Rd
Takes a balanced sample with replacement from the population. See the Details section below for further information.
Arguments
- strata
vector of stratification variables. The population size is
length(strata)
- n
positive integer value, the sample size
- prob
(optional) vector of positive numeric values, the probability weights for obtaining the
strata
elements. If provided, it must be the same length asstrata
Details
This function works when the number of elements per stratum (given by the
sample size n
divided by the number of groups in strata
) is
less/greater than the number of elements in the minority group in
strata
, by taking independent samples with replacement from each group.
References
He and Garcia, Learning from Imbalanced Data, IEEE Transactions on Knowledge and Data Engineering (2009)
Examples
#Set seed for reproducibility
set.seed(seed = 5381L)
#Define strata
strata = c(rep("a", 3),rep("b", 6))
#Check ratio
table(strata)/length(strata)
#> strata
#> a b
#> 0.3333333 0.6666667
#Balanced random sample with replacement
i = balancedSampleWithReplacement(
strata = strata,
n = 8
)
#Check indices
i
#> [1] 1 3 4 2 1 6 7 7
#Check ratio in the sample
table(strata[i])/length(strata[i])
#>
#> a b
#> 0.5 0.5