Filter by Low Variability — rowFilterByLowVariability • featscreen

This function filters the input matrix x depending on the features' variability.

See the rowVariability for further information.

Usage

rowFilterByLowVariability(
  x,
  g = NULL,
  percentile = 0.25,
  method = c("sd", "iqr", "mad", "rsd", "vmr", "efficiency")
)

Arguments

x

matrix or data.frame, where rows are features and columns are observations.

g

(optional) vector or factor object giving the group for the corresponding elements of x.

percentile

numerical value in the range \([0, 1]\) indicating the percentage of features to keep.

method

character string indicating the measure of variability. Available options are:

"sd": the standard deviation
"iqr": the interquartile range
"mad": the median absolute deviation
"rsd": the relative standard deviation (i.e., coefficient of variation)
"efficiency": the coefficient of variation squared
"vmr": the variance-to-mean ratio

Value

A logical vector of length nrow(x) indicating which rows of x passed the filter.

Author

Alessandro Barberis

Examples

#Seed
set.seed(1010)

#Define row/col size
nr = 5
nc = 10

#Data
x = matrix(
 data = sample.int(n = 100, size = nr*nc, replace = TRUE),
 nrow = nr,
 ncol = nc,
 dimnames = list(
   paste0("f",seq(nr)),
   paste0("S",seq(nc))
 )
)

#Grouping variable
g = c(rep("a", nc/2), rep("b", nc/2))

#Filter
rowFilterByLowVariability(x)
#>    f1    f2    f3    f4    f5 
#> FALSE  TRUE FALSE FALSE FALSE 

#Filter by group
rowFilterByLowVariability(x = x, g = g)
#>    f1    f2    f3    f4    f5 
#> FALSE  TRUE FALSE FALSE FALSE