Skip to contents

This function filters the input matrix x depending on the features' intensity. Variables are removed if their median values are lower than a provided minimum.

See the Details section below for further information.

Usage

rowFilterByMedianAboveMinExpr(x, g = NULL, min.expr = 0)

Arguments

x

matrix or data.frame, where rows are features and columns are observations.

g

(optional) vector or factor object giving the group for the corresponding elements of x.

min.expr

numerical value indicating the median minimum expression required.

Value

A logical vector of length nrow(x) indicating which rows of x passed the filter.

Details

If g = NULL, the median of each feature is computed across all observations via rowMedians. Then, the i-th feature is kept if \(median_{i} >= min.expr\).

If g is provided, the median per group is computed for each feature via median.

Then, the i-th feature is kept if \(median_{ig} >= min.expr\) in at least one group.

Author

Alessandro Barberis

Examples

#Seed
set.seed(1010)

#Define row/col size
nr = 5
nc = 10

#Data
x = matrix(
 data = sample.int(n = 100, size = nr*nc, replace = TRUE),
 nrow = nr,
 ncol = nc,
 dimnames = list(
   paste0("f",seq(nr)),
   paste0("S",seq(nc))
 )
)

#Grouping variable
g = c(rep("a", nc/2), rep("b", nc/2))

#Filter
rowFilterByMedianAboveMinExpr(x)
#>   f1   f2   f3   f4   f5 
#> TRUE TRUE TRUE TRUE TRUE 

#Filter by group
rowFilterByMedianAboveMinExpr(x = x, g = g)
#>   f1   f2   f3   f4   f5 
#> TRUE TRUE TRUE TRUE TRUE 

#Set 1st feature to 0s for 2/3 observations
x[1,seq(2*nc/3)] = 0

#Set 2nd feature to 0s for 2/3 observations of class "a"
x[2,seq(2*nc/6)] = 0

#Set 3rd feature to 0s for 2/3 observations of class "a" and "b"
x[3,seq(2*nc/6)] = 0
x[3,(seq(2*nc/6)+nc/2)] = 0

#Filter (1st and 3rd features should be flagged to be removed)
rowFilterByMedianAboveMinExpr(x = x, min.expr = 1)
#>    f1    f2    f3    f4    f5 
#> FALSE  TRUE FALSE  TRUE  TRUE 


#Filter by group (3rd feature should be flagged to be removed)
rowFilterByMedianAboveMinExpr(x = x, g = g, min.expr = 1)
#>    f1    f2    f3    f4    f5 
#>  TRUE  TRUE FALSE  TRUE  TRUE