Compute the significance of the filtering test — permutation

This function compute the significance of the screening test by using a permutation approach (significance analysis of microarray). The score stored in the reported Screened object is the computed q-value.

Usage

permutation_screener(
  x,
  y,
  weights = NULL,
  resp.type = c("gaussian", "mgaussian", "binomial", "multinomial", "poisson", "cox"),
  observations = NULL,
  coef = NULL,
  sam.resp.type = c("Quantitative", "Two class unpaired", "Survival", "Multiclass",
    "One class", "Two class paired", "Two class unpaired timecourse",
    "One class timecourse", "Two class paired timecourse", "Pattern discovery"),
  geneid = NULL,
  genenames = NULL,
  assay.type = c("array", "seq"),
  s0 = 1e-04,
  s0.perc = NULL,
  nperms = 100,
  center.arrays = FALSE,
  testStatistic = c("standard", "wilcoxon"),
  time.summary.type = c("slope", "signed.area"),
  regression.method = c("standard", "ranks"),
  return.x = FALSE,
  knn.neighbors = 10,
  random.seed = NULL,
  logged2 = TRUE,
  fdr.output = 1,
  eigengene.number = 1,
  nresamp = 20,
  all.genes = TRUE,
  return.type = c("pvalue", "object"),
  logger = Logger(verbose = F),
  multi = c("max", "average", "sum", "raw")
)

Arguments

x

the input matrix, where rows are observations and columns are variables.

y

the response variable. Its number of rows must match the number of rows of x.

weights

priors of the observations

resp.type

the response type

observations

(optional) indices of observations to keep

coef

(optional) an integer indicating the response variable to consider in multi-response data when multi = "raw"

sam.resp.type

Problem type:

"Quantitative": for a continuous parameter (Available for both array and sequencing data)
"Two class unpaired": for both array and sequencing data
"Survival": for censored survival outcome (for both array and sequencing data)
"Multiclass": more than 2 groups (for both array and sequencing data)
"One class": for a single group (only for array data)
"Two class paired": for two classes with paired observations (for both array and sequencing data)
"Two class unpaired timecourse": only for array data
"One class time course": only for array data
"Two class.paired timecourse": only for array data
"Pattern discovery": only for array data

geneid

Optional character vector of geneids for output.

genenames

Optional character vector of genenames for output.

s0

Exchangeability factor for denominator of test statistic; Default is automatic choice. Only used for array data.

s0.perc

Percentile of standard deviation values to use for s0; default is automatic choice; -1 means s0=0 (different from s0.perc=0, meaning s0=zeroeth percentile of standard deviation values= min of sd values. Only used for array data.

nperms

Number of permutations used to estimate false discovery rates

center.arrays

Should the data for each sample (array) be median centered at the outset? Default =FALSE. Only used for array data.

testStatistic

Test statistic to use in two class unpaired case.Either "standard" (t-statistic) or ,"wilcoxon" (Two-sample wilcoxon or Mann-Whitney test). Only used for array data.

time.summary.type

Summary measure for each time course: "slope", or "signed.area"). Only used for array data.

regression.method

Regression method for quantitative case: "standard", (linear least squares) or "ranks" (linear least squares on ranked data). Only used for array data.

return.x

Should the matrix of feature values be returned? Only useful for time course data, where x contains summaries of the features over time. Otherwise x is the same as the input data data\$x

knn.neighbors

Number of nearest neighbors to use for imputation of missing features values. Only used for array data.

random.seed

Optional initial seed for random number generator (integer)

logged2

Has the data been transformed by log (base 2)? This information is used only for computing fold changes

fdr.output

(Approximate) False Discovery Rate cutoff for output in significant genes table

eigengene.number

Eigengene to be used (just for resp.type="Pattern discovery")

nresamp

Number of resamples used to construct test statistic. Default 20.

return.type

debug argument, if object the output from samr is reported

logger

a Logger

multi

what to do when response has multiple output values

max: the max value of scores across multiple outputs is selected to get a single value for each observation
average: scores of multiple outputs are averaged to get a single value for each observation
sum: scores of multiple outputs are summed up to get a single value for each observation
raw: returns the scores for the multiple outputs

...

further arguments

Value

a Screened object

Author

Alessandro Barberis