Compute the Summary Scores

This function computes summary score(s) of the signature i in input considering each column vector in the input matrix x.

A parallel execution to speed up the computation on a multi-core machine can be run by setting the argument cores with a number greater than 1.

See the Details section below for further information.

Usage

computeSigScores(
  x,
  i = NULL,
  na.rm = TRUE,
  scores = c("sum", "weightedSum", "mean", "trimmedMean", "weightedMean", "median",
    "mode", "midrange", "midhinge", "trimean", "iqr", "iqm", "mad", "aad", "ssgsea",
    "gsva", "plage", "zscore"),
  scorers = NULL,
  args = NULL,
  sampling = c("none", "permutation", "bootstrap", "rndsig", "rndsigsub"),
  n.repeat = 10L,
  cores = 1L,
  logger = NULL,
  outdir = NULL,
  filename = "sigscores"
)

Arguments

x

features-by-samples matrix

i

(optional) numerical vector giving the rows in x or character vector matching the row names in x If missing or i = NULL, all the rows in x are considered for the computation of the scores

na.rm

logical, whether to remove NA values before computation

scores

(optional) character vector, indicating the summary score(s) to compute

scorers

named list of scoring functions. If provided, scores is not considered. Each function must accept some specific arguments, i.e. x, i, na.rm, ... and is expected to compute a score for each column in x

args

named list, where the names must match the scores or the names of scorers. Each element in the list is another list containing the arguments to pass to the function used for computing the named score. For example, args = list(trimmedMean = list(trim = 0.4)) indicates to use trim = 0.4 when computing the trimmed mean scores (scores = "trimmedMean" or scorers = list(trimmedMean = getScorer("trimmedMean")))

sampling

character string, indicating whether to compute the scores using the provided data (sampling = "none", default), whether to sample the data (sampling = "permutation" and sampling = "bootstrap"), or whether to generate random signatures, i.e. vectors the same size of i with values randomly assigned from the possible values in x.

Five options are available:

none: use x as it is
permutation: random sampling without replacement from row elements of x
bootstrap: random sampling with replacement from row elements of x
rndsig: random signatures of same length of i generated from all possible values in x
rndsigsub: random signatures of same length of i generated from all possible values in x after removing i values

See sampleData and randomSignatures for further details

n.repeat

integer, number of repeated samples to generate

cores

number of cores to use for parallel execution.

logger

(optional) a Logger object. If provided, it will be used to report extra information on progress. To create a Logger use createLogger

outdir

(optional) character string, path to the output directory. If provided the returned data will be stored

filename

(optional) character string, a name without extension for the output file

Value

A data frame containing the computed score(s) for each sample. Each row corresponds to a different sample.

If sampling = "random", sampling = "bootstrap", sampling = "rndsig" or sampling = "rndsigsub", the data frame contains a column with the run information.

The two columns containing the run/sample information are:

sampleID: the name of the sample
run: integer indicating in which run - out of the n.repeat - was computed the score

Details

computeSigScores uses internally computeScores to handle the computation of the scores.

The available scoring functions are:

"sum": sumScorer
"weightedSum": weightedSumScorer
"mean": meanScorer
"trimmedMean": trimmedMeanScorer
"weightedMean": weightedMeanScorer
"median": medianScorer
"mode": modeScorer
"midrange": midrangeScorer
"midhinge": midhingeScorer
"trimean": trimeanScorer
"iqr": iqrScorer
"iqm": iqmScorer
"mad": madScorer
"aad": aadScorer
"ssgsea": ssgseaScorer
"gsva": gsvaScorer
"plage": plageScorer
"zscore": zscoreScorer

Look at the different functions to know which specific arguments they accept (arguments can be passed via the args parameter).

Scorers also accepts a transformation function via the transform.fun argument, which is used to transform the data before the computation of the scores so that: x = transform.fun(x = x, transform.args), where transform.args is a list of parameters passed to the transformation function. Look at a scorer for further details. A transformation function and related arguments can be passed via the args parameter (see Examples).

The functions used for random sampling are:

"permutation": sampleData
"bootstrap": sampleData
"rndsig": randomSignatures
"rndsigsub": randomSignatures

Author

Alessandro Barberis

Examples

if (FALSE) { # \dontrun{
#set seed for reproducibility
set.seed(seed = 5381L)

#Define row/col size
nr = 20
nc = 10

#Create input matrix
x = matrix(
 data = stats::runif(n = nr*nc, min = 0, max = 1000),
 nrow = nr,
 ncol = nc,
 dimnames = list(
   paste0("g",seq(nr)),
   paste0("S",seq(nc))
 )
)

#Compute all scores
computeSigScores(
 x = x,
 i = rownames(x)[1:10]
)

#Compute all scores and log
computeSigScores(
 x = x,
 i = rownames(x)[1:10],
 logger = createLogger(
   verbose = T,
   level = "DEBUG")
)

#Compute one score
computeSigScores(
 x = x,
 i = rownames(x)[1:10],
 scores = 'mean'
)

#Compute one score passing an argument
computeSigScores(
 x = x,
 i = rownames(x)[1:10],
 scores = 'trimmedMean',
 args = list(trimmedMean = list(trim = 0.2))
)

#Transform data and compute the scores
computeSigScores(
 x = x,
 i = rownames(x)[1:10],
 scorers = list(
  'score1' = getScorer('weightedSum'),
  'score2' = getScorer('trimmedMean')
 ),
 args = list(
  'score1' = list(transform.fun = getDataTransformer('quantile')),
  'score2' = list(
     trim = 0.2,
     transform.fun = getDataTransformer('stepFunction'),
     transform.args = list(
       method = 'median',
       by = 'rows'
     )
   )
 )
)

#Compute scores with permutation
computeSigScores(
 x        = x,
 i        = rownames(x)[1:10],
 sampling = "permutation",
 n.repeat = 10
)

#Compute scores with permutation;
#save log file and the results
computeSigScores(
 x        = x,
 i        = rownames(x)[1:10],
 sampling = "permutation",
 n.repeat = 10,
 logger = createLogger(
   verbose = T,
   level = "DEBUG",
   path = file.path("mydir/test/log.txt")
   ),
 outdir = "mydir/test",
 filename = "sigscores"
)

#Compute scores with bootstrap
computeSigScores(
 x        = x,
 i        = rownames(x)[1:10],
 sampling = "bootstrap",
 n.repeat = 10
)

#Compute scores with random signatures
#(elements of i are possible)
computeSigScores(
 x        = x,
 i        = rownames(x)[1:10],
 sampling = "rndsig",
 n.repeat = 10
)

#Compute scores with random signatures
#(elements of i are excluded)
computeSigScores(
 x        = x,
 i        = rownames(x)[1:10],
 sampling = "rndsigsub",
 n.repeat = 10
)

} # }