This function computes summary
score(s) of the signature i
in input
considering each column vector in the input matrix
x
.
A parallel execution to speed up the computation
on a multi-core machine can be run by setting
the argument cores
with a number greater
than 1
.
See the **Details** section below for further information.
Usage
computeSigScores(
x,
i = NULL,
na.rm = TRUE,
scores = c("sum", "weightedSum", "mean", "trimmedMean", "weightedMean", "median",
"mode", "midrange", "midhinge", "trimean", "iqr", "iqm", "mad", "aad", "ssgsea",
"gsva", "plage", "zscore"),
scorers = NULL,
args = NULL,
sampling = c("none", "permutation", "bootstrap", "rndsig", "rndsigsub"),
n.repeat = 10L,
cores = 1L,
logger = NULL,
outdir = NULL,
filename = "sigscores"
)
Arguments
- x
features-by-samples matrix
- i
(optional) numerical vector giving the rows in
x
or character vector matching the row names inx
Ifmissing
ori = NULL
, all the rows inx
are considered for the computation of the scores- na.rm
logical, whether to remove
NA
values before computation- scores
(optional) character vector, indicating the summary score(s) to compute
- scorers
named list of scoring functions. If provided,
scores
is not considered. Each function must accept some specific arguments, i.e.x
,i
,na.rm
,...
and is expected to compute a score for each column inx
- args
named list, where the names must match the
scores
or the names ofscorers
. Each element in the list is another list containing the arguments to pass to the function used for computing the named score. For example,args = list(trimmedMean = list(trim = 0.4))
indicates to usetrim = 0.4
when computing the trimmed mean scores (scores = "trimmedMean"
orscorers = list(trimmedMean = getScorer("trimmedMean"))
)- sampling
character string, indicating whether to compute the scores using the provided data (
sampling = "none"
, default), whether to sample the data (sampling = "permutation"
andsampling = "bootstrap"
), or whether to generate random signatures, i.e. vectors the same size ofi
with values randomly assigned from the possible values inx
.Five options are available:
none
use
x
as it ispermutation
random sampling without replacement from row elements of
x
bootstrap
random sampling with replacement from row elements of
x
rndsig
random signatures of same length of
i
generated from all possible values inx
rndsigsub
random signatures of same length of
i
generated from all possible values inx
after removingi
values
See
sampleData
andrandomSignatures
for further details- n.repeat
integer, number of repeated samples to generate
- cores
number of cores to use for parallel execution.
- logger
(optional) a
Logger
object. If provided, it will be used to report extra information on progress. To create a Logger usecreateLogger
- outdir
(optional) character string, path to the output directory. If provided the returned data will be stored
- filename
(optional) character string, a name without extension for the output file
Value
A data frame containing the computed score(s) for each sample. Each row corresponds to a different sample.
If sampling = "random"
, sampling = "bootstrap"
,
sampling = "rndsig"
or sampling = "rndsigsub"
,
the data frame contains a column with the run information.
The two columns containing the run/sample information are:
- sampleID
the name of the sample
- run
integer indicating in which run - out of the
n.repeat
- was computed the score
Details
computeSigScores
uses internally
computeScores
to handle the computation of
the scores.
The available scoring functions are:
"sum"
"weightedSum"
"mean"
"trimmedMean"
"weightedMean"
"median"
"mode"
"midrange"
"midhinge"
"trimean"
"iqr"
"iqm"
"mad"
"aad"
"ssgsea"
"gsva"
"plage"
"zscore"
Look at the different functions to know which specific
arguments they accept (arguments can be passed via the
args
parameter).
Scorers also accepts a transformation function
via the transform.fun
argument, which
is used to transform the data before the computation
of the scores so that:
x = transform.fun(x = x, transform.args)
,
where transform.args
is a list of parameters passed
to the transformation function.
Look at a scorer for further details.
A transformation function and related arguments can be
passed via the args
parameter (see **Examples**).
The functions used for random sampling are:
"permutation"
"bootstrap"
"rndsig"
"rndsigsub"
See also
Use getAvailableScores
to list the available
built-in scores.
Use getAvailableDataTransformers
to list the available
built-in data transformers
Examples
if (FALSE) {
#set seed for reproducibility
set.seed(seed = 5381L)
#Define row/col size
nr = 20
nc = 10
#Create input matrix
x = matrix(
data = stats::runif(n = nr*nc, min = 0, max = 1000),
nrow = nr,
ncol = nc,
dimnames = list(
paste0("g",seq(nr)),
paste0("S",seq(nc))
)
)
#Compute all scores
computeSigScores(
x = x,
i = rownames(x)[1:10]
)
#Compute all scores and log
computeSigScores(
x = x,
i = rownames(x)[1:10],
logger = createLogger(
verbose = T,
level = "DEBUG")
)
#Compute one score
computeSigScores(
x = x,
i = rownames(x)[1:10],
scores = 'mean'
)
#Compute one score passing an argument
computeSigScores(
x = x,
i = rownames(x)[1:10],
scores = 'trimmedMean',
args = list(trimmedMean = list(trim = 0.2))
)
#Transform data and compute the scores
computeSigScores(
x = x,
i = rownames(x)[1:10],
scorers = list(
'score1' = getScorer('weightedSum'),
'score2' = getScorer('trimmedMean')
),
args = list(
'score1' = list(transform.fun = getDataTransformer('quantile')),
'score2' = list(
trim = 0.2,
transform.fun = getDataTransformer('stepFunction'),
transform.args = list(
method = 'median',
by = 'rows'
)
)
)
)
#Compute scores with permutation
computeSigScores(
x = x,
i = rownames(x)[1:10],
sampling = "permutation",
n.repeat = 10
)
#Compute scores with permutation;
#save log file and the results
computeSigScores(
x = x,
i = rownames(x)[1:10],
sampling = "permutation",
n.repeat = 10,
logger = createLogger(
verbose = T,
level = "DEBUG",
path = file.path("mydir/test/log.txt")
),
outdir = "mydir/test",
filename = "sigscores"
)
#Compute scores with bootstrap
computeSigScores(
x = x,
i = rownames(x)[1:10],
sampling = "bootstrap",
n.repeat = 10
)
#Compute scores with random signatures
#(elements of i are possible)
computeSigScores(
x = x,
i = rownames(x)[1:10],
sampling = "rndsig",
n.repeat = 10
)
#Compute scores with random signatures
#(elements of i are excluded)
computeSigScores(
x = x,
i = rownames(x)[1:10],
sampling = "rndsigsub",
n.repeat = 10
)
}