This function creates a plot representing multiple evaluations of a learning method across different training-set sizes.
Arguments
- data
data.frame
containing the data to plot. The function expects specific columns:training_set_size
contains the considered training-set sizes
score
contains the performance metric for each model
mean_score
contains the mean performance metric for the specific training-set size
lower_ci
contains the lower bound of the confidence interval for the mean score
upper_ci
contains the upper bound of the confidence interval for the mean score
best_resample
contains the index of the automatically selected optimal training-set size
best_model
contains the index of the best model for the optimal training-set size
name
contains a grouping key, e.g. the learning method
- thr
numerical value, if provided it is used to draw an horizontal line
- add.uncertainty
logical, whether to include the quantified uncertainty of the performance estimate in the plot
- add.boxplot
logical, whether to include a boxplot in the figure
- add.scores
logical, whether to add the performance metric of individual models as points in the plot
- add.best
logical, whether to add a point indicating the performance of what is reported as best model in
data
- shape.best
integer,
shape
aesthetic passed togeom_point
- size.best
integer,
size
aesthetic passed togeom_point
- scale.x
logical, whether to force the scaling of the x-axis
- title
character string, the title of the plot
- subtitle
character string, the subtitle of the plot
- caption
character string, the caption of the plot
- xlab, ylab
character string, axes labels
- ...
further arguments to
ggplot
Value
A ggplot
object
Details
A plot showing the mean performance and the related
95\
across different training-set sizes is produced.
Individual scores and summary metrics in the form of boxplots
can be also added (default) via the add.scores
and
add.boxplot
arguments, respectively.