Skip to main content
. 2019 Mar 27;10:1393. doi: 10.1038/s41467-019-09406-4

Table 2.

Summary of benchmarking study design and methods

Benchmarking study Application No. of tools Model of study Raw input data type Gold standard data preparation method Parameter optimization
Yang et al. 2013 Error correction 7 I R SIMUL N
Aghaeepour et al. 2013 Flow cytometry analysis 14 C R EXPERT N
Bradnam et al. 2013 Genome assembly 21 C R ALTECH n/a
Hunt et al. 2014 Genome assembly 10 I R, S SOFTWARE N
Lindgreen et al. 2016 Microbiome analysis 14 I S SIMUL No
McIntyre et al. 2017 Microbiome analysis 11 I R, S MOCK N
Sczyrba et al. 2017 Microbiome analysis 25 C S SIMUL n/a
Altenhoff et al. 2016 Ortholog prediction 15 I DB DB Y
Jiang et al. 2016 Protein function prediction 121 C R DB n/a
Radjvojac et al. 2013 Protein function prediction 54 C R DB n/a
Baruzzo et al. 2017 Read alignment 14 I S SIMUL Y
Earl et al. 2014 Read alignment 12 C R, S SIMUL n/a
Hatem et al. 2013 Read alignment 9 I R, S SIMUL Y
Hayer et al. 2015 RNA-Seq analysis 7 I R, S ALTECH N
Kanitz et al. 2015 RNA-Seq analysis 11 I R, S ALTECH N
Łabaj et al. 2016 RNA-Seq analysis 7 I R ALTECH N
Łabaj et al. 2016 RNA-Seq analysis 4 I R DB N
Li et al. 2014 RNA-Seq analysis 5 I R ALTECH Y
Steijger et al. 2013 RNA-Seq analysis 14 C, I R ALTECH n/a
Su et al. 2014 RNA-Seq analysis 6 I R ALTECH Y
Zhang et al. 2014 RNA-Seq analysis 3 I R ALTECH Y
Thompson et al. 2011 Sequence alignment 8 I DB DB N
Bohnert et al. 2017 Variant analysis 19 I R, S I&A Y
Ewing et al. 2015 Variant analysis 14 C S SIMUL n/a
Pabinger et al. 2014 Variant analysis 32 I R, S SIMUL N

Surveyed benchmarking studies published from 2011 to 2017 are grouped according to their area of application (indicated in column “Application”). We also recorded the number of tools benchmarked by each study (“Number of Tools”). We documented the coordinating model used to conduct the benchmarking study (“Model of Study”), such as those independently performed by a single group (“I”), a competition-based approach (“C”), and a hybrid approach combining elements of “I” and “C” (“C, I”). Types of raw omics data (“Raw Omics Data”) and gold standard data (“Gold Standard Data Preparation Method”) were documented across benchmarking study. When a benchmarking study uses computationally simulated data, we marked the study as “S”; when real raw data were experimentally generated in the wet-lab, we marked the study as “R”. When the study used both simulated and real data, we marked the study as “R, S”. Gold standard data types included data that were computationally simulated (marked as “SIMUL”), manually evaluated by experts (marked as “EXPERT”), prepared by alternative technology (“marked as ALTECH”), prepared as curated software input (marked as “SOFTWARE”), prepared as mock community (marked as “MOCK”), prepared from curated databases (marked as “DB”), and prepared using an integration and arbitration approach (marked as “I&A”). In competition-based benchmarking studies, parameter optimization (“Parameter Optimization”) is performed by each team and is not mandatory (marked here as “n/a”). More details about the characteristics of techniques to prepare gold standard data sets are provided in Table 1