Skip to main content
. 2017 Nov 6;2017:bax083. doi: 10.1093/database/bax083

Table 2.

Repositories and the dataset distribution used in the 2016 bioCADDIE Dataset Retrieval Challenge

Repository Datasets
Avg dataset size Avg number of attributes
(#) (%) (KB) (#)
ClinicalTrials 192 500 24.257 4.0 45
BioProject 155 850 19.638 1.1 11
PDB 113 493 14.301 4.0 147
GEO 105 033 13.235 0.4 14
Dryad 67 455 8.500 2.1 38
ArrayExpress 60 881 7.672 1.6 12
Dataverse 60 303 7.599 1.9 20
NeuroMorpho 34 082 4.295 1.3 38
Gemma 2285 0.288 1.6 9
ProteomeXchange 1716 0.216 1.1 32
PhenDisco 429 0.054 67.2 36
NursaDatasets 389 0.049 1.6 34
MPD 235 0.030 2.2 36
PeptideAtlas 76 0.010 3.2 24
PhysioBank 70 0.009 1.2 18
CIA 63 0.008 1.0 32
CTN 46 0.006 1.4 17
OpenfMRI 36 0.005 1.5 20
CVRG 29 0.004 2.0 20
YPED 21 0.003 1.7 25