Skip to main content
. Author manuscript; available in PMC: 2014 Jun 1.
Published in final edited form as: J Biomed Inform. 2013 Apr 4;46(3):480–496. doi: 10.1016/j.jbi.2013.03.008

Table 2.

Summary of datasets used in our experiments, where the class distribution (i.e., the percentage of positive and negative outcome variables) has been listed as reference.

Dataset Dataset description # of covariates # of samples Class distribution (positive/negative)
1 Simulated i.i.d. data 5 500 0.618 / 0.382
2 Simulated correlated data 6 500 0.764 / 0.236
3 Simulated i.i.d. data 15 1500 0.641 / 0.359
4 Simulated correlated data 15 1500 0.651 / 0.349
5 Simulated binary data 5 500 0.846 / 0.154
6 Simulated binary data 15 1500 0.726 / 0.274
7 Biomarker (CA-19 and CA-125) 2 141 0.638 / 0.362
8 Low birth weight study 8 488 0.309 / 0.691
9 UMASS aids research 8 575 0.256 / 0.744
10 Mammography experience study 8 412 0.432 / 0.568
11 Myocardial infarction 9 1253 0.219 / 0.781
HHS Vulnerability Disclosure