Table 2.
Summary of datasets used in our experiments, where the class distribution (i.e., the percentage of positive and negative outcome variables) has been listed as reference.
Dataset | Dataset description | # of covariates | # of samples | Class distribution (positive/negative) |
---|---|---|---|---|
1 | Simulated i.i.d. data | 5 | 500 | 0.618 / 0.382 |
2 | Simulated correlated data | 6 | 500 | 0.764 / 0.236 |
3 | Simulated i.i.d. data | 15 | 1500 | 0.641 / 0.359 |
4 | Simulated correlated data | 15 | 1500 | 0.651 / 0.349 |
5 | Simulated binary data | 5 | 500 | 0.846 / 0.154 |
6 | Simulated binary data | 15 | 1500 | 0.726 / 0.274 |
7 | Biomarker (CA-19 and CA-125) | 2 | 141 | 0.638 / 0.362 |
8 | Low birth weight study | 8 | 488 | 0.309 / 0.691 |
9 | UMASS aids research | 8 | 575 | 0.256 / 0.744 |
10 | Mammography experience study | 8 | 412 | 0.432 / 0.568 |
11 | Myocardial infarction | 9 | 1253 | 0.219 / 0.781 |