Table 2.
Summary table of the six external datasets for the clinical association of the malignancy-risk gene signature.
Dataset | Sample size (n) | Endpoint | Statistics method | Test statistics | p value |
---|---|---|---|---|---|
Cancer risk | |||||
Turashvili et al.’s IDC study | 10 | IDC versus normal | random effect model | p=0.029 | |
Cancer relapse/progression | |||||
Chanrion et al’s Tamoxifen-Treated Primary Breast Cancer | 155 | relapse of primary breast cancer | Continuous risk score: | ||
1. logistic regression | coefficient=0.137 | p<0.0001 | |||
2. ROC | AUC=0.81 | p<0.0001 | |||
3. SVM | Accuracy rate=74% | ||||
4. two-sample t-test | p<0.0001 | ||||
Binary risk score: | |||||
logistic regression | OR=8.16 | <0.0001 | |||
Ma et al’s breast cancer study | 61 | histological status (ADH, DCIS, IDC) | correlation analysis | r=0.50 (Pearson or Spearman) | <0.0001 |
logistic regression | OR (DCIS)=2.28 (compared to ADH) | p=0.016 | |||
logistic regression | OR (IDC)=3.31 (compared to ADH) | p=0.008 | |||
Prognosis | |||||
van ‘t Veer et al’s breast metastasis dataset | training=78 test=263 | time to metastasis | Continuous risk score | ||
log-rank test | X2=11.8 (training set); X2=20.4 (test set) | p=0.0006 (training);p<0.0001 (test) | |||
Binary risk score: | |||||
log-rank test | X2=12.2 (training set); X2=22.4 (test set) | p=0.0005 (training);p<0.0001 (test) | |||
Wang et al’s breast cancer relapse free survival study | 286 | metastasis-free survival | Continuous risk score: | ||
log-rank test | X2=12.8 | p=0.0004 | |||
Binary risk score: | |||||
log-rank test | X2=12.6 | p=0.0004 | |||
Huang et al’s breast lymph node study | 37 | lynph node (pos vs. neg) | Continuous risk score: | ||
1. logistic regression | coefficient=0.2 | p=0.0107 | |||
2. ROC | AUC=0.75 | p=0.0041 | |||
3. SVM | Accuracy rate=73% | ||||
4. two-sample t-test | p=0.004 | ||||
Binary risk score: | |||||
logistic regression | OR=7.29 | p=0.007 |