Table 1.
Transferability of cell type/tissue-specific predictions of regulatory variants across tissues
| GM12878 (lymphoblastoid cells) | HepG2 (liver carcinoma cells) | K562 (erythrocytic leukemia cells) | |||||||
|---|---|---|---|---|---|---|---|---|---|
| Training set | AUPR | AUROC | COR | AUPR | AUROC | COR | AUPR | AUROC | COR |
| GM12878 | 0.536 | 0.723 | 0.442 | 0.400 | 0.627 | 0.212 | 0.322 | 0.604 | 0.183 |
| HepG2 | 0.313 | 0.536 | 0.123 | 0.571 | 0.756 | 0.429 | 0.277 | 0.592 | 0.125 |
| K562 | 0.327 | 0.530 | 0.150 | 0.348 | 0.567 | 0.147 | 0.418 | 0.707 | 0.341 |
For each row, GenoNet was trained using labels from different tissues. Each cell presents the AUPR (area under the precision recall curve), AUROC (area under the receiver operating characteristics curve), and COR (Pearson correlation between predicted and true labels) calculated based on the average prediction for each variant when it is in the test data using 1000 replicates. For each replicate, the datasets were evenly divided into five parts: four as training data, and one as test data. GM12878: MPRA validated variants in lymphoblastoid cells (693 positive variants, 2772 control variants). HepG2: MPRA validated variants in liver carcinoma cells (525 positive variants, 1451 control variants). K562: MPRA validated variants in erythrocytic leukemia cells (342 positive variants, 1368 control variants). The highest value in each column is bolded