Table 2.
Sample and mutation statistics of the dataset on 11 cancer types. Note that the total number of genes are 25,128 and the number in Mutation’s Effect column means that the number of non-zero values in Variants dataset (which is less than # of samples ×# of genes)
| Cancer Site | Samples (Train/Test) | Mutation’s Effect (Train/Test) | |||
|---|---|---|---|---|---|
| LOW | MODERATE | MODIFIER | HIGH | ||
| Bladder | 258 / 87 | 25,641 / 7,261 | 61,862 / 17,386 | 11,303 / 3,181 | 9,616 / 2,500 |
| Breast | 201 / 67 | 11,915 / 4,591 | 30,966 / 13,107 | 8,335 / 3,450 | 7,828 / 6,149 |
| Bronchus / Lung | 638 / 213 | 61,277 / 21,114 | 166,945 / 57,898 | 28,039 / 9,590 | 25,819 / 9,089 |
| Cervix uteri | 149 / 50 | 12,084 / 4,608 | 28,515 / 10,913 | 14,715 / 5,284 | 4,278 / 1,751 |
| Colon | 256 / 86 | 42,501 / 11,410 | 105,179 / 26,225 | 29,525 / 7,484 | 24,320 / 6,653 |
| Corpus uteri | 219 / 73 | 139,405 / 32,877 | 364,241 / 87,297 | 167,526 / 41,800 | 60,846 / 16,427 |
| Kidney | 149 / 50 | 5,425 / 1,794 | 13,772 / 4,697 | 3,684 / 1,185 | 3,155 / 1,025 |
| Liver / Intrahepatic bile ducts | 189 / 64 | 7,198 / 2,687 | 19,697 / 6,946 | 6,186 / 2,219 | 3,149 / 1,132 |
| Ovary | 151 / 51 | 6,477 / 2,808 | 17,218 / 6,811 | 3,569 / 1,311 | 3,663 / 1,020 |
| Skin | 254 / 85 | 107,923 / 40,326 | 197,015 / 72,248 | 34,699 / 13,123 | 22,051 / 7,830 |
| Stomach | 249 / 83 | 32,972 / 13,131 | 78,593 / 30,874 | 14,630 / 5,910 | 20,571 / 8,087 |
| Total | 2,713 / 909 | 452,818 / 142,607 | 1,084,003 / 334,402 | 322,211 / 94,537 | 185,296 / 61,663 |