Table 2.
List of data partitions, associated primary cancer sites per group, and the number of training, validation (Val), and testing samples for each primary site code. Note that the partitioning was done by the algorithm described in Section 3.3; thus, the group may not represent the cancer topography defined in the ICD-O-3 coding manual.
| Group | Site | Train | Val | Test | Group | Site | Train | Val | Test |
|---|---|---|---|---|---|---|---|---|---|
| 0 | C00 | 642 | 62 | 73 | 6 | C40 | 799 | 103 | 141 |
| 0 | C01 | 3 377 | 362 | 538 | 6 | C41 | 1 393 | 144 | 240 |
| 0 | C02 | 2 491 | 248 | 383 | 7 | C42 | 68,274 | 7 584 | 9 881 |
| 0 | C03 | 503 | 55 | 104 | 8 | C44 | 44,535 | 4 864 | 10,625 |
| 0 | C04 | 959 | 97 | 94 | 9 | C47 | 227 | 17 | 17 |
| 0 | C05 | 810 | 84 | 142 | 9 | C48 | 1 574 | 156 | 225 |
| 0 | C06 | 1 004 | 115 | 140 | 9 | C49 | 5 324 | 538 | 881 |
| 0 | C07 | 1 585 | 161 | 264 | 10 | C50 | 144,230 | 16,089 | 24,746 |
| 0 | C08 | 445 | 43 | 71 | 11 | C51 | 4 881 | 541 | 630 |
| 0 | C09 | 4 332 | 495 | 650 | 11 | C52 | 1 005 | 104 | 117 |
| 0 | C10 | 800 | 86 | 159 | 11 | C53 | 6 362 | 713 | 919 |
| 0 | C11 | 1 065 | 102 | 144 | 12 | C54 | 22,083 | 2 428 | 3 747 |
| 0 | C12 | 466 | 44 | 26 | 12 | C55 | 626 | 57 | 110 |
| 0 | C13 | 580 | 51 | 88 | 12 | C56 | 9 080 | 990 | 1 333 |
| 0 | C14 | 610 | 65 | 83 | 12 | C57 | 1 229 | 128 | 281 |
| 0 | C15 | 6 257 | 673 | 1 043 | 12 | C58 | 37 | 4 | 1 |
| 1 | C16 | 12,374 | 1 443 | 2 367 | 13 | C60 | 843 | 104 | 119 |
| 1 | C17 | 4 241 | 480 | 655 | 14 | C61 | 54,136 | 5 979 | 11,878 |
| 2 | C18 | 44,198 | 4 949 | 7 275 | 15 | C62 | 2 651 | 307 | 417 |
| 3 | C19 | 4 544 | 469 | 689 | 15 | C63 | 149 | 18 | 21 |
| 3 | C20 | 13,392 | 1 498 | 2 346 | 15 | C64 | 16,033 | 1 686 | 2 731 |
| 3 | C21 | 2 856 | 349 | 494 | 15 | C65 | 1 781 | 197 | 272 |
| 3 | C22 | 6 350 | 734 | 1 135 | 15 | C66 | 1 228 | 136 | 177 |
| 3 | C23 | 1 439 | 161 | 240 | 16 | C67 | 27,165 | 3 028 | 3 902 |
| 3 | C24 | 2 044 | 219 | 347 | 16 | C68 | 770 | 110 | 128 |
| 4 | C25 | 13,652 | 1 532 | 2 608 | 16 | C69 | 867 | 123 | 112 |
| 4 | C26 | 1 097 | 134 | 217 | 16 | C70 | 3 460 | 353 | 661 |
| 4 | C30 | 700 | 74 | 89 | 17 | C71 | 9 210 | 1 023 | 1 593 |
| 4 | C31 | 576 | 76 | 111 | 17 | C72 | 1 332 | 159 | 215 |
| 4 | C32 | 6 049 | 730 | 726 | 17 | C73 | 16,866 | 1 940 | 2 447 |
| 4 | C33 | 116 | 16 | 7 | 17 | C74 | 534 | 59 | 39 |
| 5 | C34 | 94,089 | 10,590 | 16,276 | 17 | C75 | 2 296 | 244 | 437 |
| 6 | C37 | 344 | 35 | 46 | 17 | C76 | 419 | 44 | 52 |
| 6 | C38 | 1 954 | 222 | 322 | 18 | C77 | 36,900 | 4 191 | 5 196 |
| 6 | C39 | 4 | 0 | 7 | 19 | C80 | 8 428 | 943 | 1 412 |