Skip to main content
. 2021 Aug 16;11:16605. doi: 10.1038/s41598-021-95747-4

Table 1.

Summary of data used for training and testing the LungCNN-Histo and the TMB classification models.

Model LungCNN-Histo WS-S1 LungCNN-TMB and WS-TMB
Dataset name Histo-Train Histo-Test WS-Train TMB-Train TMB-Test
Data source TCGA LUAD TCGA LUAD DS2 TCGA LUAD TCGA LUAD TCGA LUAD (unseen sites) TCGA LUAD (seen sites)
Number of cases (number of slides) 64 (68) 38 (40) 50 (50) 317 (261) 242 (295) 84 (84) 88 (93)
Number of tissue source sites 22 20 N/A 23 21 10 17
Age range (median)

38–84

(67.5)

48–83

(68.5)

40–70

(56)

33–87

(67)

33–87

(67)

42–84

(64)

41–88

(69)

Pathologic stage I 37 20 35 148 138 40 56
II 15 11 9 60 54 27 17
III 8 6 6 37 35 10 12
IV 4 1 0 15 15 7 3
N/A 0 0 0 1 0 0 0
Sex Female 40 17 17 144 134 44 49
Male 24 21 33 117 108 40 39
Smoking status Non-smoker 26 17 0 96 89 38 36
Smoker 30 21 0 157 153 46 52
N/A 8 0 50 8 0 0 0
TMB status Low 47 28 N/A 182 167 61 59
High 17 10 N/A 79 75 23 29