Skip to main content
. 2020 Jun 26;10:1051. doi: 10.3389/fonc.2020.01051

Table 1.

Cleaning of patient demographic data.

Variables Before cleaning After cleaning
Training cohort Verification cohort
Age (years)
0–49 48,070 394 254
50–54 10,044 271 174
55–59 11,732 309 238
60–64 12,440 338 250
65–69 12,476 314 206
70–74 11,621 267 160
75–79 10,027 170 103
80–84 7,011 93 52
85+ 5,133 27 15
Race
Black 8,399 98 82
White 112,750 1,981 1,318
Other 6,872 104 52
Unknown 533
Sex
Female 56,923 850 602
Male 71,631 1,333 850
Year of diagnosis
1975–2006 73,624
2007–2009 16,058 744 479
2010–2012 16,341 712 471
2013–2015 17,078 727 502
2016 5,453
Type of follow-up expected
Active follow-up 126,040
Autopsy/death certificate only cases 2,423
SF/Oakland only (originally inactive/now active) 91
NHIA (Hispanic, Non-Hisp)
Non-Spanish-Hispanic-Latino 114,658 1,927 1,288
Spanish-Hispanic-Latino 13,896 256 164
Age at diagnosis
0–49 48,070 394 254
50–65 36,731 980 705
66+ 43,753 809 493
Type of reporting source
Hospital inpatient/outpatient or clinic 122,674
Others 5,880
Insurance
Uninsured 1,949 76 48
Medicaid 7,913 239 176
Insured 41,981 1,868 1,228
Unknown 76,711
Marital status at diagnosis
Single (never married) 33,718 309 208
Unmarried or domestic partner, 68,259 1,500 991
Married (including common law)
Separated; Divorced; Widowed 22,336 374 253
Unknown 4,241
Status
Alive 30,554 246 165
Dead 98,000 1,937 1,287