Skip to main content
. 2018 Nov 20;15(11):e1002695. doi: 10.1371/journal.pmed.1002695

Table 2. Baseline characteristics of derivation and validation cohorts.

Predictor Derivation cohort
n = 3,749,932
Validation cohort
n = 887,365
Sex, n (%)
Female 1,937,265 (51.66) 454,424 (51.21)
Male 1,812,667 (48.34) 432,941 (48.79)
Age, mean (SD) 51.0 (19.8) 53.1 (19.9)
Marital status, n (%) Missing 2,960,949 (78.96) 735,198 (82.85)
Single 481,753 (12.85) 44,941 (5.06)
Married/stable relationship 481,753 (12.85) 94,000 (10.60)
Separated/widowed 64,441 (1.72) 13,226 (1.49)
IMD score (socioeconomic status), mean (SD) 2.8 (1.4) 3.3 (1.4)
Family history of chronic disease, n (%) 646,360 (17.24) 196,800 (22.18)
BMI Missing, n (%) 1,094,892 (29.20) 242,324 (27.31)
Mean (SD) 26.1 (5.6) 26.4 (5.8)
Strategic health authority (region), n (%)
North East 0 89,004 (10.03)
North West 0 613,460 (69.13)
Yorkshire and the Humber 0 184,901 (20.84)
East Midlands 150,831 (4.02) 0
West Midlands 518,586 (13.83) 0
East of England 540,346 (14.41) 0
South West 558,036 (14.88) 0
South Central 572,791 (15.27) 0
London 817,870 (21.81) 0
South East Coast 591,472 (15.77) 0
Ethnicity, n (%)
Missing 2,625,523 (70.02) 536,806 (60.49)
White 1,039,476 (27.72) 339,466 (38.26)
Indian 16,740 (0.45) 1,571 (0.18)
Pakistani 6,153 (0.16) 2,395 (0.27)
Bangladeshi 1,958 (0.05) 355 (0.04)
Other Asian 7,466 (0.20) 711 (0.08)
Caribbean 9,786 (0.26) 541 (0.06)
Black African 15,499 (0.41) 1,234 (0.14)
Chinese 3,493 (0.09) 810 (0.09)
Other 23,838 (0.64) 3,476 (0.39)
Smoking status, n (%)
Missing 680,838 (18.16) 143,032 (16.12)
Non-smoker 1,678,287 (44.76) 379,395 (42.76)
Ex-smoker 392,806 (10.48) 90,539 (10.20)
Light smoker (<10 cigarettes/day) 286,113 (7.63) 69,704 (7.86)
Moderate smoker (10–20 cigarettes/day) 345,639 (9.22) 102,533 (11.55)
Heavy smoker (>20 cigarettes/day) 250,567 (6.68) 80,713 (9.10)
Smoker, amount not recorded 115,367 (3.08) 21,321 (2.40)
Alcohol intake, n (%)
Missing 1,089,383 (29.05) 235,862 (26.58)
Non-drinker 360,048 (9.60) 82,905 (9.34)
Ex-drinker 24,802 (0.66) 8,339 (0.94)
Trivial (<1 unit/week) 249,020 (6.64) 46,001 (5.18)
Light (1–2 units/week) 447,954 (11.95) 99,739 (11.24)
Moderate (3–6 units/week) 418,400 (11.16) 101,713 (11.46)
Heavy (7–9 units/week) 161,290 (4.30) 40,942 (4.61)
Very heavy (>9 units/week) 627,261 (16.73) 194,999 (21.98)
Drinker, amount not recorded 371,774 (9.91) 76,865 (8.66)
Previous use of healthcare service
No emergency admission, n (%) 3,583,848 (95.57) 834,693 (94.06)
1 emergency admission, n (%) 120,614 (3.22) 36,046 (4.06)
2 emergency admissions, n (%) 30,111 (0.80) 10,546 (1.19)
3+ emergency admissions, n (%) 15,359 (0.41) 6,080 (0.69)
Mean number of days since last admission (SD) 170.7 (103.7) 169.6 (105.8)
Mean number of consultations (SD) 21.7 (24.4) 24.4 (25.9)
Mean consultation duration 124.9 (227.5) 163.0 (374.3)
Mean number of days since last consultation (SD) 300.8 (83.4) 307.4 (78.7)
Clinical values
Systolic blood pressure Missing, n (%) 443,729 (11.83) 98,150 (11.06)
Mean (SD) 127.9 (18.4) 128.6 (19.2)
Cholesterol/HDL Missing, n (%) 2,781,874 (74.18) 593,814 (66.92)
Mean (SD) 3.8 (1.6) 3.8 (1.8)
Haemoglobin Missing, n (%) 2,012,077 (53.66) 4434,005 (48.91)
Haemoglobin < 110 g/l, n (%) 84,396 (2.25) 23,178 (2.61)
Platelets Missing, n (%) 12,056,437 (54.84) 449,522 (50.66)
Platelets > 480 × 109/l, n (%) 21,305 (0.57) 5,900 (0.66)
Liver function test Missing, n (%) 2,285,715 (60.95) 489,673 (55.18)
Abnormal liver function test, n (%) 23,217 (0.62) 9,328 (1.05)
ESR Missing, n (%) 2,908,165 (77.55) 683,599 (77.04)
Abnormal ESR, n (%) 96,436 (2.57) 21,828 (2.46)
Comorbidity, n (%)
Diabetes 326,672 (8.71) 83,309 (9.39)
Atrial fibrillation 122,627 (3.27) 49,647 (5.59)
Cardiovascular disease 379,071 (10.11) 104,215 (11.74)
Congestive cardiac failure 140,439 (3.75) 53,742 (6.06)
Venous thromboembolism 99,083 (2.64) 36,791 (4.15)
Cancer 143,923 (3.84) 36,677 (4.13)
Asthma or COPD 753,223 (20.09) 162,853 (18.35)
Epilepsy 103,800 (2.77) 7,690 (0.87)
Falls 354,748 (9.46) 86,801 (9.78)
Manic depression or schizophrenia 33,716 (0.90) 0 (0.00)
Chronic renal disease 272,292 (7.26) 72,221 (8.14)
Chronic liver disease or pancreatitis 68,726 (1.83) 0 (0.00)
Valvular heart disease 49,274 (1.31) 0 (0.00)
Treated hypertension 892,430 (23.8) 193,826 (21.84)
Rheumatoid arthritis or SLE 58,658 (1.56) 0 (0.00)
Depression (QOF definition) 862,357 (23.0) 173,965 (19.6)
Arthritis 52,4936 (14.0) 161,050 (18.15)
Connective tissue disease 32,850 (0.88) 7,079 (0.80)
Hemiplegia 7,097 (0.19) 2,553 (0.29)
HIV/AIDS 29,701 (0.79) 7,176 (0.81)
Hyperlipidaemia 216,304 (5.77) 66,238 (7.46)
Learning disability 18,574 (0.50) 5,069 (0.57)
Obesity 231,123 (6.16) 66,210 (7.46)
Osteoporosis 66,877 (1.78) 20,056 (2.26)
Peripheral arterial disease 56,828 (1.52) 20,761 (2.34)
Peptic ulcer disease 62,122 (1.66) 24,151 (2.72)
Substance abuse 54,517 (1.45) 19,673 (2.22)
Current prescribed medication, n (%)
Statin 552,982 (14.75) 164,814 (18.57)
NSAID 1505,161 (40.14) 423,637 (47.74)
Anticoagulant 122,803 (3.27) 34,285 (3.86)
Corticosteroid 809,336 (21.58) 214,067 (24.12)
Antidepressant 649,131 (17.31) 210,259 (23.69)
Antipsychotic 114,487 (3.05) 40,060 (4.51)

COPD, chronic obstructive pulmonary disease; ESR, erythrocyte sedimentation rate; HDL, high-density lipoprotein; IMD, Index of Multiple Deprivation; NSAID, non-steroidal anti-inflammatory drug; QOF, Quality and Outcomes Framework; SLE, systemic lupus erythematosus.