Table 2. Baseline characteristics of derivation and validation cohorts.
Predictor | Derivation cohort n = 3,749,932 |
Validation cohort n = 887,365 |
|
---|---|---|---|
Sex, n (%) | |||
Female | 1,937,265 (51.66) | 454,424 (51.21) | |
Male | 1,812,667 (48.34) | 432,941 (48.79) | |
Age, mean (SD) | 51.0 (19.8) | 53.1 (19.9) | |
Marital status, n (%) | Missing | 2,960,949 (78.96) | 735,198 (82.85) |
Single | 481,753 (12.85) | 44,941 (5.06) | |
Married/stable relationship | 481,753 (12.85) | 94,000 (10.60) | |
Separated/widowed | 64,441 (1.72) | 13,226 (1.49) | |
IMD score (socioeconomic status), mean (SD) | 2.8 (1.4) | 3.3 (1.4) | |
Family history of chronic disease, n (%) | 646,360 (17.24) | 196,800 (22.18) | |
BMI | Missing, n (%) | 1,094,892 (29.20) | 242,324 (27.31) |
Mean (SD) | 26.1 (5.6) | 26.4 (5.8) | |
Strategic health authority (region), n (%) | |||
North East | 0 | 89,004 (10.03) | |
North West | 0 | 613,460 (69.13) | |
Yorkshire and the Humber | 0 | 184,901 (20.84) | |
East Midlands | 150,831 (4.02) | 0 | |
West Midlands | 518,586 (13.83) | 0 | |
East of England | 540,346 (14.41) | 0 | |
South West | 558,036 (14.88) | 0 | |
South Central | 572,791 (15.27) | 0 | |
London | 817,870 (21.81) | 0 | |
South East Coast | 591,472 (15.77) | 0 | |
Ethnicity, n (%) | |||
Missing | 2,625,523 (70.02) | 536,806 (60.49) | |
White | 1,039,476 (27.72) | 339,466 (38.26) | |
Indian | 16,740 (0.45) | 1,571 (0.18) | |
Pakistani | 6,153 (0.16) | 2,395 (0.27) | |
Bangladeshi | 1,958 (0.05) | 355 (0.04) | |
Other Asian | 7,466 (0.20) | 711 (0.08) | |
Caribbean | 9,786 (0.26) | 541 (0.06) | |
Black African | 15,499 (0.41) | 1,234 (0.14) | |
Chinese | 3,493 (0.09) | 810 (0.09) | |
Other | 23,838 (0.64) | 3,476 (0.39) | |
Smoking status, n (%) | |||
Missing | 680,838 (18.16) | 143,032 (16.12) | |
Non-smoker | 1,678,287 (44.76) | 379,395 (42.76) | |
Ex-smoker | 392,806 (10.48) | 90,539 (10.20) | |
Light smoker (<10 cigarettes/day) | 286,113 (7.63) | 69,704 (7.86) | |
Moderate smoker (10–20 cigarettes/day) | 345,639 (9.22) | 102,533 (11.55) | |
Heavy smoker (>20 cigarettes/day) | 250,567 (6.68) | 80,713 (9.10) | |
Smoker, amount not recorded | 115,367 (3.08) | 21,321 (2.40) | |
Alcohol intake, n (%) | |||
Missing | 1,089,383 (29.05) | 235,862 (26.58) | |
Non-drinker | 360,048 (9.60) | 82,905 (9.34) | |
Ex-drinker | 24,802 (0.66) | 8,339 (0.94) | |
Trivial (<1 unit/week) | 249,020 (6.64) | 46,001 (5.18) | |
Light (1–2 units/week) | 447,954 (11.95) | 99,739 (11.24) | |
Moderate (3–6 units/week) | 418,400 (11.16) | 101,713 (11.46) | |
Heavy (7–9 units/week) | 161,290 (4.30) | 40,942 (4.61) | |
Very heavy (>9 units/week) | 627,261 (16.73) | 194,999 (21.98) | |
Drinker, amount not recorded | 371,774 (9.91) | 76,865 (8.66) | |
Previous use of healthcare service | |||
No emergency admission, n (%) | 3,583,848 (95.57) | 834,693 (94.06) | |
1 emergency admission, n (%) | 120,614 (3.22) | 36,046 (4.06) | |
2 emergency admissions, n (%) | 30,111 (0.80) | 10,546 (1.19) | |
3+ emergency admissions, n (%) | 15,359 (0.41) | 6,080 (0.69) | |
Mean number of days since last admission (SD) | 170.7 (103.7) | 169.6 (105.8) | |
Mean number of consultations (SD) | 21.7 (24.4) | 24.4 (25.9) | |
Mean consultation duration | 124.9 (227.5) | 163.0 (374.3) | |
Mean number of days since last consultation (SD) | 300.8 (83.4) | 307.4 (78.7) | |
Clinical values | |||
Systolic blood pressure | Missing, n (%) | 443,729 (11.83) | 98,150 (11.06) |
Mean (SD) | 127.9 (18.4) | 128.6 (19.2) | |
Cholesterol/HDL | Missing, n (%) | 2,781,874 (74.18) | 593,814 (66.92) |
Mean (SD) | 3.8 (1.6) | 3.8 (1.8) | |
Haemoglobin | Missing, n (%) | 2,012,077 (53.66) | 4434,005 (48.91) |
Haemoglobin < 110 g/l, n (%) | 84,396 (2.25) | 23,178 (2.61) | |
Platelets | Missing, n (%) | 12,056,437 (54.84) | 449,522 (50.66) |
Platelets > 480 × 109/l, n (%) | 21,305 (0.57) | 5,900 (0.66) | |
Liver function test | Missing, n (%) | 2,285,715 (60.95) | 489,673 (55.18) |
Abnormal liver function test, n (%) | 23,217 (0.62) | 9,328 (1.05) | |
ESR | Missing, n (%) | 2,908,165 (77.55) | 683,599 (77.04) |
Abnormal ESR, n (%) | 96,436 (2.57) | 21,828 (2.46) | |
Comorbidity, n (%) | |||
Diabetes | 326,672 (8.71) | 83,309 (9.39) | |
Atrial fibrillation | 122,627 (3.27) | 49,647 (5.59) | |
Cardiovascular disease | 379,071 (10.11) | 104,215 (11.74) | |
Congestive cardiac failure | 140,439 (3.75) | 53,742 (6.06) | |
Venous thromboembolism | 99,083 (2.64) | 36,791 (4.15) | |
Cancer | 143,923 (3.84) | 36,677 (4.13) | |
Asthma or COPD | 753,223 (20.09) | 162,853 (18.35) | |
Epilepsy | 103,800 (2.77) | 7,690 (0.87) | |
Falls | 354,748 (9.46) | 86,801 (9.78) | |
Manic depression or schizophrenia | 33,716 (0.90) | 0 (0.00) | |
Chronic renal disease | 272,292 (7.26) | 72,221 (8.14) | |
Chronic liver disease or pancreatitis | 68,726 (1.83) | 0 (0.00) | |
Valvular heart disease | 49,274 (1.31) | 0 (0.00) | |
Treated hypertension | 892,430 (23.8) | 193,826 (21.84) | |
Rheumatoid arthritis or SLE | 58,658 (1.56) | 0 (0.00) | |
Depression (QOF definition) | 862,357 (23.0) | 173,965 (19.6) | |
Arthritis | 52,4936 (14.0) | 161,050 (18.15) | |
Connective tissue disease | 32,850 (0.88) | 7,079 (0.80) | |
Hemiplegia | 7,097 (0.19) | 2,553 (0.29) | |
HIV/AIDS | 29,701 (0.79) | 7,176 (0.81) | |
Hyperlipidaemia | 216,304 (5.77) | 66,238 (7.46) | |
Learning disability | 18,574 (0.50) | 5,069 (0.57) | |
Obesity | 231,123 (6.16) | 66,210 (7.46) | |
Osteoporosis | 66,877 (1.78) | 20,056 (2.26) | |
Peripheral arterial disease | 56,828 (1.52) | 20,761 (2.34) | |
Peptic ulcer disease | 62,122 (1.66) | 24,151 (2.72) | |
Substance abuse | 54,517 (1.45) | 19,673 (2.22) | |
Current prescribed medication, n (%) | |||
Statin | 552,982 (14.75) | 164,814 (18.57) | |
NSAID | 1505,161 (40.14) | 423,637 (47.74) | |
Anticoagulant | 122,803 (3.27) | 34,285 (3.86) | |
Corticosteroid | 809,336 (21.58) | 214,067 (24.12) | |
Antidepressant | 649,131 (17.31) | 210,259 (23.69) | |
Antipsychotic | 114,487 (3.05) | 40,060 (4.51) |
COPD, chronic obstructive pulmonary disease; ESR, erythrocyte sedimentation rate; HDL, high-density lipoprotein; IMD, Index of Multiple Deprivation; NSAID, non-steroidal anti-inflammatory drug; QOF, Quality and Outcomes Framework; SLE, systemic lupus erythematosus.