Skip to main content
. 2023 Mar 14;21:3. doi: 10.1186/s12963-023-00302-0

Table 1.

Agreement between ethnic categorisation in CPRD and HES datasets

Algorithm-generated higher-level ethnic categorisation using all HES datasets
White Mixed Asian Black Other Total
Algorithm-generated higher-level ethnic categorisation using CPRD GOLD or CPRD Aurum White N 10,641,573 96,569 59,201 51,769 99,475 10,948,587
Row % 97.20 0.88 0.54 0.47 0.91 100.00
Column % 97.77 36.38 5.81 7.46 47.88 83.77
Mixed N 84,799 88,143 33,154 53,501 16,333 275,930
Row % 30.73 31.94 12.02 19.39 5.92 100.00
Column % 0.78 33.20 3.26 7.71 7.86 2.11
Asian N 56,358 30,064 875,090 14,705 52,042 1,028,259
Row % 5.48 2.92 85.10 1.43 5.06 100.00
Column % 0.52 11.33 85.92 2.12 25.05 7.87
Black N 33,396 38,521 17,281 564,302 17,448 670,948
Row % 4.98 5.74 2.58 84.11 2.60 100.00
Column % 0.31 14.51 1.70 81.36 8.40 5.13
Other N 67,683 12,162 33,741 9324 22,451 145,361
Row % 46.56 8.37 23.21 6.41 15.44 100.00
Column % 0.62 4.58 3.31 1.34 10.81 1.11
Total N 10,883,809 265,459 1,018,467 693,601 207,749 13,069,085
Row % 83.28 2.03 7.79 5.31 1.59
Column % 100.00 100.00 100.00 100.00 100.00

Counts (N) and proportions (%) of currently registered acceptable English patients with ethnicity recorded in combined CPRD GOLD and CPRD Aurum and in any HES dataset showing the agreement between the algorithm-generated higher-level ethnic categorisation using CPRD data only with the algorithm-generated higher-level ethnicity categorisation using HES data only