Abstract
Objective:
To examine agreement between administrative and self-reported data on the number of and constituent chronic conditions (CCs) used to measure multimorbidity.
Study Design and Setting:
Cross-sectional self-reported survey data from four Canadian Community Health Survey waves were linked to administrative data for residents of Ontario, Canada. Agreement for each of 12 CCs was assessed using kappa (κ) statistics. For the overall number of CCs, perfect agreement was defined as agreement on both the number and constituent CCs. Jackknife methods were used to assess the impact of individual CCs on perfect agreement.
Results:
The level of chance-adjusted agreement between self-report and administrative data for individual CCs varied widely, from κ = 5.5% (inflammatory bowel disease) to κ = 77.5% (diabetes), and there was no clear pattern on whether using administrative data or self-reported data led to higher prevalence estimates. Only 26.9% of participants had perfect agreement on the number and constituent CCs; 10.6% agreed on the number but not constituent CCs. The impact of each CC on perfect agreement depended on both the level of agreement and the prevalence of the individual CC.
Conclusion:
Our results show that measuring agreement on multimorbidity is more complex than for individual CCs and that even small levels of individual condition disagreement can have a large impact on the agreement on the number of CCs.
Keywords: Multimorbidity, chronic conditions, agreement, administrative data, self-report data
Introduction
The overall number of chronic conditions (CCs) as a measure of multimorbidity is commonly used as an outcome, risk factor, or confounder in population-based and clinical research. Differences in multimorbidity estimates have been attributed to the way that multimorbidity is measured, including the selection and number of CCs considered,1 population differences,2 differences among multimorbidity definitions,3 and the data source used, the most common being administrative data and self-report.4 While there is not broad agreement on how best to measure multimorbidity, in this short report, we focus on one component, data source. Several studies have previously examined agreement across data sources on individual CCs,5,6 and a small number have explored agreement on specific definitions of multimorbidity (e.g. the occurrence of two or more CCs),2 but to the best of our knowledge, no researchers have examined agreement on both the number and constituent CCs. In this article, we present agreement between administrative and self-reported population-based data from Ontario, Canada, on the overall number and the constituent conditions from a list of 12 CCs. We further examine the impact of each condition on agreement with the overall number of CCs between the two data sources.
Methods
Study design and population
This study uses cross-sectional self-reported data from the Canadian Community Health Survey (CCHS) linked to administrative data holdings in Ontario, Canada, with a population of approximately 14.6 million people.7 A full description of the data sources and study sample is included in other publications8–10 and is briefly described below.
Data sources
The CCHS is a national cross-sectional survey that collects information about health status and health determinants. Ontario participants from CCHS cycles “3.1” 2005–2006,11 2007–2008,12 2009–2010,13 and 2011–201214 who were aged 45 years and older and agreed to linkage with administrative data comprised the study sample (N = 73,717; 78.4% of CCHS respondents across cycles agreed to linkage). Administrative data sources included physician visits, inpatient hospitalizations, and emergency department visits. Both CCHS and administrative data sources were used to estimate the frequency of each CC based on the date of CCHS completion (i.e. the index date).
Chronic conditions
The following 12 CCs were used in the analyses: Alzheimer’s diseases/dementia, anxiety/depression, arthritis, asthma, cancer, chronic obstructive pulmonary disease (COPD), diabetes, heart disease, hypertension, inflammatory bowel disease (IBD), stomach or intestinal ulcers, and stroke. CC status was ascertained through self-reported physician diagnosis (CCHS) or via algorithms developed for research use (administrative data).15 The number of CCs was operationalized as the sum of the 12 candidate conditions (0, 1, 2, 3, etc.)
Statistical analysis
Agreement was examined for (1) individual CCs, (2) the number of CCs (in which the constituent CCs could differ), and (3) “perfect agreement” on the number and constituent CCs. We assessed raw agreement and chance-adjusted agreement using Cohen’s κ statistics.16 We further used jackknife methods to examine the influence of individual CCs on perfect agreement. This was done by removing each of the 12 CCs sequentially from the condition list and estimating the resulting increase in perfect agreement. All analyses were done using SAS 9.4.17
The use of data in this project was authorized under section 45 of Ontario’s Personal Health Information Protection Act. The study was approved by the Hamilton Integrated Research Ethics Board at McMaster University (Ethics certificate no.: 13-590).
Results
Study participants
Approximately 56% of the participants were 45–64 years of age, 56.2% were female, and participants were evenly split among the neighborhood income groups (Table 1). The proportion of participants with no CCs was 18.3% based on administrative data and 24.6% based on self-report. The proportion of participants with each of 2, 3, 4, and 5+ CCs was higher for administrative data than self-report.
Table 1.
Prevalence of demographic characteristics and number of CCs based on administrative data and self-report for 71,318 Ontario participants 45 years or older of the CCHS cycles 3-6
| Characteristic | Total Cohort (N=71,317) |
|---|---|
| Sex | |
| Male | 31,210 (43.76%) |
| Female | 40,107 (56.24%) |
| Age group (years) | |
| 45–54 | 18,101 (25.38%) |
| 55–64 | 21,765 (30.52%) |
| 65–74 | 16,979 (23.81%) |
| 75–84 | 11,382 (15.96%) |
| 85+ | 3090 (4.33%) |
| Neighborhood income quintile | |
| Lowest | 13,796 (19.3%) |
| 2 | 14,244 (20.0%) |
| 3 | 14,237 (20.0%) |
| 4 | 14,620 (20.5%) |
| Highest | 14,418 (20.2%) |
| Number of CCs based on administrative data | |
| 0 | 13,059 (18.3%) |
| 1 | 19,065 (26.7%) |
| 2 | 17,824 (25.0%) |
| 3 | 11,671 (16.4%) |
| 4 | 6081 (8.5%) |
| 5+ | 3617 (5.1%) |
| Number of CCs based on self-report | |
| 0 | 17,533 (24.6%) |
| 1 | 20,215 (28.4%) |
| 2 | 16,297 (22.9%) |
| 3 | 9692 (13.6%) |
| 4 | 4569 (6.4%) |
| 5+ | 3011 (4.2%) |
CC: chronic condition; CCHS: Canadian Community Health Survey.
Agreement on individual CCs and overall number of CCs
The raw agreement between data sources was over 80% for all conditions except arthritis for which raw agreement was 65.0%. κ (chance-adjusted agreement) varied from 77.5% (95% confidence interval (CI) 76.9–78.2) for diabetes to 5.5% (95% CI 4.6–6.3) for IBD (Table 2). Although there was no clear pattern on which data source led to higher individual condition prevalence estimates, the average overall number of CCs was higher using administrative data (1.87) compared to self-report (1.64). There was disagreement on the number of CCs for 62.5% of participants; 26.9% had perfect agreement on the number and constituent conditions and 10.6% agreed on the number but not the constituent conditions.
Table 2.
Prevalence of individual CCs using administrative data and self-report, and raw agreement, chance-adjusted, and chance-independent agreement between data sources.a
| Condition | Administrative data (%) | Self-reported data (%) | Raw agreement (%; 95% CI) | Kappa (%; 95% CI) |
|---|---|---|---|---|
| Individual CCs | ||||
| Hypertension | 47.9 | 44.0 | 82.8 (82.5, 83.1) | 65.4 (64.9, 66.0) |
| Arthritis (including fibromyalgia) | 51.0 | 38.7 | 65.0 (64.6, 65.3) | 30.2 (29.5, 30.9) |
| Anxiety/depression | 23.1 | 11.9 | 81.0 (80.7, 81.3) | 35.5 (34.7, 36.3) |
| Diabetes | 16.6 | 12.8 | 94.3 (94.2, 94.5) | 77.5 (76.9, 78.2) |
| Cancer | 9.8 | 14.4 | 93.1 (92.9, 93.3) | 67.8 (66.9, 68.6) |
| Heart disease | 10.4 | 12.5 | 90.0 (89.8, 90.2) | 50.7 (49.7, 51.7) |
| COPD (emphysema, chronic bronchitis) | 15.0 | 6.7 | 86.3 (86.1, 86.6) | 30.3 (29.3, 31.3) |
| Asthma | 5.1 | 8.0 | 91.1 (90.9, 91.3) | 27.4 (26.2, 28.7) |
| Stroke | 4.8 | 2.8 | 95.3 (95.1, 95.4) | 35.2 (33.5, 36.9) |
| IBD | 0.3 | 7.1 | 93.1 (92.9, 93.2) | 5.5 (4.6, 6.3) |
| Stomach or intestinal ulcers | 2.1 | 4.2 | 94.4 (94.2, 94.6) | 8.8 (7.6, 10.1) |
| Alzheimer’s disease or dementia | 1.2 | 0.9 | 98.8 (98.7, 98.9) | 42.8 (39.6, 45.9) |
| Number of CCs (mean, SD) | 1.87 (1.45) | 1.64 (1.44) |
CC: chronic condition; CI: confidence interval; COPD: chronic obstructive pulmonary disease; IBD: inflammatory bowel disease; SD: standard deviation.
a Conditions are ordered from the highest to lowest average prevalence between administrative and self-report data.
Impact of individual conditions on perfect agreement
To assess the impact of individual CCs on perfect agreement, we examined the increase in percent of perfect agreement when each of the conditions was excluded one by one from the list of 12 CCs. In Figure 1, the size of the circle represents the average prevalence (using the two data sources) of the condition, and the individual conditions’ agreement is on the x-axis. Alzheimer’s disease had the smallest impact on agreement, its exclusion increasing perfect agreement by 0.1%; arthritis had the largest impact, increasing perfect agreement by 13.4%. The increase in perfect agreement depended on both the prevalence and the level of agreement of the individual CC. For example, COPD and arthritis have a similar level of agreement (κ = 30.3 and κ = 30.2, respectively), but the average prevalence of COPD was 10.8% compared to 44.8% for arthritis. Removing COPD from the list of CCs increased perfect agreement by 2.5%, whereas removing arthritis led to a five times greater increase of 13.4%.
Figure 1.
Percent improvement in perfect agreement between administrative data and self-report on the number of CCs when each individual CC is removed from the condition list used to generate the number of CCs. The size of each bubble indicates the relative prevalence of the CCs based on administrative data. The κ for individual condition agreement is indicated on the x-axis. κ: Kappa; CC: chronic condition; COPD: chronic obstructive pulmonary disease.
Discussion
We found that for individual conditions, the level of agreement between self-report and administrative data varied widely and there was no clear pattern on whether using administrative data or self-reported data led to higher prevalence estimates. Although estimates of overall number of CCs were consistently higher using administrative data, almost half of the individual conditions had a higher prevalence based on self-report. It has been speculated that diseases which are less familiar to patients and/or have nonspecific and intermittent symptoms, such as chronic lung disease, tend to be underreported by patients18 and thus have lower levels of agreement. In contrast, diseases that are well-defined and relatively easy to diagnose19 or that require ongoing self-management and ongoing contact with the health-care system,20,21 such as diabetes, have higher levels of agreement. This is reflected in our results as we found the highest level of agreement with diabetes, heart disease, hypertension, stroke, and cancer. Unlike previous studies, we broke down the overall multimorbidity agreement into agreement on the number of CCs as well as the number and constituent conditions (“perfect” agreement). Although agreement on the number of CCs was low, it was even lower when we required that constituent CCs also agree. In our previous work, we also found that perfect agreement dropped consistently as the number of CCs increased.10 All of this demonstrates that the way we think about agreement needs to be more nuanced when we get into a complex topic like multimorbidity.
When we further examined the impact of individual CCs on the level of agreement on the overall number of CCs we found that both the individual condition’s level of agreement and its prevalence were important. Excluding a disease with a low level of agreement and low prevalence, for example, stomach ulcers, had a smaller impact on perfect agreement than excluding a disease with higher level of agreement and a prevalence closer to 50%, for example, hypertension. Since disease lists used in multimorbidity research often include the most common CCs,22 even small levels of individual condition disagreement can have a large impact on the agreement on the number of CCs. This is complicated further by the lack of a standardized disease list to assess multimorbidity across studies, as different conditions being counted will lead to different levels of multimorbidity agreement.
Strengths and limitations
This study has some limitations that need to be considered. We included a restricted list of 12 CCs. Although this list includes the many of the most common CCs in Canada23 and those commonly included in multimorbidity research,24 we were limited to conditions that could be identified in both data sources. Many disease lists include more than 12 conditions and we do not know the level of agreement between the data sources on other CCs. As well, other methods to measure multimorbidity which rely on weighting24 or that include symptoms25 (which are less well captured in administrative data) will likely have even more complex issues related to agreement. Furthermore, we must take into account that not all administrative data are the same. For example, the use of electronic medical records may provide richer data on symptoms, and we know that CC prevalence estimates have been shown to increase with the number of administrative data sources used6,26 and with increasing duration of retrospective data observation.26 This further complicates our ability to understand how estimates vary across studies based solely on the kind of data used and underscores the importance of a clear and transparent process for choosing and reporting specific conditions and data sources included in multimorbidity research.27
Conclusion
When considering agreement across data sources we need a much more nuanced approach when we are studying multimorbidity. Each condition used to define multimorbidity has a differential impact on the degree of perfect agreement on the total number of CCs. This underlying complexity needs to be considered in future efforts to improve data for research purposes.
Acknowledgements
Dr Griffith acknowledges the support by the McLaughlin Foundation Professorship in Population and Public Health. Dr Gruneir also acknowledges the support by a Canadian Institutes of Health Research (CIHR) New Investigator Award. We thank IMS Brogan Inc. for the use of their Drug Information Database. The data used in this article are held securely at ICES and data sharing agreements prohibit making them publicly available. Access may be granted to those who meet criteria at www.ices.on.ca/DAS. We also thank Dr. Jennifer Salerno for reviewing the “Agreement between administrative data and self-report data and implications for characterizing multimorbidity and its effects” series.
Authors’ note: The opinions, results, and conclusions reported in this article are those of the authors and are independent from the funding sources. No endorsement by ICES or the Ontario Ministry of Health and Long-Term Care (Ontario MOHLTC) is intended or should be inferred. Parts of this material are based on data and/or information compiled and provided by CIHI. However, the analyses, conclusions, opinions, and statements expressed in the material are those of the author(s) and not necessarily those of CIHI. Parts of this material are based on data and information provided by Cancer Care Ontario (CCO). The opinions, results, view, and conclusions reported in this article are those of the authors and do not necessarily reflect those of CCO. No endorsement by CCO is intended or should be inferred.
Declaration of conflicting interests: The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding: The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This project was funded through an operating grant from the CIHR (funding reference number: FRN 130546). Dr Markle-Reid is funded through the Canada Research Chairs program. This study was supported by ICES, which is funded by an annual grant from the Ontario MOHLTC.
ORCID iDs: Lauren E Griffith
https://orcid.org/0000-0002-2794-9692
Kathryn A Fisher
https://orcid.org/0000-0001-8342-1238
Maureen Markle-Reid
https://orcid.org/0000-0002-4019-7077
Jenny Ploeg
https://orcid.org/0000-0001-8168-8449
References
- 1. Griffith LE, Gilsing A, Mangin D, et al. Multimorbidity frameworks impact prevalence and relationships with patient-important outcomes. J Am Geriatr Soc 2019; 67(8): 1632–1640. [DOI] [PubMed] [Google Scholar]
- 2. Mokraoui NM, Haggerty J, Almirall J, et al. Prevalence of self-reported multimorbidity in the general population and in primary care practices: a cross-sectional study. BMC Res Notes 2016; 9(1): 314. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3. Harrison C, Britt H, Miller G, et al. Examining different measures of multimorbidity, using a large prospective cross-sectional study in Australian general practice. BMJ Open 2014; 4(7): e004694. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4. Fortin M, Haggerty J, Sanche S, et al. Self-reported versus health administrative data: implications for assessing chronic illness burden in populations. A cross-sectional study. CMAJ Open 2017; 5(3): E729–E733. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5. Hansen H, Schafer I, Schon G, et al. Agreement between self-reported and general practitioner-reported chronic conditions among multimorbid patients in primary care—results of the MultiCare Cohort Study. BMC Fam Pract 2014; 15: 39. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6. Lujic S, Simpson JM, Zwar N, et al. Multimorbidity in Australia: comparing estimates derived using administrative data sources and survey data. PLoS One 2017; 12(8): e0183817. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7. Statistics Canada. Population estimates on July 1st, by age and sex. Statistics Canada 2019. https://www150.statcan.gc.ca/t1/tbl1/en/tv.action?pid=1710000501&pickMembers%5B0%5D=1.7&pickMembers%5B1%5D=2.1 (accessed 7 February 2020).
- 8. Gruneir A, Fisher K, Perez R, et al. Measuring multimorbidity series. An overlooked complexity—comparison of self-report vs. administrative data in community-living adults: paper 1. Introduction. J Clin Epidemiol. Epub ahead of print 28 April 2020 DOI: S0895-4356 (19)30589–X. [DOI] [PubMed] [Google Scholar]
- 9. Griffith LE, Gruneir A, Fisher K, et al. Measuring multimorbidity series. An overlooked complexity—comparison of self-report vs. administrative data in community-living adults: paper 2. Prevalence estimates depend on the data source. J Clin Epidemiol 2020; DOI: S0895-4356 (19)30551–7. [DOI] [PubMed] [Google Scholar]
- 10. Gruneir A, Griffith LE, Fisher K, et al. Measuring multimorbidity series. An overlooked complexity—comparison of self-report vs. administrative data in community-living adults: paper 3. Measuring agreement across data sources and implications for estimating associations with health. J Clin Epidemiol 2020; DOI: S0895-4356 (19)30547–5. [DOI] [PubMed] [Google Scholar]
- 11. Statistics Canada. Canadian Community Health Survey (CCHS) Cycle 3.1: Public Use Micro Data File (PUMF) User Guide. Statistics Canada Ottawa: 2005. http://www23.statcan.gc.ca/imdb/p2SV.pl?Function=getSurvey&Id=2264 (2006, accessed 7 February 2020). [Google Scholar]
- 12. Statistics Canada. Canadian Community Health Survey (CCHS) annual component: user guide, 2007-2008 microdata files. Statistics Canada Ottawa, ON, Canada: http://www23.statcan.gc.ca/imdb/p2SV.pl?Function=getSurvey&Id=29539 (2009, accessed 7 February 2020). [Google Scholar]
- 13. Statistics Canada. Canadian Community Health Survey (CCHS) annual component: user guide 2010 and 2009-2010 microdata files. Statistics Canada Ottawa: http://www23.statcan.gc.ca/imdb-bmdi/document/3226_D7_T9_V8-eng.htm (2013, accessed 7 February 2020). [Google Scholar]
- 14. Statistics C. Canadian Community Health Survey (CCHS) annual component: user guide 2012 and 2011-2012 microdata files. Statistics Canada Ottawa: http://www23.statcan.gc.ca/imdb-bmdi/document/3226_D7_T9_V8-eng.htm (2013, accessed 7 February 2020). [Google Scholar]
- 15. Griffith LE, Gruneir A, Fisher K, et al. Patterns of health service use in community living older adults with dementia and comorbid conditions: a population-based retrospective cohort study in Ontario. Can BMC Geriatr 2016; 16(1): 177. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16. Fleiss JL. Statistical methods for rates and proportions. 2nd ed New York: John Wiley & Sons, 1981. [Google Scholar]
- 17. SAS/STAT. Software [computer program]. Version 14.1. Cary NC: SAS Institute Inc, 2017. [Google Scholar]
- 18. Okura Y, Urban LH, Mahoney DW, et al. Agreement between self-report questionnaires and medical record data was substantial for diabetes, hypertension, myocardial infarction and stroke but not for heart failure. J Clin Epidemiol 2004; 57(10): 1096–1103. [DOI] [PubMed] [Google Scholar]
- 19. Engstad T, Bonaa KH, Viitanen M. Validity of self-reported stroke: the Tromso study. Stroke 2000; 31(7): 1602–1607. [DOI] [PubMed] [Google Scholar]
- 20. Lix LM, Yogendran MS, Shaw SY, et al. Population-based data sources for chronic disease surveillance. Chronic Dis Can 2008; 29(1): 31–38. [PubMed] [Google Scholar]
- 21. Simpson CF, Boyd CM, Carlson MC, et al. Agreement between self-report of disease diagnoses and medical record validation in disabled older women: factors that modify agreement. J Am Geriatr Soc 2004; 52(1): 123–127. [DOI] [PubMed] [Google Scholar]
- 22. Fortin M, Stewart M, Poitras ME, et al. Systematic review of prevalence studies on multimorbidity: toward a more uniform methodology. Ann Fam Med 2012; 10(2): 142–151. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23. Canadian Institute for Health Information. Seniors and the health care system: What is the impact of multiple chronic conditions? Ottawa, Ontario: CIHI, 2011. Report No: https://secure.cihi.ca/estore/productFamily.htm?locale=en&pf=PFC1575 (accessed 7 February 2020). [Google Scholar]
- 24. Diederichs C, Berger K, Bartels DB. The measurement of multiple chronic diseases—a systematic review on existing multimorbidity indices. J Gerontol Ser A Biol Sci Med Sci 2011; 66(3): 301–311. [DOI] [PubMed] [Google Scholar]
- 25. Willadsen TG, Bebe A, Koster-Rasmussen R, et al. The role of diseases, risk factors and symptoms in the definition of multimorbidity—a systematic review. Scand J Prim Health Care 2016; 34(2): 112–121. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26. Chen G, Lix L, Tu K, et al. Influence of using different databases and ‘look back’ intervals to define comorbidity profiles for patients with newly diagnosed hypertension: implications for health services researchers. PLoS One 2016; 11(9): e0162074. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27. Stewart M, Fortin M, Britt HC, et al. Comparisons of multi-morbidity in family practice—issues and biases. Fam Pract 2013; 30(4): 473–480. [DOI] [PMC free article] [PubMed] [Google Scholar]

