Skip to main content
Obesity Science & Practice logoLink to Obesity Science & Practice
. 2020 Sep 4;6(6):677–693. doi: 10.1002/osp4.450

Validated methods for identifying individuals with obesity in health care administrative databases: A systematic review

Sékou Samadoulougou 1,2, Leanne Idzerda 1,2, Roxane Dault 3, Alexandre Lebel 1,2,4,, Anne‐Marie Cloutier 3, Alain Vanasse 5,6
PMCID: PMC7746972  PMID: 33354346

Summary

Background

Health care administrative databases are increasingly used for health studies and public health surveillance. Cases of individuals with obesity are selected using case‐identification methods. However, the validity of these methods is fragmentary and particularly challenging for obesity case identification.

Objective

The objectives of this systematic review are to (1) determine the case‐identification methods used to identify individuals with obesity in health care administrative databases and (2) to summarize the validity of these case‐identification methods when compared with a reference standard.

Methods

A systematic literature search was conducted in six bibliographic databases for the period January 1980 to June 2019 for all studies evaluating obesity case‐identification methods compared with a reference standard.

Results

Seventeen articles met the inclusion criteria. International Classification of Diseases (ICD) codes were the only case‐identification method utilized in selected articles. The performance of obesity‐identification methods varied widely across studies, with positive predictive value ranging from 19% to 100% while sensitivity ranged from 3% to 92%. The sensitivity of these methods was usually low while the specificity was higher.

Conclusion

When obesity is reported in health care administrative databases, it is usually correctly reported; however, obesity tends to be highly underreported in databases. Therefore, case‐identification methods to monitor the prevalence and incidence of obesity within health care administrative databases are not reliable. In contrast, the use of these methods remains relevant for the selection of individuals with obesity for cohort studies, particularly when identifying cohorts of individuals with severe obesity or cohorts where obesity is associated with comorbidities.

Keywords: algorithm, case‐identification, databases, validation

1. INTRODUCTION

Health care administrative databases are extremely useful in many research settings. 1 , 2 They are increasingly being used in epidemiological and health care costing studies, for chronic disease surveillance, as well as for the study of burden of disease, the study of health service effectiveness, compliance and utilization and to examine population health status and its determinants. 1 , 3 , 4 , 5 , 6 , 7 , 8 Health care administrative databases are comprehensive, inexpensive and time saving. 9 When combined with population survey data, they can establish the prevalence and incidence of various diseases. 2 In general, these databases include information on medical procedures such as medical visits, emergency room visits, hospitalizations, prescription drugs and associated costs. In order to identify specific populations within these databases, case‐identification methods are utilized. These case‐identification methods may take the form of identification codes of the International Classification of Diseases (ICD) system, a drug code or a code related to the use of health services.

Given the rising burden of obesity as well as the fact that obesity is an important risk factor for numerous other chronic diseases, it would be particularly useful to rely on validated methods for identifying individuals with obesity in health care administrative databases. Previous studies have shown, however, that the use of case‐identification methods to identify individuals with obesity in health care administrative databases may lead to an imperfect estimate of the prevalence of obesity. 10 , 11 , 12 , 13 Previous studies have shown that the coding of obese cases in health care administrative databases is inconsistent, incomplete or underreported. 12 , 14 , 15 , 16 , 17 In addition, the case‐identification methods used in recent studies have led to variable performance depending on the existence of other comorbidities, the severity of obesity and the type of health care administrative data. 12 , 18 , 19 , 20 , 21 , 22 , 23 Accurate case‐identification methods for health care administrative databases is essential to minimize classification errors, which can represent a significant form of bias related to the use of administrative databases.

One way to assess the accuracy of case‐identification methods is to compare this with a reference standard. For example, data on obesity in the health care administrative database could be compared with clinical data to determine their accuracy. If both the clinical data and the health care administrative database report that the patient suffers from obesity, there is high concordance (validity).

The objective of this systematic review is to determine case‐identification methods used to identify individuals with obesity within health care administrative databases and to report the validity of these case‐identification methods when compared with a reference standard. No systematic review of this kind has been conducted to the authors' knowledge.

2. METHODS

2.1. Search strategy, data source and selection criteria

This review is one of a series of systematic reviews that aims to assess the case‐identification methods used to identify chronic diseases in health care administrative data. This series of systematic reviews were conducted by the Quebec Support for People and Patient‐Oriented Research and Trials (SUPPORT Unit) as part of its mandate to facilitate access to and use of health care administrative data. The PRISMA‐P guidelines as well as production standards of the systematic reviews of the National Institute of Excellence in Health and Social Services were followed in the execution of this review. 24 , 25

2.1.1. Data sources

Six bibliographic databases were systematically searched (Medline, CINAHL, PsycINFO, AgeLine, Social Gerontology and Embase, January 1980 to June 2019). PubMed was searched from July 2017 to June 2019 to capture the most recent citations, which may not have been indexed in bibliographic databases. Sources of grey literature from both Quebec provincial organizations (Institut national de santé publique du Québec, Institut national d'excellence en santé et services sociaux, Régie de l'assurance maladie du Québec, Institut de la statistique du Québec and Institut de recherche et d'informations socio‐économiques) and Canadian organizations (Canadian Institute for Health Information and Health Canada) were also reviewed.

2.1.2. Selection criteria

Studies were included if they met all of the following criteria: (1) reported at least one case‐identification method for identifying individuals with obesity in health care administrative databases, (2) validated the case‐identification method(s) against a reference standard, (3) reported at least one of the following key measures of diagnostic accuracy: sensitivity, specificity, positive predictive value (PPV) or negative predictive value (NPV), (4) were written in English or French. To ensure comparability, studies were excluded if only kappa values were reported or if regression models were used to identify cases of obesity. Articles that were not associated with an original study or detailed report (e.g., a research protocol and commentary) were also excluded. Combinations of the following three concepts were used:

  1. the health condition of interest (e.g., obesity, overweight and body mass index [BMI]);

  2. administrative health care data (e.g., administrative database and medico‐administrative data);

  3. case‐identification method validation measures (e.g., sensitivity, specificity, PPV and NPV).

The search strategy for each of the databases is available in Appendix A.

2.1.3. Data management and extraction

The identified articles were imported into a bibliographic management software. All results were combined, and duplicates were removed. Data extraction was carried out by two members of the research team (RD and ECC). In order to ensure systematic and standardized collection, the extraction of information was accompanied by a set of specifications that described in detail the collection of each of the variables of interest. In addition, the assistants received prior training on the type of information collected and how to collect it.

Extracted data were entered into a standardized data extraction spreadsheet designed specifically for the objectives of this study. This spreadsheet included the following information: the definition of obesity utilized in the article, the obesity case‐identification method used, the reference population and the performance of the case‐identification method used (sensitivity, specificity, PPV, NPV, Cohen's kappa and area under the curve [AUC]). This review followed a similar approach to what has been used to validate other disease case‐identification methods for administrative health care databases. 26 , 27 , 28 , 29

The primary reviewer was responsible for the initial extraction while the secondary reviewer was responsible for validating the information collected in the data extraction spreadsheet. In the event of nonconsensus, a third member of the research team (AMC) was consulted.

2.2. Evaluation of the validity of obesity case‐identification methods

A case‐identification method refers to the method utilized for identifying specific cases of a disease or other health marker in a health care administrative database. Different terminologies are used for health diagnosis depending on the data source and country. This may include codes for medical procedures, interventions, drugs prescribed or diagnostic codes.

The accuracy of case‐identification methods to identify individuals with obesity in health care administrative databases was evaluated by comparing these methods with a reference sample. A reference sample may be medical records, clinical data, self‐reported outcomes or treatment registries. The primary validation statistics of interest was based on sensitivity, specificity, PPV and NPV (Table 1). However, for completeness, some studies have reported AUC or Cohen's kappa coefficient.

TABLE 1.

Indicators to assess the validity of case‐identification methods for identifying obesity cases in health care administrative data

Term Theoretical definition Definition in context
Sensitivity (Se) TP/(TP + FN) Proportion of individuals with a diagnosis of obesity in the reference sample who were identified with a code for obesity in the health care administrative database
Specificity (Sp) TN/(TN + FP) Proportion of individuals without a diagnosis of obesity in the reference sample who were not identified with a code for obesity in the health care administrative database.
Positive predictive value (PPV) TP/(TP + FP) Probability that individuals identified as suffering from obesity using the case‐identification method are actually affected by obesity in the reference sample
Negative predictive value (NPV) TN/(TN + FN) Probability that individuals who have not been identified as affected by obesity in the health care administrative database are not affected by obesity in the reference sample.

Note: TP: true positives are those that have the disease and are tested as positive by the screening/diagnostic test. FN: false negatives are that have the disease and are tested as negative by the screening/diagnostic test. TN: true negatives are the ones that are correctly identified as not having the disease being detected by the test. FP: false positives are the ones that are incorrectly identified had the disease being detected by the test.

2.3. Risk of bias analysis

An evaluation of the quality of studies was conducted by two authors (OSS and LI) using a modified version of the Quality Assessment of Diagnostic Accuracy Studies 2 evaluation tool for studies validating administrative data to assess risk of bias within and across studies. 30 This tool is summarized in Appendix B. The risk of bias tool consists of four key domains covering (1) patient selection, (2) the administrative database, (3) the reference standard and (4) flow of patients through the study and timing of the administrative database and reference standard. Each domain was assessed in terms of the risk of bias (graded as high, low or unclear), and the first three domains were also assessed in terms of concerns regarding applicability.

3. RESULTS

3.1. Study selection

The PRISMA study selection flow diagram is shown in Figure 1. A total of 1771 articles were identified. Of these, 1447 unique articles remained after deduplication. Each of the titles and abstracts was screened, and 47 articles were selected for full‐text review. Of these, 14 met the inclusion criteria. Three additional articles, identified by the snowball method or in the grey literature, were added to these articles. A total of 17 original studies that report validated methods for identifying obesity individuals in health care administrative databases were included in this systematic review (Table 2).

FIGURE 1.

FIGURE 1

Flowchart of study selection and review

TABLE 2.

Characteristics of the studies included in the systematic review (n = 17)

Authors (year), Country Population characteristics and study period Reference standard
Caplan et al., 2018 22 United States Patients aged 20–89 years old enrolled in a Medicare Advantage Prescription Drug or in a commercial health plan offered by Humana and having a documented V85 code for obesity, 2010–2015 Medical Chart review
McLynn et al., 2018 31 United States Patients who underwent elective posterior lumbar fusion with or without interbody graft at a large academic hospital, 2013–2016 Medical Chart review
Ammann et al., 2017 10 United States Adults aged 20 years and older with commercial or Medicare Advantage insurance health plan, 2013, 2014 and 2016 Medical Chart review
Chiu et al., 2017 32 United States Veterans of the VA Greater Los Angeles Health Administration with an ICD‐9 diagnosis of diabetes mellitus with visits to the eye clinic, 1999–2016 Medical Chart review
Peng et al., 2017 16 Canada Patients aged 18 years and older from four adult teaching hospitals in Alberta, Canada, 2003 Medical Chart review
Nickel et al., 2016 21 United States Privately insured women aged 18 to 64 years old who underwent mastectomy for breast cancer, 2004–2011 Medical Chart review
Lau et al., 2015 20 United States Patients who underwent primary total joint arthroplasty from three high volume total joint arthroplasty centres, 2010–2014 Medical Chart review
Lloyd et al., 2015 15 United States Elderly aged 65 years and older from the National Health and Nutrition Examination Survey, 1999–2004 Clinical measurements
Samuel et al., 2015 33 United States Patients aged 18 years and older with proximal tibia fracture, 2011–2012 Medical Chart review
Golinvaux et al., 2014 19 United States Patients aged 18 years and older from a large tertiary care medical centre who spent at least one night in the hospital as an inpatient, 2013 Medical Chart review
Martin et al., 2014 12 Canada Patients aged 18 years and older from the Alberta Provincial Project for Outcomes Assessment in the Coronary Heart Disease (APPROACH) registry, 2002–2008 Treatment Registry
Bozic et al., 2013 12 United States Patients who underwent total joint arthroplasty from three high‐volume institutions, 2009 Medical Chart review
Andrade et al., 2011 14 United States Women aged 12 to 49 years old who delivered an infant in a hospital or whose pregnancy ended in an induced abortion or miscarriage, 2006–2008 Medical Chart review
Kuhle et al., 2011 11 Canada Children aged 10–11 years old from the Children's Lifestyle and School Performance Study, 2002–2004 Clinical measurements
Quan et al., 2008 13 Canada Patients aged 18 years and older from four adult teaching hospitals in Alberta, Canada 2003 Medical Chart review
Varas‐Lorenzo et al., 2008 34 Canada Saskatchewan Health beneficiaries aged 40 to 84 years old eligible for outpatient prescription drug benefits with a confirmed diagnosis of acute coronary syndrome (ICD‐9 410–411), 1999–2001 Medical Chart review
Yasmeen et al., 2006 23 United States Women aged 10 to 55 years old who were discharged from a nonfederal licensed acute care hospital in California, after giving birth, 1992–1993 Medical Chart review

3.2. Study and population characteristics

The characteristics of the included studies are presented in Table 2. All studies were carried out in or after 2006. The majority of studies were conducted in the United States (n = 12/17) and the rest were conducted in Canada (n = 5/17). Three studies included only female populations, 14 , 21 , 23 one study included only children 11 and one study included only people aged 65 years and older. 15 The remaining studies focused on mixed populations. All studies were conducted according to a retrospective cohort design with the exception of Golinvaux et al., 19 which was a cross‐sectional survey.

3.3. Case‐identification methods

All studies used the ICD codes to identify individuals with obesity. No study identified individuals with obesity in health care administrative data based on any other codes, such as codes related to the prescription of drugs or the use of health services. The specific ICD codes utilized are outlined in Table 3.

TABLE 3.

Administrative codes and classification of obesity

ICD edition Obesity ICD code
ICD‐8 Obesity: 277
ICD‐9 codes

Obesity: 278.x, V85.3x, V85.4x

Severe obesity: 278.01, V85.4x.

ICD‐9‐CM

Underweight: 783.22, V85.0.

Normal weight V85.1;

Overweight 278.02, V85.21–V85.25;

Obese 278.00, 278.01, 278.03, V85.30–V85.39, V85.41–V85.45;

Severe obese (class II) 278.01, V85.35–V85.39, V85.41–V85.45;

Severe obese (class III) 278.01, V85.41–V85.45.

ICD‐10 codes

Obesity: E66.x, Z68.3x, Z68.4x.

Severe obesity: E66.01, Z68.4x

ICD‐10‐CM

Underweight R63.6, Z68.1;

Normal weight Z68.20–Z68.24;

Overweight E66.3, E68.25–Z68.29;

Obese E66.09, E66.1, E66.2, E66.8, E66.9, Z68.30–Z68.39, Z68.41–Z68.45;

Severe obese (class II) E66.01, E66.2, Z68.35–Z68.39, Z68.41–Z68.45;

Severe obese (class III) E66.01, E66.2, Z68.41–Z68.45.

3.4. Reference standard

Obesity case‐identification methods (ICD codes) were compared with a reference standard to determine the level of congruence between health care administrative data and other sources of health data. Of the 17 studies reviewed, two compared case‐identification methods with clinical measurements (Table 4), 14 compared against medical records (Table 5) and one compared against a treatment registry (Table 6) to validate the obesity case‐identification method. No studies have compared health care administrative data with self‐reported obesity. All three reference standards defined adults with obesity as those having a BMI greater or equal to 30. In the case of children, obesity was defined using appropriate BMI cut‐off points on CDC growth charts.

TABLE 4.

Clinical measurements as gold standard (n = 2 studies)

Author, year Years of data collection Codes used Case identification method definitions N Se (95% CI) Sp (95% CI) PPV (95% CI) NPV (95% CI) Characteristics
Lloyd et al., 2015 1999–2007 ICD‐9‐CM: 278.0, 278.00, 278.01. ≥1 claim‐based diagnosis code for obesity
All 3554 18.4 (‐) 97.3 (‐) 73.6 (‐) 74.2 (‐) BMI ≥ 30
34.2 (‐) 95.8 (‐) 47.9 (‐) 92.7 (‐) BMI ≥ 35
Diagnosis of diabetes 520 34.4 (‐) 97.5 (‐) 91.7 (‐) 65.2 (‐) BMI ≥ 30
54.3 (‐) 93.3 (‐) 68.2 (‐) 88.6 (‐) BMI ≥ 35
Diagnosis of CHF 434 25.9 (‐) 96.5 (‐) 81.1 (‐) 69.8 (‐) BMI ≥ 30
51.9 (‐) 94.0 (‐) 54.2 (‐) 93.4 (‐) BMI ≥ 35
Diagnosis of COPD 237 38.8 (‐) 96.4 (‐) 76.1 (‐) 84.1 (‐) BMI ≥ 30
58.8 (‐) 93.0 (‐) 45.6 (‐) 95.8 (‐) BMI ≥ 35
Diagnosis of depression 177 25.2 (‐) 93.7 (‐) 57.6 (‐) 78.8 (‐) BMI ≥ 30
58.6 (‐) 93.2 (‐) 43.7 (‐) 96.2 (‐) BMI ≥ 35
Kuhle et al., 2011 2002–2004

ICD‐9: 278.

ICD‐10: E66–E68.

≥1 diagnosis code for obesity as a primary or secondary diagnosis from either a physician visit or a hospital stay. 3230 7.4 (5.3–9.9) 99.7 (99.4–99.9)

Abbreviations: BMI, body mass index; CHF, congestive heart failure; CI, confidence interval; COPD, chronic obstructive pulmonary disease; ICD‐9, International Classification of Diseases, 9th Revision; ICD‐9‐CM, International Classification of Diseases, 9th Revision, Clinical Modification; ICD‐10, International Classification of Diseases, 10th Revision; N, number; NPV, negative predictive value; PPV, positive predictive value; Se, sensitivity; Sp, specificity.

TABLE 5.

Medical records review as gold‐standard (n = 14 studies)

Author, year Years of data collection Codes used Case identification method definitions N Se (95% CI) Sp (95% CI) PPV (95% CI) NPV (95% CI) Characteristics
Ammann et al., 2018 2013, 2014, 2016

ICD‐9‐CM codes a

ICD‐10‐CM codes b

≥1 ICD‐9‐CM or ICD‐10‐CM diagnosis code for obesity during the calendar years 2013, 2014 and 2016. 1 116 283 c 10.7 (‐) 49.0 (‐) 98.8 (‐) BMI < 18.5
3.7 (‐) 89.6 (‐) 73.9 (‐) BMI 18.5–24
6.0 (‐) 73.4 (‐) 67.6 (‐) BMI 25–29
25.2 (‐) 92.4 (‐) 68.0 (‐) BMI ≥ 30
22.8 (‐) 89.1 (‐) BMI ≥ 35
34.8 (‐) 69.1 (‐) BMI ≥ 40
2013, 2014 ≥1 ICD‐9‐CM diagnosis code for obesity during the calendar years 2013 and 2014. 676 989 c 8.6 (‐) 59.0 (‐) BMI < 18.5
2.7 (‐) 89.2 (‐) BMI 18.5–24
4.3 (‐) 69.9 (‐) BMI 25–29
21.8 (‐) 92.0 (‐) BMI ≥ 30
2016 ≥1 ICD‐10‐CM diagnosis code for obesity during the calendar year 2016. 439 294 c 13.9 (‐) 42.1 (‐) BMI < 18.5
5.2 (‐) 89.9 (‐) BMI 18.5–24
8.6 (‐) 76.5 (‐) BMI 25–29
30.2 (‐) 92.9 (‐) BMI ≥ 30
Caplan et al., 2018 2010–2015

ICD‐9‐CM

V85.0, V85.1, V85.2x, V85.3x, V85.4, V85.4x

≥1 claim‐based code for obesity documented during the index period. 207 90.3 (86.3–94.4) MAPD, all BMI
71.0 (55.0–87.0) MAPD, <18.5
93.8 (85.4–100) MAPD, BMI 18.5–24
97.4 (92.5–100) MAPD, BMI 25–29
96.9 (90.9–100) MAPD, BMI ≥ 30
97.0 (91.1–100 MAPD, BMI ≥ 35
85.0 (73.3–96.1) MAPD, BMI ≥ 40
21 91.1 (87.3–94.9) Commercial, all BMI
75.9 (60.3–91.4) Commercial, BMI < 18.5
87.8 (77.8–97.8) Commercial, BMI 18.5–24
93.5 (84.9–100) Commercial, BMI 25–29
97.2 (91.9–100) Commercial, BMI ≥ 30
93.0 (85.4–100) Commercial, BMI ≥ 35
97.1 (91.4–100) Commercial, BMI ≥ 40
McLynn et al., 2018 2013–2016

ICD‐9 codes d

ICD‐10 codes e

≥1 ICD‐9 or ICD‐10 code for obesity during the study period. 796 42.5 (‐) 99.2 (‐) All
54.1 (‐) DM (yes)
38.3 (‐) DM (no)
49.6 (‐) ASA class III (yes)
33.3 (‐) ASA class III (no)
69.2 (‐) Post‐op VTE (yes)
41.6 (‐) Post‐op VTE (no)
57.9 (‐) Major AE (yes)
40.9 (‐) Major AE (no)
53.8 (‐) Any AE (yes)
40.3 (‐) Any AE (no)
Chiu et al., 2017 1999–2016 ICD‐9 codes not specified ≥1 diagnosis code for obesity at the follow up time point 100 50 (45–92) 95 (88–98) 91 (81–97) 63 (54–71)
Peng et al., 2017 2003 ICD‐10‐CA codes not specified ≥1 diagnosis code for obesity. 4007 89.6 (77.3–96.5) 91.5 (79.6–97.6) All records
3891 18.3 (14.2–22.9) 84.3 (73.6–91.9) Among records without a status of death
105 30.0 (6.7–65.2) 75.0 (19.4–99.4) Among records with a status of death
Nickel et al., 2016 2004–2011 ICD‐9‐CM: 278.00, 278.01, 278.03, 649.10–649.14, 793.91, V85.30–V85.39, V85.41–V85.45. ≥2 provider or outpatient facility claims for obesity spaced ˃30 days apart or ≥1 inpatient claim for obesity during the 1‐year period before the mastectomy 174 7.14 (‐) 100 (‐) 100 (‐) 69.41 (‐)
≥2 provider or outpatient facility claims for obesity spaced ˃30 days apart or ≥1 inpatient claim for obesity during the 1‐year period before the mastectomy and the 1‐week period after the mastectomy 174 12.50 (‐) 100 (‐) 100 (‐) 70.66 (‐)
≥1 provider or outpatient facility claim for obesity or ≥1 inpatient claim for obesity during the 1‐year period before the mastectomy and the 1‐week period after the mastectomy. 174 17.86 (‐) 99.15 (‐) 90.91 (‐) 71.78 (‐)
Samuel et al., 2015 2011–2012 ICD‐9: 278.00, 278.01. ≥1 diagnosis code for obesity. 32 411 8.9 (‐)
Lau et al., 2015 2010–2014 ICD‐9: V85.xx ≥1 diagnostic code for obesity. 315 100 (‐) THA
100 (‐) THA and BMI ≥ 40
442 91.5 (‐) TKA
90.6 (‐) TKA and BMI ≥ 40
ICD‐9: 278.xx ≥1 diagnostic code for obesity. 315 67.7 (‐) THA
50.0 (‐) THA and BMI ≥ 40
442 65.6 (‐) TKA
78.1 (‐) TKA and BMI ≥ 40
Golinvaux et al., 2014 2013 ICD‐9: 278.00 1 diagnosis code for unspecified obesity 2075 19 (‐) 97 (‐) 70 (‐) 76 (‐)
ICD‐9: 278.01 1 diagnosis code for severe obesity 2075 48 (‐) 99 (‐) 81 (‐) 95 (‐)
Bozic et al., 2013 2009 ICD‐9‐CM: 278.00, 278.01, V85.3, V85.4. ≥1 diagnostic code for obesity during the study period. 1350 76.4 (‐) 91.8 (‐) 49.2 (‐)
Andrade et al., 2011 2006–2008 ICD‐9‐CM: 278, 278.0, 278.00–278.02, 649.10–649.14. ≥1 diagnosis code indicating obesity or overweight status during the 1‐year period prior to end date of pregnancy. 18 312 33 (32–35) 99 (99–99) 93 (92–96)
Quan et al., 2008 2003 ICD‐10‐CA codes not specified ≥1 ICD‐10‐CA code for obesity (original coding). 4008 18.6 (‐) 99.7 (‐) 83.8 (‐) 93.1 (‐)
ICD‐9‐CM codes not specified ≥1 ICD‐9‐CM code for obesity recoded from the original coding. 4008 24.6 (‐) 99.3 (‐) 75.9 (‐) 93.6 (‐)
Varas‐Lorenzo et al., 2008 1999–2001 ICD‐8: 277. ICD‐9: 278. ≥1 code for obesity recorded in the hospital or medical services databases at hospital admission for confirmed ACS 431 43.9 (34.3–53.8) 86.2 (81.4–90.2) 25.5 (19.1–32.0) 35.7 (31.6–39.2)
Yasmeen et al., 2006 1992–1993 ICD‐9‐CM: 278 ≥1 code for obesity 1611 12 (‐) 62 (‐) Unweighted estimates
11 (‐) 49 (‐) Weighted estimates
ICD‐9‐CM: 646.1x ≥1 code for obsessive gain weight 1611 5 (‐) 29 (‐) Unweighted estimates
3 (‐) 19 (‐) Weighted estimates
ICD‐9‐CM: 278, 646.1x ≥1 code for obesity or obsessive gain weight 1611 14 (‐) 83 (‐) Unweighted estimates
13 (‐) 71 (‐) Weighted estimates

Abbreviations: ACS, acute coronary syndrome; BMI, body mass index; CI, confidence interval; ICD‐8, International Classification of Diseases, 8th Revision; ICD‐9, International Classification of Diseases, 9th Revision; ICD‐9‐CM, International Classification of Diseases, 9th Revision, Clinical Modification; ICD‐10‐CA, International Classification of Diseases, 10th Revision, Canada; ICD‐10‐CM, International Classification of Diseases, 10th Revision, Clinical Modification; N, number; NPV, negative predictive value; PPV, positive predictive value; Se, sensitivity; Sp, specificity; TKA, total knee arthroplasty; THA, total hip arthroplasty.

a

person‐years.

b

ICD‐9‐CM codes: Underweight: 783.22, V85.0. Normal weight V85.1; Overweight 278.02, V85.21–V85.25; Obese 278.00, 278.01, 278.03, V85.30–V85.39, V85.41–V85.45; Severe obese (class II) 278.01, V85.35–V85.39, V85.41–V85.45; Severe obese (class III) 278.01, V85.41–V85.45.

c

ICD‐10‐CM codes: Underweight R63.6, Z68.1; Normal weight Z68.20–Z68.24; Overweight E66.3, E68.25–Z68.29; Obese E66.09, E66.1, E66.2, E66.8, E66.9, Z68.30–Z68.39, Z68.41–Z68.45; Severe obese (class II) E66.01, E66.2, Z68.35–Z68.39, Z68.41–Z68.45; Severe obese (class III) E66.01, E66.2, Z68.41–Z68.45.

d

ICD‐9 codes: Obesity: 278.x, V85.3x, V85.4x Severe obesity: 278.01, V85.4x.

e

ICD‐10 codes: Obesity:E66.x, Z68.3x, Z68.4x. Severe obesity: E66.01, Z68.4x.

TABLE 6.

Registry as gold‐standard (n = 1 study)

Author, year Years of data collection Codes used Case identification method definitions N Se (95% CI) Sp (95% CI) PPV (95% CI) NPV (95% CI) Characteristics
Martin et al., 2014 2002–2008 ICD‐10: E65–E68. ≥1 diagnostic code for obesity in any position in the 25 diagnosis coding fields.
Overall 17 380 7.75 (‐) 98.98 (‐) 65.94 (61.38–70.51) 80.84 (‐) 2002–2008
Year 17 380 8.24 (‐) 98.49 (‐) 58.82 (48.36–69.29) 80.41 (‐) 2002
9.48 (‐) 98.58 (‐) 61.25 (50.57–71.93) 82.16 (‐) 2003
8.69 (‐) 99.14 (‐) 73.13 (62.52–83.75) 80.02 (‐) 2004
5.81 (‐) 99.34 (‐) 69.05 (55.07–83.03) 80.60 (‐) 2005
6.44 (‐) 99.53 (‐) 76.92 (63.70–90.15) 81.38 (‐) 2006
5.87 (‐) 99.14 (‐) 63.41 (48.67–78.16) 80.64 (‐) 2007
9.37 (‐) 98.78 (‐) 66.67 (54.74–78.59) 80.67 (‐) 2008
Age 17 380 7.58 (‐) 99.03 (‐) 68.57 (‐) 79.27 (‐) <55 years old
7.63 (‐) 98.77 (‐) 63.64 (‐) 79.19 (‐) 55–65 years old
8.02 (‐) 99.06 (‐) 69.49 (‐) 80.08 (‐) 65–75 years old
7.76 (‐) 99.08 (‐) 60.00 (‐) 85.85 (‐) ≥75 years old
Gender 17 380 9.84 (‐) 98.79 (‐) 76.26 (‐) 73.52 (‐) Female
6.14 (‐) 99.06 (‐) 56.48 (‐) 84.11 (‐) Male
Indication for coronary catheterization 17 380 8.77 (‐) 98.92 (‐) 68.99 (‐) 79.81 (‐) Stable angina
7.85 (‐) 98.92 (‐) 63.19 (‐) 82.01 (‐) MI
6.88 (‐) 98.85 (‐) 63.74 (‐) 78.28 (‐) Unstable angina
6.52 (‐) 99.54 (‐) 74.19 (‐) 83.87 (‐) Other
Diabetes 17 380 6.18 (‐) 99.19 (‐) 62.55 (‐) 82.86 (‐) No
11.39 (‐) 98.08 (‐) 70.76 (‐) 73.05 (‐) Yes
CVD 17 380 7.49 (‐) 98.98 (‐) 65.07 (‐) 80.85 (‐) No history
10.94 (‐) 99.00 (‐) 74.36 (‐) 80.73 (‐) History
CHF 17 380 7.64 (‐) 98.98 (‐) 65.73 (‐) 80.74 (‐) No
8.46 (‐) 98.99 (‐) 67.24 (‐) 81.48 (‐) Yes
Hypertension 17 380 7.80 (‐) 99.18 (‐) 63.87 (‐) 85.27 (‐) No
7.73 (‐) 98.86 (‐) 66.78 (‐) 78.36 (‐) Yes
Hyperlipidaemia 17 380 7.83 (‐) 98.87 (‐) 65.11 (‐) 79.91 (‐) No
7.49 (‐) 99.27 (‐) 68.82 (‐) 83.27 (‐) Yes
Prior MI 17 380 7.43 (‐) 98.99 (‐) 65.66 (‐) 80.59 (‐) No
8.73 (‐) 98.95 (‐) 66.96 (‐) 81.60 (‐) Yes

Abbreviations: BMI, body mass index; CHF, congestive heart failure; CI, confidence interval; CVD, cerebrovascular diseases; ICD‐10, International Classification of Diseases, 10th Revision; N, number; MI, myocardial infarction; NPV, negative predictive value; PPV, positive predictive value; Se, sensitivity; Sp, specificity.

3.5. Comorbidities

In six of the 17 studies, the case‐identification methods were designed to select individuals with obesity with other specified chronic conditions, including pain, 19 , 31 diabetes, 32 fractures, 33 arthroplasty 20 and acute coronary syndromes. 23 In these cases, obesity was assessed as a comorbidity.

3.6. Statistical validity

The statistical validity of the case‐identification methods was assessed using four indicators: sensitivity (n = 16/17), specificity (n = 13/17), PPV (n = 13/17) and NPV (n = 12/17). Three studies also carried out concordance analysis, specifically Cohen's kappa test (n = 2) 11 , 18 and the calculation of the AUC curve (n = 1). 20

Table 4 compares obesity case‐identification methods in health care administrative databases with clinical measures (n = 2 studies). These methods had overall high performance in terms of specificity, PPV and NPV. On the other hand, the sensitivity values recorded were very low. Even when the analyses were restricted to individuals with other comorbidities (diabetes, depression, chronic obstructive pulmonary disease, congestive heart failure), the maximum sensitivity observed was only 38.8% for a single BMI group.

Table 5 compares obesity case‐identification methods for health care administrative databases with medical records (n = 14). These methods showed low sensitivity values (always less than 50%) except in two studies. 12 , 32 On the other hand, the specificities were always above 50% with values reaching 100%. The PPV fluctuated from 19% to 100%, and NPV fluctuated from and 36% to 99%. The case‐identification methods were more effective in detecting cases of obesity when combined with other comorbidities. The best performances in detecting cases of obesity, considering both sensitivity and specificity, were observed in a group of 50 patients also suffering from diabetes retinopathy (sensitivity = 50%, specificity = 95%). 32

The only study that compared obesity identification methods with data from a treatment registry (Table 6) reported sensitivity values that were less than 12% regardless of the subgroup considered (age, gender and associated comorbidities). Conversely, the specificity values obtained in this study were high (above 98%). Similarly, the PPV and NPV were also high.

3.7. Validation of methods for identifying severe obesity

Among the articles included in this review, six studies evaluated the statistical validity of methods in people suffering from severe obesity. 10 , 15 , 19 , 20 , 22 , 31 In individuals with BMI between 35 and 39 kg/m2, sensitivity varied (depending on the studies and association with other comorbidities) from 22.8% to 58.8%, whereas specificity always exceeded 93%. In individuals with a BMI greater or equal to 40 kg/m2, sensitivity values ranged from 33.3% to 49.6%. The reporting of validation statistics was often incomplete in the case of severe obesity, likely due to the small number of cases.

3.8. Risk of bias analysis

This review assessed the quality of the included studies using a modified version of the QUADAS‐2 tool. This assessment found that most studies did not include enough information to determine whether the risk of bias was low or high. In cases where enough information was presented, the risk of bias was generally reported as being low.

4. DISCUSSION

The objectives of this systematic review were to (1) determine the case‐identification methods used to identify individuals with obesity in health care administrative databases and (2) to summarize the validity of these case‐identification methods when compared with a reference standard. Understanding the validity of case‐identification methods for identifying individuals with obesity in health care administrative databases is critical to improving surveillance and research studies that utilize these databases. In order to understand the validity of case‐identification methods, the main statistical indicators (sensitivity, PPV, specificity and NPV) that are used to evaluate the performance of these case‐identification methods in the literature were reviewed. The validity of these case‐identification methods was then summarized and compared with a reference standard, in order to assess the implications of using health care administrative databases for epidemiological surveillance and other research activities.

The sensitivity of the case‐identification method when compared with all three reference standards was generally very low. This means that a low number of individuals who were identified as suffering from obesity in the reference sample were also identified as suffering from obesity in the health care administrative database. This low sensitivity could be explained by the fact that obesity is often not encoded by physicians in health records and registers. This may be due to a lack of time, the fact that obesity is not always considered a disease, there is only place for one primary diagnostic code or that the diagnosis of obesity may be based on subjective observation by the clinician, and not on an objective measure of BMI, thus only capturing cases of severe obesity. 11 Finally, in countries, such as the United States, that rely on reimbursement from insurance companies, obesity‐related comorbidities are often listed as the primary diagnosis as obesity is not reimbursable. It should also be noted that obesity was only considered a disease by the American Medical Association in 2013, which is in the middle of the search period. This may influence the results of this study as well as studies of this issue in the future. 35 The consequence of poor sensitivity of obesity‐identification methods can be twofold. First, it may lead to underestimation of the prevalence of obesity in administrative databases. Second, it may lead to an inaccurate estimation of health care costs associated with obesity from health care administrative databases. To illustrate this point, Kuhle 11 showed that when comparing the costs of health care usage in children with obesity and normal weight children, costs were 16% higher in children with clinically measured obesity. This cost differential increased to 108% when the analysis was based on data extracted from a health care administrative database using case‐identification methods based on ICD codes. Thus, using only obesity‐identification codes to select individuals with obesity in a health care administrative database severely overestimated health care costs for children with obesity. In conclusion, the low sensitivity of the case‐identification methods suggests that they should not be used for monitoring the prevalence or incidence of obesity.

Conversely, PPV was high in the majority of studies. In other words, if an individual is coded as having a BMI greater or equal to 30 in a health care administrative database, they are very likely to have a BMI greater or equal to 30 according to the reference standard. Thus, identification codes of individuals with obesity in administrative data are very accurate. Several authors point to the possible explanation that physicians are more likely to diagnose obesity if multiple, potentially obesity‐related comorbidities are present or alternatively if the person suffered from severe obesity. 11 , 14 , 15 , 16 , 18 , 19 , 20 , 21 Case‐identification methods can therefore be used to identify, with good precision, cohorts of individuals suffering from severe obesity from administrative databases. These cohorts will likely be made up of individuals who are ill or who are affected by severe obesity.

The specificity of case‐identification methods for the identification of individuals with obesity was generally higher than the sensitivity. This means that a large proportion of individuals without a diagnosis of obesity in the reference sample also did not have a diagnosis of obesity in the health care administrative database. This finding confirms the fact that the encoding of obesity in administrative databases is incomplete. Indeed, the weight status of individuals with obesity is rarely entered in administrative databases. This poor completeness of reporting of individuals with obesity leads to a significant bias in the estimation of the negative predictive value. The NPV, which is very poorly reported in these studies, should be interpreted with caution.

The main contribution of this article is the rigorous manner in which the literature has been searched to identify case‐identification methods compared with a reference standard using identified measures of performance. The main limitation of this research is that none of the available reference standards are validated population measures of obesity. They also suffer from many of the same reliability issues as health care administrative databases such as issues with data entry and reliability of entry as were discussed above.

Despite these limitations, some conclusions were able to be drawn. Overall, several factors that are likely to positively affect the accuracy of the case‐identification methods were identified. Case‐identification methods tended to be more accurate when reported with comorbidities associated with obesity and in cases of severe obesity. 10 , 15 , 19 , 20 , 22 , 23 , 31 , 32 , 33 This may be explained by the fact that subjects with severe obesity or comorbidities are more likely to be coded as such in administrative databases because the patient's weight has direct relevance to the disease of interest. This may lead to a positive bias in the coding of obesity for these groups.

5. CONCLUSION

The performance of case‐identification methods for identifying individuals with obesity in administrative databases was inconsistent. The sensitivity of these methods was usually low while the specificity was higher. Thus, when obesity is reported in health care administrative databases, it is usually correctly reported; however, obesity tends to be highly underreported in these databases. Therefore, case‐identification methods to monitor the prevalence and incidence of obesity within health care administrative databases should be used with caution. Despite this, the use of these methods remains relevant for the selection of cohorts of individuals with obesity for research activities. Further research on how to improve reporting of obesity in health care administrative databases is urgently required.

CONFLICT OF INTEREST

No conflict of interest was declared.

ACKNOWLEDGEMENTS

We would like to acknowledge Elizabeth Cauvier Charest, a research intern who contributed in the study selection and the data extraction. This study was supported by the Quebec SPOR support Unit, an initiative funded by the Canadian Institutes of Health Research (CIHR), the Ministère de la santé et des services sociaux du Québec and the Fonds de recherche du Québec ‐ Santé. These three funders were not involved in the design, the data collection, the interpretation or the publication of the review.

APPENDIX A. SEARCH STRATEGY USED ACCORDING TO THE DATABASE

Medline
Obesity (MH ‘obesity’) OR (MH ‘overweight’) OR obes* OR overweight* OR malnutrition OR ‘body mass index*’ OR ‘BMI’ OR ‘waist‐hip ratio*’ OR ‘waist hip ratio*’ OR ‘waist circumference’ OR ‘abdominal fat’
AND
Case‐identification method validation (MH ‘Sensitivity and Specificity’) OR specificit* OR sensitivit* OR ‘predictive value*’ OR ‘positive predictive value*’ OR ‘ppv’ OR ‘negative predictive value*’ OR ‘npv’ OR valid* OR ‘roc curve*’ OR ‘roc’ OR ‘receiver operating characteristic*’ OR ‘auc’ OR ‘area under curve*’ OR kappa*
AND
Administrative data (MH ‘Health Information Systems’) OR (MH ‘Billing and Claims’) OR (MH ‘Coding’) OR (MH ‘Databases, Factual’) OR ‘administrative data*’ OR ‘medico‐administrative data*’ OR ‘administrative register data*’ OR ‘health* administrative data*’ OR ‘administrative code*’ OR ‘medico‐administrative code*’ OR ‘health* administrative code*’ OR ‘health* data*’ OR ‘billing data*’ OR ‘billing code*’ OR claim*
CINAHL, PsycINFO, Abstract in Social Gerontology, Ageline
Obesity obes* OR overweight* OR malnutrition OR ‘mass index*’ OR ‘BMI’ OR ‘waist‐hip ratio*’ OR ‘waist hip ratio*’ OR ‘waist circumference’ OR ‘abdominal fat’
AND
Case‐identification method validation specificit* OR sensitivit* OR ‘predictive value*’ OR ‘positive predictive value*’ OR ‘ppv’ OR ‘negative predictive value*’ OR ‘npv’ OR valid* OR ‘roc curve*’ OR ‘roc’ OR ‘receiver operating characteristic*’ OR ‘auc’ OR ‘area under curve*’ OR kappa*
AND
Administrative data ‘administrative data*’ OR ‘medico‐administrative data*’ OR ‘administrative register data*’ OR ‘health* administrative data*’ OR ‘administrative code*’ OR ‘medico‐administrative code*’ OR ‘health* administrative code*’ OR ‘health* data*’ OR ‘billing data*’ OR ‘billing code*’ OR claim*
Pubmed
Obesity obesity OR obese OR overweight OR overweighted OR malnutrition OR ‘body mass index’ OR ‘body mass indexes’ OR bmi OR ‘waist‐hip ratio’ OR ‘waist hip ratio’ OR ‘waist circumference’ OR ‘abdominal fat’
AND
Case‐identification method validation specificity OR sensitivity OR ‘predictive value’ OR ‘predictive values’ OR ‘positive predictive value’ OR ‘positive predictive values’ OR ‘ppv’ OR ‘negative predictive value’ OR ‘negative predictive values’ OR ‘npv’ OR validation OR validity OR ‘roc curve’ OR ‘roc curves’ OR ‘roc’ OR ‘receiver operating characteristic’ OR ‘receiver operating characteristics’ OR ‘auc’ OR ‘area under curve’ OR ‘area under curves’ OR kappa OR kappas
AND
Administrative data ‘administrative data’ OR ‘administrative database’ OR ‘medico‐administrative data’ OR ‘medico‐administrative database’ OR ‘administrative register data’ OR ‘administrative register database’ OR ‘healthcare administrative data’ OR ‘healthcare administrative database’ OR ‘healthcare administrative data’ OR ‘healthcare administrative database’ OR ‘administrative code’ OR ‘administrative codes’ OR ‘medico‐administrative code’ OR ‘medico‐administrative codes’ OR ‘healthcare administrative code’ OR ‘healthcare administrative codes’ OR ‘healthcare administrative code’ OR ‘healthcare administrative codes’ OR ‘health data’ OR ‘health database’ OR ‘healthcare data’ OR ‘healthcare database’ OR ‘billing data’ OR ‘billing database’ OR ‘billing code’ OR ‘billing codes’ OR claim OR claims
Embase
Obesity ‘obesity’/exp OR ‘obese*’:ti,ab,kw OR ‘obesit*’:ti,ab,kw OR ‘overweight’:ti,ab,kw OR ‘overweighted’:ti,ab,kw OR ‘malnutrition’:ti,ab,kw OR ‘body mass index*’:ti,ab,kw OR ‘bmi’:ti,ab,kw OR (‘waist‐hip’ NEAR/1 ‘ratio*’):ti,ab,kw OR (‘waist hip’ NEAR/1 ‘ratio*’):ti,ab,kw OR (‘waist’ NEAR/1 ‘circumference*’):ti,ab,kw OR ‘abdominal fat’:ti,ab,kw
AND
Case‐identification method validation ‘sensitivity and specificity’/exp OR ‘predictive value’/exp OR ‘validity’/exp OR ‘receiver operating characteristic’/exp OR ‘area under the curve’/exp OR (‘predictive*’ NEXT/1 ‘value*’):ti,ab,kw OR ‘ppv’:ti,ab,kw OR ‘npv’:ti,ab,kw OR ‘sensitivity’:ti,ab,kw OR ‘specificit*’:ti,ab,kw OR ‘valid*’:ti,ab,kw OR ‘receiver operat* characteristic*’:ti,ab,kw OR ‘roc’:ti,ab,kw OR ‘auc’:ti,ab,kw OR ‘area under curve*’:ti,ab,kw OR ‘kappa*’:ti,ab,kw
AND
Administrative data ‘administrative database’/exp OR ‘administrative data’/exp OR ‘billing and claims’/exp OR ‘medical information system’/exp OR ‘factual database’/exp OR (‘administrative*’ NEAR/2 ‘data*’):ti,ab,kw OR (‘administrative*’ NEAR/2 ‘code*’):ti,ab,kw OR (‘health*’ NEAR/2 ‘data*’):ti,ab,kw OR (‘billing*’ NEAR/2 ‘data*’):ti,ab,kw OR (‘billing*’ NEAR/2 ‘code*’):ti,ab,kw OR (‘billing*’ NEAR/2 ‘billing*’):ti,ab,kw OR claim:ti,ab,kw OR claims:ti,ab,kw

APPENDIX B. MODIFIED QUADAS‐2 TOOL ADAPTED FOR IDENTIFYING INDIVIDUALS WITH OBESITY IN HEALTHCARE ADMINISTRATIVE DATABASES 30 , 36

Domain Patient selection Administrative database Reference standard Flow and timing
Description Describe methods of patient selection Describe the administrative database and how it was used and interpreted Describe the reference standard and how it was conducted and interpreted Describe any patients in the validation cohort who were not found within the reference standard or who were excluded from cross‐tabulation of the administrative data diagnoses results against the results of the reference standard diagnoses
What is the study question? Where available, include comment on how coding was done and by whom. Where available, include comment on quality of the reference standard, including the level of experience of clinicians making the diagnosis, access to diagnostic tests such as physical exam evaluating height, weight, waist circumference measurement and lab tests to check for comorbidities. Describe the time interval and any interventions between administrative database diagnosis and reference standard diagnosis
Signalling questions (yes/no/unclear) Was a consecutive or random sample of patients enrolled? Were the administrative database diagnosis results interpreted without knowledge of the results of the reference standard diagnosis? Is the reference standard likely to correctly classify obesity? Was there an appropriate interval between administrative database diagnosis and reference standard diagnosis?
Did the study avoid inappropriate exclusions? If a diagnostic threshold was used, was it pre‐specified? Were the reference standard results interpreted without knowledge of the results of the administrative database diagnosis? Did all patients receive a reference standard?
Did all patients receive the same reference standard?
Were all patients included in the analysis?
Risk of bias: High/low/unclear Could the selection of patients have introduced bias? Could the conduct or interpretation of the administrative database have introduced bias? Could the reference standard, its conduct, or its interpretation have introduced bias? Could the patient flow have introduced bias?
Concerns regarding applicability: High/low/unclear Are there concerns that the included patients do not match the study question? Are there concerns that the administrative database, its conduct, or interpretation differ from the study question? Are there concerns that obesity, as defined by the reference standard, does not match the study question?

Samadoulougou S, Idzerda L, Dault R, Lebel A, Cloutier A‐M, Vanasse A. Validated methods for identifying individuals with obesity in health care administrative databases: A systematic review. Obes Sci Pract. 2020;6:677–693. 10.1002/osp4.450

on behalf of the TorSaDE Cohort researchers' group*

*Gillian Bartlett, Lucie Blais, Marc‐André Blanchette, David Buckeridge, Manon Choinière, Jean‐François Ethier, Catherine Hudon, Sonia Jean, Anaïs Lacasse, Benoit Lamarche, Alexandre Lebel, Thomas Poder, Amélie Quesnel‐Vallée, Pasquale Roberge, Alain Vanasse, Bilkis Vissandjée.

REFERENCES

  • 1. Blais C, Jean S, Sirois C, et al. Quebec Integrated Chronic Disease Surveillance System (QICDSS), an innovative approach. Chronic Dis Inj Can. 2014;34:226‐235. [PubMed] [Google Scholar]
  • 2. Koné Péfoyo AJ, Rivard M, Laurier C. Public health surveillance and role of administrative data. Rev Epidemiol Sante Publique. 2009;57:99‐111. 10.1016/j.respe.2008.11.003 [DOI] [PubMed] [Google Scholar]
  • 3. Schneeweiss S, Avorn J. A review of uses of health care utilization databases for epidemiologic research on therapeutics. J Clin Epidemiol. 2005;58:323‐337. 10.1016/j.jclinepi.2004.10.012 [DOI] [PubMed] [Google Scholar]
  • 4. Hamm NC, Pelletier L, Ellison J, et al. Trends in chronic disease incidence rates from the Canadian Chronic Disease Surveillance System. Health Promot Chronic Dis Prev Can. 2019;39:216‐224. 10.24095/hpcdp.39.6/7.02 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5. Lix LM, Yogendran MS, Shaw SY, Burchill C, Metge C, Bond R. Population‐based data sources for chronic disease surveillance. Chronic Dis Can. 2008;29:31‐38. [PubMed] [Google Scholar]
  • 6. Cadarette SM, Wong L. An introduction to health care administrative data. Can J Hosp Pharm. 2015;68:232‐237. 10.4212/cjhp.v68i3.1457 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7. Iron K, Lu H, Manuel D, Henry D, Gershon A. Using linked healthcare administrative data to assess the clinical and healthcare system impact of chronic diseases in Ontario. Healthc Q. 2011;14:23‐27. [DOI] [PubMed] [Google Scholar]
  • 8. Schwartz RM, Gagnon DE, Muri JH, Zhao QR, Kellogg R. Administrative data for quality improvement. Pediatrics. 1999;103:291‐301. [PubMed] [Google Scholar]
  • 9. Zins M, Goldberg M. L'apport des bases de données d'origine administrative aux cohortes épidémiologiques: l'exemple de la cohorte Constances. Statistique et société. 2014;2:43‐48. [Google Scholar]
  • 10. Ammann EM, Kalsekar I, Yoo A, Johnston SS. Validation of body mass index (BMI)‐related ICD‐9‐CM and ICD‐10‐CM administrative diagnosis codes recorded in US claims data. Pharmacoepidemiol Drug Saf. 2018;27:1092‐1100. 10.1002/pds.4617 [DOI] [PubMed] [Google Scholar]
  • 11. Kuhle S, Kirk SF, Ohinmaa A, Veugelers PJ. Comparison of ICD code‐based diagnosis of obesity with measured obesity in children and the implications for health care cost estimates. BMC Med Res Methodol. 2011;11:1–5, 173 10.1186/1471-2288-11-173 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12. Martin BJ, Chen G, Graham M, Quan H. Coding of obesity in administrative hospital discharge abstract data: accuracy and impact for future research studies. BMC Health Serv Res. 2014;14:1–8, 70 10.1186/1472-6963-14-70 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13. Woo JG, Zeller MH, Wilson K, Inge T. Obesity identified by discharge ICD‐9 codes underestimates the true prevalence of obesity in hospitalized children. J Pediatr. 2009;154:327‐331. 10.1016/j.jpeds.2008.09.022 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14. Andrade SE, Moore Simas TA, Boudreau D, et al. Validation of algorithms to ascertain clinical conditions and medical procedures used during pregnancy. Pharmacoepidemiol Drug Saf. 2011;20:1168‐1176. 10.1002/pds.2217 [DOI] [PubMed] [Google Scholar]
  • 15. Lloyd JT, Blackwell SA, Wei II, Howell BL, Shrank WH. Validity of a claims‐based diagnosis of obesity among Medicare beneficiaries. Eval Health Prof. 2015;38:508‐517. 10.1177/0163278714553661 [DOI] [PubMed] [Google Scholar]
  • 16. Peng M, Southern DA, Williamson T, Quan H. Under‐coding of secondary conditions in coded hospital health data: Impact of co‐existing conditions, death status and number of codes in a record. Health Informatics J. 2017;23:260‐267. 10.1177/1460458216647089 [DOI] [PubMed] [Google Scholar]
  • 17. Quan H, Li B, Saunders LD, et al. Assessing validity of ICD‐9‐CM and ICD‐10 administrative data in recording clinical conditions in a unique dually coded database. Health Serv Res. 2008;43:1424‐1441. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18. Bozic KJ, Bashyal RK, Anthony SG, Chiu V, Shulman B, Rubash HE. Is administratively coded comorbidity and complication data in total joint arthroplasty valid? Clin Orthop Relat Res. 2013;471:201‐205. 10.1007/s11999-012-2352-1 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19. Golinvaux NS, Bohl DD, Basques BA, Fu MC, Gardner EC, Grauer JN. Limitations of administrative databases in spine research: a study in obesity. Spine J. 2014;14:2923‐2928. 10.1016/j.spinee.2014.04.025 [DOI] [PubMed] [Google Scholar]
  • 20. Lau EC, Son MS, Mossad D, et al. The validity of administrative BMI data in total joint arthroplasty. J Arthroplasty. 2015;30:1683‐1687. 10.1016/j.arth.2015.04.029 [DOI] [PubMed] [Google Scholar]
  • 21. Nickel KB, Wallace AE, Warren DK, et al. Modification of claims‐based measures improves identification of comorbidities in non‐elderly women undergoing mastectomy for breast cancer: a retrospective cohort study. BMC Health Serv Res. 2016;16:1–12, 388 10.1186/s12913-016-1636-7 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22. Caplan EO, Kamble PS, Harvey RA, et al. Positive predictive value between medical‐chart body‐mass‐index category and obesity versus codes in a claims‐data warehouse. Curr Med Res Opin. 2018;34:117‐121. 10.1080/03007995.2017.1366302 [DOI] [PubMed] [Google Scholar]
  • 23. Yasmeen S, Romano PS, Schembri ME, Keyzer JM, Gilbert WM. Accuracy of obstetric diagnoses and procedures in hospital discharge data. Am J Obstet Gynecol. 2006;194:992‐1001. 10.1016/j.ajog.2005.08.058 [DOI] [PubMed] [Google Scholar]
  • 24. Moher D, Liberati A, Tetzlaff J, Altman DG. Preferred reporting items for systematic reviews and meta‐analyses: the PRISMA statement. Ann Intern Med. 2009;151:264‐269. [DOI] [PubMed] [Google Scholar]
  • 25. Martin V, Renaud J, Dagenais P. Les normes de production des revues systématiques: Guide méthodologique. Institut National d'excellence en santé et en Services Sociaux (INESS); 2013.
  • 26. Chung CP, Rohan P, Krishnaswami S, McPheeters ML. A systematic review of validated methods for identifying patients with rheumatoid arthritis using administrative or claims data. Vaccine. 2013;31:K41‐K61. 10.1016/j.vaccine.2013.03.075 [DOI] [PubMed] [Google Scholar]
  • 27. Hohl CM, Karpov A, Reddekopp L, Doyle‐Waters M, Stausberg J. ICD‐10 codes used to identify adverse drug events in administrative data: a systematic review. J Am Med Inform Assoc. 2014;21:547‐557. 10.1136/amiajnl-2013-002116 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28. Ronald LA, Ling DI, FitzGerald JM, et al. Validated methods for identifying tuberculosis patients in healthcare administrative databases: systematic review. Int J Tuberc Lung Dis. 2017;21:517‐522. 10.5588/ijtld.16.0588 [DOI] [PubMed] [Google Scholar]
  • 29. Townsend L, Walkup JT, Crystal S, Olfson M. A systematic review of validated methods for identifying depression using administrative data. Pharmacoepidemiol Drug Saf. 2012;21:163‐173. 10.1002/pds.2310 [DOI] [PubMed] [Google Scholar]
  • 30. Mbizvo GK, Bennett K, Simpson CR, Duncan SE, Chin RFM. Accuracy and utility of using administrative healthcare databases to identify people with epilepsy: a protocol for a systematic review and meta‐analysis. BMJ Open. 2018;8:1–9, e020824 10.1136/bmjopen-2017-020824 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31. McLynn RP, Geddes BJ, Cui JJ, et al. Inaccuracies in ICD coding for obesity would be expected to bias administrative database spine studies toward overestimating the impact of obesity on perioperative adverse outcomes. Spine (Phila Pa 1976). 2018;43:526‐532. 10.1097/BRS.0000000000002356 [DOI] [PubMed] [Google Scholar]
  • 32. Chiu S, Davis J, Giaconi J, et al. Variable validity of computer extracted problem lists for complications of diabetes mellitus within the VA Greater Los Angeles Health System. Diabetes Metab Syndr. 2017;11:S611‐S615. 10.1016/j.dsx.2017.04.013 [DOI] [PubMed] [Google Scholar]
  • 33. Samuel AM, Lukasiewicz AM, Webb ML, et al. ICD‐9 diagnosis codes have poor sensitivity for identification of preexisting comorbidities in traumatic fracture patients: a study of the National Trauma Data Bank. J Trauma Acute Care Surg. 2015;79:622‐630. 10.1097/TA.0000000000000805 [DOI] [PubMed] [Google Scholar]
  • 34. Varas‐Lorenzo C, Castellsague J, Stang MR, Tomas L, Aguado J, Perez‐Gutthann S. Positive predictive value of ICD‐9 codes 410 and 411 in the identification of cases of acute coronary syndromes in the Saskatchewan Hospital automated database. Pharmacoepidemiol Drug Saf. 2008;17:842‐852. 10.1002/pds.1619 [DOI] [PubMed] [Google Scholar]
  • 35. Kyle TK, Dhurandhar EJ, Allison DB. Regarding obesity as a disease: evolving policies and their implications. Endocrinol Metab Clin North am. 2016;45:511‐520. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36. Whiting PF, Rutjes AW, Westwood ME, et al. QUADAS‐2: a revised tool for the quality assessment of diagnostic accuracy studies. Ann Intern Med. 2011;155:529‐536. 10.7326/0003-4819-155-8-201110180-00009 [DOI] [PubMed] [Google Scholar]

Articles from Obesity Science & Practice are provided here courtesy of Wiley

RESOURCES