Abstract
Chronic comorbid conditions are important predictors of primary care outcomes, provide context for clinical decisions, and are potential complications of diseases and treatments. Comorbidity indices and multimorbidity categorization strategies based on administrative claims data enumerate diagnostic codes in easily modifiable lists, but usually have inflexible temporal requirements, such as requiring two claims greater than 30 days apart, or three claims in three quarters. Table structures and claims data search algorithms were developed to support flexible temporal constraints. Tables of disease categories allow subgroups with different numbers of events, different times between similar claims, variable periods of interest, and specified diagnostic code substitutability. The strategy was tested on five years of private insurance claims from 2.2 million working age adults. The contrast between rarely recorded, high prevalence diagnoses (smoking and obesity) and frequently recorded but not necessarily chronic diagnoses (musculoskeletal complaints) demonstrated the advantage of flexible temporal criteria.
Introduction
Comorbid conditions are extremely important predictors of primary care outcomes1–3. In addition, comorbid conditions may complicate disease management, and may even result from disease management.
Comorbidity indices merge numerous clinical observations into a single score that predicts an outcome. The Charlson score, published in 1994, predicted mortality in patients undergoing elective surgery based on 19 categories of comorbid conditions4. The Elixhauser comorbidity index, published in 1998, defines 30 categories of comorbid conditions for use with administrative claims data5. These categories comprise sets of diagnosis codes from the International Classification of Diseases, version 9, Clinical Modification (ICD-9-CM). The Elixhauser index predicts mortality, charges, and length of a hospitalization. Klabunde, et al described in 2000 how outpatient records could identify comorbid conditions that were undocumented during hospitalization6 but that would affect hospital outcomes7.
While comorbidity indices simplify predictive analyses, the complexity of comorbidity, often called multimorbidity, is more relevant to many clinical decisions. Analysis of multimorbidity requires that thousands of very specific diagnostic codes be mapped into scores or hundreds of categories of conditions. Several reports describe categorization strategies for common comorbid conditions in claims data8–10. Even crude measures of multimorbidity, e.g. counts of comorbid conditions, are consistently predictive of outcomes.
Several issues distinguish the identification of comorbidities from ambulatory vs. hospital claims data. First, omissions are common in ambulatory claims. Outpatient claims headings in the USA are restricted to 4 diagnostic codes, although “line codes” may include an additional diagnostic code for each procedure claimed. Inpatient claims headings currently are restricted to 25 diagnostic codes for Medicare claims, but have been as few as 9 codes in the past. Furthermore, diagnoses such as smoking and obesity may be absent for years in outpatient claims, but then be recorded when patients have elective surgery. Undocumented diagnoses may appear to (i) be tangential or subordinate to acute problems, (ii) lack effective treatments, leading to lack of attention, (iii) lack documentation incentives, such as reimbursement or quality measures, (iv) carry a risk of stigmatizing patients, or (v) carry a risk of causing a confrontation when patients see the code descriptions on billing records. Thus, inpatient claims from one hospitalization provide a more complete picture than office claims from one visit. Practical limits on numbers of claims imply that (a) any short series of outpatient visits can fail to document relevant comorbidities and (b) diagnoses that are often missed, neglected, or ignored could appear infrequently in claims. For these reasons, long periods of surveillance may be appropriate.
Second, an outpatient claim does not indicate a definite diagnosis. One problem is that clinicians may record codes for diagnoses being ruled out. Another issue is that clinicians often select codes with little direction and only delayed feedback through reimbursement. Professional coders assign inpatient claims, following guidelines that should increase the odds that the diagnosis is present. Thus, inpatient claims could be more accurate than office claims, although both may be biased toward codes yielding higher reimbursement11–13. Finally, diagnostic conclusions can be wrong14–16. When diagnostic errors are discovered, the incorrect diagnosis should disappear from the record. Ambulatory claims analyses therefore require some strategy for establishing validity of outpatient claims. One rule of thumb requires a diagnosis to appear twice, more than 30 days apart, to infer that an outpatient diagnosis is real17. A more stringent German algorithm required a diagnosis to appear once each quarter (3 months) for 3 consecutive quarters10. Of course, some chronic diseases can actually resolve, especially those related to weight, diet, substance abuse, and other reversible problems. The three preceding quarters defines a moving window that eventually removes incorrect and resolved diagnoses.
Third, some broad comorbidity categories used to predict hospital mortality warrant dissection for primary care quality and outcomes analyses. Distinctions between organ impairment and failure are important for renal and hepatic diseases. Patients with injured organs need their physicians to help them minimize the rate at which further injury accumulates, while patients with overt organ failure often face different risks. For instance, the risk of cardiovascular disease initially rises slowly as renal function deteriorates but becomes extreme as kidneys finally fail18,19. Hospital mortality-oriented comorbidity classifications may combine diverse, common neurologic diseases that primary care analyses should subdivide, such as dementia, headaches, seizures, and stroke.
Fourth, many conditions could influence treatments, costs, and outcomes in primary care, or result from medical treatment, without affecting hospital mortality. Topics of primary care include allergies, anxiety, esophageal reflux, migraine and tension headaches, chronic musculoskeletal problems, osteoporosis, smoking, and personality disorders. For instance, osteoporosis may have important outpatient implications (such as increasing the risk of using of corticosteroids) without affecting hospital mortality. Conversely, diagnoses with dire implications for hospital mortality may be unmanageable in the outpatient context.
In order to analyze outpatient claims data describing family physicians’ management of specific diseases, we needed to classify patients’ comorbid conditions. Due to the issues just enumerated and other experiences with outpatient claims, we believed that the inference of comorbid conditions from outpatient data could be further refined. We particularly wanted to increase the temporal flexibility of comorbidity definitions in response to common documentation patterns. We report the development and testing of a chronic disease list with temporal criteria for analyzing outpatient claims data.
Methods
Categories of diagnosis codes
The Elixhauser5 and German ambulatory care10 lists of comorbid conditions were reviewed by category. Categories that included common problems with very diverse clinical implications were subdivided into categories with more uniform implications. Categories that were unlikely to occur in primary care were deleted, and categories for common and clinically influential problems were added. ICD-9-CM diagnosis codes were sought using the hierarchical structure of ICD, and by searching for specific codes and string values.
We determined that assigning codes to a single comorbidity category still limited our ability to make higher resolution inferences that would be desirable in some analyses. We defined subgroups within categories, including a “subgroup zero” containing codes that could match any other subgroup within a category.
After defining subgroups, we noted that requiring repeated documentation of codes in a subgroup could lead to false positive matches unless mutually substitutable codes were specified. We therefore defined bundles of codes that we considered mutually substitutable. For instance, the musculoskeletal group has a subgroup for joint problems. Within the joint subgroup, there are several codes for arthritis of the shoulder and others for arthritis of the hip. The shoulder codes are mutually substitutable and belong to one bundle; the hip codes are in a different bundle. Again, a “bundle zero” identified very imprecise codes that could match any bundle in the subgroup, e.g. “arthritis, NOS.”
Timing of claims
Flexible temporal constraints were added to subgroup definitions. First, we provided minimum numbers of outpatient events as a subgroup specification. Fixed requirements for at least two or at least three outpatient claims over a period were judged unlikely to work consistently. More or fewer claims could be appropriate. For diseases that are rarely documented at all and normally are recorded after being “ruled in,” such as smoking, obesity, and personality disorders, one appearance in an outpatient claim suggests relevance for years before and after the claim. Conversely, an acute problem recurring many times over a few years could be considered a chronic problem.
Second, we added minimum separation between outpatient events as a subgroup specification. Any fixed temporal separation may be excessively or insufficiently restrictive for some condition. For instance, a patient presenting with angina could be evaluated and receive an effective medical intervention in less than 30 days. If chest pain resolves and other risk factors command attention at subsequent visits, angina codes could disappear from claims records. Nevertheless, the patient would have evidence of coronary artery disease as a comorbid condition. Clinical scenarios requiring longer minimum intervals are rare, but consider provoked and unprovoked venous thrombosis. Provoked venous thrombosis will generate a series of claims over 3 to 6 months. Clotting disorders will generate claims over longer intervals, due to recurrence of clots. Therefore, a minimum interval of 90, 180, or even 270 days between venous thromboembolic claims could be appropriate when attempting to infer a clotting disorder.
Third, we added a maximum separation between outpatient events as a subgroup specification. Most algorithms have not specified maximum separation between codes, but some maximum separation is often justifiable. Invasive breast cancer codes may appear annually in claims for women aged 40 to 60; this probably does not indicate an active neoplasm, but the intent of the diagnostic test (to identify preclinical breast cancer). Discovery of a new neoplasm should generate a series of claims at short intervals. At the other extreme, a seizure disorder documented at annual visits could indicate a stable patient receiving annual medication refills.
Finally, we added persistence, a time to look back through for the other requirements, as a subgroup specification. This change relaxes the fixed three quarters implemented in the German ambulatory care algorithm. In a long series of claims data, some potentially chronic diseases will remit: weight loss will cure some diabetics and abstinence will cure some alcoholics’ hepatitis. However, if an incurable diagnosis disappears from the series, then the diagnosis is in question. For instance, Parkinson’s disease is currently incurable and debilitating: it should appear regularly in a patient’s claims. If the diagnosis was not recorded in the most recent two years, then any earlier diagnosis of Parkinsons’ disease is doubtful.
Table structure
We implemented these constraints in two tables, subgroups and codes. The subgroup table has these fields, with each combination of Group and Subgroup being unique:
| Group | positive integer identifier |
| Abbrev | short name for the group |
| Subgroup | non-negative integer identifier |
| Name | short name for the subgroup |
| IP Count | number of inpatient claims (within a bundle) required to establish the subgroup; zero if inpatient claims never establish the subgroup being present |
| OP Count | number of outpatient claims (within a bundle) required to establish the subgroup |
| MinSep | minimum number of days between claims (within a bundle) |
| MaxSep | maximum number of days between claims (within a bundle) |
| Persist | maximum number of days to look back from a specified date |
The code table has these fields:
| Group | positive integer identifier |
| SubGroup | non-negative integer identifier |
| Bundle | non-negative integer identifier |
| Code | ICD9-CM code |
| Description | description of the ICD9-CM code |
General approach to specifying groups and subgroups
Given the above considerations and the lack of standardization in assignment of diagnostic codes for outpatient billing claims, we found that subgroup specifications were quite subjective, especially in regard to time intervals. We therefore developed some general principles for assigning subgroups’ attributes.
-
Number of instances required
○ One inpatient code may establish a clearly chronic condition
○ Inpatient codes for potentially acute or iatrogenic conditions may be ignored, so that the condition must be established by outpatient codes
○ One outpatient code may establish a clearly chronic condition that is under-documented
○ Two outpatient codes are needed to infer most chronic conditions
○ More than two outpatient codes may be used to infer that a normally acute condition is functionally chronic
-
Separation of outpatient codes
○ As a default setting, pairs of claims are at least 30 days apart and not more than 180 days apart
○ Acute events that imply chronic problems may have shorter minimum separation requirements (strokes ad heart attacks)
○ Chronic conditions that could be stabilized and managed with annual checks were set 400 days (13 months) apart (seizure disorders)
○ Infrequent events that imply chronic problems may be set farther apart (smoking, obesity), but in the USA, private claims data may not capture two infrequent events due to limited periods of enrollment (3 years on average)
-
Time since last diagnosis (persist field)
○ Incurable, confidently diagnosed, and rarely recorded diseases may persist indefinitely (strokes)
○ Potentially curable chronic diseases should be confirmed by periodic reappearance (restless leg syndrome) or change to “history of” codes (neoplasms)
○ Inexorably progressive diseases should be confirmed by periodic reappearance (degenerative neurologic diseases)
○ Rarely recorded but curable diseases should be inferred from appearance over a long but not indefinite interval (smoking, obesity)
Algorithm
These definitions reflect the possibility of evolving comorbidity status in primary care analyses of claims data. In procedural pseudo code, the logic for interpreting these data is:
| Establish assessment date D Get individual’s claims data For each Group G in the subgroup table Put claims for Group G, Subgroup zero, Bundle zero into wild card Subgroup claims For each Subgroup S>0 in Group G Put claims for Group G, Subgroup S, Bundle zero into wild card Bundle claims For each Bundle B>0 in Group G, Subgroup S Put claims for Group G, Subgroup S, Bundle B into Bundle B claims Merge wild card Subgroup and Bundle into the Bundle B claims list If Group G, Subgroup S defines Persist>0 then Limit Bundle B claims list to dates between D - Persist and D End if If Group G, Subgroup S defines IPCount>0 then If number of Inpatient claims in Bundle B claims>IPCount then Patient has comorbidity Group G Patient has comorbidity Subgroup S Next Subgroup End if End if For each claim in the claims list Put claim into Sequence Q For each subsequent claim in the claims list Get days between last claim in Q and this subsequent claim If days >= SepMin and days <= SepMax then Add claim to sequence Q End if If number of claims in Q > OP count for Group G, Subgroup S then Patient has comorbidity Group G If at least one claim in Q is in Bundle B then Patient has comorbidity Subgroup S Next Subgroup |
Trial
We implemented the search algorithm and applied it to five years of private insurance claims data from three states in the Truven Health Analytics MarketScan® Commercial Claims and Encounters Databases. Medical claims were restricted to (i) persons aged 18 to 64 years and (ii) claims with a specialty code for general internists or family physicians, thus excluding facility, laboratory, and other specialty claims. Comorbidity prevalence was calculated for each category and subgroup. Patient level prevalence over the five years was calculated for each ICD-9CM claim that was not categorized as a chronic comorbidity. These codes were reviewed manually to identify (a) codes that were included in the codes table but failed temporal criteria, (b) codes that should have been included in the codes table, and (c) codes that were reasonably excluded from the codes table. After reviewing results from the preliminary draft, some comorbid conditions were added and temporal criteria were modified for others. The search algorithm was refined to obtain the form described above and applied to the revised tables to obtain the ensuing results.
Results
Categories, subgroups and bundles
We defined 50 categories with 154 specific subgroups and 13 wild card subgroups. The 50 categories included 2,733 ICD9-CM codes in 367 bundles. The most commonly assigned temporal requirement was 2 claims 30 to 400 days apart, over the past 1 or more years (67 uses), followed by 2 claims 30 to 180 days apart, over the past 1 or more years (33 uses) and 2 claims 30 to 730 days apart, over the past 2 or more years (10 uses). Persistence was about equally divided between 1, 2, 3, 5, and indefinite time spans. Three subgroups required a single claim within 4 or 5 years. Six subgroups required three or more claims.
Test run results
The test run analyzed records of up to five years’ duration from 2.2 million adult patients seen by family physicians and/or general internists between 2006 and 2012. The average duration of enrollment was 2.5 years.
Table 1 summarizes the frequency with which patients met criteria for the 50 categories, sorted from most to least frequent positive matches as a percentage of the 2.2 million patients. The number of subgroups and ICD codes specified in each category is given. The “Time Criteria Fails” column lists the percentage of patients who had one of the codes in the group documented, but a time criterion was not met. The largest of these, musculoskeletal diagnoses, involves 12% of patients. An extremely low number of patients were documented to be smokers (0.83%,) or obese or morbidly obese (1%). Two thirds of the smokers and half of the obese patients had exactly one claim with the diagnosis. In contrast, the population prevalence of smoking is 5% for individuals with graduate degrees20, and above 20% for all workers21, and obesity affects nearly 30% of the workforce22.
Table 1.
Categories with most frequently matched subgroups
| Categories | +(%) | Subgroups (#) | ICD9 Codes (#) | Time Criteria Fails (%) | Main subgroup | + (%) |
|---|---|---|---|---|---|---|
| Hypertension | 7.9 | 1 | 2 | 5.0 | Benign | 7.9 |
| Lipid disorder | 6.7 | 1 | 11 | 5.5 | Hyperlipidemia | 6.7 |
| Htn, complicated | 6.6 | 4 | 45 | 0.10 | Other | 6.4 |
| Type II diabetes | 3.3 | 4 | 28 | 1.2 | Uncomplicated | 3.1 |
| Musculoskeletal | 2.2 | 8 | 295 | 12.3 | Back | 1.1 |
| Mental health | 2 | 6 | 175 | 3.6 | Depression | 1.1 |
| Endocrine | 1.5 | 3 | 46 | 1.7 | Hypothyroid | 1.3 |
| Unexpl. illnesses | 1.5 | 2 | 31 | 3.6 | Functional Somatic | 1.2 |
| Gastrointestinal | 1 | 6 | 106 | 2.6 | GE reflux disease | 0.82 |
| Body mass index | 1 | 2 | 26 | 0 | Obesity | 0.78 |
| Smoking | 0.83 | 1 | 6 | 0 | Tobacco | 0.83 |
| Asthma | 0.75 | 2 | 14 | 1.4 | Reactive airway dis | 0.73 |
| Sleep problems | 0.73 | 4 | 73 | 1.4 | Other | 0.51 |
| Anemia | 0.54 | 4 | 36 | 0.73 | Other | 0.27 |
| Headaches | 0.53 | 5 | 101 | 0.95 | Migraine | 0.52 |
| Coronary art dis | 0.45 | 4 | 48 | 0.35 | Ischemia | 0.41 |
| Chr. Obst. Lung | 0.33 | 3 | 8 | 0.86 | Other | 0.19 |
| Rheumatic dis. | 0.26 | 3 | 44 | 0.25 | Rheum arthritis | 0.13 |
| Rhythm disturb | 0.25 | 5 | 19 | 0.41 | Atrial fibrillation | 0.13 |
| Type I diabetes | 0.19 | 2 | 14 | 0.12 | Uncomplicated | 0.16 |
| Allergy | 0.18 | 5 | 51 | 0.44 | Contact | 0.12 |
| Infection | 0.18 | 5 | 23 | 1.1 | Sinusitis | 0.17 |
| Skin diseases | 0.18 | 3 | 48 | 1.1 | Various | 0.16 |
| Chr. kidney disease | 0.17 | 2 | 61 | 0.08 | Stage I–IV | 0.16 |
| Neoplasms | 0.15 | 2 | 437 | 0.26 | Local | 0.15 |
| Coagulopathy | 0.14 | 3 | 63 | 0.10 | VTE | 0.057 |
| Liver disease | 0.13 | 3 | 40 | 0.35 | Cirrhosis/Enceph. | 0.051 |
| Bone diseases | 0.13 | 1 | 6 | 0.21 | Osteoporosis | 0.13 |
| Sexual dysfunction | 0.11 | 1 | 10 | 0.34 | Various | 0.11 |
| Genitourinary | 0.11 | 4 | 33 | 0.40 | Calculi | 0.074 |
| Heart valve | 0.093 | 5 | 40 | 0.17 | Mitral valve | 0.046 |
| Neurology | 0.086 | 7 | 42 | 0.92 | Multiple sclerosis | 0.038 |
| Pain syndromes | 0.072 | 2 | 15 | 0.08 | Neuropathy | 0.069 |
| Pulm emboli/htn | 0.071 | 3 | 8 | 0.03 | Pulm embolism | 0.058 |
| Seizures | 0.068 | 7 | 29 | 0.14 | Traumatic seizures | 0.037 |
| Heart failure | 0.063 | 3 | 33 | 0.04 | Hypertensive HF | 0.062 |
| Periph vasc disease | 0.045 | 7 | 66 | 0.08 | Atherosclerosis | 0.038 |
| HIV | 0.041 | 1 | 12 | 0.00 | HIV | 0.041 |
| Alcohol misuse | 0.033 | 3 | 26 | 0.07 | End organ damage | 0.019 |
| Drug misuse | 0.032 | 2 | 73 | 0.03 | Dependence | 0.024 |
| Stroke | 0.03 | 3 | 24 | 0.03 | Thromboembolic | 0.021 |
| Lymphoma | 0.021 | 3 | 254 | 0.00 | Non-Hodgkins | 0.015 |
| Reproductive | 0.019 | 1 | 38 | 0.00 | Various | 0.019 |
| Paralysis | 0.012 | 4 | 63 | 0.02 | Syndromes | 0.004 |
| Heart blocks | 0.008 | 5 | 21 | 0.04 | AV block | 0.002 |
| Nutrition | 0.006 | 2 | 11 | 0.01 | Anorexia / Bulimia | 0.003 |
| Dementia | 0.005 | 5 | 35 | 0.01 | Frontal dementias | 0.003 |
| Cystic Fibrosis | 0.005 | 1 | 9 | 0.01 | Various | 0.005 |
| Restrictive pulm dis | 0.002 | 2 | 15 | 0.01 | Various | 0.002 |
| Sensory losses | 0.001 | 1 | 68 | 0.18 | Vision loss | 0.001 |
Figure 1 plots time criteria failures (Table 1, Column 5) as a function of comorbidity prevalence (Table 1, Column 2) for each category. For instance, the outlier point at (2%, 12%) is the musculoskeletal category. The point indicates that 2% of patients have chronic musculoskeletal conditions, as defined here. Another 12% have diagnostic codes included in the musculoskeletal category, but did not meet specified time constraints. For instance, a person having two claims related to back injuries occurring less than 20 days apart would fail the temporal criteria (see table 2). Other musculoskeletal subgroups require more than two claims.
Figure 1.

Prevalence vs. time criteria failure rate for major categories
Table 2.
Temporal definitions of the most frequently matched subgroups in each category
| Categories | Main subgroup | IP;OP (#:#) | Min-Max (days) | Look Back (days) | Codes (#) | Bundles (#) |
|---|---|---|---|---|---|---|
| Hypertension | Benign | 1;2 | 30–400 | 730 | 2 | 1 |
| Lipid disorder | Hyperlipidemia | 0;2 | 30–400 | 1095 | 11 | 1 |
| Htn, complicated | Other | 0;0 | 0–0 | 0 | 3 | 0 |
| Type II diabetes | Uncomplicated | 1;2 | 30–400 | 730 | 2 | 1 |
| Musculoskeletal | Back | 1;2 | 20–400 | 400 | 19 | 3 |
| Mental health | Depression | 1;2 | 30–180 | 1095 | 21 | 2 |
| Endocrine | Hypothyroid | 1;2 | 30–400 | 400 | 8 | 1 |
| Unexpl. illnesses | Functional Somatic | 1;2 | 30–400 | 730 | 6 | 1 |
| Gastrointestinal | GE reflux disease | 1;2 | 30–400 | 1095 | 10 | 1 |
| Body mass index | Obesity | 1;1 | 0–1500 | 1500 | 18 | 1 |
| Smoking | Tobacco | 1;1 | 0–1825 | 1825 | 6 | 1 |
| Asthma | Reactive airway dis | 1;2 | 30–400 | 1095 | 11 | 1 |
| Sleep problems | Other | 0;2 | 20–400 | 400 | 52 | 3 |
| Anemia | Other | 0;0 | 0–0 | 0 | 1 | 0 |
| Headaches | Migraine | 1;2 | 30–400 | 1825 | 42 | 1 |
| Coronary art dis | Ischemia | 1;2 | 14–270 | 0 | 15 | 1 |
| Chr. Obst. Lung | Other | 1;2 | 30–400 | 1095 | 1 | 1 |
| Rheumatic dis. | Rheumatoid arthritis | 1;2 | 30–730 | 730 | 12 | 1 |
| Rhythm disturb | Atrial fibrillation | 1;2 | 30–400 | 400 | 3 | 1 |
| Type I diabetes | Uncomplicated | 1;2 | 30–180 | 3650 | 2 | 1 |
| Allergy | Contact | 1;2 | 30–400 | 1095 | 22 | 5 |
| Infection | Sinusitis | 0;2 | 30–400 | 400 | 7 | 1 |
| Skin diseases | Various | 1;2 | 30–180 | 730 | 21 | 9 |
| Chr. kidney disease | Stage I–IV | 1;2 | 30–400 | 730 | 42 | 1 |
| Neoplasms | Local | 1;2 | 14–90 | 1825 | 402 | 38 |
| Coagulopathy | Venous thromb | 2;2 | 90–1500 | 0 | 13 | 1 |
| Liver disease | Cirrhosis/Enceph. | 1;2 | 30–400 | 400 | 13 | 4 |
| Bone diseases | Osteoporosis | 1;2 | 14–400 | 3650 | 6 | 1 |
| Sexual dysfunction | Various | 1;2 | 30–400 | 730 | 10 | 1 |
| Genitourinary | Calculi | 1;2 | 14–1095 | 1095 | 11 | 1 |
| Heart valve | Mitral valve | 1;2 | 30–730 | 1825 | 8 | 1 |
| Neurology | Multiple sclerosis | 1;2 | 30–400 | 730 | 5 | 1 |
| Pain syndromes | Neuropathy | 0;2 | 30–180 | 730 | 6 | 1 |
| Pulm emboli/htn | Pulm. embolism | 1;2 | 10–180 | 1825 | 3 | 1 |
| Seizures | Traumatic seizures | 1;2 | 30–400 | 400 | 2 | 1 |
| Heart failure | Hypertensive HF | 1;2 | 30–400 | 1825 | 26 | 1 |
| Periph vasc disease | Atherosclerosis | 1;2 | 30–180 | 0 | 25 | 1 |
| HIV | HIV | 1;2 | 30–400 | 730 | 12 | 1 |
| Alcohol misuse | End organ damage | 1;2 | 30–180 | 0 | 20 | 1 |
| Drug misuse | Dependence | 1;2 | 30–270 | 0 | 45 | 1 |
| Stroke | Thromboembolic | 1;2 | 10–180 | 0 | 12 | 1 |
| Lymphoma | Non-Hodgkins | 1;2 | 30–180 | 1825 | 171 | 1 |
| Reproductive | Various | 1;2 | 30–180 | 730 | 38 | 3 |
| Paralysis | Syndromes | 2;2 | 30–400 | 1095 | 31 | 1 |
| Heart blocks | AV block | 1;2 | 30–400 | 1825 | 5 | 1 |
| Nutrition | Anorexia / Bulimia | 1;2 | 20–180 | 400 | 2 | 2 |
| Dementia | Frontal dementias | 1;2 | 30–180 | 400 | 11 | 3 |
| Cystic Fibrosis | Various | 1;2 | 30–400 | 1095 | 9 | 2 |
| Restrictive pulm dis | Various | 1;4 | 30–400 | 1095 | 9 | 1 |
| Sensory losses | Vision loss | 1;2 | 30–400 | 730 | 34 | 1 |
Table 2 summarizes the main subgroup’s requirements in each category. Usually, the other subgroups in a category will have comparable constraints. The number of Inpatient (IP) and Outpatient (OP) events required is listed. Zero IP events means that only outpatient diagnoses are accepted. Zero OP events means that the subgroup is a wildcard. The minimum and maximum separation between events is listed, and the persistence of a diagnosis is listed in the look back column. The number of distinct ICD-9CM codes used and number of bundles in each main subgroup completes the table.
Claims that did not match categorization criteria involved 9,158 distinct ICD-9CM codes, of which 7,316 codes did not map to any of the 50 categories. The remaining 1,842 matched a code in a category, but were not in a series of claims that met temporal requirements. Table 3 lists the uncategorized codes affecting at least 0.5% of the 2.2 million people in the sample. Codes that are included in a comorbidity category are in bold italics.
Table 3.
ICD-9CM codes from claims that did not match a category
| Code | Description | Patients (#) | % |
|---|---|---|---|
| V700 | Routine general medical examination | 236058 | 5.8 |
| 4619 | Acute sinusitis, unspecified | 111050 | 2.73 |
| 4659 | Acute upper respiratory infections of unspecified site | 102665 | 2.52 |
| 462 | Acute pharyngitis | 90305 | 2.22 |
| 4660 | Acute bronchitis | 86781 | 2.13 |
| V7231 | Routine gynecological examination | 80650 | 1.98 |
| V0481 | Vaccination against influenza | 79894 | 1.96 |
| 2724 | Other and unspecified hyperlipidemia | 63177 | 1.55 |
| 4011 | Benign essential hypertension | 62957 | 1.55 |
| 78079 | Other malaise and fatigue | 58502 | 1.44 |
| 4779 | Allergic rhinitis, cause unspecified | 57454 | 1.41 |
| 4019 | Unspecified essential hypertension | 48904 | 1.2 |
| 7862 | Cough | 45206 | 1.11 |
| 5990 | Urinary tract infection, site not specified | 43967 | 1.08 |
| 7242 | Lumbago | 41305 | 1.01 |
| 78650 | Chest pain, unspecified | 37557 | 0.92 |
| 2720 | Pure hypercholesterolemia | 37220 | 0.91 |
| 7295 | Pain in limb | 37053 | 0.91 |
| 7840 | Headache | 34046 | 0.84 |
| V061 | Vaccination against diphtheria-tetanus-pertussis | 33382 | 0.82 |
| 78900 | Abdominal pain, unspecified site | 33303 | 0.82 |
| 4610 | Acute maxillary sinusitis | 32290 | 0.79 |
| 6929 | Contact dermatitis and other eczema, unspecified cause | 31538 | 0.77 |
| 2449 | Unspecified acquired hypothyroidism | 29010 | 0.71 |
| 53081 | Esophageal reflux | 28176 | 0.69 |
| 71946 | Pain in joint, lower leg | 27279 | 0.67 |
| 311 | Depressive disorder, not elsewhere classified | 26877 | 0.66 |
| 30000 | Anxiety state, unspecified | 26528 | 0.65 |
| 7245 | Backache, unspecified | 24594 | 0.6 |
| 7804 | Dizziness and giddiness | 21465 | 0.53 |
| 7231 | Cervicalgia | 20879 | 0.51 |
| 7821 | Rash and other nonspecific skin eruption | 20796 | 0.51 |
| 71941 | Pain in joint, shoulder region | 20786 | 0.51 |
| 49390 | Asthma, unspecified type, unspecified | 20607 | 0.51 |
| 4739 | Unspecified sinusitis (chronic) | 20596 | 0.51 |
| 3829 | Unspecified otitis media | 20224 | 0.5 |
Discussion
We present a refinement in strategies for defining comorbid conditions based on provider (outpatient) rather than facility (inpatient) claims. The disease categories presented here are familiar from previous work, especially by Charlson4, Elixhauser5, and van den Bussche10. We extended these systems by adding flexible temporal constraints, including the option to require any number of instances of a diagnosis in provider claims. The value of this flexibility in required numbers was evident in the test run. Smoking and obesity documentation is so sparse that less than 5% of cases are documented during the average 2.5 years of claims data, and half of the documented cases were based on only one claim. Requiring two or three claims to confirm these diagnoses would risk eliminating nearly all documentation of these fundamental problems. Conversely, six out of seven patients with musculoskeletal claims do not generate enough claims to be categorized as chronic conditions – these problems are usually of short duration.
In addition, we have explicitly described three levels of hierarchy, starting with 50 broad categories, followed by 154 specific subgroups. The subgroups have much more uniform primary care management implications than the categories. The diagnoses in many subgroups share similar pathology, treatment strategies, complication risks, and morbidity implications. The third level of the hierarchy consists of substitutable bundles of diagnoses within subgroups. Given the frequency of claims in some categories, such as musculoskeletal, false positive categorization would occur over time because of related processes occurring at unrelated sites unless the sites’ (or processes’) codes are grouped in substitutable sets.
These temporal definitions are quite flexible. Transient comorbidities could be defined. For instance, respiratory tract infections in the last 30 days could be specified as a transient comorbidity in a study of asthma. A respiratory infection is a transient risk factor for asthma exacerbations, but one that persists for only a few weeks: farther removed infections are usually irrelevant.
Limitations
The categorization outlined here may improve with the addition of treatment information or inferences based on other diagnostic claims. For instance, both smoking and obesity can be inferred from the appearance of associated diseases. Patients with mental health problems (smoking risk factor) and acute bronchitis (smoking consequence) are likely to smoke. Chronic obstructive pulmonary disease or lung cancer would make a history of smoking nearly certain. Obesity is likely when recent claims include knee osteoarthritis, gastroesophageal reflux, sleep apnea, hypertension and type II diabetes. Many other diseases and risk of death can be inferred from the presence of prescription claims23,24. However, pharmacy claims analysis introduces new challenges. Paradoxical relationships between claims and outcomes have been attributed to “selective under-use of drugs by elderly patients”25. Increasingly popular “$4 drug list” prescriptions26 are not captured in claims data. Anonymity as well as price may motivate use of these deeply discounted pharmacies27. Widespread use by insured patients will limit inferences based on prescription claims28–30.
Another issue is that longitudinal surveillance identifies patterns of claims, which may have distinct implications for outcomes31. The use of temporal criteria to identify comorbid conditions risks obscuring this complexity if claims patterns are not considered in analyses.
Conclusions
Temporal constraints applied to ambulatory claims may improve comorbid condition categorization. Nevertheless, incomplete documentation of relevant conditions impedes complete description of primary care patients’ multimorbidity.
Acknowledgments
This work was funded in part by Washington University Institute of Clinical and Translational Sciences grant UL1 TR000448 from the National Center for Advancing Translational Sciences (NIH) and by grant number R24 HS19455 (PI: V. Fraser) from the Agency for Healthcare Research and Quality (AHRQ), and by the American Board of Family Medicine.
References
- 1.Feinstein A. The pre-therapeutic classification of co-morbidity in chronic disease. Journal of chronic diseases. 1970;23(7):455–468. doi: 10.1016/0021-9681(70)90054-8. [DOI] [PubMed] [Google Scholar]
- 2.Kadam UT, Croft PR. Clinical multimorbidity and physical function in older adults: a record and health status linkage study in general practice. Fam Pract. 2007 Oct;24(5):412–419. doi: 10.1093/fampra/cmm049. [DOI] [PubMed] [Google Scholar]
- 3.Saver BG, Wang CY, Dobie SA, Green PK, Baldwin LM. The central role of comorbidity in predicting ambulatory care sensitive hospitalizations. Eur J Public Health. 2014 Feb;24(1):66–72. doi: 10.1093/eurpub/ckt019. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Charlson M, Szatrowski TP, Peterson J, Gold J. Validation of a combined comorbidity index. Journal of clinical epidemiology. 1994 Nov;47(11):1245–1251. doi: 10.1016/0895-4356(94)90129-5. [DOI] [PubMed] [Google Scholar]
- 5.Elixhauser A, Steiner C, Harris DR, Coffey RM. Comorbidity measures for use with administrative data. Medical care. 1998 Jan;36(1):8–27. doi: 10.1097/00005650-199801000-00004. [DOI] [PubMed] [Google Scholar]
- 6.Klabunde CN, Potosky AL, Legler JM, Warren JL. Development of a comorbidity index using physician claims data. Journal of clinical epidemiology. 2000 Dec;53(12):1258–1267. doi: 10.1016/s0895-4356(00)00256-0. [DOI] [PubMed] [Google Scholar]
- 7.Wang CY, Baldwin LM, Saver BG, et al. The contribution of longitudinal comorbidity measurements to survival analysis. Medical care. 2009 Jul;47(7):813–821. doi: 10.1097/MLR.0b013e318197929c. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Huntley AL, Johnson R, Purdy S, Valderas JM, Salisbury C. Measures of multimorbidity and morbidity burden for use in primary care and community settings: a systematic review and guide. Ann Fam Med. 2012 Mar-Apr;10(2):134–141. doi: 10.1370/afm.1363. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Schafer I, Hansen H, Schon G, et al. The German MultiCare-study: Patterns of multimorbidity in primary health care - protocol of a prospective cohort study. BMC Health Serv Res. 2009;9:145. doi: 10.1186/1472-6963-9-145. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.van den Bussche H, Schon G, Kolonko T, et al. Patterns of ambulatory medical care utilization in elderly patients with special reference to chronic diseases and multimorbidity–results from a claims data based observational study in Germany. BMC Geriatr. 2011;11:54. doi: 10.1186/1471-2318-11-54. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Bowden K. Managing up to maximize medicare reimbursement for outpatient care. Am J Health Syst Pharm. 2001 Oct 1;58(Suppl 1):S14–16. doi: 10.1093/ajhp/58.suppl_1.S14. [DOI] [PubMed] [Google Scholar]
- 12.Kesselheim AS, Brennan TA. Overbilling vs. downcoding--the battle between physicians and insurers. The New England journal of medicine. 2005 Mar 3;352(9):855–857. doi: 10.1056/NEJMp058011. [DOI] [PubMed] [Google Scholar]
- 13.Holmberg S, Rothstein B. Dying of corruption. Health Econ Policy Law. 2011 Oct;6(4):529–547. doi: 10.1017/S174413311000023X. [DOI] [PubMed] [Google Scholar]
- 14.Thammasitboon S, Singhal G. Diagnosing diagnostic error. Curr Probl Pediatr Adolesc Health Care. 2013 Oct;43(9):227–231. doi: 10.1016/j.cppeds.2013.07.002. [DOI] [PubMed] [Google Scholar]
- 15.Ely JW, Kaldjian LC, D’Alessandro DM. Diagnostic errors in primary care: lessons learned. J Am Board Fam Med. 2013 Jan-Feb;25(1):87–97. doi: 10.3122/jabfm.2012.01.110174. [DOI] [PubMed] [Google Scholar]
- 16.Singh H, Giardina TD, Forjuoh SN, et al. Electronic health record-based surveillance of diagnostic errors in primary care. BMJ Qual Saf. 2013 Feb;21(2):93–100. doi: 10.1136/bmjqs-2011-000304. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Baldwin LM, Klabunde CN, Green P, Barlow W, Wright G. In search of the perfect comorbidity measure for use with administrative claims data: does it exist? Medical care. 2006 Aug;44(8):745–753. doi: 10.1097/01.mlr.0000223475.70440.07. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Ninomiya T, Perkovic V, Turnbull F, et al. Blood pressure lowering and major cardiovascular events in people with and without chronic kidney disease: meta-analysis of randomised controlled trials. BMJ. 2013;347:f5680. doi: 10.1136/bmj.f5680. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Marenzi G, Cabiati A, Assanelli E. Chronic kidney disease in acute coronary syndromes. World J Nephrol. 2013 Oct 6;1(5):134–145. doi: 10.5527/wjn.v1.i5.134. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Cigarette smoking among adults and trends in smoking cessation - United States, 2008. MMWR Morb Mortal Wkly Rep. 2009 Nov 13;58(44):1227–1232. [PubMed] [Google Scholar]
- 21.Lee DJ, Fleming LE, Arheart KL, et al. Smoking rate trends in U.S. occupational groups: the 1987 to 2004 National Health Interview Survey. J Occup Environ Med. 2007 Jan;49(1):75–81. doi: 10.1097/JOM.0b013e31802ec68c. [DOI] [PubMed] [Google Scholar]
- 22.Hertz RP, Unger AN, McDonald M, Lustik MB, Biddulph-Krentar J. The impact of obesity on work limitations and cardiovascular risk factors in the U.S. workforce. J Occup Environ Med. 2004 Dec;46(12):1196–1203. [PubMed] [Google Scholar]
- 23.Clark DO, Von Korff M, Saunders K, Baluch WM, Simon GE. A chronic disease score with empirically derived weights. Medical care. 1995 Aug;33(8):783–795. doi: 10.1097/00005650-199508000-00004. [DOI] [PubMed] [Google Scholar]
- 24.Von Korff M, Wagner EH, Saunders K. A chronic disease score from automated pharmacy data. Journal of clinical epidemiology. 1992 Feb;45(2):197–203. doi: 10.1016/0895-4356(92)90016-g. [DOI] [PubMed] [Google Scholar]
- 25.Glynn RJ, Knight EL, Levin R, Avorn J. Paradoxical relations of drug treatment with mortality in older persons. Epidemiology (Cambridge, Mass. 2001 Nov;12(6):682–689. doi: 10.1097/00001648-200111000-00017. [DOI] [PubMed] [Google Scholar]
- 26.Gatwood J, Tungol A, Truong C, Kucukarslan SN, Erickson SR. Prevalence and predictors of utilization of community pharmacy generic drug discount programs. J Manag Care Pharm. 2011 Jul-Aug;17(6):449–55. doi: 10.18553/jmcp.2011.17.6.449. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Rucker NL. $4 generics: How low, how broad, and why patient engagement is priceless. J Am Pharm Assoc. 2010 Nov-Dec;50(6):761–763. doi: 10.1331/JAPhA.2010.10546. [DOI] [PubMed] [Google Scholar]
- 28.Czechowski JL, Tjia J, Triller DM. Deeply discounted medications: Implications of generic prescription drug wars. J Am Pharm Assoc. 2010 Nov-Dec;50(6):752–757. doi: 10.1331/JAPhA.2010.09114. [DOI] [PubMed] [Google Scholar]
- 29.Tungol A, Starner CI, Gunderson BW, Schafer JA, Qiu Y, Gleason PP. Generic drug discount programs: are prescriptions being submitted for pharmacy benefit adjudication? J Manag Care Pharm. 2012 Nov-Dec;18(9):690–700. doi: 10.18553/jmcp.2012.18.9.690. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Omojasola A, Hernandez M, Sansgiry S, Paxton R, Jones L. Predictors of $4 generic prescription drug discount programs use in the low-income population. Res Social Adm Pharm. 2014 Jan-Feb;10(1):141–148. doi: 10.1016/j.sapharm.2013.04.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Freund T, Kunz CU, Ose D, Szecsenyi J, Peters-Klimm F. Patterns of multimorbidity in primary care patients at high risk of future hospitalization. Popul Health Manag. 2012 Apr;15(2):119–124. doi: 10.1089/pop.2011.0026. [DOI] [PubMed] [Google Scholar]
