Abstract
BACKGROUND: Nontuberculous mycobacterial lung disease (NTMLD) is a debilitating disease. Chronic obstructive pulmonary disease (COPD) is the leading comorbidity associated with NTMLD in the United States. Their similarities in symptoms and overlapping radiological findings may delay NTMLD diagnosis in patients with COPD.
OBJECTIVE: To develop a predictive model that identifies potentially undiagnosed NTMLD among patients with COPD.
METHODS: This retrospective cohort study developed a predictive model of NTMLD using US Medicare beneficiary claims data (2006 - 2017). Patients with COPD with NTMLD were matched 1:3 to patients with COPD without NTMLD by age, sex, and year of COPD diagnosis. The predictive model was developed using logistic regression modeling risk factors such as pulmonary symptoms, comorbidities, and health care resource utilization. The final model was based on model fit statistics and clinical inputs. Model performance was evaluated for both discrimination and generalizability with c-statistics and receiver operating characteristic curves.
RESULTS: There were 3,756 patients with COPD with NTMLD identified and matched to 11,268 patients with COPD without NTMLD. A higher proportion of patients with COPD with NTMLD, compared with those with COPD without NTMLD, had claims for pulmonary symptoms and conditions, including hemoptysis (12.6% vs 1.4%), cough (63.4% vs 24.7%), dyspnea (72.5% vs 38.2%), pneumonia (59.2% vs 13.4%), chronic bronchitis (40.5% vs 16.3%), emphysema, (36.7% vs 11.1%), and lung cancer (15.7% vs 3.5%). A higher proportion of patients with COPD with NTMLD had pulmonologist and infectious disease (ID) specialist visits than patients with COPD without NTMLD (≥ 1 pulmonologist visit: 81.3% vs 23.6%, respectively; ≥ 1 ID visit: 28.3% vs 4.1%, respectively, P < 0.0001). The final model consists of 10 risk factors (≥ 2 ID specialist visits; ≥ 4 pulmonologist visits; the presence of hemoptysis, cough, emphysema, pneumonia, tuberculosis, lung cancer, or idiopathic interstitial lung disease; and being underweight during a 1-year pre-NTMLD period) predicting NTMLD with high sensitivity and specificity (c-statistic, 0.9). The validation of the model on new testing data demonstrated similar discrimination and showed the model was able to predict NTMLD earlier than the receipt of the first diagnostic claim for NTMLD.
CONCLUSIONS: This predictive algorithm uses a set of criteria comprising patterns of health care use, respiratory symptoms, and comorbidities to identify patients with COPD and possibly undiagnosed NTMLD with high sensitivity and specificity. It has potential application in raising timely clinical suspicion of patients with possibly undiagnosed NTMLD, thereby reducing the period of undiagnosed NTMLD.
DISCLOSURES: Dr Wang and Dr Hassan are employees of Insmed, Inc. Dr Chatterjee was an employee of Insmed, Inc, at the time of this study. Dr Marras is participating in multicenter clinical trials sponsored by Insmed, Inc, has consulted for RedHill Biopharma, and has received a speaker’s honorarium from AstraZeneca. Dr Allison is an employee of Statistical Horizons, LLC.
This study was funded by Insmed Inc.
Plain language summary
Nontuberculous mycobacterial (NTM) lung infections are hard to detect in people with chronic obstructive pulmonary disease (COPD) because the symptoms are similar (eg, cough or shortness of breath), but it is important to find and treat NTM infections to stop or slow down lung damage. This study created a way to predict which people with COPD might also have an undiagnosed NTM infection. It is useful to alert doctors to test high-risk patients for NTM infection to help start treatment quickly and improve lung health.
Implications for managed care pharmacy
NTM lung disease (NTMLD) in patients with COPD often progresses silently and remains undiagnosed. Without timely management, NTMLD could add substantial illness burden to the already burdensome COPD. This model can identify possible undiagnosed NTMLD among patients with COPD with high sensitivity and specificity. Applying it to patients with COPD enables the early diagnosis and treatment of NTMLD. It informs payers regarding benefit design, coverage, and formulary placement to help this subpopulation access effective treatment.
Nontuberculous mycobacterial lung disease (NTMLD) is a debilitating disease associated with significant treatment challenges and a high burden of disease.1 NTMLD prevalence varies by region and has increased worldwide over the past decade according to population-based data from North America, Europe, and Asia.2 For example, a study based on a national managed care claims database reported that the annual prevalence of NTMLD in the United States increased from 6.8 per 100,000 persons in 2008 to 11.7 per 100,000 persons in 2015; for those aged 65 years and older, the annual prevalence increased from 30.27 to 47.48 per 100,000 persons, which is much higher than the prevalence in the population aged younger than 65 years.3 Without appropriate and timely management, NTMLD infection can worsen over time, leading to progressive lung damage and poor clinical outcomes.1,4,5
Many patients with NTMLD have respiratory comorbidities associated with structural lung disease, such as chronic obstructive pulmonary disease (COPD), bronchiectasis, and cystic fibrosis.1,6-8 These comorbidities are among the key host risk factors for NTMLD9 (which include COPD,8 cystic fibrosis,10 bronchiectasis,11,12 prior tuberculosis infection,11 and immunosuppression8,13) and, as such, may be contributing factors in NTMLD development. Among these, COPD is the leading comorbidity of NTMLD in the United States, with an estimated 81% - 87% of Medicare beneficiaries with NTMLD having comorbid COPD.8,14
Overall, the prevalence and disease burden of COPD in the United States is high, with more than 6% of adults aged at least 40 years diagnosed with COPD in 2014 and 2015.15 Globally, more than 300 million people have COPD, and it represents the third leading cause of death worldwide, with 3.23 million deaths in 2019.16-18 NTMLD has been shown to add a substantial incremental mortality risk and health care resource utilization burden to COPD.19,20 The mortality risk among patients with COPD and NTMLD is approximately 2 times higher than for those with COPD without NTMLD.20 Additionally, hospitalizations and emergency department (ED) visits during a 1-year period following the diagnosis of NTMLD are markedly increased in patients with COPD and NTMLD compared with those without NTMLD (odds ratio [95% CI] of respiratory-related hospitalization, 2.5 [2.2 - 2.7]).19
Similarities in symptom presentation and overlapping radiological findings with other underlying pulmonary diseases may delay the diagnosis of NTMLD for years.1,4-6,21,22 Given that there is increasing patient morbidity and suffering as lung damage worsens,23-25 it would follow that an earlier diagnosis should permit timely treatment to prevent progressive lung damage, improve clinical outcomes, and reduce disease burden. As most patients with NTMLD have comorbid COPD,14,26 and COPD is a risk factor for the development of NTMLD,9 early identification of NTMLD specifically among patients with COPD could help to address this issue. To facilitate this, a predictive modeling algorithm using administrative claims data was developed to help identify potentially undiagnosed NTMLD among patients with COPD.
Methods
MODELING APPROACH
This predictive model of NTMLD among patients with COPD was developed based on logistic regression, which estimates the probability that a patient with COPD has NTMLD given their risk factors. The logistic regression takes the following form:
Where P(x1,…,xn) is the probability that a patient has NTMLD, given the risk factors x1…,xn. The linear regression intercept is β0, and βi(i = 1,…,n) is he regression coefficient for risk factor xi.
To train and evaluate the model’s performance, training, validation, and testing data were extracted from the medical claims of patients with COPD with NTMLD and matched patients with COPD without NTMLD. The learning process selects risk factors from a candidate set of risk factors characterizing patients’ comorbidities and resource utilization, and the model learns the intercept β0 and coefficients βi. for the selected risk factor xi. The underlying assumption was that training data consisting of a 1-year history of risk factors prior to the diagnosis of NTMLD would be sufficient to identify NTMLD. The learning model was then tested on new testing data, and model performance validated on that dataset is reported.27
DATA SOURCE AND STUDY POPULATION
This study was conducted using 100% of the beneficiary records from the US Medicare Part A and Part B claims database (inpatient, outpatient, and carrier files) from 2006 to 2017, sourced from the Centers for Medicare & Medicaid Services. Medicare beneficiaries (limited to those eligible for Medicare because of being aged ≥ 65 years) with COPD and NTMLD were identified as follows. Patients with COPD were defined as beneficiaries who had at least 2 ambulatory encounters with diagnostic codes for COPD (Supplementary Table 1, available in online article) dated at least 30 days apart or at least 1 hospitalization with a principal or secondary diagnosis for COPD.28 The diagnosis date of COPD was the date of the first claim that fulfilled the COPD definition. Patients with NTMLD were defined as beneficiaries with at least 2 medical encounters with diagnostic codes for NTMLD (Supplementary Table 1) from an office visit (diagnostic code assigned by a physician), a hospital inpatient stay, or a hospital outpatient visit dated 30 days apart but within 365 days.3 The diagnosis date of NTMLD was defined as the date of the first claim that fulfilled the NTMLD definition. Additional requirements were that patients with NTMLD must have been newly diagnosed between 2011 and 2016 (no prior claims with NTMLD diagnostic codes since 1999) and that COPD was diagnosed prior to the NTMLD diagnosis. Patients with bronchiectasis were excluded from this study because of its complex and poorly understood relationship with NTMLD in terms of both causality and risk.29,30 Patients with COPD with NTMLD were matched (without replacement) 1:3 to patients with COPD without NTMLD by age, sex, and year of COPD diagnosis.
STUDY DESIGN AND ANALYSIS PLAN
This was a retrospective cohort study. The index date (defined as the date of the first claim with a diagnosis of NTMLD) of a given patient with COPD with NTMLD was assigned to the 3 matched patients with COPD without NTMLD. All patients with COPD with NTMLD and matched patients with COPD without NTMLD had 5-year pre-index continuous coverage in Medicare Parts A and B.
The predictive model of NTMLD in patients with COPD was developed and validated in 4 steps.31 The first step was data inspection and coding to identify risk factors that could be used for predicting potentially undiagnosed NTMLD. Patients’ demographics (age, sex, and race and ethnicity); clinical characteristics (pulmonary symptoms, pulmonary and nonpulmonary comorbidities, and Charlson Comorbidity Index score); and resource utilization, such as visits to pulmonologists and infectious disease (ID) specialists, hospitalizations, and ED visits, during the 1-year pre-index period were compared between the 2 groups. Clinical characteristics were binary categorical variables, and health care resource utilization data were mostly continuous variables. After data inspection, some continuous variables were converted into dichotomous variables to support the ease of interpretation of risk factors in the final model. For example, visits to ID specialists were dichotomized as having at least 2 visits during the 1-year pre-index period or not.
The second step was model specification and estimation. The patients from the 2 cohorts (patients with COPD with NTMLD and matched patients with COPD without NTMLD) were uniformly randomly split into mutually exclusive datasets of training (50%), validation (25%), and testing (25%). The training data were used to fit the model, the validation data were used to estimate the prediction error for model selection, and the testing data were used for assessing the generalization error of the final chosen model. Logistic regression with forward stepwise selection was applied for predictive modeling. The risk factors of the predictive model were selected sequentially based on the Bayesian information criterion (BIC) estimated from the training dataset. The model fitting started as a null model without any risk factors; the risk factors to be added to the model at kth iteration were based on the BIC evaluated on the training data, where BIC = – 2*log (P(x1,…,xk)) + log (n)* k and N is the total number of training data. A smaller BIC indicates better model fitting. The iterative selection process stopped when the average squared error of the predictive model evaluated on the validation set at the current iteration was greater than the previous iteration. Testing data were used to evaluate the final predictive model’s generalization error. The c-statistics (ie, concordance) evaluated on the training, validation, and testing datasets were reported. Variable coding was adjusted as necessary. Ease of interpretation was considered, and some continuous risk factors were converted to categorical ones, such as thresholding the number of visits to pulmonologists during model development. As a result, the final model is a joint consideration of clinical context/medical rationale and model discrimination.
In the third step, discriminatory performance of the final model was evaluated using c-statistics and receiver operating characteristic (ROC) curves. The associated odds ratios of risk factors in the final model for predicting NTMLD were estimated via logistic regression.
In the fourth step, the predictive model was validated on a new testing dataset different from the testing data used in the model estimation. Model performance on the new testing dataset was evaluated by ROC curve.
Categorical variables were presented as the numbers and percentages of patients; continuous variables were summarized by mean and SD and medians and quartiles. Statistical tests comparing patients with COPD with NTMLD and patients with COPD without NTMLD included the McNemar test for categorical variables and the Wilcoxon signed-rank test for continuous variables; an α of 0.05 was defined as statistically significant. The analysis was conducted on the research-identifiable files residing in the Centers for Medicare & Medicaid Services virtual research data center with SAS Enterprise Guide (SAS Institute, Inc). The modeling was implemented with the SAS HPLOGISTIC Procedure.
Results
STUDY POPULATION
A total of 3,756 patients with COPD with NTLMD who met the study criteria were identified from the US Medicare claims database and successfully matched in a 1:3 ratio to patients with COPD without NTMLD (n = 11,268) (Figure 1). The median index age for both patients with COPD with NTMLD and patients with COPD without NTMLD was 78 years, with 56.6% of patients being female in both groups and >90% being White (Table 1).
FIGURE 1.
Patient Identification Flowchart
TABLE 1.
Demographic and Clinical Characteristics of Patients With COPD With NTMLD and Patients With COPD Without NTMLD During the 1-Year Pre-Index Period
Demographic characteristics | Patients with COPD with NTMLD (n = 3,756) | Patients with COPD without NTMLD (n = 11,268) |
---|---|---|
Age at index date, years | ||
Mean (SD) | 78.7 (5.8) | 78.7 (5.8) |
Median (Q1, Q3) | 78 (74, 83) | 78 (74, 83) |
Female sex, n (%) | 2,125 (56.6) | 6,375 (56.6) |
Male sex, n (%) | 1,631 (43.4) | 4,893 (43.4) |
Race and ethnicity, n (%) | ||
White | 3,465 (92.3) | 10,180 (90.3) |
Black | 108 (2.9) | 627 (5.6) |
Asian | 75 (2.0) | 166 (1.5) |
Hispanic | 33 (0.9) | 130 (1.2) |
North American Native | 16 (0.4) | 39 (0.3) |
Other or unknown | 59 (1.6) | 126 (1.1) |
Clinical characteristics | ||
Charlson Comorbidity Index, mean (SD) | 2.8 (1.6) | 2.2 (1.7) |
Select nonpulmonary comorbidities, n (%) | ||
All cardiovascular diseases | 3,531 (94.0) | 10,293 (91.3) |
Hypertension | 3,096 (82.4) | 9,340 (82.9) |
Gastroesophageal reflux disease | 1,570 (41.8) | 3,350 (29.7) |
Diabetes | 993 (26.4) | 3,928 (34.9) |
Malnutrition | 992 (26.4) | 2,365 (21.0) |
All cancer, excluding lung cancer | 913 (24.3) | 1,884 (16.7) |
Underweight or abnormal weight loss | 802 (21.4) | 756 (6.7) |
Chronic kidney disease | 738 (19.6) | 2,080 (18.5) |
Dementia | 454 (12.1) | 1,805 (16.0) |
Overweight and obesity | 296 (7.9) | 1,285 (11.4) |
Rheumatoid arthritis | 264 (7.0) | 470 (4.2) |
Moderate or severe liver disease | 26 (0.7) | 47 (0.4) |
Transplant of kidney, heart, or liver | 21 (0.6) | 14 (0.2) |
Select pulmonary symptoms and comorbidities, n (%) | ||
Dyspnea | 2,722 (72.5) | 4,309 (38.2) |
Cough | 2,380 (63.4) | 2,788 (24.7) |
Pneumonia | 2,225 (59.2) | 1,509 (13.4) |
Simple and mucopurulent chronic bronchitis | 1,520 (40.5) | 1,841 (16.3) |
Smoking history | 1,428 (38.0) | 2,041 (18.1) |
Emphysema | 1,379 (36.7) | 1,251 (11.1) |
Idiopathic interstitial lung disease | 1,017 (27.1) | 501 (4.4) |
Asthma | 1,011 (26.9) | 1,838 (16.3) |
Malignant neoplasm of bronchus and lung | 590 (15.7) | 393 (3.5) |
Hemoptysis | 475 (12.6) | 159 (1.4) |
Pulmonary tuberculosis | 354 (9.4) | 13 (0.1) |
Idiopathic pulmonary fibrosis | 102 (2.7) | 44 (0.4) |
Lung transplant | 18 (0.5) | 2 (0) |
Cystic fibrosis with pulmonary manifestations | 1 (0) | 0 |
P values based on McNemar test indicated there was a statistically significant difference between patients with COPD with NTMLD and patients with COPD without NTMLD for all clinical characteristics (P < 0.01) except for transplant of kidney, heart, or liver (P = 0.66) and cystic fibrosis (not applicable).
COPD = chronic obstructive pulmonary disease; NTMLD = nontuberculous mycobacterial lung disease;
Q = quartile.
PATIENT CLINICAL CHARACTERISTICS DURING THE 1-YEAR PRE-INDEX PERIOD
A higher percentage of patients with COPD with NTMLD had nonpulmonary comorbidities of rheumatoid arthritis (7.0% vs 4.2%) and gastroesophageal reflux disease (41.8% vs 29.7%) and were underweight (21.4% vs 6.7%) than those with COPD without NTMLD (Table 1). A higher percentage of patients with COPD with NTMLD had claims for pulmonary symptoms of hemoptysis (12.6% vs 1.4%), cough (63.4% vs 24.7%), dyspnea (72.5% vs 38.2%), and most pulmonary conditions, such as pneumonia (59.2% vs 13.4%), chronic bronchitis (40.5% vs 16.3%), emphysema (36.7% vs 11.1%), and lung cancer (15.7% vs 3.5%), than those with COPD without NTMLD. Additionally, the Charlson Comorbidity Index score was higher in patients with COPD with NTMLD than in those with COPD without NTMLD (2.8 vs 2.2).
HEALTH CARE RESOURCE UTILIZATION DURING THE 1-YEAR PRE-INDEX PERIOD
Overall, a significantly higher proportion of patients with COPD with NTMLD had visits to pulmonologists and ID specialists compared with patients with COPD without NTMLD (81.3% vs 23.6%, respectively, had at least 1 pulmonologist visit, and 28.3% vs 4.1%, respectively, had at least 1 ID visit, P < 0.0001). There was also a higher number of visits per patient among patients with COPD with NTMLD than in patients with COPD without NTMLD (mean [SD] of 9.91 [12.20] vs 1.87 [5.68], respectively, for pulmonologist visits and 2.12 [8.42] vs 0.35 [3.64], respectively, for ID visits). Visits to a pulmonologist were more common than visits to ID specialists in both groups. A notably greater proportion of patients with COPD with NTMLD visited a pulmonologist on at least 4 occasions compared with patients with COPD without NTMLD (65.6% vs 14.9%, respectively), and the proportion who had at least 4 visits to an ID specialist was 7 times higher (15.3% vs 2.3%, respectively). There were more hospitalizations and ED visits among patients with COPD with NTMLD than there were for patients with COPD without NTMLD, especially for respiratory-and COPD-associated hospitalizations (mean [SD] of 0.52 [0.90] vs 0.17 [0.54], respectively, for respiratory-associated hospitalizations and 0.15 [0.49] vs 0.06 [0.30], respectively, for COPD-associated hospitalizations) (Table 2).
TABLE 2.
Health Care Resource Utilization During 1-Year Pre-Index Period
Hospitalizations, ED visits, and physician specialties | Patients with COPD with NTMLD (n = 3,756) | Patients with COPD without NTMLD (n = 11,268) |
---|---|---|
All-cause hospitalizations | ||
Counts among all patients, mean (SD); median (Q1, Q3) | 0.85 (1.25); 0 (0,1) | 0.52 (1.07); 0 (0,1) |
1, n (%) | 999 (26.6) | 1,980 (17.6) |
2-3, n (%) | 650 (17.3) | 1,027 (9.1) |
≥ 4, n (%) | 143 (3.8) | 307 (2.7) |
Respiratory-associated hospitalizations | ||
Counts among all patients, mean (SD); median (Q1, Q3) | 0.52 (0.90); 0 (0,1) | 0.17 (0.54); 0 (0,0) |
1, n (%) | 895 (23.8) | 998 (8.9) |
2-3, n (%) | 361 (9.6) | 303 (2.7) |
≥ 4, n (%) | 49 (1.3) | 48 (0.4) |
COPD-associated hospitalizations | ||
Counts among all patients, mean (SD); median (Q1, Q3) | 0.15 (0.49); 0 (0,0) | 0.06 (0.30); 0 (0,0) |
1, n (%) | 354 (9.4) | 431 (3.8) |
2-3, n (%) | 82 (2.2) | 93 (0.8) |
≥ 4, n (%) | 10 (0.3) | 7 (0.1) |
ED visit followed by hospitalization | ||
Counts among all patients, mean (SD); median (Q1, Q3) | 0.55 (0.97); 0 (0,1) | 0.36 (0.86); 0 (0,0) |
1, n (%) | 839 (22.3) | 1,600 (14.2) |
2-3, n (%) | 381 (10.1) | 715 (6.4) |
≥ 4, n (%) | 81 (2.2) | 164 (1.5) |
ED visit without subsequent hospitalization | ||
Counts among all patients, mean (SD); median (Q1, Q3) | 1.02 (2.49); 0 (0,1) | 0.85 (2.12); 0 (0,1) |
1, n (%) | 596 (15.9) | 1,472 (13.1) |
2-3, n (%) | 513 (13.7) | 1,318 (11.7) |
≥ 4, n (%) | 315 (8.4) | 801 (7.1) |
Visits to pulmonologista | ||
Counts among all patients, mean (SD); median (Q1, Q3) | 9.91 (12.20); 6 (2, 13) | 1.87 (5.68); 0 (0, 0) |
1, n (%) | 187 (5.0) | 332 (2.9) |
2-3, n (%) | 404 (10.8) | 654 (5.8) |
≥ 4, n (%) | 2,463 (65.6) | 1,682 (14.9) |
Visits to ID specialista | ||
Counts among all patients, mean (SD); median (Q1, Q3) | 2.12 (8.42); 0 (0, 1) | 0.35 (3.64); 0 (0, 0) |
1, n (%) | 219 (5.8) | 65 (0.6) |
2-3, n (%) | 269 (7.2) | 139 (1.2) |
≥ 4, n (%) | 575 (15.3) | 257 (2.3) |
a A significantly higher proportion of patients with COPD with NTMLD than patients with COPD without NTMLD had visits to pulmonologists and ID specialists (P < 0.0001, Wilcoxon signed-rank test for continuous variables).
COPD = chronic obstructive pulmonary disease; ED = emergency department; ID = infectious disease; NTMLD = nontuberculous mycobacterial lung disease; Q = quartile.
PREDICTIVE MODEL SPECIFICATION, ESTIMATION, AND PERFORMANCE
The predictive model was a joint consideration of medical rationale/clinical context and model discrimination and generalizability. Following the inspection of clinical characteristics and resource utilization, candidate risk factors were calculated for each patient, including most pulmonary symptoms and conditions (eg, hemoptysis, cough, dyspnea, asthma, idiopathic pulmonary fibrosis, chronic bronchitis, emphysema, idiopathic interstitial lung disease, tuberculosis, lung cancer, lung transplant, pneumonia, and smoking history), nonpulmonary conditions that were more common among patients with COPD with NTMLD (eg, gastroesophageal reflux disease, underweight or abnormal weight loss, and malnutrition), and resource utilizations (≥ 2 visits to an ID or ≥ 4 visits to a pulmonologist). The candidate risk factors extracted from the total 15,024 patients (3,756 patients with COPD with NTLMD and the matched 11,268 patients with COPD without NTMLD) were split into mutually exclusive training (n = 7,413), validation (n = 3,750), and testing (n = 3,861) datasets. Within the training, validation, and testing data, the ratio between the number of patients with NTMLD and the number of patients without NTMLD was approximately 1:3 because of uniform sampling. A final set of 10 risk factors for NTMLD among patients with COPD was identified by model estimation through logistic regression with forward stepwise selection. These risk factors, along with their estimated coefficients and odds ratios, are shown in Table 3. This binary logistic regression model predicts potentially undiagnosed NTMLD with high sensitivity and specificity, as demonstrated by the ROC curves and concordances of 0.8982, 0.8998, and 0.8908 (ie, area under ROC curves) evaluated on the training set, validation set, and testing set, respectively (Figure 2).
TABLE 3.
Risk Factors for NTMLD Among Patients With COPD and Their Odds Ratios Associated With NTMLD Based on Logistic Regression
Risk factors for NTMLD among COPD | Maximum likelihood estimates | Pr > chi-square | Odds ratio (had risk factor vs no) | 95% CI |
---|---|---|---|---|
Intercept | −3.1418 | < 0.0001 | — | — |
≥ 2 ID specialist visits | 0.9324 | < 0.0001 | 2.54 | 2.14-3.01 |
≥ 4 Pulmonologist visits | 1.6322 | < 0.0001 | 5.12 | 4.62-5.67 |
Hemoptysis | 1.476 | < 0.0001 | 4.38 | 3.47-5.52 |
Cough | 0.9064 | < 0.0001 | 2.48 | 2.24-2.74 |
Emphysema | 0.6144 | < 0.0001 | 1.85 | 1.64-2.08 |
Tuberculosis | 3.7594 | < 0.0001 | 42.92 | 23.98-76.84 |
Lung cancer | 0.7344 | < 0.0001 | 2.08 | 1.75-2.49 |
Idiopathic interstitial lung disease | 1.0391 | < 0.0001 | 2.83 | 2.44-3.27 |
Underweight | 0.7938 | < 0.0001 | 2.21 | 1.91-2.56 |
Pneumonia | 1.0828 | < 0.0001 | 2.95 | 2.65-3.29 |
COPD = chronic obstructive pulmonary disease; ID = infectious disease; NTMLD = nontuberculous mycobacterial lung disease; Pr = probability.
FIGURE 2.
Receiver Operating Characteristic Curves and Concordances of the Predictive Model of Nontuberculous Mycobacterial Lung Disease Among Patients With Chronic Obstructive Pulmonary Disease That Was Learned on the Uniformly Randomly Split Training Data (50%), Validation Data (25%), and Testing Data (25%)
PREDICTIVE MODEL VALIDITY
To further test the generalization of the predictive model that was developed based on data from the 1-year pre-index period, the model was applied to a new testing dataset consisting of risk factors extracted from a 5-year pre-index period from the population of 3,756 patients with COPD with NTMLD and 11,268 patients with COPD without NTMLD. As both COPD and NTMLD are chronic conditions, the risk factors would fluctuate over time for the same patient. The 5-year period was chosen because prior studies reported that the average time from symptom onset to NTMLD diagnosis was 5 years.27 During each year from the first to the fifth year pre-index, the risk factors were calculated using that year’s claims data, and a predictive score of having NTMLD was outputted by the predictive model given that year’s risk factors. Patients were labeled as having NTMLD if the score passed a predefined cutoff value. Sometimes a patient could have multiple years of risk factors that predicted NTMLD; in these cases, the predictive score from the earliest year was that patient’s score.
The testing results showed that the predictive algorithm was discriminant and generalizable, as demonstrated by the ROC curve on the new testing data (Supplementary Figure 1). At the cutoff with sensitivity of 0.81 and specificity of 0.78, the model was able to correctly predict a total of 3,045 patients who were later diagnosed with NTMLD (true positives). Among these correctly predicted patients: 14.8% (n = 450) were predicted to have NTMLD onset based on the risk factors present in the fifth year period prior to their first claim with NTMLD diagnosis; 11.3% (n = 343) of the patients had possible NTMLD onset based on the risk factors present in the fourth year prior to their first diagnosis of NTMLD; 13.5% (n = 412) of the patients had possible NTMLD onset based on the risk factors present in the third year prior to their first diagnosis of NTMLD; and 13.1% (n = 398) of the patients had possible NTMLD onset based on the risk factors present in the second year prior to their first diagnosis of NTMLD. The results demonstrated the potential of the predictive model for the early diagnosis of NTMLD among patients with COPD.
Discussion
Using comprehensive Medicare data, a model was developed to identify COPD patients with undiagnosed, coexisting NTMLD. This study is the first to address the high-risk population of patients with COPD and identify risk factors for possible comorbid NTMLD that are easy to recognize in daily clinical practice. A potential application could be integration into electronic health record systems or managed care systems to facilitate the timely diagnosis of, and intervention in, patients with potentially undiagnosed NTMLD by appropriately raising clinical suspicion and encouraging diagnostic interventions, such as sputum collection.
Prior studies aiming to generally identify patients with NTMLD to improve estimates of prevalence and incidence have used claims data from the United Kingdom and Germany.32,33 Consistent with this predictive model, those studies identified COPD and pneumonia as the most common preexisting diagnoses among patients with NTMLD, and both concluded that there are likely large numbers of undiagnosed patients with NTMLD in each of their respective countries.32,33 In contrast with the prior works, we focused on the population of high-risk patients with COPD to develop a practical model that could be implemented to identify patients who should be screened.
This predictive algorithm used US Medicare administrative claims data from a nationwide representative population of older patients with COPD among whom the prevalence of NTMLD is increasing.2 It is reported that patients with NTMLD have a long diagnostic journey that typically involves multiple referrals (eg, from primary care to pulmonologists or ID specialists) and hospital services (inpatient, outpatient, or both).1 Using claims data allowed the model to capture a holistic and longitudinal view of patients’ health care resource utilization across inpatient, outpatient, and office visit settings, which helps characterize the journey of patients with NTMLD.
This predictive model of NTMLD among patients with COPD based on logistic regression has 10 easy-to-interpret risk factors comprising patterns of health care resource use, pulmonary symptoms, and comorbidities. The model demonstrated strong generalizability and discrimination for predicting potentially undiagnosed NTMLD among patients with COPD. With a sensitivity of 0.81 and specificity of 0.78, the model correctly predicted a total of 3,045 patients who were later diagnosed with NTMLD. Of these, 1,603 (52.6%) had possible NTMLD onset more than 1 year earlier than their first diagnosis of NTMLD. These findings suggest that the model has potential utility for screening patients with COPD to identify those with potentially undiagnosed NTMLD and support earlier diagnosis.
The predictive model described here is the first to demonstrate the ability to identify patients with COPD and possibly undiagnosed NTMLD with high sensitivity and specificity. Patients with COPD are at greater risk of developing NTMLD than the general population,8,26 and the overlapping clinical presentation between the 2 diseases often results in delayed diagnosis, faster disease progression, and increased mortality.4,20,22,28,34 In addition, the high burden of each condition separately is compounded in patients with both COPD and NTMLD, with these patients’ conditions being particularly challenging to treat and manage.1,18-20 Applying our results would reduce diagnostic delays and facilitate the implementation of the current guideline recommendation of considering early treatment.1,6
LIMITATIONS
Limitations to this study include that patients with NTMLD may have heterogeneous disease presentation and severity, and some of the risk factors identified by this model might not generalize well to individuals who differ from Medicare patients with COPD.35 Other limitations are inherent with the use of claims data. For example, the continuous coverage eligibility requirement could exclude patients with NTMLD without coverage or those who died prior to being diagnosed with NTMLD. The identification of patients with NTMLD was based on International Classification of Diseases, Ninth Revision (ICD-9) and ICD-10 codes instead of clinical confirmation via sputum culture results and radiographical findings as indicated by the 2020 American Thoracic Society/Infectious Diseases Society of America guidelines.1 However, the ICD-9 code of NTMLD has been associated with high positive predictive values (PPVs) and moderate sensitivity in diverse populations, with Ku et al finding a PPV (95% CI) of 72.1% (63.3%-79.9%) and a sensitivity (95% CI) of 41.6% (35.2%-48.0%) of ICD-9-CM 031.0 when assigned by clinicians at least 2 times and dated at least 30 days apart but within 365 days, and Winthrop et al finding a PPV range of 74%-82% and a sensitivity range of 50%-65% when ICD-9-CM 031 was received at least 1 time in a managed care cohort and a Veterans Affairs Medical Center’s cohort, respectively.36,37 Finally, the retrospective design may have included patients without NTMLD as negative examples who might become positive for NTMLD in the near future (beyond our data cutoff date) thus reducing the discriminative power of the model.
Conclusions
This predictive algorithm uses a set of criteria comprising patterns of health care use, respiratory symptoms, and comorbidities to identify patients with COPD and possibly undiagnosed NTMLD with high sensitivity and specificity. Although further validation of the model is required, its application is expected to appropriately raise clinical suspicion for NTMLD among patients with COPD, facilitating timely diagnosis and reducing the undiagnosed period of NTMLD.
ACKNOWLEDGMENTS
Medical writing support was provided by Kat Hendrix, PhD, of Curo Consulting, a division of Envision Pharma Group, and funded by Insmed Incorporated.
REFERENCES
- 1.Daley CL, Iaccarino JM, Lange C, et al. Treatment of nontuberculous mycobacterial pulmonary disease: An official ATS/ERS/ESCMID/IDSA Clinical Practice Guideline. Clin Infect Dis. 2020;71(4):e1-36. doi:10.1093/cid/ciaa241 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Prevots DR, Marras TK. Epidemiology of human pulmonary infection with nontuberculous mycobacteria: A review. Clin Chest Med. 2015;36(1):13-34. doi:10.1016/j.ccm.2014.10.002 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Winthrop KL, Marras TK, Adjemian J, Zhang H, Wang P, Zhang Q. Incidence and prevalence of nontuberculous mycobacterial lung disease in a large U.S. managed care health plan, 2008-2015. Ann Am Thorac Soc. 2020;17(2):178-85. doi:10.1513/AnnalsATS.201804-236OC [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Kotilainen H, Valtonen V, Tukiainen P, Poussa T, Eskola J, Järvinen A. Clinical findings in relation to mortality in nontuberculous mycobacterial infections: Patients with Mycobacterium avium complex have better survival than patients with other mycobacteria. Eur J Clin Microbiol Infect Dis. 2015;34(9):1909-18. doi:10.1007/s10096-015-2432-8 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Mirsaeidi M, Hadid W, Ericsoussi B, Rodgers D, Sadikot RT. Non-tuberculous mycobacterial disease is common in patients with non-cystic fibrosis bronchiectasis. Int J Infect Dis. 2013;17(11):E1000-4. doi:10.1016/j.ijid.2013.03.018 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Griffith DE, Aksamit T, Brown-Elliott BA, et al. An official ATS/IDSA statement: Diagnosis, treatment, and prevention of nontuberculous mycobacterial diseases. Am J Respir Crit Care Med. 2007;175(4):367-416. doi:10.1164/rccm.200604-571ST [DOI] [PubMed] [Google Scholar]
- 7.Olivier KN, Weber DJ, Wallace RJJ, et al. Nontuberculous mycobacteria. I: Multicenter prevalence study in cystic fibrosis. Am J Respir Crit Care Med. 2003;167(6):828-34. doi:10.1164/rccm.200207-678OC [DOI] [PubMed] [Google Scholar]
- 8.Adjemian J, Olivier KN, Seitz AE, Holland SM, Prevots DR. Prevalence of nontuberculous mycobacterial lung disease in U.S. Medicare beneficiaries. Am J Respir Crit Care Med. 2012;185(8):881-6. doi:10.1164/rccm.201111-2016OC [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Drummond WK, Kasperbauer SH. Nontuberculous mycobacteria: Epidemiology and the impact on pulmonary and cardiac disease. Thorac Surg Clin. 2019;29(1):59-64. doi:10.1016/j.thorsurg.2018.09.006 [DOI] [PubMed] [Google Scholar]
- 10.Adjemian J, Olivier KN, Prevots DR. Nontuberculous mycobacteria among patients with cystic fibrosis in the United States: Screening practices and environmental risk. Am J Respir Crit Care Med. 2014;190(5):581-6. doi:10.1164/rccm.201405-0884OC [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Axson EL, Bual N, Bloom CI, Quint JK. Risk factors and secondary care utilisation in a primary care population with non-tuberculous mycobacterial disease in the UK. Eur J Clin Microbiol Infect Dis. 2019;38(1):117-24. doi:10.1007/s10096-018-3402-8 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Andréjak C, Nielsen R, Thomsen V, Duhaut P, Sørensen HT, Thomsen RW. Chronic respiratory disease, inhaled corticosteroids and risk of non-tuberculous mycobacteriosis. Thorax. 2013;68(3): 256-62. doi:10.1136/thoraxjnl-2012-201772 [DOI] [PubMed] [Google Scholar]
- 13.Winthrop KL, Baxter R, Liu L, et al. Mycobacterial diseases and antitumour necrosis factor therapy in USA. Ann Rheum Dis. 2013;72(1):37-42. doi:10.1136/annrheumdis-2011-200690 [DOI] [PubMed] [Google Scholar]
- 14.Prevots DR, Marras TK, Wang P, Mange KC, Flume PA. Hospitalization risk for medicare beneficiaries with nontuberculous mycobacterial pulmonary disease. Chest. 2021;160(6):2042-50. doi:10.1016/j.chest.2021.07.034 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Biener AI, Decker SL, Rohde F. Prevalence and treatment of chronic obstructive pulmonary disease (COPD) in the United States. JAMA. 2019;322(7):602. doi:10.1001/jama.2019.10241 [DOI] [PubMed] [Google Scholar]
- 16.World Heath Organization. Chronic obstructive pulmonary disease (COPD). 2022. March 16, 2023. Accessed June 23, 2022. https://www.who.int/news-room/fact-sheets/detail/chronic-obstrnctive-pulmonary-disease-(copd)
- 17.Adeloye D, Song P, Zhu Y, Campbell H, Sheikh A, Rudan I. Global, regional, and national prevalence of, and risk factors for, chronic obstructive pulmonary disease (COPD) in 2019: A systematic review and modelling analysis. Lancet Respir Med. 2022;10(5):447-58. doi:10.1016/s2213-2600(21)00511-7 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Quaderi SA, Hurst JR. The unmet global burden of COPD. Glob Health Epidemiol Genom. 2018;3:e4. doi:10.1017/gheg.2018.1 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Wang P, Hassan M, Chatterjee A. 1493. The incremental burden of nontuberculous mycobacterial lung disease (NTMLD) among patients with chronic obstructive pulmonary disease (COPD): Hospitalizations and ER visits among US Medicare beneficiaries. Open Forum Infectious Diseases. 2020;7(suppl 1):S748-9. doi:10.1093/ofid/ofaa439.1674 [Google Scholar]
- 20.Wang P, Marras T, Alemao E, Hassan M, Chatterjee A. Incremental mortality associated with nontuberculous mycobacterial lung disease among US Medicare beneficiaries with chronic obstructive pulmonary disease. Am J Respir Crit Care Med. 2021;203:A1192. doi:10.1164/ajrccm-conference.2021.203.1_MeetingAbstracts.A1192 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Aksamit TR, Philley JV, Griffith DE. Nontuberculous mycobacterial (NTM) lung disease: The top ten essentials. Respir Med. 2014;108(3):417-25. doi:10.1016/j.rmed.2013.09.014 [DOI] [PubMed] [Google Scholar]
- 22.van Ingen J, Obradovic M, Hassan M, et al. Nontuberculous mycobacterial lung disease caused by Mycobacterium avium complex - disease burden, unmet needs, and advances in treatment developments. Expert Rev Respir Med. 2021;15(11):1387-401. doi:10.1080/17476348.2021.1987891 [DOI] [PubMed] [Google Scholar]
- 23.Aksamit TR, O’Donnell AE, Barker A, et al. Adult patients with bronchiectasis: A first look at the US bronchiectasis research registry. Chest. 2017;151(5): 982-92. doi:10.1016/j.chest.2016.10.055 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Huang CT, Tsai YJ, Wu HD, et al. Impact of non-tuberculous mycobacteria on pulmonary function decline in chronic obstructive pulmonary disease. Int J Tuberc Lung Dis. 2012;16(4):539-45. doi:10.5588/ijtld.11.0412 [DOI] [PubMed] [Google Scholar]
- 25.Park HY, Jeong BH, Chon HR, Jeon K, Daley CL, Koh WJ. Lung function decline according to clinical course in nontuberculous mycobacterial lung disease. Chest. 2016;150(6):1222-32. doi:10.1016/j.chest.2016.06.005 [DOI] [PubMed] [Google Scholar]
- 26.Marras TK, Campitelli MA, Kwong JC, et al. Risk of nontuberculous mycobacterial pulmonary disease with obstructive lung disease. Eur Respir J. 2016;48(3): 928-31. doi:10.1183/13993003.00033-2016 [DOI] [PubMed] [Google Scholar]
- 27.Kim RD, Greenberg DE, Ehrmantraut ME, et al. Pulmonary nontuberculous mycobacterial disease: Prospective study of a distinct preexisting syndrome. Am J Respir Crit Care Med. 2008;178(10):1066-74. doi:10.1164/rccm.200805-686OC [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Seifer FD, Hansen G, Weycker D. Health-care utilization and expenditures among patients with comorbid bronchiectasis and chronic obstructive pulmonary disease in US clinical practice. Chron Respir Dis. 2019;16:1479973119839961. doi:10.1177/1479973119839961 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Honda JR, Knight V, Chan ED. Pathogenesis and risk factors for nontuberculous mycobacterial lung disease. Clin Chest Med. 2015;36(1):1-11. doi:10.1016/j.ccm.2014.10.001 [DOI] [PubMed] [Google Scholar]
- 30.Chan ED, Iseman MD. Underlying host risk factors for nontuberculous mycobacterial lung disease. Semin Respir Crit Care Med. 2013;34(1):110-23. doi:10.1055/s-0033-1333573 [DOI] [PubMed] [Google Scholar]
- 31.Steyerberg EW, Vergouwe Y. Towards better clinical prediction models: seven steps for development and an ABCD for validation. Eur Heart J. 2014;35(29):1925-31. doi:10.1093/eurheartj/ehu207 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Doyle OM, van der Laan R, Obradovic M, et al. Identification of potentially undiagnosed patients with nontuberculous mycobacterial lung disease using machine learning applied to primary care data in the UK. Eur Respir J. 2020;56(4):2000045. doi:10.1183/13993003.00045-2020 [DOI] [PubMed] [Google Scholar]
- 33.Ringshausen FC, Ewen R, Multmeier J, et al. Predictive modeling of nontuberculous mycobacterial pulmonary disease epidemiology using German health claims data. Int J Infect Dis. 2021;104:398-406. doi:10.1016/j.ijid.2021.01.003 [DOI] [PubMed] [Google Scholar]
- 34.van Ingen J, Wagner D, Gallagher J, et al. Poor adherence to management guidelines in nontuberculous mycobacterial pulmonary diseases. Eur Respir J. 2017;49(2):1601855. doi:10.1183/13993003.01855-2016 [DOI] [PubMed] [Google Scholar]
- 35.Pravosud V, Mannino DM, Prieto D, et al. Symptom burden and medication use among patients with nontuberculous mycobacterial lung disease. Chronic Obstr Pulm Dis. 2021;8(2):243-54. doi:10.15326/jcopdf.2020.0184 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Ku JH, Henkle EM, Carlson KF, Marino M, Winthrop KL. Validity of diagnosis code-based claims to identify pulmonary NTM disease in bronchiectasis patients. Emerg Infect Dis. 2021;27(3): 982-5. doi:10.3201/eid2703.203124 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Winthrop KL, Baxter R, Liu L, et al. The reliability of diagnostic coding and laboratory data to identify tuberculosis and nontuberculous mycobacterial disease among rheumatoid arthritis patients using anti-tumor necrosis factor therapy. Pharmacoepidemiol Drug Saf. 2011;20(3):229-35. doi:10.1002/pds.2049 [DOI] [PMC free article] [PubMed] [Google Scholar]