Skip to main content
AMIA Annual Symposium Proceedings logoLink to AMIA Annual Symposium Proceedings
. 2024 Jan 11;2023:1209–1217.

Integrating Clinical and Air Quality Data to Improve Prediction of COPD Exacerbations

Grace E Ratcliff 1, Michael E Matheny 2,4, Jeremiah R Brown 3, Iben Sullivan 3, Bradley W Richmond 2,4, Laura M Paulin 5, Adrienne K Conger 2, Sharon E Davis 2
PMCID: PMC10785856  PMID: 38222356

Abstract

Several studies have found associations between air pollution and respiratory disease outcomes. However, there is minimal prognostic research exploring whether integrating air quality into clinical prediction models can improve accuracy and utility. In this study, we built models using both logistic regression and random forests to determine the benefits of including air quality data with meteorological and clinical data in prediction of COPD exacerbations requiring medical care. Logistic models were not improved by inclusion of air quality. However, the net benefit curves of random forest models showed greater clinical utility with the addition of air quality data. These models demonstrate a practical and relatively low-cost way to include environmental information into clinical prediction tools to improve the clinical utility of COPD prediction. Findings could be used to provide population level health warnings as well as individual-patient risk assessments.

Introduction

Chronic obstructive pulmonary disorder (COPD) is the third leading cause of death globally1 and impacts 24 million people across the United States2. COPD is a progressive respiratory disease in which patients experience periods of exacerbation leading to over 700,000 hospitalizations and 1 million emergency visits each year3, and diminishing the quality of life for patients with COPD and their caregivers2. Population management and clinical decision support tools leveraging prediction models may present opportunities for early intervention and effective delivery of limited preventive case management resources to reduce complications and improve the quality of life of those living with COPD.

A recent review found over 400 prediction models have been developed to predict complications among patients with COPD, including models for mortality, acute exacerbation, and health care utilization4. The potential utility and adoption of these models are limited by failure to adhere to best prognostic modeling practices, a lack of rigorous validation, and development using limited sample sizes. Two commonly adopted prediction tools for COPD—the BODE5 and ADO6 indices—consider combinations of only age, body mass index, airflow obstruction, dyspnea, and exercise capacity. While the BODE index has been applied to exacerbation7, both indices were trained to predict mortality and have not been updated to incorporate the breadth of clinical history data routinely in electronic health records (EHRs). The more recently developed ACCEPT model8 highlights the potential for using detailed, routinely collected clinical data to predict exacerbations, but was trained using patients enrolled in clinical trials rather than a general population of patients with COPD.

In addition to EHR data on clinical history and disease progression, information on environmental conditions may be useful for predicting COPD exacerbations9, 10 Studies have found links between pollutants regulated by the United States Environmental Protection Agency under the Clean Air Act (including fine particulate matter [PM2.5], coarse particulate matter [PM10], nitrogen dioxide [NO2], and ozone [O3]) and outcomes for patients with COPD11, 12, 13. Air pollution has been associated with increased mortality9 and emergency room visits10. Risk assessment of air pollution was greater when using a joint effects model to include the complex interactions of air pollutants14. Research linking regional monitoring data on air pollution with COPD has focused on establishing associations with clinical outcomes rather than predicting patient-level risk of adverse events. Many of these studies have also been conducted outside the United States where air pollution may not reflect exposures levels and compositions among patients in other regions. While some studies using personal monitoring data have explored patient-level prediction13, these models have limited generalizability as personal monitoring systems are expensive to provide and maintain.

With the ability to handle larger numbers of predictors and high-order interactions, machine learning provides the opportunity to analyze complex associations between air pollution, clinical status, and health endpoints. Such associations are difficult to capture in traditional statistical regression models. As a result, machine learning has become more common in air pollution epidemiology15, extending consideration from the effects of individual pollutants to multi-pollutant exposures16. Studies of environmental contributions to adverse outcomes in pediatric asthma examined the joint effects of multiple air pollutants. Using classification and regression trees they found interactions among multiple pollutants increased the risk of emergency room visit14, 17. Although these studies aimed to characterize associations rather than develop predictive models, their findings highlight the potential of machine learning to support improved prediction of health outcomes given complex environmental exposures. To determine hospital level resources needed for chronic respiratory emergency room visits, a study utilized machine learning techniques including the random forest model, which had the best prediction performance18.

Despite established links between air quality and clinical outcomes for COPD and numerous other health conditions, environmental data has yet to be routinely integrated into EHR-based clinical prediction models. This study aims to close this gap by integrating air pollution and meteorological data with electronic health records to predict acute COPD exacerbations. We develop our clinical prediction model using a large population of patients with COPD in the southeastern United States using publicly available air quality monitoring data alongside routinely collected clinical EHR data. We focus on developing a clinically useful model to support population management and patient-level interventions.

Methods

We collected data on patients with COPD receiving treatment at Vanderbilt University Medical Center between 2014 and 2020. Patients entered our COPD cohort at the time of their first outpatient primary care, outpatient pulmonary care, or emergency department visit with a COPD diagnosis code (ICD-9: 491, 491.1, 491.2, 491.21, 491.22, 491.8, 491.9, 492, 492.8, 493.92, 496 or ICD-10: J41.0, J41.1, J41.8, J42, J43.1, J43.2, J43.8, J43.9, J44.0, J44.1, J44.9). We collected data for each non-urgent (defined as encounters scheduled >48 hours in advance) outpatient primary care and pulmonary care encounter starting with the qualifying index encounter. Encounters during which the patient was under 40 years of age were excluded. For this study, we further limited our patient population to individuals living within middle Tennessee counties (see Figure 1). This study was approved by the Institutional Review Board at Vanderbilt University Medical Center.

Figure 1.

Figure 1.

Map of included Tennessee counties and monitor locations

For each encounter in our cohort, we extracted data on age, sex, race, ethnicity, smoking history, body mass index, forced expiratory volume (FEV1), health care utilization in the prior year, the Charlson comorbidities indices, and oxygen supplementation. These features reflect those previously considered in outpatient clinical models for COPD complications9. Race and ethnicity data were not used as predictors but instead collected to evaluate potential bias during model validation. Since data collection methods cannot be confirmed, we are unable to guarantee that all demographics were self-identified. We recorded outcomes data over the 7 days following each encounter, defining an acute exacerbation as an urgent outpatient, emergency department or inpatient encounter with a COPD exacerbation diagnosis code (ICD-9-CM: 493.22, ICD-10-CM: J44.1).

Ambient air quality data was collected by the United States Environmental Protection Agency. We downloaded daily monitoring data from sites across Tennessee, Kentucky, and Alabama from 2013 to 2020 for ozone, NO2, PM2.5, and PM10. Values for each pollutant were recorded using the Air Quality Index19, a metric ranging from 0-500 based on the level of severity of health risk20. Data on daily maximum temperature and relative humidity was collected by the National Centers for Environmental Information and publicly available in the Local Climatological Data21. We integrated these data with patient-level clinical data based on the nearest monitor to each patient’s geocoded residence. As monitoring sites can vary in terms of which pollutants are measured, we determined the nearest monitor to each patient for each pollutants, as well as for meteorological data. For each outpatient primary care or pulmonology encounter in our dataset, we collected daily values for each pollutant, temperature, and humidity for the 7 days prior to the encounter date to account for the lag seen between environmental triggers and emergency room visits22. Weather and air quality in the 7 days prior to each encounter was summarized as the mean, minimum, and maximum values. In middle Tennessee, ozone is not monitored between November and February as winter conditions are not conducive to high levels of this pollutant. The minimum air quality index value observed across ozone monitors was used for encounters during this period.

Statistical analysis

We divided our cohort into 80% training and 20% test sets using patient-stratified sampling to ensure all encounters to each patient were included in the either training or test samples. We excluded any features with missing values for more than 70% of all encounters. Among remaining clinical features, missing data was imputed separately for the training and tests sets using predictive mean matching.

We trained logistic regression and random forest models to predict 7-day acute COPD exacerbations. Logistic regressions did not specify non-linearity or interactions between features, while the random forest models, by design, considered the possibility of non-linearity and interactions. We selected these learning algorithms to compare a standard, baseline approach and the potential gains of enabling more complex learning. To explore the utility of incorporating environmental data alongside clinical data, we trained models with 1) only clinical predictors; 2) clinical and meteorological features; and 3) clinical, meteorological, and air quality index features. In addition to separate models being developed for each pollutant, we fit a multi-pollutant model controlling for all four air pollutants. Observations with no meteorological or air quality data in the 7 days prior to the encounter were excluded from pollutant specific models.

We validated each model in the 20% training set using discrimination, calibration, and clinical utility metrics. Discrimination was measured with the area under the receiver operating curve (AUC). Calibration was measured with the observed to expected outcome ratio (O:E), integrated calibration index (ICI), and calibration plots. Net benefit curves were used to compare clinical utility across models with and without air quality information.

Results

Patients with COPD residing in middle Tennessee participated in 255,987 eligible non-urgent primary care and pulmonary specialist encounters during the study period. These encounters involved 6,635 unique patients. Table 1 describes the characteristics of patients at the time of each encounter, including demographics and select risk factors collected from electronic health records. Patients were predominantly white with an average age of 68 years across all encounters. The majority of the population smoked at some point during their life, with higher rates of patients being current smokers among encounters followed by an exacerbation in the next 7 days compared to encounters not followed by an exacerbation (49% vs 39%, respectively). FEV1 as percent of predicted lung capacity was 67% and 62% for encounters with and without subsequent exacerbations respectively, both well below the level considered healthy (80%)23.

Table 1.

Baseline characteristics of patients with COPD at the time of eligible non-urgent primary care and pulmonary specialist encounters as noted on their electronic health records

Exacerbation in next 7 days
No Yes
# encounters 254,229 1,758
Age (mean and SD) 68.6 (10.5) 67.1 (11.1)
% female 52.0 55.6
Race
% Black 16.3 26.8
% White 82.7 72.6
% Other 1 0.5
Smoking status
Current smoker 38.6 49.3
Former smoker 43.7 39.2
Nonsmoker 17.7 11.4
Body mass index (mean and SD) 28.7 (7.7) 28.9 (8.1)
Bronchiectasis in prior year 1.9 3.1
Acute bronchitis in prior year 2.1 4.8
Acute upper infection in prior year 1.9 3.6
Pneumonia in prior year 1.9 2.1
Last FEV1 % of predicted value 67.0 (21.0) 62.2 (21.6)

Performance of the logistic regression and random forest models with and without air quality information are presented in Table 2. Using only clinical features, the logistic models had poor discrimination (AUC: 0.68). The logistic model was not improved by adding meteorological and air quality information, with AUCs at or below 0.7 for all models. The random forest model using only clinical features provided excellent discrimination with an AUC of 0.83 (95% CI: 0.81-0.86). Discrimination of the random forest model was improved by incorporating nitrogen dioxide and PM2.5. Though not statistically different than alternative models, the highest discrimination was observed in the multi-pollutant model (AUC=0.86, 95% CI=0.82-0.89).

Table 2.

Performance of each model in the test set.

Model AUC Brier OE ICI
Logistic regression
Clinical 0.683 [0.652 - 0.713] 0.007 [0.006 - 0.007] 0.989 [0.890 - 1.092] 0.001 [0.001 - 0.002]
Weather 0.685 [0.658 - 0.715] 0.007 [0.006 - 0.007] 0.983 [0.884 - 1.089] 0.002 [0.001 - 0.002]
Ozone 0.684 [0.653 - 0.715] 0.007 [0.006 - 0.007] 0.981 [0.884 - 1.084] 0.001 [0.001 - 0.002]
NO2 0.689 [0.658 - 0.719] 0.007 [0.006 - 0.007] 0.959 [0.860 - 1.056] 0.002 [0.001 - 0.002]
PM10 0.680 [0.648 - 0.712] 0.007 [0.006 - 0.008] 1.007 [0.898 - 1.121] 0.002 [0.001 - 0.002]
PM25 0.702 [0.669 - 0.736] 0.007 [0.006 - 0.007] 0.983 [0.878 - 1.097] 0.002 [0.001 - 0.002]
AQ combined 0.700 [0.669 - 0.736] 0.007 [0.006 - 0.008] 0.989 [0.867 - 1.109] 0.002 [0.001 - 0.002]
Random forest
Clinical 0.834 [0.807 - 0.860] 0.006 [0.005 - 0.006] 1.150 [1.039 - 1.252] 0.003 [0.002 - 0.003]
Weather 0.830 [0.801 - 0.860] 0.005 [0.005 - 0.006] 1.014 [0.926 - 1.104] 0.003 [0.003 - 0.004]
Ozone 0.829 [0.800 - 0.857] 0.005 [0.005 - 0.006] 0.938 [0.851 - 1.022] 0.004 [0.004 - 0.005]
NO2 0.850 [0.823 - 0.877] 0.005 [0.004 - 0.005] 0.916 [0.836 - 0.995] 0.004 [0.004 - 0.005]
PM10 0.833 [0.805 - 0.861] 0.005 [0.005 - 0.006] 0.976 [0.886 - 1.066] 0.004 [0.003 - 0.004]
PM25 0.846 [0.818 - 0.874] 0.004 [0.004 - 0.005] 0.940 [0.857 - 1.030] 0.004 [0.004 - 0.005]
AQ combined 0.856 [0.822 - 0.885] 0.005 [0.004 - 0.005] 0.858 [0.778 - 0.950] 0.005 [0.005 - 0.006]

Both the logistic and random forest models were calibrated on average, with O:E ratios generally approaching 1.0 and ICI values near 0 (see Table 2). The logistic regression models with and without air quality information produced well calibrated predictions across the range of probabilities (see Figure 2). Calibration curves for the random forest models showed predictions were calibrated in the lower range of predicted risk, particularly for probabilities under 20%. However, these models tended to be less calibrated in the higher risk range where calibration curves highlight underprediction for both the clinical and multi-pollutant models.

Figure 2.

Figure 2.

Calibration curves for clinical and multi-pollutant air quality (AQ) model.

Decision curve analyses with net benefits evaluation highlighted improvement in clinical utility by incorporating air quality information in the random forest model (Figure 3). The model with only clinical features provided a net benefit over the extreme strategies of treat none and treat all for probability thresholds up to 60% predicted probability. Adding meteorological features provided incrementally more net benefit, as did subsequently adding air quality features. Between decision thresholds of approximately 10% and 90% predicted probability of exacerbation, the multi-pollutant model provided a net benefit above the default strategies and other models considered. At a threshold of 10%, comparing to a treat none approach, including air quality, meteorological, and clinical data into exacerbation predictions was equivalent to a strategy that identified 35 impending exacerbations among 10,000 encounters without predicting any false positives. In comparison, at this 10% threshold, the clinical model would only be able to identify 26 exacerbations in 10,000 encounters without predicting any false positives. Using decision thresholds between 25% and 45%, the multi-pollutant model would correctly identify 17 more opportunities for intervention per 10,000 encounters than the model with only clinical features.

Figure 3.

Figure 3.

Standardized net benefit curves for random forest models with clinical features; clinical and meteorological features; and the multi-pollutant air pollution model with clinical and meteorological features.

In evaluations by EHR-recorded sex and race, we observed few differences in overall performance metrics between subgroups. However, net benefit analyses revealed variation in clinical utility across subpopulations in the random forest models (see Figure 4). At a 10% decision threshold, the multi-pollutant model would correctly identify 91 impending exacerbations among 10,000 encounters of Black patients without predicting any false positives compared to 69 identified by the model with only clinical features (difference of 21 cases). Among white patients, the multi-pollutant model would identify 26 impending exacerbations among 10,000 encounters compared to 19 identified by the model with only clinical features (difference of 7 cases). For thresholds between 25% and 45%, the benefit of incorporating air quality over only considering clinical features increased to approximately 38 and 14 more opportunities for intervention per 10,000 encounters involving Black and white patients, respectively. Compared to the model with only clinical features, for thresholds between 25% and 45% incorporating air quality would identify approximately 22 and 12 more opportunities for intervention per 10,000 encounters involving female and male patients, respectively.

Figure 4.

Figure 4.

Standardized net benefit curves by race and sex for random forest models with clinical features; clinical and meteorological features; and the multi-pollutant model with clinical and meteorological features.

Discussion

In this study, we explored the impact of supplementing clinical data with air quality information to support care of patients with COPD. We focused on patient-level clinical prediction and developing clinically useful insight into exacerbation risk among patients with COPD and build on prior work primarily aimed at identifying associations between air quality and COPD outcomes, as well as on work exploring the use of personal air quality monitors to predict clinical outcomes. Integrating publicly available, routinely collected air quality monitoring data with EHR data, our models provide a practical and relatively low-cost approach to incorporating environmental risk factors into clinically-oriented prediction tools to reach a broad patient population.

We found that basic logistic regression models were not improved by adding air pollution features. However, discrimination and net benefit of random forest models increased when air pollution features were incorporated. This finding may highlight that the influence of air quality on COPD outcomes is complex rather than additive, as the logistic approach assumes. Rather, there may be important interactions between clinical status and air quality that were able to be captured by the random forest design.

We also found that incorporating environmental data had differential impacts across subpopulations, particularly by race. At a 10% high risk cutoff, the relative benefit of the air pollution model over the clinical data model was 3 times higher for the Black population than white population (21 vs 7 more cases correctly identified). This additional benefit of the model may be related to different levels of air pollution exposure across populations and persisting racial residential segregation24, 25, 26. However, despite differences in performance by race, clinical utility for the whole population and particularly for traditionally underserved populations may be received as an added benefit of our model.

In random forest models, each air pollutant independently improved predictions, but the model performed best when all pollutants were considered in a single model. Other studies have found mixed results of interaction between the four included pollutants (ozone, nitrogen dioxide, PM2.5, and PM10)27. By including all the pollutants, our multi-pollutant model was able to account for seasonal changes in the primary pollutant of concern and potential cumulative effects of concurrent exposures to multiple pollutants. With environmental data, the net benefit curve demonstrated the model’s ability to correctly predict more exacerbations while avoiding false positives. These results show that there is meaningful usage of including environmental data in predictions for patient and population level prediction. The ability to correctly predict more exacerbations provides an opportunity to create more effective interventions and monitoring of at-risk patients. Results also show that there is a complex relationship between clinical and environmental data, and considering only at clinical information reduces predictive ability and utility.

This study is not without limitations. First, we have not accounted for the distance between each patient’s residence and the nearest air quality monitor. Middle Tennessee is rural outside of the metro Nashville area, and there were limited monitors in the more rural counties, reflecting a national lack of air quality monitoring infrastructure in rural areas26, 28. Controlling for monitor distance or urbanization near patient locations may provide more accurate predictions. Additionally, using air quality monitors only captures outdoor air pollution levels. Research indicates some patients, particularly older individuals, may spend more time indoors and therefore are more affected by indoor air pollution levels29, 30. While other studies have considered data on indoor air quality using personal monitors, such data may not be feasible for building generalizable models for predicting exacerbation in large populations. Finally, using residential addresses to link patients to air quality monitors does not account for patterns of daily living that may include regular regional travel. While we are not able to account regional commuting patterns, given the distribution of monitors, many patients may be closest to the same monitor throughout their day.

Conclusion

In conclusion, we built multivariable models using clinical and environmental data to evaluate whether including environmental data would improve COPD exacerbation predictions. The multi-pollutant random forest model outperformed the clinical model, providing both more accurate results and a net benefit in decision analyses. Models integrating EHR data with publicly available environmental data may be useful for prioritizing and targeting interventions for patient populations with COPD. These models may support individual-level preventive interventions at an encounter scheduled during periods of poor air quality, as well as population health programs that provide outreach to high-risk patients during periods of elevated air pollution levels.

Acknowledgements

This study was funded by National Heart, Lung, And Blood Institute (grant number 5R01HL157130). This project used data curated in the Vanderbilt Research Derivative. This data resource was supported by CTSA award No. UL1TR000445 from the National Center for Advancing Translational Sciences.

Figures & Tables

References

  • 1.Halpin DMG, Vogelmeier CF, Agusti A. Lung health for all: Chronic obstructive lung disease and world lung day 2022. Am J Respir Crit Care Med. 2022;206(6):669–71. doi: 10.1164/rccm.202207-1407ED. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Stanford RH, Nag A, Mapel DW, et al. Claims-based risk model for first severe copd exacerbation. Am J Manag Care. 2018;24(2):e45–e53. [PubMed] [Google Scholar]
  • 3.Rinne ST, Castaneda J, Lindenauer PK, Cleary PD, Paz HL, Gomez JL. Chronic obstructive pulmonary disease readmissions and other measures of hospital quality. Am J Respir Crit Care Med. 2017;196(1):47–55. doi: 10.1164/rccm.201609-1944OC. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Bellou V, Belbasis L, Konstantinidis AK, Tzoulaki I, Evangelou E. Prognostic models for outcome prediction in patients with chronic obstructive pulmonary disease: Systematic review and critical appraisal. BMJ. 2019;367:l5358. doi: 10.1136/bmj.l5358. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Abu Hussein N, Ter Riet G, Schoenenberger L, et al. The ado index as a predictor of two-year mortality in general practice-based chronic obstructive pulmonary disease cohorts. Respiration. 2014;88(3):208–14. doi: 10.1159/000363770. [DOI] [PubMed] [Google Scholar]
  • 6.Puhan MA, Garcia-Aymerich J, Frey M, et al. Expansion of the prognostic assessment of patients with chronic obstructive pulmonary disease: The updated bode index and the ado index. Lancet. 2009;374(9691):704–11. doi: 10.1016/S0140-6736(09)61301-5. [DOI] [PubMed] [Google Scholar]
  • 7.Marin JM, Carrizo SJ, Casanova C, et al. Prediction of risk of copd exacerbations by the bode index. Respir Med. 2009;103(3):373–8. doi: 10.1016/j.rmed.2008.10.004. [DOI] [PubMed] [Google Scholar]
  • 8.Adibi A, Sin DD, Safari A, et al. The acute copd exacerbation prediction tool (accept): A modelling study. Lancet Respir Med. 2020;8(10):1013–21. doi: 10.1016/S2213-2600(19)30397-2. [DOI] [PubMed] [Google Scholar]
  • 9.Faustini A, Stafoggia M, Cappai G, Forastiere F. Short-term effects of air pollution in a cohort of patients with chronic obstructive pulmonary disease. Epidemiology. 2012;23(6):861–79. doi: 10.1097/EDE.0b013e31826767c2. [DOI] [PubMed] [Google Scholar]
  • 10.Weichenthal SA, Lavigne E, Evans GJ, Godri Pollitt KJ, Burnett RT. Fine particulate matter and emergency room visits for respiratory illness. Effect modification by oxidative potential. Am J Respir Crit Care Med. 2016;194(5):577–86. doi: 10.1164/rccm.201512-2434OC. [DOI] [PubMed] [Google Scholar]
  • 11.Schikowski T, Mills IC, Anderson HR, et al. Ambient air pollution: A cause of copd? Eur Respir J. 2014;43(1):250–63. doi: 10.1183/09031936.00100112. [DOI] [PubMed] [Google Scholar]
  • 12.DeVries R, Kriebel D, Sama S. Outdoor air pollution and copd-related emergency department visits, hospital admissions, and mortality: A meta-analysis. COPD. 2017;14(1):113–21. doi: 10.1080/15412555.2016.1216956. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Cooper CB, Sirichana W, Arnold MT, et al. Remote patient monitoring for the detection of copd exacerbations. Int J Chron Obstruct Pulmon Dis. 2020;15:2005–13. doi: 10.2147/COPD.S256907. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Winquist A, Kirrane E, Klein M, et al. Joint effects of ambient air pollutants on pediatric asthma emergency department visits in atlanta, 1998-2004. Epidemiology. 2014;25(5):666–73. doi: 10.1097/EDE.0000000000000146. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Bellinger C, Mohomed Jabbar MS, Zaiane O, Osornio-Vargas A. A systematic review of data mining and machine learning for air pollution epidemiology. BMC Public Health. 2017;17(1):907. doi: 10.1186/s12889-017-4914-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Bobb JF, Valeri L, Claus Henn B, et al. Bayesian kernel machine regression for estimating the health effects of multi-pollutant mixtures. Biostatistics. 2015;16(3):493–508. doi: 10.1093/biostatistics/kxu058. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Gass K, Klein M, Chang HH, Flanders WD, Strickland MJ. Classification and regression trees for epidemiologic research: An air pollution example. Environ Health. 2014;13(1):17. doi: 10.1186/1476-069X-13-17. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Peng J, Chen C, Zhou M, Xie X, Zhou Y, Luo CH. Peak outpatient and emergency department visit forecasting for patients with chronic respiratory diseases using machine learning methods: Retrospective cohort study. JMIR Med Inform. 2020;8(3):e13075. doi: 10.2196/13075. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Us environmental protection agency Air quality system data mart [internet database] [Available from: https://www.epa.gov/outdoor-air-quality-data.
  • 20.Us environmental protection agency Air quality index (aqi) basics. [Available from: https://www.airnow.gov/aqi/aqi-basics/
  • 21.Administration NOaA Local climatological data (lcd) https://www.ncei.noaa.gov/2023 [Available from: https://www.ncei.noaa.gov/products/land-based-station/local-climatological-data.
  • 22.Stieb DM, Beveridge RC, Smith-Doiron M, et al. Beyond administrative data: Characterizing cardiorespiratory disease episodes among patients visiting the emergency department. Can J Public Health. 2000;91(2):107–12. doi: 10.1007/BF03404921. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.David S, Edwards CW. Forced expiratory volume. Statpearls. Treasure Island (FL)2022. [PubMed]
  • 24.Lane HM, Morello-Frosch R, Marshall JD, Apte JS. Historical redlining is associated with present-day air pollution disparities in u.S. Cities. Environ Sci Technol Lett. 2022;9(4):345–50. doi: 10.1021/acs.estlett.1c01012. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Liu J, Clark LP, Bechle MJ, et al. Disparities in air pollution exposure in the united states by race/ethnicity and income, 1990-2010. Environ Health Perspect. 2021;129(12):127005. doi: 10.1289/EHP8584. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Miranda ML, Edwards SE, Keating MH, Paul CJ. Making the environmental justice grade: The relative burden of air pollution exposure in the united states. Int J Environ Res Public Health. 2011;8(6):1755–71. doi: 10.3390/ijerph8061755. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Duan RR, Hao K, Yang T. Air pollution and chronic obstructive pulmonary disease. Chronic Dis Transl Med. 2020;6(4):260–9. doi: 10.1016/j.cdtm.2020.05.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Bell ML, Ebisu K. Environmental inequality in exposures to airborne particulate matter components in the united states. Environ Health Perspect. 2012;120(12):1699–704. doi: 10.1289/ehp.1205201. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Klepeis NE, Nelson WC, Ott WR, et al. The national human activity pattern survey (nhaps): A resource for assessing exposure to environmental pollutants. J Expo Anal Environ Epidemiol. 2001;11(3):231–52. doi: 10.1038/sj.jea.7500165. [DOI] [PubMed] [Google Scholar]
  • 30.Hansel NN, McCormack MC, Belli AJ, et al. In-home air pollution is linked to respiratory morbidity in former smokers with chronic obstructive pulmonary disease. Am J Respir Crit Care Med. 2013;187(10):1085–90. doi: 10.1164/rccm.201211-1987OC. [DOI] [PMC free article] [PubMed] [Google Scholar]

Articles from AMIA Annual Symposium Proceedings are provided here courtesy of American Medical Informatics Association

RESOURCES