Abstract
Background
Atopic dermatitis (AD) is a chronic inflammatory skin disease that affects 20% of children worldwide. Environmental factors including weather and air pollutants have been shown to be associated with AD symptoms. However, the time‐dependent nature of such a relationship has not been adequately investigated. This paper aims to assess whether real‐time data on weather and air pollutants can make short‐term prediction of AD severity scores.
Methods
Using longitudinal data from a published panel study of 177 paediatric patients followed up daily for 17 months, we developed a statistical machine learning model to predict daily AD severity scores for individual study participants. Exposures consisted of daily meteorological variables and concentrations of air pollutants, and outcomes were daily recordings of scores for six AD signs. We developed a mixed‐effect autoregressive ordinal logistic regression model, validated it in a forward‐chaining setting and evaluated the effects of the environmental factors on the predictive performance.
Results
Our model successfully made daily prediction of the AD severity scores, and the predictive performance was not improved by the addition of measured environmental factors. Potential short‐term influence of environmental exposures on daily AD severity scores was outweighed by the underlying persistence of preceding scores.
Conclusions
Our data does not offer enough evidence to support a claim that weather or air pollutants can make short‐term prediction of AD signs. Inferences about the magnitude of the effect of environmental factors on AD severity scores require consideration of their time‐dependent dynamic nature.
Keywords: atopic dermatitis, environmental factors, longitudinal data, prediction, statistical machine learning
1. INTRODUCTION
Atopic dermatitis (AD, also called eczema) is a chronic inflammatory skin disease characterized by inflammatory flares as well as dry and itchy skin. 1 AD patients often suffer from symptoms that fluctuate every day, resulting in a decreased quality of life due to the unforeseeable dynamic nature of the symptoms. AD affects almost 20% of the paediatric population worldwide and the prevalence of AD in children is still increasing globally. 2 The rising prevalence of AD coincides with increased urbanization and industrialization worldwide, 3 and the assessment of the effects of environmental factors on AD has gained a growing importance.
AD pathophysiology is considered to be affected by external environmental factors, such as air pollution from particulates, ultraviolet radiation, temperature and humidity—collectively known as the skin exposome. 4 , 5 Environmental factors have been shown to be associated with AD development and aggravation, 6 , 7 as well as other aspects of AD including barrier dysfunction 8 or care visits. 9 Prior studies investigated whether environmental factors were associated with the current AD severity, 10 , 11 , 12 , 13 but none have considered the dynamic nature of the severity nor have they investigated whether the future AD severity can be predicted by environmental factors. Despite this evidence gap, a profusion of smartphone eczema apps has emerged offering to track disease severity and environmental factors with bold claims of being able to predict AD flares. 14
We have recently developed statistical machine learning models to predict AD severity scores on a daily basis at an individual level. 15 The models demonstrated that it was possible to decipher much of the apparent unpredictable dynamics of AD severity scores from each patient's longitudinal data. The models investigated the effects of age, filaggrin mutations and the treatments used, such as calcineurin inhibitors and corticosteroids, on daily change of AD severity scores. However, we could not investigate the effects of environmental factors due to the lack of availability of such data in the training data sets.
In this paper, we aim to assess the impact of environmental factors in predicting future AD severity scores. We developed a statistical machine learning model to predict daily AD severity scores for individual patients using a longitudinal data set with high‐quality environmental and AD symptom data. We used that model to evaluate whether environmental factors including weather and air pollutants are important determinants in predicting the next day's AD scores from today's scores.
2. METHODS
2.1. Data
We used the longitudinal data from a published panel study 10 that investigated the short‐term impact of environmental factors on AD symptoms in Seoul, South Korea. The cohort included 177 Korean paediatric patients (67 girls and 110 boys) aged five or younger (average age of 2.0 years old, SD = 1.6) with mild to severe AD (mean severity scoring of atopic dermatitis [SCORAD] at enrolment of 31.1, SD = 12.8). The data contained the daily recording of the atopic dermatitis symptom score (ADSS) 16 over 17 months (Figures 1 and S1). ADSS is a sum of scores for six AD signs (dryness, oedema, itching, oozing, redness and sleep disturbance), each on a discrete scale from 0 (none) to 4 (severe). In this study, we used the six AD sign scores, rather than their sum (ADSS), to extract more information from the data. 18.9% of the daily AD sign scores were missing. We removed five patients with less than 10 daily observations, resulting in a total of 34,921 patient‐day observations.
FIGURE 1.

Example trajectories of the six atopic dermatitis (AD) sign scores and the derived AD symptom state for a representative patient
The use of topical corticosteroids (TCS; yes/no answer) was recorded daily. Weather variables (mean temperature, relative humidity, total rainfall and diurnal temperature range) and the concentration of air pollutants (PM10, NO2 and O3) were collected daily for each patient. A binary AD symptom state was derived in Kim et al. 10 from the sign scores: the state was 1 when the sum of itching and sleep disturbance scores was greater than or equal to 2 and the scores of at least 2 of redness, dryness, oedema or oozing were non‐zero, and the state was 0 otherwise (Figure 1).
2.2. Mixed‐effect autoregressive ordinal logistic regression model
We developed a model to predict the patient‐dependent dynamics for each of the six AD sign scores. The model has a similar structure to that of our previously published model, 15 namely an autoregressive model with patient‐dependent parameters, and uses an ordinal logistic regression to model the ordinal signs. The mixed effect autoregressive ordinal logistic regression is described by
where is a sign score for the kth patient at day , is the patient‐dependent intercept (the random effect), ’s are the regression coefficients, is the Kronecker delta and is the vector of cut‐off values of the ordered logistic distribution (details in the supplementary text). The linear predictor, corresponds to the location parameter of the ordered logistic distribution. We also considered a model that includes all covariates of interest (the environmental variables and TCS usage at t) for evaluation of the impact of environmental factors in the linear predictor and models with one covariate each. Cross‐correlation analysis did not support the inclusion of higher order time lags for sign scores or covariates in the model. The models were fitted, using the ‘lme4’ package in R, to pairs of successive scores . Pairs with at least one missing value were removed from the training set.
2.3. Model validation
We evaluated the predictive performance of our models in a forward‐chaining setting where the models were trained with the first day's data and tested on the second day's data, then re‐trained on the first 2 days' data and tested on the third day's data, and so on. The performance of predicting AD sign scores was quantified by the ranked probability score (RPS), a proper scoring rule for ordinal probabilistic forecasts. The performance of predicting binary AD symptom states was evaluated with the Brier score.
We compared the performance of our models to that of two benchmark models: the uniform forecast model that predicts each of the five possible outcomes of a sign score with the probability of one‐fifth and the historical forecast model where the probability of each possible outcome is equal to its occurrence in the patient's training data. We also compared our models to the logistic regression model proposed in Kim et al. 10 for the prediction of AD symptom states.
3. RESULTS
3.1. Model validation
We trained each of the six mixed‐effect autoregressive logistic regression models without covariates, where each model was developed for prediction of one of the AD sign scores. The models learnt the patient‐dependent dynamics of the sign scores as more data came in and outperformed the benchmark models in predicting the next day's score for all AD signs (Figure 2). The performance of the benchmark models varied across signs, confirming that the scores of some signs are more imbalanced than others (Figure S1) and easier to predict. For instance, the historical forecast model (and our model) achieved an almost perfect prediction for oedema, for which the outcome is 0 for nearly 90% of the time. For other signs, such as dryness, the RPS of our model was about 60% lower (i.e., achieved a better performance) than that of the historical forecast model after 200 days of training.
FIGURE 2.

Comparison of the predictive performance of our model (the mixed‐effect autoregressive ordinal logistic regression without covariates) to that of the uniform forecast and the historical forecast models, for prediction of each of the six atopic dermatitis (AD) signs. The performance of predicting AD sign scores is measured by the ranked probability score (RPS) (the lower RPS indicates the better predictive performance). Learning curves were obtained using locally weighted scatterplot smoothing. Shaded areas correspond to ±1.96 standard error
We derived a prediction for the binary AD symptom state from the six mixed‐effect autoregressive logistic regression models for AD sign scores (Figure S2), assuming their predictions are independent random variables. Our model outperformed the two benchmark models and the logistic regression model proposed in Kim et al. 10 The Brier score of our model was about 40% lower (i.e., achieved a better performance) than the logistic regression model, whose performance was as low as that of the historical forecast model.
3.2. Effect of environmental factors on the model's predictions
To assess the effects of exogenous factors (weather, air pollution and TCS usage) on the prediction of AD sign scores, we computed the pairwise difference in the RPS between the model without covariates, the models with a single covariate and the model with all covariates (Figure 3).
FIGURE 3.

Effects of environmental factors (mean temperature [Temp], relative humidity [RH], total rainfall [RF], diurnal temperature range [DTR], and the concentration of air pollutants [PM10, NO2, O3]) and treatment usage (topical corticosteroids [TCS]) on AD sign score prediction. (A) The pairwise difference in predictive performance between the model without covariate (ranked probability score [RPS]) and the model with covariates (single or all, ). > 0 indicates that the model with covariates has a higher predictive performance. (B) The coefficients for the covariates in the single‐covariate models (). A positive coefficient means that an increase in the covariate is associated with a higher probability for more severe outcomes
No evidence was found to support that the inclusion of exogenous factors improved the predictive performance of the model for all signs (Figure 3A). Even though some of the coefficients associated with the covariates have confidence bounds that do not cross 0, all of them were small in magnitude accounting for approximately only 1% of the linear predictor (Figure 3B). These small coefficients result in the lack of a noticeable improvement in the predictive performance of the model by addition of the covariates.
4. DISCUSSION
The assessment of the effects of environmental factors on AD has gained a growing importance. Many prior studies investigated whether environmental factors were associated with the current AD severity, but they have failed to consider the dynamic nature of the severity nor investigated whether the future AD severity can be predicted by real time data on environmental factors.
We developed a mixed‐effect autoregressive ordinal logistic regression model that can predict the next day's AD severity scores, using the longitudinal data from a published panel study. 10 Our model successfully made daily prediction of the AD severity scores: it outperformed two benchmark models for the prediction of AD sign scores (Figure 2) and outperformed the benchmark models and the logistic regression model for the prediction of an AD symptom state proposed in Kim et al. 10 (Figure S2). Despite development of such a model, inclusion of environmental factors did not improve the predictive performance of the model (Figure 3).
Our results from a comprehensive data set of South Korean children does not present any convincing evidence to support a claim made in Kim et al. 10 that AD symptoms were associated with weather or air pollutants on a short‐term basis. Environmental factors can be considered predictive only if their inclusion in a predictive model improves its predictive performance, which was not the case here. The short‐term influence of the environmental factors on AD sign scores was outweighed by the previous scores' persistence, and the next day's score for a patient is more accurately predicted using the patient's today's score than using environmental data. Neglecting the time‐dependence of the AD symptoms severity scores as in previous studies 9 , 11 , 12 , 13 may misguide inferences about the effect size of environmental associations. The extent to which AD severity can be predicted from the measurement of environmental factors remains unclear. Our results throw serious doubts into the claim of many AD apps that purport to use real time environmental measures to inform AD users when their AD symptoms are likely to flare.
It is possible that other ‘internal’ factors such as the development of skin autoimmunity may be more important than external factors in determining disease fluctuations over time. 17 It is also important to state that factors that determine disease incidence may be different from those that determine disease chronicity, so it is still possible that environmental factors may be more predictive of the AD onset and long‐term disease trajectories rather than short‐term symptom fluctuations.
This study used the high‐quality data set on South Korean children with high rates of data completion. Modelling each of the six AD signs enabled to extract more information from the data and to generate predictions for any quantity of interest to the practitioner, be it ADSS or any combination of the sign scores. In terms of study limitations, the AD sign scores used in this study were obtained by subjective assessment by the patients (or their carers) on a discrete scale. Further investigation of the seemingly small effects of environmental factors on AD severity scores may benefit from more data or better quality data, for example, by recording time series of SCORAD or eczema area and severity index, or their self‐assessed version, as they are more objective and may offer better responsiveness to environmental changes. However, dichotomization of AD sign scores into a binary AD symptom state as proposed in Kim et al. 10 reduces the power of the analysis 18 and is not recommended. Our model might be improved by taking measurement errors into account using hidden Markov models or by modelling the correlations between AD signs in a multi‐outcome regression. However, we believe the additional complexity in the model would only result in marginal improvements in the already solid predictive performance.
Whilst this study focused on the association between environmental factors and future AD severity scores, whether environmental factors cause a change in AD scores is of more interest for the AD community. Estimating the causal effect is challenging, as most causal inference methods assume the absence of unobserved confounders, 19 an assumption that is deemed unrealistic. For example, ‘staying indoors’ was not recorded in the original study 10 but could lead to reverse causation if patients decided to stay indoors during a pollution peak. Estimation of non‐linear interactions may also be required, if patients react differently to environmental triggers depending on their severity: mild patients could be less sensitive than severe patients who may be subject to a ‘ceiling effect’. Constructing causal diagrams using specialist background knowledge could be a promising approach.
The methods presented in this study could be applied to other diseases, such as asthma, for which associations between environmental factors and asthma exacerbations are of interest.
CONFLICT OF INTEREST
The author declares that there is no conflict of interest that could be perceived as prejudicing the impartiality of the research reported.
AUTHOR CONTRIBUTIONS
Guillem Hurault: Conceptualization; equal, data curation; equal, formal analysis; equal, investigation; equal, methodology; equal, software; equal, validation; equal, visualization; equal, writing‐original draft; equal. Valentin Delorieux: Data curation; equal, formal analysis; equal, investigation; equal, methodology; equal, software; equal, validation; equal, writing‐original draft; equal. Young‐Min Kim: Resources; supporting. Kangmo Ahn: Resources; Supporting, writing‐review & editing; supporting. Hywel Williams: Conceptualization; supporting, funding acquisition; supporting, writing‐review & editing; equal. Reiko J. Tanaka: Conceptualization; equal, funding acquisition; lead, project administration; lead, resources; lead, supervision; lead, validation; supporting, writing‐original draft; supporting, writing‐review & editing; equal.
Supporting information
Supplementary Material 1
ACKNOWLEDGMENTS
This study was funded by British Skin Foundation (005/R/18) and by the Environmental Health Center Project of the Ministry of Environment, Republic of Korea.
REFERENCES
- 1. Williams HC. Atopic Dermatitis: The Epidemiology, Causes and Prevention of Atopic Eczema. Cambridge University Press; 2000. [Google Scholar]
- 2. Langan SM, Irvine AD, Weidinger S. Atopic dermatitis. Lancet. 2020;396(10247):345‐360. [DOI] [PubMed] [Google Scholar]
- 3. Williams H, Stewart A, von Mutius E, Cookson W, Anderson HR. Is eczema really on the increase worldwide? J Allergy Clin Immunol. 2008;121(4):947‐954. [DOI] [PubMed] [Google Scholar]
- 4. Passeron T, Krutmann J, Andersen ML, Katta R, Zouboulis CC. Clinical and biological impact of the exposome on the skin. J Eur Acad Dermatol Venereol. 2020;34(S4):4‐25. [DOI] [PubMed] [Google Scholar]
- 5. Ahn K, Kim BE, Kim J, Leung DY. Recent advances in atopic dermatitis. Curr Opin Immunol. 2020;66:14‐21. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6. Ahn K. The role of air pollutants in atopic dermatitis. J Allergy Clin Immunol. 2014;134(5):993‐999. [DOI] [PubMed] [Google Scholar]
- 7. Kathuria P, Silverberg JI. Association of pollution and climate with atopic eczema in US children. Pediatr Allergy Immunol. 2016;27(5):478‐485. [DOI] [PubMed] [Google Scholar]
- 8. Hendricks AJ, Eichenfield LF, Shi VY. The impact of airborne pollution on atopic dermatitis: a literature review. Br J Dermatol. 2020;183(1):16‐23. [DOI] [PubMed] [Google Scholar]
- 9. Baek J‐O, Cho J, Roh J‐Y. Associations between ambient air pollution and medical care visits for atopic dermatitis. Environ Res. 2020;195:110153. [DOI] [PubMed] [Google Scholar]
- 10. Kim YM, Kim J, Han Y, Jeon BH, Cheong HK, Ahn K. Short‐term effects of weather and air pollution on atopic dermatitis symptoms in children: a panel study in Korea. PLoS One. 2017;12(4):e0175229. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11. Kim Y‐M, Kim J, Jung K, Eo S, Ahn K. The effects of particulate matter on atopic dermatitis symptoms are influenced by weather type: application of spatial synoptic classification (SSC). Int J Hyg Environ Health. 2018;221(5):823‐829. [DOI] [PubMed] [Google Scholar]
- 12. Noh SR, Kim J‐S, Kim E‐H, et al. Spectrum of susceptibility to air quality and weather in individual children with atopic dermatitis. Pediatr Allergy Immunol. 2019;30(2):179‐187. [DOI] [PubMed] [Google Scholar]
- 13. Patella V, Florio G, Palmieri M, et al. Atopic dermatitis severity during exposure to air pollutants and weather changes with an Artificial Neural Network (ANN) analysis. Pediatr Allergy Immunol. 2020;31(8):938‐945. [DOI] [PubMed] [Google Scholar]
- 14. Galen LS, Xu X, Koh MJA, Thng S, Car J. Eczema apps conformance with clinical guidelines: a systematic assessment of functions, tools and content. Br J Dermatol. 2020;182(2):444‐453. [DOI] [PubMed] [Google Scholar]
- 15. Hurault G, Domínguez‐Hüttinger E, Langan SM, Williams HC, Tanaka RJ. Personalized prediction of daily eczema severity scores using a mechanistic machine learning model. Clin Exp Allergy. 2020;50(11):1258‐1266. [DOI] [PubMed] [Google Scholar]
- 16. Lee JY, Kim M, Yang H‐K, et al. Reliability and validity of the atopic dermatitis symptom score (ADSS). Pediatr Allergy Immunol. 2018;29(3):290‐295. [DOI] [PubMed] [Google Scholar]
- 17. Tang TS, Bieber T, Williams HC. Does "autoreactivity" play a role in atopic dermatitis? J Allergy Clin Immunol. 2012;129(5):1209‐1215. [DOI] [PubMed] [Google Scholar]
- 18. Royston P, Altman DG, Sauerbrei W. Dichotomizing continuous predictors in multiple regression: a bad idea. Stat Med. 2006;25(1):127‐141. [DOI] [PubMed] [Google Scholar]
- 19. Glymour C, Zhang K, Spirtes P. Review of causal discovery methods based on graphical models. Front Genet. 2019;10:524. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Supplementary Material 1
