Abstract
The aim of this study was to develop and explore the diagnostic accuracy of a decision tree derived from a large real-life primary care population.
Data from 9297 primary care patients (45% male, mean age 53±17 years) with suspicion of an obstructive pulmonary disease was derived from an asthma/chronic obstructive pulmonary disease (COPD) service where patients were assessed using spirometry, the Asthma Control Questionnaire, the Clinical COPD Questionnaire, history data and medication use. All patients were diagnosed through the Internet by a pulmonologist. The Chi-squared Automatic Interaction Detection method was used to build the decision tree. The tree was externally validated in another real-life primary care population (n=3215).
Our tree correctly diagnosed 79% of the asthma patients, 85% of the COPD patients and 32% of the asthma–COPD overlap syndrome (ACOS) patients. External validation showed a comparable pattern (correct: asthma 78%, COPD 83%, ACOS 24%).
Our decision tree is considered to be promising because it was based on real-life primary care patients with a specialist's diagnosis. In most patients the diagnosis could be correctly predicted. Predicting ACOS, however, remained a challenge. The total decision tree can be implemented in computer-assisted diagnostic systems for individual patients. A simplified version of this tree can be used in daily clinical practice as a desk tool.
Short abstract
A real-life diagnostic decision tree that can be implemented in digital decision-making programmes http://ow.ly/VnHut
Introduction
Diagnostic reasoning and clinical decision making is essential in daily clinical practice and depends on the physician's ability to synthesise and interpret clinical information. Different attempts have been made to support physicians in this process by developing decision support tools. These tools have the potential to improve care and decrease variation in care delivery [1], and can provide useful diagnostic suggestions leading to a decrease in diagnostic errors [2]. Probably the most promising approach to improve diagnostic accuracy is to incorporate decision aids directly into daily clinical practice using computer-assisted diagnostic support systems [3]. These decision support tools based on expert opinion can provide expert consultation to physicians [2].
Many clinicians who have to deal with individual patients have a negative attitude towards these systems, as most are not developed in real-life situations, thus reducing generalisability [1]. Another shortcoming of currently available tools is that they are mostly based on regression and, hence, are too complex and time-consuming for use in daily clinical practice [4]. A new way to develop decision support tools is using data from real-life clinical decisions to develop decision trees.
Decision trees based on real-life data are promising because they can detect previously unknown interactions between the various items of clinical information and reveal relationships between assessment outcomes and patient characteristics. Additionally, decision trees are visually easy to interpret and transparent so that clinicians see the thresholds leading to the outcome. Moreover, they can trace back the model [5] and they can see what can be expected if the patient's status changes [6].
We set out to develop a decision tree to predict asthma, chronic obstructive pulmonary disease (COPD) and asthma–COPD overlap syndrome (ACOS) diagnosis based on careful analysis of 9297 real-life individual patient assessments in a primary care-based diagnostic support system [7]. All patients were suspected to have an obstructive pulmonary disease (OPD) and were assessed identically according to a structured protocol. Each patient was diagnosed by an experienced pulmonologist (n=10). The aim of this study was to enhance diagnostic accuracy and decrease diagnostic variation. We present a decision tree that should be able to be implemented as a decision aid in computer-assisted diagnostic support systems and a simplified and compact version of the decision tree should be able to be used on paper in daily clinical practice as desk tool.
Method
Study design
We retrospectively analysed data obtained from 2007 until 2012 from the Groningen asthma/COPD service for primary care (the Netherlands) [7]. The Standards for Reporting Diagnostic Accuracy (STARD) guidelines were used as a basis for this study. According to Dutch regulations, a separate ethical committee approval was not required because data were used anonymously and encrypted.
Patient cohort for dataset derivation
We only included patient data from experienced pulmonologists (n=10), who had each assessed ≥300 patients in the asthma/COPD service, in order to avoid the influence of learning effects in our results. Patients (aged >15 years) referred to the asthma/COPD service by their general practitioner for diagnostic assessment were included in the study (table 1). This was an unselected primary care population of patients with respiratory complaints. The proportion of no-show in the asthma/COPD service is on average 12%. The initial dataset consisted of 10 058 patients. Data from 761 patients were excluded because they could not perform an assessable spirometry (n=626) or had missing data at random (n=135). The analysis was therefore based on the remaining 9297 patients.
TABLE 1.
Derivation database# | Validation database¶ | |
Patients | 9297 | 3142+ |
Diagnosis | ||
COPD | 1716 (18.5) | 555 (17.7) |
Asthma | 4125 (44.4) | 685 (21.8) |
Probable asthma | 836 (26.6) | |
ACOS | 711 (7.6) | 247 (7.9) |
Other | 2745 (29.5) | 818 (26.0) |
Patient characteristics | ||
Male | 4146 (44.6) | 1347 (42.9) |
Smoked | ||
Never smoked | 2833 (30.5) | 1182 (37.7) |
Ever smoked | 6464 (69.5) | 1895 (62.3) |
Family history | ||
No or unknown family history | 4525 (48.7) | 2146 (68.3) |
Positive family history | 4772 (51.3) | 996 (31.7) |
Allergy | ||
No allergy | 5542 (58.9) | 932 (29.7) |
≥1 allergy | 3755 (39.9) | 1651 (52.5) |
Missing data | 105 (1.1) | 559 (17.8) |
Hyperreactivity | ||
No hyperreactivity | 3105 (33.4) | 2347 (74.7) |
Hyperreactivity present | 6192 (66.6) | 795 (25.3) |
Occupational risk | ||
No occupational risk | 8742 (94.0) | Unknown |
Occupational risk present | 555 (6.0) | |
Age years | 53.3±17.1 | 49.4±16.8 |
Age of onset years | 35.4±23.3 | 36.1±21.6 |
Total ACQ score | 1.2±0.9 | 1.3±0.9 |
Total CCQ score | 1.4±0.9 | 1.5±0.9 |
Lung function post bronchodilator | ||
FEV1 L | 2.9±1.0 | 2.9±1.0 |
FEV1 % predicted | 89.4±19.3 | 92.1±20.0 |
FVC L | 3.9±1.1 | 4.0±1.1 |
FVC % predicted | 101.6±16.5 | 106.7±35.9 |
FEV1/FVC | 73.0±12.9 | 72.1±13.6 |
Reversibility % | 6.1±7.5 | 6.9±9.0 |
Data are presented as n, n (%) or mean±sd. COPD: chronic obstructive pulmonary disease; ACOS: asthma–COPD overlap syndrome; ACQ: Asthma Control Questionnaire; CCQ: Clinical COPD Questionnaire; FEV1: forced expiratory volume in 1 s; FVC: forced vital capacity. #: database from the asthma/COPD service used for development of the decision tree (Groningen, the Netherlands); ¶: database from the asthma/COPD service used in the external validation (Rotterdam, the Netherlands); +: total diagnosed was 3141 because one patient could not perform a proper lung function test.
Predictors
Predictors could be divided into 1) patient characteristics, 2) patient-reported outcomes (PROs) and 3) spirometry results. All 22 predictors were collected during one regular baseline assessment procedure in the asthma/COPD service. No adverse effects were to be expected from the assessments.
Patient characteristics
A medical history questionnaire with questions about sex, age, age of onset of respiratory symptoms, family history, current and past symptoms, exacerbations, allergy and other stimuli provoking symptoms, current medication, occupation and smoking was collected.
PROs: the Asthma Control Questionnaire and the Clinical COPD Questionnaire
The Asthma Control Questionnaire (ACQ) [8] was used to measure asthma control and contains six questions. The Clinical COPD Questionnaire (CCQ) [9] was used to measure COPD health status and contains 10 questions. In the decision tree analysis we included all individual questions from the ACQ and CCQ and the total score on each questionnaire, to examine whether disease severity and specific symptoms could be used to distinguish between the different diagnoses.
Spirometry results
Spirometry was performed according to current guidelines [10, 11]. We analysed post-bronchodilator (post-BD) forced expiratory volume in 1 s (FEV1), post-BD forced vital capacity (FVC) and post-BD FEV1/FVC ratio. Also, reversibility of FEV1 (in litres) after 400 μg salbutamol was examined.
Statistical analyses
SPSS package 22.0 (IBM Corp., Armonk, NY, USA) was used for the statistical analyses. Initially, continuous variables were divided into categorical counterparts using optimal binning [12] to enhance the performance and accuracy of the decision tree. The number of predefined categorical counterparts was two, except for the body mass index and FEV1 post-BD, where we chose to accept a maximum of four counterparts (table 2) [3, 13].
TABLE 2.
Predictor | Established categories |
Patient characteristics | |
Age years | <55 |
≥55 | |
Age of onset years | <38 |
≥38 | |
BMI kg·m−2 | <22 |
≥22 and <36 | |
≥36 | |
Allergy total | No allergy |
≥1 allergy | |
Hyperreactivity | No hyperreactivity |
≥1 hyperreactivity | |
ACQ and CCQ | |
ACQ1 | 0 or 1 |
≥2 | |
ACQ total, ACQ2, ACQ4, ACQ5, ACQ6 | 0 |
≥1 | |
CCQ subscale mental, CCQ1, CCQ2, CCQ4 | 0 |
≥1 | |
CCQ subscale symptoms, CCQ6 | 0 or 1 |
≥2 | |
CCQ7 | <6 |
≥6 | |
Spirometry results | |
FEV1 % predicted | <78 |
≥78 and <92 | |
≥92 and <102 | |
≥102 | |
FVC % predicted | <81 |
≥81 | |
Reversibility % | <7 |
≥7 |
Continuous predictors were transformed to ordinal predictors using minimum descriptive length discretisation. It was not possible to create bins for Asthma Control Questionnaire (ACQ) question 3, Clinical COPD Questionnaire (CCQ) questions 3, 5, 8, 9 or 10, CCQ total or CCQ subscale functional, because of low association with the dependent variable. BMI: body mass index; FEV1: forced expiratory volume in 1 s; FVC: forced vital capacity.
Development of the decision tree
We used the exhaustive Chi-squared Automatic Interaction Detection (CHAID) method [13] to develop our decision tree. For an overview of relevant decision tree concepts see figure 1. In the decision tree we combined “indication of restriction”, “diagnosis unclear” or “no disease” with “other”. The maximum tree depth was five levels and the significance level for merging nodes was 0.01. Bonferroni correction was applied to correct for overstating of the significance level caused by multiple comparisons. The minimum number of patients in a child leaf was 94 (>1% of the total number of patients).
A simplified compact version of the decision tree [13] was developed by reducing the initial decision tree with a technique called pruning. Branches were pruned if the difference in main category between the parent leaf and the child leaf was <10%. For example, if the proportion of asthmatics in the parent leaf is 43% and the proportion of asthmatics in the child leaf is 40%, the branch will be pruned because the difference is <10%. To enhance usability we determined the maximum tree depth to be four levels and discussed this tool with experienced clinicians.
Internal validation
We validated our decision tree with the “10-fold cross validation” method (figure 2). The dataset was randomly divided into ten mutually exclusive subsets and each subset was held out in turn to function as validation sample. The decision tree was then developed on the combined nine remaining subsets. This procedure was repeated 10 times so that each subset was used once as validation set, according to Witten et al. [14], so that the final decision tree was based on 100 tree analyses.
External validation
We validated our decision tree in an external database of another Dutch asthma/COPD service for primary care that operates in Rotterdam and has a similar structure to the service in Groningen. Patients were assessed by two pulmonologists and two specialised general practitioners. This database is called the validation database.
Results
Patient characteristics
We included 9297 patients (mean age 53±17 years, 44.6% male, diagnosis by pulmonologist: 44.4% asthma, 18.5% COPD, 7.6% ACOS and 29.5% other). Patients from the validation dataset (n=3142) were comparable with patients from the derivation dataset (mean age 49±17 years, 42.9% male, 21.8% asthma, 26.6% “probable asthma”, 17.7% COPD, 7.9% ACOS and 26.0% other). However, the proportion of asthma diagnoses given by the pulmonologists differed (derivation: 44.4% asthma; validation: 21.8% asthma) (table 1).
Exhaustive CHAID analysis
The final decision tree consisted of the following predictors (in order of importance): FEV1/FVC, age of onset, smoking, allergy, reversibility, ACQ question 5 (“In general, during the past week, how much of the time did you wheeze?”), age, FEV1 and bronchial hyperreactivity. Comparisons between the predicted diagnoses and actual pulmonologists’ diagnoses are given in tables 3–5. The average predictive value of the decision tree before pruning was 69.0% (proportion correct: asthma 78.9%, COPD 84.7%, ACOS 31.6% and other 53.9%) (table 3). The most important pathways leading to diagnoses were: 1) no obstruction, onset age <38 years, allergy and reversibility ≥7%, leading to asthma (89% correct); and 2) obstruction, smoked, onset age ≥38 years and FEV1 <78% predicted, leading to COPD (81% correct). ACOS was only predicted by one pathway (obstruction, smoked, onset age <38 years). The pathway “no obstruction, no allergy, reversibility <7% and onset age ≥38 years” did not predict diagnosis and led to the category “other” in 1961 patients, which is 21.2% of the total patient population and is the largest branch. For an overview of all pathways, see table 6.
TABLE 3.
Diagnosis by pulmonologist | Diagnosis predicted by decision tree | Total | Correct | |||
ACOS | COPD | Asthma | Other# | |||
ACOS | 225 | 355 | 98 | 33 | 711 (7.6) | 225 (31.6) |
COPD | 135 | 1454 | 68 | 59 | 1716 (18.5) | 1454 (84.7) |
Asthma | 162 | 101 | 3253 | 609 | 4125 (44.4) | 3253 (78.9) |
Other# | 28 | 128 | 1109 | 1480 | 2745 (29.5) | 1480 (53.9) |
Total | 550 (5.9) | 2038 (21.9) | 4528 (48.7) | 2181 (23.5) | 9297 (100) | 6412 (69.0) |
Data are presented as n or n (%). ACOS: asthma–COPD overlap syndrome; COPD: chronic obstructive pulmonary disease. #: “diagnosis unclear”, “indication of restriction” or “no disease”. Bold indicates diagnoses that were correctly predicted.
TABLE 5.
Diagnosis by validation pulmonologist | Diagnosis predicted by decision tree | Total | Correct | |||
ACOS | COPD | Asthma | Other# | |||
ACOS | 59 | 151 | 35 | 2 | 247 (7.9) | 59 (23.9) |
COPD | 53 | 459 | 37 | 6 | 555 (17.7) | 459 (82.7) |
Asthma | 42 | 32 | 533 | 78 | 685 (21.8) | 533 (77.8) |
Probable asthma | 11 | 7 | 580 | 238 | 836 (26.6) | 580 (69.4) |
Other# | 10 | 59 | 336 | 413 | 818 (26.0) | 413 (50.5) |
Total | 175 | 708 | 1521 | 737 | 3141 (100.0) | 2044 (65.1) |
Data are presented as n or n (%). ACOS: asthma–COPD overlap syndrome; COPD: chronic obstructive pulmonary disease. #: “diagnosis unclear”, “indication of restriction” or “no disease”. Bold indicates diagnoses that were correctly predicted.
TABLE 6.
Rule branch | Main outcome | Total leaf n (% total leaf) | ACOS n (% total) | COPD n (% total) | Asthma n (% total) | Other# n (% total) |
FEV1/FVC ≥70% predicted | Asthma | 1415 (15.2) | 11 (0.8) | 3 (0.2) | 1108 (78.3) | 293 (20.7) |
Onset age <38 years | ||||||
≥1 allergy | ||||||
Reversibility <7% | ||||||
FEV1/FVC ≥70% predicted | Asthma | 724 (7.8) | 11 (1.5) | 1 (0.1) | 647 (89.4) | 65 (9.0) |
Onset age <38 years | ||||||
≥1 allergy | ||||||
Reversibility ≥7% | ||||||
FEV1/FVC ≥70% predicted | Asthma | 829 (8.9) | 16 (1.9) | 5 (0.6) | 593 (71.5) | 215 (25.9) |
Onset age <38 years | ||||||
No allergy | ||||||
Wheezing | ||||||
FEV1/FVC ≥70% predicted | Asthma | 548 (5.9) | 11 (2.0) | 7 (1.3) | 276 (50.4) | 254 (46.4) |
Onset age ≥38 years | ||||||
≥1 allergy | ||||||
Reversibility <7% | ||||||
FEV1/FVC ≥70% predicted | Asthma | 181 (1.9) | 8 (4.4) | 2 (1.1) | 133 (73.5) | 38 (21.0) |
Onset age ≥38 years | ||||||
≥1 allergy | ||||||
Reversibility ≥7% | ||||||
FEV1/FVC <70% predicted | Asthma | 356 (3.8) | 35 (9.8) | 43 (12.2) | 219 (61.5) | 59 (16.6) |
Never smoked | ||||||
FEV1/FVC <70% predicted | ACOS | 783 (8.4) | 302 (38.6) | 252 (32.2) | 183 (23.4) | 46 (5.9) |
Onset age <38 years | ||||||
Smoked | ||||||
FEV1/FVC <70% predicted | COPD | 1142 (12.3) | 164 (14.4) | 928 (81.3) | 19 (1.7) | 31 (2.7) |
Onset age ≥38 years | ||||||
Smoked | ||||||
FEV1 <78% predicted | ||||||
FEV1/FVC <70% predicted | COPD | 257 (2.8) | 26 (10.1) | 203 (79.0) | 11 (4.3) | 17 (6.6) |
Onset age ≥38 years | ||||||
Smoked | ||||||
FEV1 ≥78% and <92% predicted | ||||||
Reversibility <7% | ||||||
FEV1/FVC <70% predicted | COPD | 168 (1.8) | 56 (33.3) | 79 (47.0) | 18 (10.7) | 15 (8.9) |
Onset age ≥38 years | ||||||
Smoked | ||||||
FEV1 ≥78% and <92% predicted | ||||||
Reversibility ≥7% | ||||||
FEV1/FVC <70% predicted | COPD | 238 (2.6) | 32 (13.4) | 127 (53.4) | 32 (13.4) | 47 (19.7) |
Onset age ≥38 years | ||||||
Smoked | ||||||
FEV1 ≥92% predicted | ||||||
FEV1/FVC ≥70% predicted | Other# | 1961 (21.2) | 33 (1.7) | 61 (3.1) | 561 (28.6) | 1306 (66.6) |
Onset age ≥38 years | ||||||
No allergy | ||||||
FEV1/FVC ≥70% predicted | Other# | 695 (7.5) | 6 (0.9) | 5 (0.7) | 323 (46.8) | 359 (51.7) |
Onset age <38 years | ||||||
No allergy | ||||||
No wheezing |
ACOS: asthma–COPD overlap syndrome; COPD: chronic obstructive pulmonary disease; FEV1: forced expiratory volume in 1 s; FVC: forced vital capacity. #: “diagnosis unclear”, “indication of restriction” or “no disease”. Bold indicates diagnoses that were correctly predicted. Spirometry results were taken after admission of bronchodilation.
The simplified compact version of the decision tree (figure 3) was slightly more efficient, with 11 termination leaves. The simplified tree is practical in clinical practice. However, the overall precision of this tree was slightly lower than the complete decision tree: overall 67.5% were correctly predicted (proportion correct: asthma 72.1%, COPD 77.9%, ACOS 42.5% and other 60.7%). After discussion with experienced clinicians (n=3), we decided to exclude FEV1 post-BD, to enhance applicability. For a comparison between the predicted diagnoses from this simplified decision tree and the actual pulmonologists' diagnoses, see table 4.
TABLE 4.
Diagnosis by pulmonologist | Diagnosis predicted by simplified tree | Total | Correct | |||
ACOS | COPD | Asthma | Other# | |||
ACOS | 302 | 278 | 92 | 39 | 711 (7.6) | 302 (42.5) |
COPD | 252 | 1337 | 61 | 66 | 1716 (18.5) | 1337 (77.9) |
Asthma | 183 | 80 | 2976 | 886 | 4125 (44.4) | 2976 (72.1) |
Other# | 46 | 110 | 924 | 1665 | 2745 (29.5) | 1665 (60.7) |
Total | 783 (8.4) | 1805 (19.4) | 4053 (43.6) | 2656 (28.6) | 9297 (100) | 6280 (67.5) |
Data are presented as n or n (%). ACOS: asthma–COPD overlap syndrome; COPD: chronic obstructive pulmonary disease. #: “diagnosis unclear”, “indication of restriction” or “no disease”. Bold indicates diagnoses that were correctly predicted.
Internal validation
The error rates of the 10 repeated decision tree analyses ranged from 0.314 to 0.318, with an average error of 0.316. Variation in error rates exist because small differences in random splits used for the “10-fold cross validation” occur. We have selected the decision tree with the lowest error rate (0.314).
External validation
Our decision tree could correctly predict diagnosis in 54.2% of the patients in the validation dataset (proportion correct: asthma 77.8%, COPD 82.7%, ACOS 23.9% and other 39.4%). In 836 (26.6%) patients from the validation database with unclear diagnosis, the assessing pulmonologists added a remark in the database with the notion “probable asthma”. We repeated the validation procedure and included “probable asthma” patients in the asthma group. The accuracy of our decision tree improved substantially: the overall proportion correct became 65.1% (ACOS 23.9%, COPD 82.7%, asthma 77.8% and other 50.5%), which is comparable with the accuracy of the decision tree in the derivation dataset (table 5).
Discussion
Main results
In this study, we have presented a thoroughly developed diagnostic support tool, based on a large database with real-life primary care patients suspected to have OPD who have received a structured assessment and an expert diagnosis. We chose this patient population because OPDs like asthma and COPD are common in primary care, and underdiagnosis of COPD and misdiagnosis between COPD and asthma are an important clinical problem [15]. Our tool was able to correctly predict diagnosis in 69% of the patients (proportion correct: asthma 79%, COPD 85% and ACOS 32%) and was based on a combination of patient characteristics, symptoms and spirometry results, which are part of guideline recommended assessments. Our decision tree provides a simple, well interpretable and practical overview that generates a diagnostic suggestion for primary care patients suspected to have an OPD. Additionally, we have developed a simplified version of the decision tree to be used as a desk tool in clinical practice. This slightly decreased the accuracy of the original decision tree (proportion correct: overall 68%, asthma 72% and COPD 78%) but increased the proportion of correctly predicted ACOS patients (43%).
Limitations
Although most patients could be correctly diagnosed with our decision tree, still 31% of the patients could not be diagnosed correctly using the diagnosis originally made by the pulmonologist as gold standard. This might have been caused by the diagnostic variation among pulmonologists, which was previously described by Metting et al. [7]. Despite this diagnostic variation between the pulmonologists, additional data from 1856 patients showed that most diagnoses were confirmed at follow-up (confirmed in 92% of the asthma patients, in 86% of the COPD patients and in 73% of the ACOS patients). According to Buffels et al. [16], in the absence of a gold standard, a pulmonologist's diagnosis is most accurate. Of course, elimination of all uncertainty in a diagnostic support tool is not feasible; this would cost too much in terms of resources [3]. Response to treatment might determine whether the predicted diagnosis was satisfactory [17] and the predicted diagnosis can be considered as a working diagnosis.
Another limitation is that the decision tree does not differentiate between patients with or without disease. The diagnosis “no disease” is combined with “indication of restriction” and “diagnosis unclear” in the umbrella term “other”. However, the proportion of patients without disease was very small (n=709, 7.6%) and would therefore be difficult to predict with a decision tree.
Finally, the decision tree has a low accuracy in diagnosing ACOS. Again, using the diagnosis originally made by the pulmonologist as gold standard, it means that the pulmonologists had little agreement about this diagnosis at the time the data were collected. It is known that ACOS is difficult to diagnose from both asthma and COPD, which was reflected in our decision tree. Differentiating between asthma, COPD and ACOS is important because the treatment and prognosis are different [15]. ACOS patients have more respiratory symptoms, more functional limitations, and are more frequently hospitalised [18]. In 2014, the Global Initiative for Chronic Obstructive Lung Disease (GOLD) and the Global Initiative for Asthma (GINA) presented new guidelines for ACOS that might enhance future diagnostic accuracy [19] and will probably lead to more consensus among physicians.
Strengths and weaknesses
Internal validity
CHAID is based on the maximum likelihood ratio and is considered to be at least as good as log regression techniques; however, it is easier to interpret and no calculation of risk scores is needed because the user can simply follow the tree [20]. The exhaustive CHAID method provides an even more thorough heuristic for finding the optimal way of grouping the categories of each predictor, and provides a better suited approximation for the Bonferroni correction [13]. We performed the “10-fold cross validation” method because this method is considered to be the best validation method [14].
We used specialists' diagnoses, which we considered to be the gold standard. Patients in the asthma/COPD service were diagnosed from spirometry and history data through the Internet. Previously, Lucas et al. [21] showed that pulmonologists can reliably diagnose patients from written spirometry and history data. However, all diagnoses in this system were based on the available variables and were not confirmed by, for example, bronchial hyperresponsiveness testing, exhaled nitric oxide fraction or extended radiology, because these are not used in primary care practice. One can therefore argue that these diagnoses are not fully confirmed and are just a step in the diagnostic process.
External validation
The decision tree could correctly predict 54% of the patients in the validation dataset. However, adding “probable asthma” to the asthma group improved the accuracy substantially (from 54% to 65%). The lower overall prediction performance in the validation dataset might be caused by the difference in opinion from the pulmonologists who assessed the patients in the validation dataset to the pulmonologists in the original dataset. We make this assumption because the proportion of patients diagnosed with asthma by the pulmonologists was lower (22% in the validation dataset, compared with 44% in the original dataset). Most patients with “probable asthma” in Rotterdam were referred for a histamine provocation test (n=628, 75%). Apparently, pulmonologists from the derivation asthma/COPD service in Groningen establish the diagnosis of asthma more quickly than the pulmonologists in the validation asthma/COPD service. Additional analyses showed that probable asthma patients had on average lower reversibility compared with asthma patients (mean±sd reversibility: probable asthma patients 3.6±4.9%, asthma patients 12.5±12.1%; p<0.001).
An effectiveness study has shown that patients who were diagnosed and followed-up by the asthma/COPD service in Groningen improved in health status, asthma control and exacerbation rate [7]. We therefore assume that our decision tree is of added value for primary care respiratory patients and that the external validity of our decision tree is high because we have included a large sample of real-life primary care patients, while our decision tree is developed with common predictors that are part of guideline recommended assessments in patients suspected to have an OPD [10, 11]. Therefore, the generalisability of our decision tree is expected to be high.
Comparison with existing literature
In the field of respiratory medicine, several decision trees have been developed to predict severity [6], mortality [22], hospitalisation [4] and clinical outcomes [23]. In this article, we have presented the first real-life decision tree to predict diagnosis in patients suspected to have an OPD in primary care daily clinical practice. This is important because diagnostic errors are common [3, 24, 25] in general practice [26]. 10–15% of all diagnoses are estimated to be incorrect [26]. These errors affect patients outcomes [24, 27], and can lead to inappropriate patient care and increased healthcare costs [2, 3]. Being a physician can be demanding [25] and making decisions under time pressure can negatively influence diagnostic performance [17].
In the past 20 years, a consensus has been reached about a dual-system theory that proposes two modes of clinical decision making. The first system consists of one nonverbal intuitive cognition system, which is fast but error prone [17] and is based on intuitive reasoning, while the second system is based on the classical analytical reasoning approach [26]. Experienced physicians use both systems while novices mostly rely on the second hypothesis-testing system [17]. The decision support tool presented here matches both pathways by providing diagnostic suggestions. It points out possible diagnoses along with an estimation of probability, which can support the nonverbal intuitive cognition system. It also supports the analytic reasoning approach by giving feedback so that the initial diagnosis can be confirmed or dismissed. Our decision tree can be used by novices and experienced physicians, so that novices can function like a more experienced physician [1, 24] and experts can use the tree as a feedback tool to confirm their initial diagnosis or suggest another.
Spirometry is considered to be essential for proper diagnosis, according to the GOLD and GINA guidelines [18]. Symptom-based questionnaires in combination with spirometry enhance diagnostic accuracy of OPD even more [15]. Our decision tree combined both and produced transparent thresholds for continuous variables like age or reversibility that can be used in clinical practice.
In the past years, more emphasis has been given to personalised medicine instead of the “one size fits all” approach. We found that there are different pathways leading to the same diagnosis. We found six pathways leading to asthma and four leading to COPD (table 6). This is consistent with the new insights that asthma and COPD are heterogeneous diseases.
Implementation
We have presented a computer-assisted diagnostic support system for OPDs based on real-life primary care data that can be implemented in digital automated decision-making programmes. The transparency of our decision tree is valuable because the proposed diagnosis is accompanied by a probability that can support the physicians in diagnosing and treating their individual patients. This might enhance diagnostic accuracy. The simplified and compact paper version of the decision tree could be helpful in clinical practice as a desk tool.
Recommendation for future research
The next step is to validate our decision tree in other primary care populations and in clinical practice, to optimise the predictive value and the applicability in individual patients with suspicion of OPD.
Footnotes
Support statement: Funding was received from the Universitair Medisch Centrum Groningen (regular PhD salary) and cofunding was received from Novartis (grant for department). Funding information for this article has been deposited with FundRef.
Conflict of interest: Disclosures can be found alongside this article at openres.ersjournals.com
References
- 1.Berner ES, Graber ML. Overconfidence as a cause of diagnostic error in medicine. Am J Med 2008; 121: Suppl. 5, S2–S23. [DOI] [PubMed] [Google Scholar]
- 2.McDonald KM, Matesic B, Contopoulos-Ioannidis DG, et al. . Patient safety strategies targeted at diagnostic errors: a systematic review. Ann Intern Med 2013; 158: 381–389. [DOI] [PubMed] [Google Scholar]
- 3.Graber M, Gordon R, Franklin N. Reducing diagnostic errors in medicine: what's the goal? Acad Med 2002; 77: 981–992. [DOI] [PubMed] [Google Scholar]
- 4.Tsai CL, Clark S, Camargo CA Jr. Risk stratification for hospitalization in acute asthma: the CHOP classification tree. Am J Emerg Med 2010; 28: 803–808. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Le Loët X, Berthelot JM, Cantagrel A, et al. . Clinical practice decision tree for the choice of the first disease modifying antirheumatic drug for very early rheumatoid arthritis: a 2004 proposal of the French Society of Rheumatology. Ann Rheum Dis 2006; 65: 45–50. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Esteban C, Arostegui I, Moraza J, et al. . Development of a decision tree to assess the severity and prognosis of stable COPD. Eur Respir J 2011; 38: 1294–1300. [DOI] [PubMed] [Google Scholar]
- 7.Metting EI, Riemersma RA, Kocks JH, et al. . Feasibility and effectiveness of an asthma/COPD service for primary care: a cross-sectional baseline description and longitudinal results. NPJ Prim Care Respir Med 2015; 25: 14101. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Juniper EF, O'Byrne PM, Guyatt GH, et al. . Development and validation of a questionnaire to measure asthma control. Eur Respir J 1999; 14: 902–907. [DOI] [PubMed] [Google Scholar]
- 9.van der Molen T, Willemse BW, Schokker S, et al. . Development, validity and responsiveness of the Clinical COPD Questionnaire. Health Qual Life Outcomes 2003; 1: 13. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Global Initiative for Chronic Obstructive Lung Disease (GOLD). Global Strategy for Diagnosis, Management, and Prevention of COPD. 2013. Available from: www.goldcopd.org
- 11.Global Initiative for Asthma. Pocket Guide for Asthma Management and Prevention. 2012. Available from: www.ginasthma.org
- 12.Maslove DM, Podchiyska T, Lowe HJ. Discretization of continuous features in clinical datasets. J Am Med Inform Assoc 2013; 20: 544–553. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Ritschard G. CHAID and Earlier Supervised Tree Methods. Geneva, University of Geneva, 2010. www.unige.ch/ses/metri/cahiers/2010_02.pdf [Google Scholar]
- 14.Witten IH, Frank E, Hall MA. Data Mining: Practical Machine Learning Tools and Techniques. 3rd Edn Burlington, Elsevier Inc., 2011. [Google Scholar]
- 15.Miravitlles M, Andreu I, Romero Y, et al. . Difficulties in differential diagnosis of COPD and asthma in primary care. Br J Gen Pract 2012; 62: e68–e75. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Buffels J, Degryse J, Liistro G, et al. . Differential diagnosis in a primary care population with presumed airway obstruction: a real-life study. Respiration 2012; 84: 44–54. [DOI] [PubMed] [Google Scholar]
- 17.Elstein AS. Thinking about diagnostic thinking: a 30-year perspective. Adv Health Sci Educ Theory Pract 2009; 14: Suppl. 1, 7–18. [DOI] [PubMed] [Google Scholar]
- 18.de Marco R, Pesce G, Marcon A, et al. . The coexistence of asthma and chronic obstructive pulmonary disease (COPD): prevalence and risk factors in young, middle-aged and elderly people from the general population. PLoS One 2013; 8: e62985. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Global Initiative for Asthma, Global Initiative for Chronic Obstructive Lung Disease. Asthma, COPD, and Asthma–COPD Overlap Syndrome. 2014. Available from: www.ginasthma.org and www.goldcopd.org
- 20.Zhang J, Goode KM, Rigby A, et al. . Identifying patients at risk of death or hospitalisation due to worsening heart failure using decision tree analysis: evidence from the Trans-European Network-Home-Care Management System (TEN-HMS) study. Int J Cardiol 2013; 163: 149–156. [DOI] [PubMed] [Google Scholar]
- 21.Lucas A, Smeenk F, Smeele I, et al. . The validity of diagnostic support of an asthma/COPD service in primary care. Br J Gen Pract 2007; 57: 892–896. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Asiimwe AC, Brims FJ, Andrews NP, et al. . Routine laboratory tests can predict in-hospital mortality in acute exacerbations of COPD. Lung 2011; 189: 225–232. [DOI] [PubMed] [Google Scholar]
- 23.Eisner MD, Yegin A, Trzaskoma B. Severity of asthma score predicts clinical outcomes in patients with moderate to severe persistent asthma. Chest 2012; 141: 58–65. [DOI] [PubMed] [Google Scholar]
- 24.Schiff GD, Hasan O, Kim S, et al. . Diagnostic error in medicine: analysis of 583 physician-reported errors. Arch Intern Med 2009; 169: 1881–1887. [DOI] [PubMed] [Google Scholar]
- 25.Redelmeier DA, Ferris LE, Tu JV, et al. . Problems for clinical judgement: introducing cognitive psychology as one more basic science. CMAJ 2001; 164: 358–360. [PMC free article] [PubMed] [Google Scholar]
- 26.Croskerry P, Nimmo GR. Better clinical decision making and reducing diagnostic error. J R Coll Physicians Edinb 2011; 41: 155–162. [DOI] [PubMed] [Google Scholar]
- 27.Norman GR, Eva KW. Diagnostic error and clinical reasoning. Med Educ 2010; 44: 94–100. [DOI] [PubMed] [Google Scholar]