Abstract
Background
The prevalence of multimorbidity is increasing in recent years, and patients with multimorbidity often have a decrease in quality of life and require more health care. The aim of this study was to explore the evolution of multimorbidity taking the sequence of diseases into consideration.
Methods
We used a Belgian database collected by extracting coded parameters and more than 100 chronic conditions from the Electronic Health Records of general practitioners to study patients older than 40 years with multiple diagnoses between 1991 and 2015 (N = 65 939). We applied Markov chains to estimate the probability of developing another condition in the next state after a diagnosis. The results of Weighted Association Rule Mining (WARM) allow us to show strong associations among multiple conditions.
Results
About 66.9% of the selected patients had multimorbidity. Conditions with high prevalence, such as hypertension and depressive disorder, were likely to occur after the diagnosis of most conditions. Patterns in several disease groups were apparent based on the results of both Markov chain and WARM, such as musculoskeletal diseases and psychological diseases. Psychological diseases were frequently followed by irritable bowel syndrome.
Conclusions
Our study used Markov chains and WARM for the first time to provide a comprehensive view of the relations among 103 chronic conditions, taking sequential chronology into consideration. Some strong associations among specific conditions were detected and the results were consistent with current knowledge in literature, meaning the approaches were valid to be used on larger data sets, such as National Health care Systems or private insurers.
Keywords: Chronic conditions, Chronology of disease, Machine learning
Multimorbidity—the co-occurrence of 2 or more chronic diseases (1,2)—is frequent, especially among older people, and its occurrence is increasing. The consequences of multimorbidity include a decrease in quality of life and functional status and an increase in health care utilization (3–5). Increasing studies look into frequent combinations of diseases, most often duals and triads (6,7).
Ng et al. (8) made a systematic review of analytical methods used to identify multimorbidity patterns. The results showed that more than half of the studies (62/103) only adopted descriptive measures of multimorbidity to explain the associations between health conditions and selected indices, and 90% of the left 41 studies applied factor analysis or clustering. There are some other studies using methods like latent class analysis or latent Dirichlet allocation (9,10). These models are similar to clustering, which creates disease groups based on measures constructed by the algorithms. Apart from the disadvantage of being unable to track the development of multimorbidity, the results from these methods are often clusters of diseases instead of one specific disease, making it difficult to provide accurate clinical decision support in practice.
Some studies used Association Rule Mining (ARM) to explore multimorbidity (11,12). ARM is a method to uncover the combinations of items that occur together frequently (13,14). However, in traditional ARM, the sequence of the items is not taken into account. Besides, association rules among less frequent diseases will turn out to be strong if disease prevalence varies in a large range, because with fewer observations, the weight of each observation will take up a larger percentage. If 2 low-prevalence diseases with limited observations coincidentally co-occur several times, it will result in strong association rules when using traditional ARM, while these cases are of noninterest. Weighted Association Rule Mining (WARM) is a good solution to overcome the 2 limitations of traditional ARM (15). It introduces weights to make the co-occurrence with the same items different from each other if the sequence changes and the model adjusts the weight of the item and combination based on selected criteria and prioritizes the rules according to their importance.
Previous studies on multimorbidity mostly include a small group of common chronic diseases. Held et al. (12) studied about 17 common diseases for men aged 70 years or older, such as heart diseases, diabetes, and stroke, and Zemedikun et al. (11) collected data on 36 conditions. As a result, a large number of less frequent chronic conditions were neglected, excluding the potentially relevant information concerning a large number of patients, as 6%–8% of the general population is affected by a rare disease (16).
Models applied in previous studies group the diseases based on associations or correlations, but are not able to present the evolution of multimorbidity over time. Analyses over time regarding patient trajectories of multimorbidity can help to identify vulnerable patient groups and provide suggestions for improving care for complex patients. This study aims to take into account the sequence of occurrence. Another contribution of this article is that the associations among more than 100 chronic conditions were included in the analyses.
Method
Study Design and Participants
We used Intego (17), a longitudinal database of Electronic Health Records of patients in general practice from the Flanders region in Belgium. About 300 000 individual patients are recorded in the database, corresponding to more than 2.3% of the Flemish population, and they are representative of the general population in Flanders, Belgium (18). Intego uses an opt-out methodology and is approved by the local ethical committee of the KU Leuven and in line with Belgian privacy regulations. The register used for our analyses is based on daily clinical practice in primary care. All general practitioners (GPs) participating in the Intego registry are trained and receive regular feedback on their registration skills to minimize differences in prevalence caused by the registry. Only the data of sufficient quality are used and the quality of the data is secured through the testing of quality criteria (18).
Instead of the whole age span, health records of patients older than 40 were selected because multimorbidity is particularly prevalent in adults aged 40 years and older. For this study, we analyzed the data concerning the period between1991 and 2015. A total of 98 632 patients fulfilled the requirements and had records of the selected chronic conditions. The study regarded it as multimorbidity if the patient had more than 1 out of the selected chronic conditions recorded in an overlapping period. As these health conditions were all chronic, it was assumed that they lasted until the end of follow-up. In total, 65 939 patients had multimorbidity and among this group of patients, the average duration between the first and the last diagnosis was 8.29 years, the median value is 7.33 (P25–P75: 2.79–12.87). The sequence of the health conditions was derived from the diagnosis dates. The distribution of the total patients and multimorbidity patients in each year is given in Supplementary Table1.
Intego contains all the coded data registered in general practices, including clinical parameters, laboratory tests, disease diagnosis, and prescriptions. The diagnosis is coded based on the International Classification of Primary Care-2 (ICPC-2) and the exact date of the diagnosis is also coded (19). ICPC is a classification system aiming to reflect the content of primary care, with 17 chapters for different disease groups and about 1300 codes. ICPC codes cover the most frequent complaints or symptoms that GPs encounter in primary care and allow classification of the patient’s reason for encounter, the diagnosis managed, interventions, and the ordering of these data in an episode of care structure. Therefore, ICPC codes include not only medical diagnosis, but complaints, lifestyle factors, and risk factors, for example, alcohol abuse and limited function.
Selection of Conditions
A dedicated group of 4 GPs, and an epidemiologist of the Academic Centre for General Practice of the KU Leuven, categorized all diagnostic codes and complaints codes of the ICPC into either acute or chronic, adjusted from a previous categorization of ICPC codes (20). The purpose of this was to come to more valid prevalence estimates of chronic conditions and was based on clinical experience and epidemiological knowledge concerning the duration of a certain condition. An expected duration of 3 years or longer was classified as chronic. All 5 members of the group prepared the categorization independently, and the final categorization was discussed until consensus in 4 group meetings. This resulted in a selection of 105 conditions that were considered chronic. However, there was no record of A21 (risk factor for malignancy) in Intego, and we only had one record of W72 (malignant neoplasm related to pregnancy), which had no co-occurrence with any other condition. As a consequence, A21 and W72 were excluded from analysis, as the following analytics required patients with at least 2 conditions in the data set. The list of the 103 chronic conditions is given in Supplementary Table 2.
Statistical Analyses
The prevalence of multimorbidity in the selected population was estimated by sex and age groups (11). Then Markov chains were applied to study the sequence of development of selected conditions. A Markov chain is a stochastic model that describes a sequence of possible events whose probability only depends on the state attained in the previous event (21). The result of this model is the transition matrix, whose elements can be interpreted as the probability of having condition B after condition A. In this study, we calculated the first-order transition matrix, meaning the transfer from condition A to condition B occurs in only one step. More details on Markov chains are given in Supplementary Method 1.
The first-order transition matrix estimates the probability of transition between 2 conditions. To further select closely related groups of conditions, WARM was applied (15). In WARM algorithm, the record of one condition is called an item and the co-occurrence of several conditions is called a transaction. For example, {Hypertension, Diabetes, Stroke} is a transaction with 3 items. The results of WARM are rules derived from a large number of transactions and can be written as X→Y, X and Y are 2 different sets of items, known as itemsets. The itemsets can include either one or multiple conditions and they can differ from the combinations appeared in the transactions, as these itemsets are the summarized results instead of original data. Using the example transaction together with some other transactions, it is possible to derive a rule →{Hypertension}.
Three important measures can be used to evaluate the strength of association rules, namely support, confidence, and lift. Support refers to the frequency of the itemset and higher support indicates that the rules have high appearance among all the transactions. Confidence indicates how often the rules are true and it reveals the reliability of the rule. Lift is the ratio to define the importance of the rule. A lift larger than one means the itemset Y is likely to occur if itemset X occurs and a larger value indicates closer associations. Thresholds for each criterion were defined to select the strong rules. Based on the distribution of all the measures, the selection criteria in the article were set as support more than 0.001, confidence more than 0.1, and lift more than 1. The calculation of weighted association rules was done with R package arules (22). More details on WARM are given in Supplementary Method 2.
Results
Multimorbidity and Prevalence
About two thirds of the selected patients had multimorbidity (Table 1). The prevalence was much higher than that of the general population, mainly because the majority of the selected patients were older than 60 years. Based on the age in 2015, 63.7% of the patients were in the group of 60 years or older. Females had a higher prevalence than males, and the rates were evidently higher in the older age groups. Among patients older than age 74, 77.8% had multimorbidity.
Table 1.
Total | Number of Patients in Intego, N (%) | Number of Multimorbidity Patients, N (%) | Prevalence Rate of Multimorbidity (%) |
---|---|---|---|
Total Number (TN) | Multimorbidity Number (MN) | MN/TN | |
98 632 | 65 939 | 66.9 | |
Gender group | |||
Male | 45 796 (46.4) | 29 712 (45.1) | 64.9 |
Female | 52 836 (53.6) | 36 227 (54.9) | 68.6 |
Age group in 2015 | |||
40–49 | 13 400 (13.6) | 6719 (10.2) | 50.1 |
50–59 | 22 412 (22.7) | 13 017 (19.7) | 58.1 |
60–74 | 29 764 (30.2) | 20 482 (31.1) | 68.8 |
≥75 | 33 056 (33.5) | 25 721 (39.0) | 77.8 |
The prevalence of each chronic condition is given in Supplementary Table 2. The top 5 conditions with the highest prevalence were hypertension, lipid disorder, depressive disorder, type 2 diabetes, and osteoarthrosis. In contrast to these common conditions, some conditions only have limited records, such as poliomyelitis and neoplasm urinary tract. In total, 12 chronic conditions had less than 100 observations while 17 chronic conditions had a prevalence higher than 5%. The prevalence of the other 35 chronic conditions ranged from 1% to 5%.
Probabilities of Follow-Up Conditions
Figure 1 is a heatmap of the transition probability matrix calculated by using a Markov chain. The conditions distributed on the y-axis are the conditions at the first status, and those on the x-axis stand for the second status. Each small box in the heatmap represents the probability of developing the condition on the x-axis after the diagnosis of the condition on the y-axis and darker color means higher probability. The scale of the color bar on the right side of the heatmap was defined between 0 and 0.1 because most of the probabilities were within this range. The probabilities below 0.1 could be distinguished by the color shade while the probabilities higher than 0.1 would have the same darkest color as the probability of 0.1.
Some patterns can be observed from the heatmap and can be explained based on the properties of these chronic conditions. Distinct columns with deep color often appear when the conditions on the x-axis have high prevalence, for example, hypertension (K86), depressive disorder (P76), diabetes (T90), and lipid disorder (T93). There are also some rows that are totally white in most of the boxes, making a visual horizontal division. These rows start from conditions with low prevalence on the y-axis. They have fewer records, leading to a higher probability in each observed co-occurrence and a probability of 0 if no co-occurrence was observed. Poliomyelitis (N70), neoplasm cardiovascular (K72), and limited function/disability coming after psychological disorder (P28) are representatives of these conditions. Apart from the columns and rows, there are some triangle areas with deep color, including multiple conditions from the same disease group. Taking cardiovascular diseases as an example, K74–K86 forms the largest triangle area, meaning that once the patients develop one condition out of the group, they are at higher risk for other cardiovascular diseases. Some other disease groups, such as musculoskeletal diseases, endocrine, metabolic, and nutritional diseases, and blood diseases, have similar patterns. Almost all these triangles are above the diagonal, because in general the conditions from the same group are more severe if they have a larger number in ICPC codes; thus, it is more probable that they occur in the late stage as a subsequent condition.
The heatmap can also be interpreted in terms of a specific condition. For instance, the cells with dark color shade in the row of chronic alcohol abuse (P15, prevalence = 2.915%), a complaint with a medium level of prevalence in the data set, showed that it had a higher probability to have K86, P76, T90, and T93 as its subsequent. It was also likely to develop chronic obstructive pulmonary diseases (R95), asthma (R96), and dermatitis/atopic eczema (S87) afterward. The dark blue box representing the probability of developing depressive disorder (P76) after chronic alcohol abuse requires special attention, implying a strong association between them. On the other hand, when observing the conditions prior to chronic alcohol abuse, HIV infection/AIDS (B90), limited function/disability (P28), malignant neoplasm male genital other (Y78), and epilepsy (N88) became conspicuous. In this way, the clinical interested investigators can inspect the heatmap for clinically relevant associations of their interest.
As it may be difficult to go through the whole heatmap to find all interesting associations, we conducted an automatic search to detect the important associations. The method is described in detail in Supplementary Method 3, and the results are given in Supplementary Table 3.
Weighted Association Rules
The results of the Markov chain analysis included 103 chronic conditions to show a general picture of condition relations. Then WARM was applied to calculate strong associations from all chronic conditions and pay attention to robust and strong relations instead of incidental co-occurrence. Table 2 presents the results of WARM, which are the strongest rules selected based on the thresholds of support, confidence, and lift. These rules were presented in a more intuitive way in Figure 2. The results present the sequence of the condition diagnoses that occurs more often than other possibilities based on the statistics; however, it does not mean causality. For example, when the antecedent is “Retinopathy” and the consequent is “Diabetes,” it shows the fact that one of the early signs of diabetes is blurred vision so that the patients are diagnosed with diabetes after they have checked their eyes, but in fact retinopathy is the complication of diabetes. In some cases, we observed strong relations in both directions, for example, depressive disorder and irritable bowel syndrome, which meant these 2 conditions were probably strongly correlated and they often occurred together, making the sequence indifferent. Similar to the findings from Figure 1, hypertension and diabetes were subsequent of many chronic conditions and many strong relations were among the same disease group, such as musculoskeletal diseases, psychological diseases, respiratory diseases, and skin diseases. But there were still some rules that were worth special attention. Severe pain, such as migraine, could be followed by depressive disorder, while multiple psychological diseases might be followed by irritable bowel syndrome. As was already known in some studies, tobacco abuse could precede diabetes, which was also reflected in the WARM results. Besides, associations among dementia and hypertension, gout and osteoarthrosis, and presbyacusis and hypertension were also detected.
Table 2.
Antecedents | Consequents | Support | Confidence | Lift |
---|---|---|---|---|
Suicide/suicide attempt | Depressive disorder | 0.00139 | 0.505 | 3.384 |
Retinopathy | Diabetes non-insulin dependent | 0.00236 | 0.521 | 2.923 |
Retinopathy and hypertension | Diabetes non-insulin dependent | 0.00129 | 0.476 | 2.67 |
Anxiety disorder/anxiety state | Depressive disorder | 0.00373 | 0.297 | 1.995 |
Acquired deformity of spine | Back syndrome w/o radiating pain | 0.00106 | 0.12 | 1.986 |
Somatization disorder | Depressive disorder | 0.00268 | 0.264 | 1.767 |
Somatization disorder | Irritable bowel syndrome | 0.00136 | 0.134 | 1.738 |
Rheumatoid/seropositive arthritis | Osteoarthrosis other | 0.00207 | 0.193 | 1.697 |
Dermatitis contact/allergic | Dermatitis/atopic eczema | 0.00521 | 0.136 | 1.647 |
Diabetes insulin-dependent | Diabetes non-insulin dependent | 0.00155 | 0.292 | 1.64 |
Chronic alcohol abuse | Depressive disorder | 0.00405 | 0.231 | 1.551 |
Chronic bronchitis | Asthma | 0.00132 | 0.153 | 1.513 |
Migraine | Depressive disorder | 0.00812 | 0.221 | 1.484 |
Depressive disorder | Irritable bowel syndrome | 0.01565 | 0.105 | 1.36 |
Irritable bowel syndrome | Depressive disorder | 0.01565 | 0.203 | 1.36 |
Anxiety disorder/anxiety state | Irritable bowel syndrome | 0.00131 | 0.105 | 1.356 |
Hypertension complicated | Hypertension uncomplicated | 0.00207 | 0.782 | 1.266 |
Malignant neoplasm of kidney | Hypertension uncomplicated | 0.00105 | 0.705 | 1.141 |
Tobacco abuse | Diabetes non-insulin dependent | 0.00173 | 0.202 | 1.133 |
Gout | Osteoarthrosis other | 0.00537 | 0.127 | 1.114 |
Heart failure | Diabetes non-insulin dependent | 0.00171 | 0.195 | 1.092 |
Presbyacusis | Hypertension uncomplicated | 0.0023 | 0.668 | 1.082 |
Chronic obstructive pulmonary disease | Asthma | 0.00365 | 0.109 | 1.08 |
Acute myocardial infarction | Lipid disorder | 0.00587 | 0.304 | 1.046 |
Gout | Hypertension uncomplicated | 0.02732 | 0.645 | 1.044 |
Glomerulonephritis/nephrosis | Hypertension uncomplicated | 0.00121 | 0.641 | 1.038 |
Malignant neoplasm colon/rectum | Diabetes non-insulin dependent | 0.00281 | 0.182 | 1.021 |
Osteoarthrosis of knee | Osteoarthrosis other | 0.00634 | 0.115 | 1.015 |
Stroke/cerebrovascular accident | Hypertension uncomplicated | 0.01438 | 0.627 | 1.015 |
Malignant neoplasm prostate | Hypertension uncomplicated | 0.00636 | 0.626 | 1.013 |
Dementia | Hypertension uncomplicated | 0.00742 | 0.625 | 1.013 |
Acquired deformity of limb | Osteoarthrosis other | 0.00287 | 0.114 | 1.005 |
Note: The selection criteria are support >0.001, confidence >0.1, and lift >1.
Discussion
In this study, we found that about 67% of the selected patients had multimorbidity and the rate could be as high as 77.8% among patients older than 74 years old. Conditions with relatively high prevalence, such as hypertension, depressive disorder, diabetes, and lipid disorder, were very likely to occur after the diagnosis of most conditions as they are the common complications of many conditions. Multiple conditions from the same disease group often happened together, for example, once the patient developed one condition from the cardiovascular disease group, he or she would be at higher risk of other cardiovascular diseases. Several morbidity clusters were apparent based on the results of the Markov chain and WARM, such as musculoskeletal diseases, psychological diseases, respiratory diseases, and skin diseases. Hypertension and diabetes were subsequent of many chronic conditions, and multiple psychological diseases might be followed by irritable bowel syndrome afterward.
Strengths and Limitations
In this article, the Markov chain analysis and WARM were used to study the multimorbidity among people aged older than 40 years, making it one of the first to investigate multimorbidity considering the sequence of evolution. Besides, instead of using surveys and self-reported health complaints as previous studies did, this article used the registered chronic health conditions from Belgian data collected from patients’ daily consultation with their GP. The real medical records made our study reliable and generalizable to the Belgian population. Moreover, we presented a comprehensive view of the associations among 103 chronic conditions, covering chronic conditions from all disease groups, which might shed light on some potential relations that were not widely discussed before. It is worth noticing that both Markov chains and WARM have rarely been applied in epidemiologic studies, although they are very useful techniques to explore sequential rules. To the best of our knowledge, 2 articles have used ARM to summarize multimorbidity patterns (10,11), but no studies have applied WARM so far. The current results are in agreement with previous clinical findings, meaning that these approaches were valid and could be used on larger data sets in future studies, such as National Health care Systems or large private insurers. Because of the intensive information stored in the results of the Markov chain and the WARM, it is worth examining the results more carefully to discover potential new clinically relevant relations that have not been widely discussed before.
One limitation of the Markov chain analysis is that our results were derived under the first-order Markov assumption, meaning that the future state only depends on the current state, instead of on the sequence of history events. One example of this property is a patient with a sequence of records {Hypertension, Diabetes, Stroke}, {Stroke} is State 3 which only depends on State 2 {Diabetes} while {Hypertension} cannot be taken into account. In this case, the transition matrix can only present the probability among 2 conditions. The limitation can be overcome by a higher-order Markov chain, for instance, a second-order Markov chain involves 3 states. However, our study included 103 chronic conditions, and it was too complicated to interpret the results for all combinations in multiple steps. There are already other studies discussing about relations among disease groups or a small number of chronic conditions. This article would like to do a comprehensive study on the relations among more than 100 chronic conditions to provide clinical decision support for the GPs in primary care. It is much easier to track the follow-up of one specific condition by using the Markov chains. Other methods could provide the results of disease groups (ie, clustering and latent Dirichlet allocation) or selected results with the best criteria (ie, WARM), but fail to provide an accurate estimation between any 2 individual conditions. Considering that 54.5% of the patients with multimorbidity only had 2 conditions, we would like to apply the first-order Markov chain analysis in spite of the limitation. We also used WARM in the second step which could include multiple conditions in the itemsets. Therefore, these 2 methods were complementary.
Another limitation of the study is the lack of a time concept in the analysis. Although the sequence of occurrence is considered, it is not possible to incorporate the exact interval time between diagnoses. Finally, the results of the Markov chain and WARM could suggest strong associations, but causal relationships remain unclear and further studies are needed to confirm causality. Thus, more studies using new methods are still expected in future work to overcome these limitations.
Validity
The results of a Markov chain analysis may be questioned because the majority of the probabilities are very small, lower than 0.1. Although the values are small, they are significantly larger than the average level if we take the total number into consideration, meaning that the methodology provides significant contrast in the results to distinguish likely or unlikely sequence of diseases. It is the same in the case of WARM where the support of the selected rules might be interpreted at first sight as very low but in fact selects the strongest rules. The detailed calculation is given in Supplementary Method 4.
The clinical relevance was also checked. For instance, the triangle of cardiovascular diseases detected in the heatmap of Markov chain analysis reveals the fact that patients with multiple cardiovascular conditions are common in real-world clinical practice (23). Chronic alcohol abuse was used as an example to demonstrate the Markov chain results and it could be the precursor of hypertension (24), depression (25), and diabetes (26), which was in correspondence with previous studies. Similarly, WARM selected the strongest association rules with robustness and validity among all possible combinations; hence, it is not surprising to find that almost all the association rules have been discussed in previous research, for example, the relationship between psychological symptoms and irritable bowel syndrome (27,28), tobacco abuse and diabetes (29), hypertension and many diseases such as dementia (30,31), and the sequence of condition development was in agreement with these studies. The correspondence proves the validity of the results and shows the possibility of using the model to detect more potential relations. Considering the huge amount of combinations, it was acceptable to further lower the thresholds for support and confidence to take a look at a broader range of association rules, which might result in some potential rules that were not discussed before. The automatic search results from the Markov chain work in the same way. It provides a comprehensive view of the relations among 103 chronic conditions, making it helpful to provide decision support for practice-oriented health care professionals.
Finally, subgroup analyses by sex and age were conducted separately to check the robustness and consistency (Supplementary Figures 1 and 2). The results of the Markov chain were robust in all subgroups. The horizontal and vertical belt patterns and triangle patterns were similar in all figures, and the main follow-up conditions of one specific condition were almost always the same. There were some differences in WARM results among all subgroups, mainly because WARM selected the strongest and most robust rules and the importance of the rules may vary in different age and gender groups (32). Generally, more rules were detected in the group with a higher prevalence of multimorbidity, but most of the rules selected in the subgroup analysis were the same as the rules in Figure 2, for example, depression and irritable bowel syndrome, hypertension, and lipid disorder. Some group-specific findings are in correspondence with clinical knowledge, for instance, gout did not appear in the female group and younger age group, as the risk of gout is higher for the male and higher age (33–35). Therefore, WARM is a consistent tool for subgroup analysis, and it is useful to discover group-specific findings, which can be used for subject matter experts to further examine clinical relevance.
Funding
This work was supported by KU Leuven: Research Fund (projects C16/15/059, C32/16/013, C24/18/022), Industrial Research Fund (Fellowship 13-0260), and several Leuven Research and Development bilateral industrial projects, Flemish Government Agencies: Research Foundation - Flanders (FWO) (EOS Project no 30468160 [SeLMA], SBO project I013218N, PhD grants [SB/1SA1319N, SB/1S93918, SB/151622]), EWI (PhD and postdoc grants Flanders AI Impulse Program), Flanders Agency Innovation & Entrepreneurship (VLAIO) (City of Things [COT.2018.018], PhD grants: Baekeland [HBC.20192204] and Innovation mandate [HBC.2019.2209], Industrial Projects [HBC.2018.0405]), European Commission (EU H2020-SC1-2016-2017 grant agreement No. 727721: MIDAS), and the European Research Council (ERC) (advanced grant No. 885682 [B.D.M.]).
Conflict of Interest
None declared.
Supplementary Material
Acknowledgments
X.S. and M.V.D.A. developed the study concept and design. G.V.P. contributed to data collection, and X.S. performed the data analysis, interpreted the results, and drafted the manuscript. G.V.P., M.V.D.A., and R.V. contributed to results interpretation. All authors provided critical revisions and approved the final version of the manuscript for submission.
References
- 1. Van den Akker M, Buntinx F, Knottnerus J. Comorbidity or multimorbidity: what’s in a name? A review of literature review. Eur J Gen Pract. 1996;2:65–70. doi: 10.3109/13814789609162146 [DOI] [Google Scholar]
- 2. Van den Akker M, Vaes B, Goderis G, Van Pottelbergh G, De Burghgraeve T, Henrard S. Trends in multimorbidity and polypharmacy in the Flemish–Belgian population between 2000 and 2015. PLoS One. 2019;14():e0212046. doi: 10.1371/journal.pone.0212046 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3. Xu X, Mishra G, Jones M. Evidence on multimorbidity from definition to intervention: an overview of systematic reviews. Ageing Res Rev. 2017;37:53–68. doi: 10.1016/j.arr.2017.05.003 [DOI] [PubMed] [Google Scholar]
- 4. Makovski T, Schmitz S, Zeegers M, Stranges S, Van den Akker M. Multimorbidity and quality of life: systematic literature review and meta-analysis. Ageing Res Rev. 2019;53:100903. doi: 10.1016/j.arr.2019.04.005 [DOI] [PubMed] [Google Scholar]
- 5. Lund J, Pedersen H, Vestergaard M, Mercer S, Glumer C, Prior A. The impact of socioeconomic status and multimorbidity on mortality: a population-based cohort study. Clin Epidemiol. 2017;9:279–289. doi: 10.2147/CLEP.S129415 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6. Schäfer I, Kaduszkiewicz H, Wagner HO, Schön G, Scherer M, Van den Bussche H. Reducing complexity: a visualisation of multimorbidity by combining disease clusters and triads. BMC Public Health. 2014;14:1285. doi: 10.1186/1471-2458-14-1285 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7. Jensen AB, Moseley PL, Oprea TI, et al. . Temporal disease trajectories condensed from population-wide registry data covering 6.2 million patients. Nat Commun. 2014;5:4022. doi: 10.1038/ncomms5022 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8. Ng SK, Tawiah R, Sawyer M, Scuffham P. Patterns of multimorbid health conditions: a systematic review of analytical methods and comparison analysis. Int J Epidemiol. 2018;47:1687–1704. doi: 10.1093/ije/dyy134 [DOI] [PubMed] [Google Scholar]
- 9. Olaya B, Moneta MV, Caballero FF, et al. . Latent class analysis of multimorbidity patterns and associated outcomes in Spanish older adults: a prospective cohort study. BMC Geriatr. 2017;17:186. doi: 10.1186/s12877-017-0586-1 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10. Park S, Choi D, Kim M, Cha W, Kim C, Moon IC. Identifying prescription patterns with a topic model of diseases and medications. J Biomed Inform. 2017;75:35–47. doi: 10.1016/j.jbi.2017.09.003 [DOI] [PubMed] [Google Scholar]
- 11. Zemedikun DT, Gray LJ, Khunti K, Davies MJ, Dhalwani NN. Patterns of multimorbidity in middle-aged and older adults: an analysis of the UK biobank data. Mayo Clin Proc. 2018;93:857–866. doi: 10.1016/j.mayocp.2018.02.012 [DOI] [PubMed] [Google Scholar]
- 12. Held FP, Blyth F, Gnjidic D, et al. . Association rules analysis of comorbidity and multimorbidity: the concord health and aging in men project. J Gerontol A Biol Sci Med Sci. 2016;71:625–631. doi: 10.1093/gerona/glv181 [DOI] [PubMed] [Google Scholar]
- 13. Piatetsky-Shapiro G. Discovery, analysis, and presentation of strong rules. Presented at: Conference Proceedings of Knowledge Discovery in Databases 1991. December 1991:229–248; Cambridge, MA. [Google Scholar]
- 14. Agrawal R, Imieliński T, Swami A. Mining association rules between sets of items in large databases. Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data—SIGMOD ‘93. June 1993:207–216. New York, NY. doi: 10.1145/170035.170072 [DOI] [Google Scholar]
- 15. Ramkumar GD, Ranka S, Tsur S. Weighted association rules: model and algorithm. Proceedings of ACM SIGKDD. 1998. [Google Scholar]
- 16. Mendlovic J, Barash H, Yardeni H, Banet-Levi Y, Yonath H, Raas-Rothschild A. Rare diseases DTC: diagnosis, treatment and care. Harefuah. 2016:155(4):241–253. [PubMed] [Google Scholar]
- 17. Department of General Practice, KU Leuven. Intego-project.2011. http://www.intego.be. Accessed March 30, 2020.
- 18. Truyers C, Goderis G, Dewitte H, Van den Akker M, Buntinx F. The Intego database: background, methods and basic results of a Flemish general practice-based continuous morbidity registration project. BMC Med Inform Decis Mak. 2014;14:48. doi: 10.1186/1472-6947-14-48 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19. Lamberts H, Wood M. The birth of the International Classification of Primary Care (ICPC). Serendipity at the border of Lac Léman. Fam Pract. 2002;19:433–435. doi: 10.1093/fampra/19.5.433 [DOI] [PubMed] [Google Scholar]
- 20. Nielen M, Davids R, Gommer M, et al. . Berekening morbiditeitscijfers op basis van Nivel Zorgregistraties Eerste Lijn. www.nivel.nl/nl/nivel-zorgregistraties-eerste-lijn/incidentie-en-prevalentiecijfers. Updated May 29, 2019. Accessed January 10, 2020.
- 21. Gagniuc PA. Markov Chains: From Theory to Implementation and Experimentation. Hoboken, NJ: John Wiley & Sons; 2017:9–24. doi: 10.1002/9781119387596 [DOI] [Google Scholar]
- 22. Hornik K, Grün B, Hahsle M. arules—a computational environment for mining association rules and frequent item sets. J Stat Softw. 2005;14(15):1–25. doi: 10.18637/jss.v014.i15 [DOI] [Google Scholar]
- 23. Kennedy B. Treating patients with multiple cardiovascular conditions: an analysis of outpatient data in the United States, 2005. J Natl Med Assoc. 2008;100(11):1260–1270. doi: 10.1016/S0027-9684(15)31504-2 [DOI] [PubMed] [Google Scholar]
- 24. Sesso HD, Cook NR, Buring JE, Manson JE, Gaziano JM. Alcohol consumption and the risk of hypertension in women and men. Hypertension. 2008;51:1080–1087. doi: 10.1161/HYPERTENSIONAHA.107.104968 [DOI] [PubMed] [Google Scholar]
- 25. Boden JM, Fergusson DM. Alcohol and depression. Addiction. 2011;106:906–914. doi: 10.1111/j.1360-0443.2010.03351.x [DOI] [PubMed] [Google Scholar]
- 26. Howard AA, Arnsten JH, Gourevitch MN. Effect of alcohol consumption on diabetes mellitus: a systematic review. Ann Intern Med. 2004;140:211–219. doi: 10.7326/0003-4819-140-6-200403160-00011 [DOI] [PubMed] [Google Scholar]
- 27. Creed F, Guthrie E. Psychological factors in the irritable bowel syndrome. Gut. 1987;28:1307–1318. doi: 10.1136/gut.28.10.1307 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28. Tripathi R, Mehrotra S. Irritable bowel syndrome and its psychological management. Ind Psychiatry J. 2015;24:91–93. doi: 10.4103/0972-6748.160947 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29. Haire-Joshu D, Glasgow RE, Tibbs TL. Smoking and diabetes. Diabetes Care. 1999;22:1887–1898. doi: 10.2337/diacare.22.11.1887 [DOI] [PubMed] [Google Scholar]
- 30. Nagai M, Hoshide S, Kario K. Hypertension and dementia. Am J Hypertens. 2010;23:116–124. doi: 10.1038/ajh.2009.212 [DOI] [PubMed] [Google Scholar]
- 31. Faraco G, Iadecola C. Hypertension: a harbinger of stroke and dementia. Hypertension. 2013;62:810–817. doi: 10.1161/HYPERTENSIONAHA.113.01063 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32. Atzmueller M. Subgroup discovery. WIREs Data Mining Knowl Discov. 2015;5:35–49. doi: 10.1002/widm.1144 [DOI] [Google Scholar]
- 33. Dehlin M, Jacobsson L, Roddy E. Global epidemiology of gout: prevalence, incidence, treatment patterns and risk factors. Nat Rev Rheumatol. 2020;16:380–390. doi: 10.1038/s41584-020-0441-1 [DOI] [PubMed] [Google Scholar]
- 34. Robinson PC, Taylor WJ, Dalbeth N. An observational study of gout prevalence and quality of care in a national Australian general practice population. J Rheumatol. 2015;42:1702–1707. doi: 10.3899/jrheum.150310 [DOI] [PubMed] [Google Scholar]
- 35. Chen-Xu M, Yokose C, Rai S, Pillinger M, Choi H. Contemporary prevalence of gout and hyperuricemia in the United States and decadal trends: the National Health and Nutrition Examination Survey, 2007–2016. Arthritis Rheumatol. 2019;71:991–999. doi: 10.1002/art.40807 [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.