Abstract
Earlier studies on hospitalization risk are largely based on regression models. To our knowledge, network modeling of multiple comorbidities is novel and inherently enables multidimensional scoring and unbiased feature reduction. Network modeling was conducted using an independent validation design starting from 38,695 patients, 1,446,581 visits, and 430 distinct clinical facilities/hospitals. Odds ratios (OR) were calculated for every pair of comorbidity using patient counts and compared their tendency with hospitalization rates and ED visits. Network topology analyses were performed, defining significant comorbidity associations as having OR≥5 & False-Discovery-Rate≤10−7. Four COPD-associated comorbidity sub-networks emerged, incorporating multiple clinical systems: (i) metabolic syndrome, (ii) substance abuse and mental disorder, (iii) pregnancy-associated conditions, and (iv) fall-related injury. The latter two have not been reported yet. Features prioritized from the network are predictive of hospitalizations in an independent set (p<0.004). Therefore, we suggest that network topology is a scalable and generalizable method predictive of hospitalization.
Introduction
Chronic Obstructive Pulmonary Disease (COPD) is the third leading cause of death in the United Sates and an important cause of disability and hospitalizations, particularly in aged populations1,2. Currently, about 12 million adults (aged 18 and over) have been diagnosed with COPD, but there are likely to be more who have yet to be diagnosed3. Deaths attributable to COPD in women are higher than men, with more than 6,000 deaths for women alone in 20104. Hospitalizations for COPD account for a large economic burden to the patients and to society5,6. In that same year, COPD resulted in $49.9 billion in direct and indirect costs, and total costs incurred by COPD patients are approximately $6,000 higher than patients without COPD6. Due to its large burden on individuals and health care systems, it is crucial to find evidence-based strategies to identify those who are at the highest risk of being hospitalized in order to target preventive interventions7,8.
Comorbidities such as hypertension, ischemic heart diseases, diabetes, and pneumonia have been known as contributing causes to COPD hospitalizations. Therefore, reducing COPD-associated comorbid conditions may decrease the hospitalization rate, which ultimately reduces the economic burden of COPD patients7,9. Multiple studies have developed predictive models including comorbidities of COPD hospitalizations and re-hospitalization; however, those models were constructed only on the use of regression models (Table 1). These models aim at predicting COPD hospitalization from comorbidities data. However, they do not take into account the impact of all potential interactions between each comorbid condition, unless these and their interactions are specified a priori in the functional model. Also, COPD hospitalization risk predictions are based on labor-intensive scoring systems such as the Charlson Comorbidity Index (CCI)10 or Elixhauser11, which are the best-known comorbidity indices. However, these tools only provide pre-selected list of diseases12 and therefore many diseases are not taken into account, and do not provide the level of granularity necessary to understand the clinical dynamics of COPD hospitalization risk. In other words, they exploit a biased and small number of features for predictions. As a result, comorbidities finding may be biased and have limitations for identifying novel patterns of comorbidities.
Table 1.
Summary of three studies on the prediction of hospitalization among COPD patients
| Author (yr) | Austin et al, 20127 | Fan et al, 20028 | Coventry et al, 201115 |
|---|---|---|---|
| Total patients | 855,661 | 3,282 | 79 |
| Validation | |||
| Independent study | yes | yes | |
| Cross-validation | yes | ||
| Measurement of features | |||
| Index name 1 | Charlson | Charlson | Charlson |
| Count of features 1 | 19 | 19 | 19 |
| Index name 2 | Elixhauser | ||
| Count of features 2 | 30 | ||
| Index name 3 | Aggregated Diagnosis Groups (ADGs) | ||
| Count of features 3 | 32 | ||
| Outcome measures | |||
| Hospitalization for COPD | Yes | Yes | |
| Rehospitalization for COPD | Yes | ||
| Any hospitalization | Yes | Yes | |
| Analysis method | Logistic regression | Logistic regression | Logistic regression |
When investigating conditions or features that can predict high risks of admission in COPD patients, it is necessary to understand the association between COPD and its comorbidities. Studying the structure defined by the entire set of comorbidities with the novel methodology is required to understand the enriched association and unveil the hidden structure. Recently, network-based approaches have been applied to human disease and have revealed unknown connections between diseases, which shed new light on the clinical research realm (Background)13. For example, Hidalgo et al. introduced a Phenotypic Disease Network (PDN), which uses data obtained from the medical claims of more than 30 million patients to demonstrate that highly connected diseases are more lethal than barely connected ones14. However, results from these patterns have not been translated into classifiers of hospitalization.
In this study, we hypothesized that network topology modeling of COPD-associated comorbidities with higher risk of hospitalization and emergency department visits can predict future hospitalizations.
Background on Network Analysis
These approaches aim to discover, map, and quantify complex relationships among variables that may not be revealed by correlational or clustering methods. Combining exploratory analytics with statistical decision points and graphical displays, these methods may discover complex networks of variables that were heretofore unknown, yield new insights into the dynamics among these variables, and link these networks to clinical outcomes.
Methodology
Sample
We collected data from the Illinois Health Connect Medicaid, which included patients from 430 distinct clinical facilities and hospitals from January 1, 2010 through December 31, 2010 and covered 1,446,581 visits. These clinical institutions have been chosen using the distinct provider ID from outpatient billing and inpatient billing. If a site had multiple locations and a single billing address, then it counted as one institution. Claims include all claims adjudicated for payment through April 29, 2011. Location (emergency department, inpatient, and outpatient), admission date, primary disease and secondary disease information with ICD-9-CM code were extracted for the network analysis. Although the system provided 5 digits level ICD9-CM code, we used the ICD-9-CM at the 3 digit level for the analysis. COPD was defined as ICD-9-CM code 490–492 and 496 based on the literature. COPD with asthma (ICD-9-CM code 493) was not included in this study since the COPD with asthma is under asthma category which is different from COPD. In total, 38,695 patients and 1,049 ICD-9-CM codes at the 3 digit level for COPD and comorbid diseases were included in the dataset.
Initially, 3,862 patients had a COPD diagnosis, and potentially other comorbid conditions (886 total comorbidities). We removed from further analysis ICD-9-CM codes associated to less than 20 patients, in order to prevent any future re-identification. This filtering resulted in a final dataset of 3,831 patients and 754 comorbidities. Among those data, 880 COPD patients were hospitalized at least once, and 2,711 patients visited the ED at least one time. Further, we created a subgroup of patients that we labeled as “higher risk of acute exacerbation (e.g. bronchospasms) (AE-risk) COPD patients”, given the following inclusion criteria: 1) patients with COPD as either primary or secondary diagnosis, and 2) patients who visited emergency department (ED) 5 or more times and hospitalized at least once for the condition of COPD during 1/1/2010~12/31/2010. We identified 238 AE-risk COPD patients that met those two criteria (ED ≥ 5 AND hospitalization ≥ 1).
Data analysis
First, we randomly selected 25% of the “AE-risk COPD patients” to create a validation set of 60 patients and kept the remaining 178 patients (from the 238 AE-risk COPD patients) in the background. The aim was to identify comorbidities associated with an increased hospitalization risk. Specifically, we searched among the 754 comorbidities the ones that were significantly associated with the AE-risk patients, and the associations between these morbidities. We prioritized those comorbidities by using the associated Fisher’s Exact Test (FET) result as the statistical criterion to determine whether a given comorbidity was retained in the network analysis (3,771 patients; Figure 1). Our model can be viewed as an alternate feature selection procedure, where features are here comorbidities with the COPD condition. We created the association network of all comorbidities with the AE-risk COPD condition, and extracted the most significant ones and their interactions/associations. The levels of significance (p-values) by FET of all tested paired associations were adjusted by False Discovery Rate (FDR16) to correct for multiple comparisons for multiple comparisons and the possibility of finding a statistically significant result merely because of high statistical power. The analyses were performed using custom scripts written in Python17 and R programs18. We used a stringent cutoff of significance for the association between diseases: Odds Ratio (OR)≥5 of the hospitalization with and without the co-morbidities and FDR≤10−7. For a one-year period, the number of hospitalizations and ED visits were computed for each patient using the MySQL Community Server 5.6. We bundled the high cost of recurrent ED visits with those of hospitalizations as proxies for high risk of COPD exacerbation associated to high overall health system costs.
Figure 1.

Summary of the study design
Network modeling
To explore the pattern of COPD with multiple systems comorbidities, we constructed the network consisting of the significant associations (according to FET) of comorbidities of the AE-risk COPD patients (Figure 2). The network has been constructed using Cytoscape19. Next, we investigated the tendency of hospitalization and ED visits according to the topology of the network. Our research group has extensive experience and pioneered network modeling in: (1) diseases20–23, (ii) translational bioinformatics24–39, and (iii) between multiple scales of molecules of life40–43.
Figure 2.
COPD-derived comorbidity network. Four sub-networks are shown in Panel A (details in Table 2). Panels B and C are highlighted for the number of ED visits and OR of hospitalizations respectively.
Evaluation
After the model was constructed from the background data, significant comorbidities were prioritized. We then retained the same comorbidities from the validation set (60 patients) and applied a clustering procedure in order to automatically separate those 60 patients into two subgroups, from their associated comorbidities. We used Partitioning Around Medoids (PAM) method for unsupervised clustering, as it is a well-established method for identifying subgroup (clusters) from data44. The R algorithm for Partitioning Around Medoids (PAM)44 was utilized in a parameter-free way. It resulted in two clusters on which we compared Hospitalization and ED visits distribution. The comparison was statistically assessed using two-tailed non-parametric Mann-Whitney test (GraphPad PRISM v.5.0d). Of note, the PAM algorithm utilized exclusively comorbidity data and was not presented the hospitalization rates per patient nor ED visit.
IRB
The research project was approved by the University of Illinois Institutional Review Board id#2012-0150 (non-human subject, de-identified dataset).
Results
From the total 754 comorbidities found in COPD, 215 were significantly associated to rehospitalization (OR≥5 and FDR≤10−7, Methods) linked to patients at high risk of hospitalization and ED visits (AE-risk patients). These 215 comorbidities were regarded as prioritized features extracted from the training data (Methods). They were associated with each other, involving 280 interactions that we represented in a network (Figure 2b). Since all comorbidities were connected to COPD, we removed the COPD node from the network to explore further the associations between comorbidities. Of note, for a better readability of the network structures, only comorbidities containing associations with other comorbidities are shown in Figure 2. We provide the full network in supplement (Supplement Figure S1, http://lussierlab.org/publications/COPD_networks/SuppFigureS1.pdf).
We further annotated the network with the number of hospitalizations and ED visits for each comorbidity condition, in order to highlight the patterns they create in the network topology.
To express the multiplicity of information, we kept the original network but represented its nodes in three ways: 1) ICD-9-CM class of diseases and number of hospitalizations (Figure 2, Panel A); 2) the number of ED visits per disease (Figure 2, Panel B; darker=more visits); and 3) Odds ratios between AE-risk patients (≥1 hospitalization and ≥ 5 ED visits) and comorbidity (Figure 2, Panel C; darker for higher OR). In Panel A, the size of the node corresponds to the number of hospitalizations, while it corresponds to the relative numbers in the other two panels. Moreover, in Panel A, the link thickness corresponds to the strength of the association (OR) between two comorbidity diseases (thicker link=higher OR). Of note, if patients visited the ED or were admitted to the hospital more than once, each visit or admission was counted separately. The network topology showed that among the total 754 comorbidities, the prioritized 215 comorbidities had an increased risk for hospitalizations and ED visits (FDR≤0.05 Panel B).
The clinical dynamics of this network analysis become evident from a pattern analysis of Figure 2, Panel A. Visual inspection reveals four distinct sub-networks: I = multisystem and metabolic syndrome, II = drug abuse and mental disorder, III = pregnancy-associated conditions and IV = fall-related injury. The sub-network I comprises hypertension, obesity, myocardial infarction, diabetes mellitus, disorders of lipoid metabolism, angina pectoris, and chronic ischemic heart diseases. The metabolic syndrome has already been defined as “a cluster of cardiovascular risk factors that is frequently associated with insulin resistance.”45 In this network each comorbidity disease is connected to the others, forming a complex sub-network. The metabolic syndrome is part of the largest network (Figure 1, Panel A), while many additional metabolic diseases are agglomerated to this multi-system. Considering the sub-network II, we can see that mental disorder and drug abuse are connected to each other (regarding their related comorbidities). This study also shows that falls, injuries and pregnancy-associated complications seem to be associated with COPD and its hospitalization (Figure 2 sub-networks III and IV).
The analysis of the sub-networks reveals that sub-network I is centered on the metabolic syndrome and is very complex in terms of clinical systems. Sub-networks I, II and IV each comprise diseases associated with some of the highest odds ratio of re-hospitalization and ED visits (Figure 1, Panel C; Table 2). Sub-network III is centered on pregnancy and while its odds ratio of re-hospitalization and ED visits is statistically and clinically significant, none of its morbidities are among the top 20 shown in Table 2 and merit further investigation of studying separately the ED visits and hospitalizations. Additionally, Contusion to the trunk (Sub-network IV) stands out as a high risk for ED visits and hospitalization in COPD populations that may merit subsequent investigations for potential preventive measures.
Table 2.
Top COPD’s co-morbidities associated to increased hospitalizations and ED visits as measured by OR. Of note, none of the top ORs was part of sub-network III (Figure 1).
| Sub-network of Figure 1* | ICD-9 CM Code | Odds Ratio≥ 5 ED visits & 1 hospitalization | # patients | #ED visits | # Hosp. |
|---|---|---|---|---|---|
| I | 250 Diabetes mellitus | 2.9 | 65 | 449 | 229 |
| 272 Disorders of lipoid metabolism | 3.0 | 72 | 197 | 132 | |
| 275 Disorders of mineral metabolism | 8.2 | 21 | 18 | 21 | |
| 276 Disorders of fluid, electrolyte, and acid-base balance | 5.3 | 92 | 135 | 112 | |
| 285 Other and unspecified anemia | 4.8 | 83 | 139 | 113 | |
| 288 Diseases of white blood cells | 7.2 | 28 | 30 | 32 | |
| V12 Personal history of certain other diseases | 6.0 | 84 | 159 | 72 | |
| 401 Essential hypertension | 2.8 | 126 | 625 | 308 | |
| 453 Other venous embolism and thrombosis | 6.3 | 24 | 20 | 21 | |
| II | 295 Schizophrenic disorders | 4.3 | 32 | 108 | 124 |
| 296 Episodic mood disorders | 3.8 | 79 | 135 | 126 | |
| 298 Other nonorganic psychoses | 8.7 | 33 | 19 | 13 | |
| 303 Alcohol dependence syndrome | 9.3 | 34 | 60 | 49 | |
| 305 Nondependent abuse of drugs | 3.9 | 133 | 481 | 288 | |
| V15 Other personal history presenting hazards to health | 5.3 | 84 | 164 | 134 | |
| IV | 920 Contusion of face, scalp, and neck except eye(s) | 4.6 | 19 | 16 | 1 |
| 922 Contusion of trunk | 8.0 | 21 | 22 | 5 | |
| 959 Injury, other and unspecified | 3.2 | 76 | 25 | 2 | |
| V62 Other psychosocial circumstances | 5.5 | 35 | 62 | 56 |
• I = Multisystem and Metabolic syndrome, II = drug abuse and mental disorder, IV=fall-related injury
In Figure 2, Panel B, the comorbidity conditions with the highest number of ED visit is ‘Essential hypertension’ followed by ‘Nondependent abuse of drugs’ and ‘Diabetes Mellitus’, which represents 625, 481 and 449 visits, respectively (Table 2). However, the number of ED visit seems not correlated to the OR; their corresponding odds ratio is “modestly” increased by 2.8. Figure 2, Panel C shows that ‘Alcohol dependence syndrome’, ‘Disorders of mineral metabolism’ and ‘Contusion of trunk’ are the top three comorbidities with highest OR (9.3, 8.2 and 8.0 respectively; Table 2), while their number of ED visits are low.
Figure 3 describes how the 215 prioritized comorbidities are predictive of hospitalization in an independent data set. The same 215 comorbidities were retained from the 60 patients of the validation set (Figure 1, Methods). Then, we applied the PAM clustering procedure (Methods) in order to automatically separate those 60 patients into two clusters44. Of note, PAM divides the patients only on their comorbidity patterns and is not informed of the hospitalization data. The results showed that the two clusters generated (named A and B; respective sizes=9 and 51) yielded by the 215 comorbidities have different risk of hospitalization (p=0.004; Mann Whitney U test). This validates that the discovered patterns (the 215 comorbidities) are indeed predictive of the hospitalization rate. Of note, the PAM analysis and Mann-Whitney tests were repeated for models built from different OR and FDR cutoffs and lead to the same order of results (data not shown).
Figure 3.

Evaluation in Independent Dataset Using Disease-Features Prioritized In Learning Set. Using the co-morbidities identified in the learning set (Figure 2, panel A), we then clustered the independent validation set using PAM clustering (restricted to two clusters), an unsupervised vectorial quantization method. Of note, PAM analysis did not have access to the hospitalization rates of the patients in the validation set. As shown here, the validation of the identified clusters was conducted by comparing the hospitalization rate of the identified clusters of the validation set.
Discussion
Comorbidities have been known to be a key factor of ED admission and hospitalization among COPD patients; however, previous studies were constructed on regression models. This study showed that the network topology may contribute to hospital prediction study using comorbidity feature among COPD patients. Some of the features were consistent with previous studies. Hypertension, adult-onset diabetes, mental illnesses and substance abuse have all been previously associated to increase COPD patients’ ED visits and hospitalizations rates46, and evidence is shown in central hubs of Figure 2 panels I-II.
Meanwhile, falls and pregnant-related comorbidities have not been shown in the previous studies on COPD comorbidities. While falls are the leading cause of ED visits and hospitalizations due to an injury in the elderly population, to our knowledge the increased OR of COPD patients’ fall related-hospitalizations had not been previously reported47,48. However, since age increases both risk of COPD and that of falls, additional studies are required to identify if the risk of falls in COPD patients is higher than that of a matched control. Furthermore, no studies have been described where pregnancy is a comorbid condition of COPD from our understanding. Since this study was extracted from Medicaid data, we included pregnant population due to the inclusion criteria of Medicaid49. Pregnancy Risk Assessment Monitoring System study conducted by the Centers for Disease Control and Prevention (CDC) showed that half of respondents (50.2%) were enrolled in Medicaid at any point from preconception through pregnancy and delivery49. Therefore, pregnancy-related comorbidities that show in our analysis may not be related with COPD. Because of this, we must interpret the conclusions from the network analysis carefully.
The strength of this study lies in using claims data, which captures a large number of diseases (n=754) as recorded through medical claims, to construct the model on hospitalization among COPD patients compared to previous studies that used a pre-selected list of about twenty to thirty comorbidities. Consequently, we discovered previously unreported comorbidities associated to hospitalization of COPD patients. Administrative data has become an essential research resource since it is easily accessible, contains heterogeneous information of a large population, and can be promptly available. Claims data has been used to predict hospitalization and re-hospitalization in other studies50. CCI or Elixhauser can also be applied to administrative data; however, since they already select certain number of features, it may limit the possibility of finding new patterns or results.
Furthermore, compared to the traditional descriptive table, the network topology uncovered distinctive patterns of multiple comorbidities related COPD hospitalization and their co-dependency. The network-approach may inform clinicians which comorbidities need to be treated intensively to reduce the comorbidities. Focusing on the hub would decrease the risk of other comorbidities which are connected to it. In the network of metabolism syndrome, diabetes mellitus and heart failure are the hub of the network; making an effort to manage that hub would reduce the risk of diseases of heart, lung or pulmonary which decreases the frequency of medical utilization and economic burden. However, to construct the network, we need to be cautious about the threshold. Depending on the threshold, the network topology would vary and affect the quality of information. Too low of a threshold leads to the building of a hair-ball type of network that prevents the sharing of informative messages. Otherwise, too high of a threshold will threaten the loss of hub and may provide information that is not clinically actionable.
Future studies and Limitation
The generalizability of this study to other COPD patients is limited due to the short period of data collection time. Furthermore, we limited this proof-of-concept study to identifying valid “features” of the classifier (the co-morbidities) with a judicious model-free validation strategy. In future studies, a fully-specified classifier is required for individual patient prediction, inclusive of “features,” mathematical model, and computed weighted parameters. We also intend to extend the type of features with ten years of unstructured clinical narratives (radiology reports, discharge summaries and pathology reports) that we coded in UMLS with MedLEE51 and more relationships in SNOMED52 to improve the predictive power and compare the accuracy rates to those reported in the literature by alternate approaches. Further studies need to include heterogeneous variable types such as severity of comorbidity, length of stay or demographic information as well as severity of comorbidities over time.
Further, this proof-of-concept study was designed to address the hypothesis that network analyses could identify co-morbidity features predictive of future hospitalizations in clinical datasets, while mitigating the curse of dimensionality that plague other feature discovery methods (such as FDR over multiple comparisons). However, future studies are required to identify to cop are the accuracy of predictions of features discovered by network analyses against those found by classical methods such as statistical regression or vector space discrimination (e.g. Support vector machine).
Conclusion
The proposed network modeling of COPD hospitalized patients unveils several sub-networks of comorbidities. The descriptive information with ICD-9 code only is insufficient to reveal the underpinnings of biologic or pathophysiologic connections of comorbidities. This network topology may show the possibility of revealing the codependency of comorbidities, which has been buried in the traditional prevalence study that links to those connections. Further, in high dimension datasets, reducing features by design while controlling for multiplicity of comparisons decreases the statistical power of conventional approaches. Finally, we propose that network analysis of clinical comorbidity and their dependencies provides an unbiased and straightforward predictor development that merits further investigation in order to prevent future hospitalization and ED visits for COPD patients.
Acknowledgments
Patient Outcome Research Institute (PCORI) “PATient Navigator to rEduce Readmissions” (PArTNER) grant (http://www.pcori.org/pfaawards/patient-navigator-to-reduce-readmissions-partner/). JK and YAL are funded in part by The University of Illinois at Chicago Center for Clinical and Translational Science NIH/NCAT UL1TR000050 and 1UL1RR029879 grants and The Office of the Office of the Vice-President for Health Affairs of the University of Illinois Hospital and Health Science Center. Role of the Sponsor: None of the funding sources had a role in the design and conduct of the study; in the collection, analysis, and interpretation of the data; or in the preparation, review, or approval of the manuscript.
References
- 1.Holguin F, Folch E, Redd SC, Mannino DM. Comorbidity and mortality in COPD-related hospitalizations in the United States 1979 to 2001. CHEST Journal. 2005;128(4) doi: 10.1378/chest.128.4.2005. [DOI] [PubMed] [Google Scholar]
- 2.Lin P-J, Shaya FT, Scharf SM. Economic implications of comorbid conditions among Medicaid beneficiaries with COPD. Respiratory medicine. 2010;104(5):697–704. doi: 10.1016/j.rmed.2009.11.009. [DOI] [PubMed] [Google Scholar]
- 3.Wier LM, Elixhauser A, Pfuntner A, Au DH. Overview of Hospitalizations among Patients with COPD. 2008:2011. [PubMed] [Google Scholar]
- 4.Murphy SL, Xu J, Kochanek KD. Deaths: final data for 2010. National vital statistics reports. 2013;61(4):1–118. [PubMed] [Google Scholar]
- 5.Miravitlles M, Ferrer M, Pont A, et al. Characteristics of a population of COPD patients identified from a population-based study. Focus on previous diagnosis and never smokers. Respiratory medicine. 2005;99(8):985–995. doi: 10.1016/j.rmed.2005.01.012. [DOI] [PubMed] [Google Scholar]
- 6.COPD Foundation Impact of COPD on Health Care Costs. 2012. http://www.copdfoundation.org/pdfs/Impact%20on%20Costs.pdf.
- 7.Austin PC, Stanbrook MB, Anderson GM, Newman A, Gershon AS. Comparative ability of comorbidity classification methods for administrative data to predict outcomes in patients with chronic obstructive pulmonary disease. Annals of epidemiology. 2012;22(12):881–887. doi: 10.1016/j.annepidem.2012.09.011. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Fan VS, Curtis JR, Tu S-P, McDonell MB, Fihn SD. Using quality of life to predict hospitalization and mortality in patients with obstructive lung diseases. CHEST Journal. 2002;122(2):429–436. doi: 10.1378/chest.122.2.429. [DOI] [PubMed] [Google Scholar]
- 9.Chatila WM, Thomashow BM, Minai OA, Criner GJ, Make BJ. Comorbidities in chronic obstructive pulmonary disease. Proceedings of the American Thoracic Society. 2008;5(4):549–555. doi: 10.1513/pats.200709-148ET. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Charlson ME, Pompei P, Ales KL, MacKenzie CR. A new method of classifying prognostic comorbidity in longitudinal studies: development and validation. Journal of chronic diseases. 1987;40(5):373–383. doi: 10.1016/0021-9681(87)90171-8. [DOI] [PubMed] [Google Scholar]
- 11.Elixhauser A, Steiner C, Harris DR, Coffey RM. Comorbidity measures for use with administrative data. Medical care. 1998;36(1):8–27. doi: 10.1097/00005650-199801000-00004. [DOI] [PubMed] [Google Scholar]
- 12.Sharabiani MT, Aylin P, Bottle A. Systematic review of comorbidity indices for administrative data. Medical care. 2012;50(12):1109–1118. doi: 10.1097/MLR.0b013e31825f64d0. [DOI] [PubMed] [Google Scholar]
- 13.Barabási A-L, Gulbahce N, Loscalzo J. Network medicine: a network-based approach to human disease. Nature Reviews Genetics. 2011;12(1):56–68. doi: 10.1038/nrg2918. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Hidalgo CA, Blumm N, Barabási A-L, Christakis NA. A dynamic network approach for the study of human phenotypes. PLoS computational biology. 2009;5(4):e1000353. doi: 10.1371/journal.pcbi.1000353. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Coventry PA, Gemmell I, Todd CJ. Psychosocial risk factors for hospital readmission in COPD patients on early discharge services: a cohort study. BMC pulmonary medicine. 2011;11(1):49. doi: 10.1186/1471-2466-11-49. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Strimmer K. fdrtool: a versatile R package for estimating local and tail area-based false discovery rates. Bioinformatics. 2008;24(12):1461–1462. doi: 10.1093/bioinformatics/btn209. [DOI] [PubMed] [Google Scholar]
- 17.van Rossum G, de Boer J. Interactively testing remote servers using the Python programming language. CWI Quarterly. 1991;4(4):283–303. [Google Scholar]
- 18.Team RC. R: A language and environment for statistical computing. 2012. [Google Scholar]
- 19.Lopes CT, Franz M, Kazi F, Donaldson SL, Morris Q, Bader GD. Cytoscape Web: an interactive web-based network browser. Bioinformatics. 2010;26(18):2347–2348. doi: 10.1093/bioinformatics/btq430. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Wang X, Quek HN, Cantor M, Kra P, Schultz A, Lussier Y. Automating terminological networks to link heterogeneous biomedical databases. Studies in health technology and informatics. 2003;107(Pt 1):555–559. [PMC free article] [PubMed] [Google Scholar]
- 21.Pantazatos SP, Li J, Pavlidis P, Lussier YA. Integration of neuroimaging and Microarray Datasets through Mapping and Model-Theoretic semantic Decomposition of Unstructured phenotypes. Cancer informatics. 2009;8:75. doi: 10.4137/cin.s1046. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Cantor MN, Lussier YA. Mining OMIM™’for Insight into Complex Diseases. Studies in health technology and informatics. 2004;107(Pt 2):753. [PMC free article] [PubMed] [Google Scholar]
- 23.Bales ME, Lussier YA, Johnson SB. Topological analysis of large-scale biomedical terminology structures. Journal of the American Medical Informatics Association. 2007;14(6):788–797. doi: 10.1197/jamia.M2080. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Li H, Lee Y, Chen JL, Rebman E, Li J, Lussier YA. Complex Disease Networks of Trait–Associated SNPs Unveiled by Information Theory. Journal of the American Medical Informatics Association. 2012;19(2):295–305. doi: 10.1136/amiajnl-2011-000482. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Chen JL, Hsu A, Yang X, et al. Curation-free biomodules mechanisms in prostate cancer predict recurrent disease. BMC medical genomics. 2013;6(Suppl 2):S4. doi: 10.1186/1755-8794-6-S2-S4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Chen JL, Li J, Kiriluk KJ, et al. Deregulation of a Hox protein regulatory network spanning prostate cancer initiation and progression. Clinical Cancer Research. 2012;18(16):4291–4302. doi: 10.1158/1078-0432.CCR-12-0373. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Cantor M, Sarkar I, Gelman R, Hartel F, Bodenreider O, Lussier Y. An Evaluation of Hybrid Methods for Matching Biomedical Terminologies: Mapping the Gene Ontology to the UMLS®. Studies in health technology and informatics. 2003;95:62. [PMC free article] [PubMed] [Google Scholar]
- 28.Yang X, Li J, Lee Y, Lussier YA. GO-Module: functional synthesis and improved interpretation of Gene Ontology patterns. Bioinformatics. 2011;27(10):1444–1446. doi: 10.1093/bioinformatics/btr142. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Goh C-S, Gianoulis TA, Liu Y, et al. Integration of curated databases to identify genotype-phenotype associations. BMC genomics. 2006;7(1):257. doi: 10.1186/1471-2164-7-257. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Liu Y, Li J, Sam L, Goh C-S, Gerstein M, Lussier YA. An integrative genomic approach to uncover molecular mechanisms of prokaryotic traits. PLoS computational biology. 2006;2(11):e159. doi: 10.1371/journal.pcbi.0020159. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Yang X, Huang Y, Crowson M, Li J, Maitland ML, Lussier YA. Kinase inhibition-related adverse events predicted from< i> in vitro kinome and clinical trial data. Journal of biomedical informatics. 2010;43(3):376–384. doi: 10.1016/j.jbi.2010.04.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Sarkar IN, Cantor MN, Gelman R, Hartel F, Lussier YA. Linking biomedical language information and knowledge resources: GO and UMLS; Paper presented at: Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing; 2003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Lee Y, Li H, Li J, et al. Network models of genome-wide association studies uncover the topological centrality of protein interactions in complex diseases. Journal of the American Medical Informatics Association. 2013 doi: 10.1136/amiajnl-2012-001519. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Gardeux V, Achour I, Li J, et al. ‘N-of-1-pathways’ unveils personal deregulated mechanisms from a single pair of RNA-Seq samples: towards precision medicine. Journal of the American Medical Informatics Association. 2014 doi: 10.1136/amiajnl-2013-002519. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Sam LT, Mendonça EA, Li J, Blake J, Friedman C, Lussier YA. PhenoGO: an integrated resource for the multiscale mining of clinical and biological data. Bmc Bioinformatics. 2009;10(Suppl 2):S8. doi: 10.1186/1471-2105-10-S2-S8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Lussier Y, Borlawsky T, Rappaport D, Liu Y, Friedman C. PhenoGO: assigning phenotypic context to gene ontology annotations with natural language processing; Paper presented at: Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing; 2006. [PMC free article] [PubMed] [Google Scholar]
- 37.Chen J, Sam L, Huang Y, et al. Protein interaction network underpins concordant prognosis among heterogeneous breast cancer signatures. Journal of biomedical informatics. 2010;43(3):385–396. doi: 10.1016/j.jbi.2010.03.009. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Chen JL, Li J, Stadler WM, Lussier YA. Protein-network modeling of prostate cancer gene signatures reveals essential pathways in disease recurrence. Journal of the American Medical Informatics Association. 2011;18(4):392–402. doi: 10.1136/amiajnl-2011-000178. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Regan K, Wang K, Doughty E, et al. Translating Mendelian and complex inheritance of Alzheimer’s disease genes for predicting unique personal genome variants. Journal of the American Medical Informatics Association. 2012;19(2):306–316. doi: 10.1136/amiajnl-2011-000656. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Gamazon ER, Im H-K, Duan S, et al. Exprtarget: an integrative approach to predicting human microRNA targets. PLoS One. 2010;5(10):e13534. doi: 10.1371/journal.pone.0013534. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Yang X, Lee Y, Fan H, Sun X, Lussier YA. Identification of common microRNA-mRNA regulatory biomodules in human epithelial cancer. Chinese Science Bulletin. 2010;55(31):3576–3589. doi: 10.1007/s11434-010-4051-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Yang X, Huang Y, Chen JL, Xie J, Sun X, Lussier YA. Mechanism-anchored profiling derived from epigenetic networks predicts outcome in acute lymphoblastic leukemia. BMC bioinformatics. 2009;10(Suppl 9):S6. doi: 10.1186/1471-2105-10-S9-S6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Lee Y, Yang X, Huang Y, et al. Network modeling identifies molecular functions targeted by miR-204 to suppress head and neck tumor metastasis. PLoS computational biology. 2010;6(4):e1000730. doi: 10.1371/journal.pcbi.1000730. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Kaufman L, Rousseeuw PJ. Finding groups in data: an introduction to cluster analysis. Vol. 344. John Wiley & Sons; 2009. [Google Scholar]
- 45.Reynolds K, Muntner P, Fonseca V. Metabolic syndrome underrated or underdiagnosed? Diabetes care. 2005;28(7):1831–1832. doi: 10.2337/diacare.28.7.1831. [DOI] [PubMed] [Google Scholar]
- 46.Baty F, Putora PM, Isenring B, Blum T, Brutsche M. Comorbidities and burden of COPD: a population based case-control study. PloS one. 2013;8(5):e63285. doi: 10.1371/journal.pone.0063285. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Wu S, Keeler EB, Rubenstein LZ, Maglione MA, Shekelle PG. A cost-effectiveness analysis of a proposed national falls prevention program. Clinics in geriatric medicine. 2010;26(4):751–766. doi: 10.1016/j.cger.2010.07.005. [DOI] [PubMed] [Google Scholar]
- 48.Stevens JA, Baldwin GT, Ballesteros MF, Noonan RK, Sleet DA. An older adult falls research agenda from a public health perspective. Clinics in geriatric medicine. 2010;26(4):767–779. doi: 10.1016/j.cger.2010.06.006. [DOI] [PubMed] [Google Scholar]
- 49.Prevention CfDCa. Pregnancy Risk Assessment Monitoring System Report on CDC’s Winnable Battles. 2012. http://www.cdc.gov/prams/PRAMSReport.html.
- 50.He D, Mathews SC, Kalloo AN, Hutfless S. Mining high-dimensional administrative claims data to predict early hospital readmissions. Journal of the American Medical Informatics Association. 2014;21(2):272–279. doi: 10.1136/amiajnl-2013-002151. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Friedman C, Shagina L, Lussier Y, Hripcsak G. Automated encoding of clinical documents based on natural language processing. Journal of the American Medical Informatics Association. 2004;11(5):392–402. doi: 10.1197/jamia.M1552. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Lussier Y, Rothwell D, Côté R. The SNOMED model: a knowledge source for the controlled terminology of the computerized patient record. Methods of information in medicine. 1998;37(2):161–164. [PubMed] [Google Scholar]

