Abstract
Microorganisms in the human body play a vital role in metabolism, immune defense, nutrient absorption, cancer control, and prevention of pathogen colonization. More and more biological and clinical studies have shown that the imbalance of microbial communities is closely related to the occurrence and development of various complex human diseases. Finding potential microbial-disease associations is critical for understanding the pathology of a few diseases and thus further improving disease diagnosis and prognosis. In this study, we proposed a novel computational model to predict disease-associated microbes. Specifically, we first constructed a heterogeneous interconnection network based on known microbe-disease associations deposited in a few databases, the similarity between diseases, and the similarity between microorganisms. We then predicted novel microbe-disease associations by a new method called the double-ended restart random walk model (DRWHMDA) implemented on the interconnection network. In addition, we performed case studies of colon cancer and asthma for further evaluation. The results indicate that 10 and 9 of the top 10 microorganisms predicted to be associated with colorectal cancer and asthma were validated by relevant literatures, respectively. Our method is expected to be effective in identifying disease-related microorganisms and will help to reveal the relationship between microorganisms and complex human diseases.
1. Introduction
Microorganisms include bacteria, archaea, protozoa, fungi, and viruses. There are different types of microorganisms on the human body and in the cavity connected to the outside world, such as the oral cavity, respiratory tract, intestinal tract, and urogenital tract [1, 2]. Microbes play important roles in human health, metabolism, immune defense, nutrient absorption, cancer control, and prevention of colonization of pathogens [3]. Microorganisms of the human body are mainly distributed on the body surface, intestine, and oral cavity, and the types and numbers of microorganisms are different. Among them, the number of microorganisms in the intestine is about ten times that of the body's own cells. In nature, the density of microorganisms isolated from the colon is the highest, and 60% of the dry weight of human feces is bacteria [4].
Numerous studies have shown that many diseases are related to changes in microorganisms. For example, patients with type 2 diabetes have been found to have moderate intestinal microecological disorders and lack of butyric acid-producing bacteria [5]. Intestinal microbial diseases lead to intestinal immune system dysfunction. For patients with irritable bowel syndrome (IBS), the number of chronic inflammatory cells in the colonic mucosa of increases, a large number of T cells are activated, and the expression of inflammatory reactions is accelerated [6]. In addition, epidemiological studies have shown that common mental illnesses such as autism and schizophrenia are associated with perinatal pathogen infections [7–11].
As mentioned above, discovering the potential links between microbes and diseases allows us to better understand the mechanisms by which disease is formed and developed. By regulating the microbial environment, medical solutions for disease prevention, diagnosis, treatment, and prognosis can be provided to some extent. In previous biological or clinical experimental research methods, it took a lot of time and cost to obtain a new connection between microorganisms and disease. In recent years, many computational biology methods have provided new and effective tools for identifying the key links between microorganisms and disease. Ma et al. constructed a microbe-disease association data pants called HMDAD, which can help study the relationship between microbes and diseases [12], and provide data support for various calculation methods to predict new associations
In recent years, machine learning algorithms have achieved good performance in various fields [13]. At present, various machine learning algorithms have been used in the prediction of the association between microorganisms and diseases and have achieved good performance. As such, Chen et al. established a microorganism-human disease association network and further developed a new KATZ metric calculation model for the prediction of human microorganism-disease association (KATZHMDA) under the premise that similar-function microorganisms tend to the following assumptions [14]. Huang et al. [15] proposed a path-based human microorganism-disease association prediction (PBHMDA) method that integrates the identified nuclear-similarities of disease-microbe relationships and Gaussian interaction spectra into a heterogeneous network of diseases and microorganisms. The model traverses all possible pathways between microbes and diseases. A novel depth-first search algorithm is used to predict the microorganisms most likely to be associated with the disease. In addition, Wang et al. [16] proposed a new computational model of Laplace regularized least squares to reveal potential disease-related microorganisms. LRLSHMDA applies a semisupervised learning framework. In this model, a microbial similarity network and a disease similarity network are constructed based on the Gaussian interaction spectrum kernel similarity calculated from known disease-disease associations, and the cost function in the microbial space and disease space is then constructed and optimized integrating the optimal classifier function to calculate the correlation probability of microbial disease pairs. Although the reliable prediction performance of LRLSHMDA has been verified, the model still has some shortcomings and needs further improvement. For example, the number of proven microbial associations is too small, and a sparse network of known associations may affect the predictive performance of the model. Shen et al. [17] combined the known similarity of microbe-disease association with the nuclear similarity of the Gaussian interaction spectrum; a collaborative matrix decomposition calculation model was established for the microbial-disease association prediction (CMFHMDA) of humans and diseases. A special matrix decomposition algorithm is proposed to update the correlation matrix between microorganisms and diseases and infers the microorganisms most likely to be related to diseases. However, the performance of this model needs improvement.
In summary, though the tremendous progress made in computing predictions of microbial-disease associations, there are still some limitations. In order to better reveal the association between microbial diseases, based on the known heterogeneous network consisting of microbial-disease association and Gaussian interaction contour kernel similarity, we propose a computational model based on a double-ended restart random walk to predict disease-related microorganisms. To prove the superiority of the DRWHMDA algorithm, we applied the 5-fold CV and global LOOCV to evaluate the prediction performance of DRWHMDA. In addition, we used DRWHMDA for case studies of two diseases.
2. Materials and Methods
2.1. Materials
The general workflow of DRWHMDA is shown in Figure 1. First, we need to preprocess the data. The original data comes from a microbe-disease association dataset named HMDAD constructed by Ma et al. [13]. HMDAD contains 483 artificially planned microbiological associations involving 39 diseases and 292 microorganisms. Because there are multiple evidences for some associations, we extracted 450 different disease-microbial associations. Secondly, based on these known microbial-disease associations, we constructed disease networks, microbial networks, and microbial-disease related networks, respectively. Here, Nd = 39 indicates the number of diseases, and Nm = 292 indicates the number of microorganisms. Finally, a two-terminal random walk is performed through a heterogeneous network. Combine different prediction scores into the final associated prediction probability according to the linear combination.
Figure 1.

The workflow of DRWHMDA for inferring potential microbe-disease associations.
2.2. Symptom-Based Disease Similarity (SDM)
In the field of information retrieval, text documents or concepts are usually represented by feature vectors. Here, we describe the vector dj of each disease j through symptoms.
| (1) |
where wi,j quantifies the strength of the association between symptom i and disease j. The prevalence of different symptoms and diseases is very different. In order to solve this heterogeneity, we do not use absolute co-occurrence wi,j to measure the strength of the association between symptom i and disease j, but the term frequency and the reciprocal of the document frequency wi,j.
| (2) |
where N represents the number of all diseases in the data set and ni represents the number of diseases with symptom i.
Therefore, the similarity between the vectors dx and dy of the two diseases x and y is calculated as follows:
| (3) |
The cosine similarity ranges from 0 (no shared symptoms) to 1 (identical symptoms).
2.3. Effect of Gaussian Interaction Spectroscopy Nuclear Similarity on Disease
Based on the assumption that diseases with similar phenotypes always share similar associations and nonassociative patterns with functionally similar microorganisms, the Gaussian interaction distribution kernel similarity between disease and disease can be further calculated. We define the binary vector VP(di) to represent the interaction curve of disease di, which can be obtained by observing whether di is known to be associated with each microorganism (i.e., the ith row of the adjacency matrix). Then, after calculating the similarity value between disease pairs, the Gaussian interaction distribution kernel similarity matrix (KD) can be constructed.
| (4) |
| (5) |
The parameter value γd controls the bandwidth of the Gaussian kernel. As shown in (5), γ′d can be further calculated by dividing the new bandwidth parameter γ′d. The average of each disease is associated with microorganisms. Here, we γ′d = 1 according to previous research [18].
From the above, we can see that the similarity of the Gaussian interaction spectrum kernel is only based on adjacency matrix A. If we want to effectively and scientifically predict potential disease-related microorganisms, it is necessary to incorporate other data sets similar to the Gaussian interaction spectrum kernel, recorded in PubMed bibliography based on disease and corresponding symptoms. Zhou et al. (2014) calculated similarities between diseases and established a symptom-based human disease network (HSDN). Here, we synthesize the Gaussian interaction spectrum kernel similarity of disease KD and symptom-based disease similarity SDM to obtain symptom-based disease similarity SD, and SD is calculated as follows:
| (6) |
2.4. Gaussian Interaction Spectrum Nuclear Similarity for Microbes
In the same way, the Gaussian interaction similarity mi and mj between microorganisms can be obtained as the Gaussian kernel similarity matrix (KM) between microorganisms.
| (7) |
where γm′ is usually set to 1.
2.5. Building a Heterogeneous Network
A heterogeneous network can be expressed as G = (D, E), where D represents 331 of all diseases and microorganisms and E represents the interaction of microorganisms and microorganisms, diseases and diseases, and diseases and microorganisms. The heterogeneous network is represented by n∗n adjacency matrix A, where n represents the number of diseases and microorganisms. By the similarity between the microorganisms (KM) and the similarity between the diseases (KD), the coefficients of similarity can construct a heterogeneity network. Then, for each adjacency matrix A, if there is an interaction between Ai and Aj, the i-th row and j-th column are set to 1, otherwise set to zero. Normalize adjacency matrix A:
| (8) |
2.6. Restart Random Walk Algorithm in Both Directions
Through heterogeneous networks, random walks are used to find potential genetic association data between diseases or microorganisms. By randomly walking to convergence, you can get the probability of a disease or microbe at every point in the heterogeneous network. The relationship between microorganisms and diseases is indicated by calculating the correlation between the probability distributions of disease and microorganisms.
For a disease, we list all relevant diseases and microorganisms in our known data set, and then our related diseases and microbial collections are the seeds of the disease. Among them,
| (9) |
Among them, the disease-related diseases and microbial aggregates ψdis¯i were set to 1, and the others were set to zero. Normalize Pdis:
| (10) |
Similarly, we list all relevant diseases and microorganisms for a certain microbe, we know the data set, and then, we related diseases and microbial collections as Pmic.
Begin random walks and randomly access adjacent genes in each time scale (t⟶t + 1). State probability Pt+1 at time t + 1:
| (11) |
where Pt is the probability of time t and r is the probability of restart. According to previous studies, we set r to 0.7 [19]. If the difference between Pt and Pt+1 is less than 10−6 used in the previous study, the process will reach a steady state [20, 21]. By using the mapped set Pdis as the seed of the disease and the mapped set Pmic as the seed of the microbe, we implemented a two-way random walk algorithm to obtain the association probability scored with disease as the random seed and the association probability scorem with microorganism as the random seed. The association probability score between the disease di and the microorganism mi is finally obtained by linearly combining the two predicted probabilities.
| (12) |
where β represents the parameter of the linear combination; we set the default value to 0.7.
3. Results
3.1. Performance Evaluation
To verify the predictive performance of DRWHMDA, we implemented 5-fold CV and global LOOCV on the model based on the HMDAD database. In each 5-fold CV, the known correlation matrix Y is divided into 5-folds; then each fold is taken as a test set, and the remaining 4 folds are treated as a training set. On the other hand, in the global LOOCV, each known microbial-disease association is sequentially excluded from the test, and other microbial-disease associations are used as training samples for model learning. Specifically, all microbial-disease pairs without known evidence of correlation will be considered candidate samples. Further obtain the rank of each missing test sample relative to the candidate sample. Test samples with a prediction level above a given threshold will be considered to have successfully predicted. We evaluated the predictive performance of the model based on the AUC value of the area under the curve of the receiver. Specifically, only test samples ranked above a certain threshold can be considered correct predictions. We then set the true-positive rate (TPR, sensitivity) and false-positive rate (FPR, 1 − specificity) as the horizontal axis and the vertical axis, respectively. Therefore, we can draw a receiver operating characteristic (ROC) curve composed of points corresponding to different thresholds and then obtain the area (AUC) under the ROC curve. A model with an AUC value equal to 0.5 is equivalent to random prediction. When the AUC takes a maximum of 1, the model has excellent prediction performance. In other words, when the value of AUC is greater than 0.5 and less than 1, the larger the value is, the better the prediction performance of the model.
As shown in Figure 2, the 5-fold CV value of DRWHMDA was 0.8676, which was significantly larger than those of KATZHMDA (0.8382), LRLSHMDA (0.8493), and ABHMDA (0.8571). What was more, the global LOOCV value of our model reached 0.8897, which was also obviously better than those of KATZHMDA (0.8644), LRLSHMDA (0.8843), and ABHMDA (0.8861). These results confirmed the superior prediction performance of DRWHMDA.
Figure 2.

The ROC curves for DRWHMDA and other approaches in microbe-disease association prediction for 5-fold cross-validation and global LOOCV.
To investigate the selection of restart probability r for the performance of DRWHMDA, we set various values of r ranging from 0.1 to 0.9 and calculated AUC in the framework of 5-fold CV. As shown in Table 1, as the restart probability r gradually increases, the prediction performance obtained through DRWHMDA increases first and then decreases.
Table 1.
Prediction AUCs of DRWHMDA at different choices of restart probability r.
| DRWHMDA | AUC | DRWHMDA | AUC |
|---|---|---|---|
| r = 0.1 | 0.8511 | r = 0.6 | 0.8684 |
| r = 0.2 | 0.8513 | r = 0.7 | 0.8674 |
| r = 0.3 | 0.8525 | r = 0.8 | 0.8666 |
| r = 0.4 | 0.8597 | r = 0.9 | 0.8590 |
| r = 0.5 | 0.8695 |
3.2. Case Study
In the present study, double-ended random walks were used to screen candidate microorganisms for all the investigated diseases. To further evaluate the predictive performance of DRWHMDA, we included 10,038 unknown samples in HMDAD, involving 39 diseases and 292 microorganisms. The corresponding unknown samples are classified and ranked by the DRWHMDA algorithm, and it is verified whether the relevant literature has verified the association between the top ten microorganisms and the disease under study. Among them, an independent case analysis was performed on colon cancer and asthma.
3.3. Relationship between Colon Cancer and Microorganisms
According to previous research, the intestinal microflora is the most complex, and it is most closely related to various behavioral diseases in humans. Imbalance of the human intestinal microbial flora can lead to autoimmune diseases [22], obesity [23, 24], inflammatory bowel disease (IBD) [25], diabetes [26], and even cancer [27, 28]. According to the world's leading cancer statistics report, colon cancer has been a high-risk area for men and women over the past few decades [29]. Therefore, it is necessary to study the pathogenesis of colon cancer in order to explore new treatment methods. More and more evidences show that the imbalance of microbial community is closely related to the occurrence and development of colon cancer. For example, in the sequence analysis of 16S rRNA gene V3 region in patients with sporadic colorectal cancer, the protein bacteria are insufficient [30]; Staphylococcus produces tannase; its activity may be related to the development of colon cancer [31]. Compared with noncancerous tissues, Lactococcus and Fusarium are more abundant in cancerous tissues, and Pseudomonas and Escherichia coli are less abundant [32]. We applied DRWHMDA to the first case study of colon cancer. Of the top 10 predicted microorganisms, 9 have been validated based on recent experimental literature (see Table 2). Evidence suggests that Clostridium difficile- (first in the prediction list) associated colitis is a known complication of colon and rectal surgery and can increase morbidity and mortality during surgery, thereby increasing hospital stay time and costs [33].
Table 2.
The 10 microbes predicted to be most likely to be associated with colon cancer.
| Microbe | Evidence |
|---|---|
| Clostridium difficile | PMID:21152135 |
| Helicobacter pylori | PMID:22294430 |
| Protein bacteria | PMID:25699023 |
| Prevotella | PMID:25699024 |
| Staphylococcus aureus | Unconfirmed |
| Clostridium globosum | PMID:18237311 |
| Fermicket | PMID:25699024 |
| Bacteroides | PMID:25699024 |
| Actinomycetes | PMID:26811603 |
| Clostridium | PMID:19807912 |
3.4. The Relationship between Asthma and Microbes
Asthma is a common chronic inflammatory disease of the lungs and is generally thought to be caused by a combination of genetic and environmental factors. According to the latest statistics, the incidence of asthma has been rising in recent decades, with the number of asthma patients increasing from 183 million in 1990 to 242 million in 2013. Infection by pathogenic microorganisms (especially viruses, chlamydia, mycoplasma, and mold) is one of the main causes of severe asthma [34–37]. For example, studies have shown that Proteus accounts for a higher proportion of microorganisms in asthma patients and that Firmicutes are reduced in asthma patients compared to normal people. Moreover, there is evidence that when the hypopharyngeal area of a newborn is infected with Streptococcus pneumoniae, the risk of developing asthma is increased compared to uninfected [38, 39]. Therefore, research on asthma-related microorganisms is crucial and will help us to gain a deeper understanding of the pathogenesis and treatment of asthma. Prioritizing candidate microorganisms by implementing DRWHMDA, recent clinical evidence successfully validated 9 of the top 10 predicted microorganisms (see Table 3). As for the top five confirmed asthma-associated microorganisms, Clostridium difficile and Staphylococcus aureus (No. 1 and No. 5 in the prediction table) were found to be increased in number in airway concentrations in asthma patients, while Firmicutes and Actinomycetes were found to be reduced [40]. Importantly, the XIVa subclass of Clostridium globosum (No. 3 in the prediction table) has been proven to be an early indicator of future asthma, help prevent and diagnose asthma, and provide guidance for clinical treatment.
Table 3.
The 10 microbes predicted to be most likely to be associated with asthma.
| Microbe | Evidence |
|---|---|
| Clostridium difficile | PMID:25974301 |
| Fermicket | PMID:23265859 |
| Clostridium globosum | PMID:21477358 |
| Actinomycetes | PMID:23265859 |
| Staphylococcus aureus | PMID:12743582 |
| Lactobacillus | PMID:20592920 |
| Clostridium | PMID:21477358 |
| Burkholder | PMID:24451910 |
| Gracilariaceae | PMID:17433177 |
| Lachnospiraceae | PMID: 27433177 |
For clarity, we illustrate in Figure 3 the association network of the top 10 predicted microbial candidates for two diseases. It is worth noting that some top candidates were found to be related to several diseases. For example, both Fermicket and Clostridium have been documented to prove that they are related to the occurrence of asthma and colon cancer at the same time.
Figure 3.

The network of the top 10 predicted associations for the two diseases via DRWHMDA. The dotted line indicates that it has not been confirmed by the literature.
4. Discussion
Over the years, a lot of evidence has shown that microorganisms living in the human body are closely related to human life activities and human diseases. Abnormal levels of specific microorganisms are closely related to the development of various human diseases. Microbial disease-related knowledge can provide valuable insights into understanding complex disease mechanisms and preventing, diagnosing, and treating various diseases. However, little work has been done to predict microbial candidates for large-scale human complex diseases. Therefore, in this paper, a computational model based on known microbial-disease correlation is proposed. A microbial similarity network and a disease similarity network are constructed using Gaussian kernel similarity. Using the existing experimentally validated associations, we connected the two networks. The double-ended restart random walk method is used to walk on the network, and the correlation probability order representing the candidate microorganism-disease association is obtained. The construction network with different correlations is applied to the optimization of prediction performance, and the optimal prediction parameters are obtained. The results show that DRWHMDA achieved average AUC reliability performance of 0.8676 and 0.8897 in the 5-fold cross-validation and LOOCV framework, respectively. Given its good predictive performance, we believe that the model can be used as one of the effective tools to accelerate biomedical identification of underlying disease-related microorganisms.
Although DRWHMDA has achieved satisfactory results, this method still has some limitations. First, we only use Gaussian kernel similarity to construct a similarity network which is too simplistic. Improving the predictive performance of DRWHMDA by integrating disease or microbe similarity from multiple data sources (such as sequence similarity) may help. Secondly, as more and more microbes and disease associations are identified, collecting more validated data will help us conduct further research. Finally, we experimentally verify candidate microbes related to the disease, and some have not been verified in the literature, because the verification of these candidate microbes through biological wet experiments will also be one of the important directions for our subsequent research.
5. Conclusion
The main goal of the current research is to predict the microorganisms that may be related to the disease through the calculation method, thereby reducing the verification cost of the biological wet experiment, so that people can more deeply explore the impact of microorganisms on human complex diseases. Therefore, this paper proposes a calculation model of microbial disease correlation based on double-ended random walk. The results show that DRWHMDA has achieved more reliable and stable prediction performance than other algorithms. We believe that the model can be used as one of the effective tools for accelerating biomedical identification of potential disease-related microorganisms.
Acknowledgments
This research was funded by the Harbin Applied Technology Research and Development Project (grant no. 2017RAXXJ073), Hai Yan Foundation of Harbin Medical University Cancer Hospital (grant no. JJMS2016-04), Wu Jieping Medical Foundation Technology Research Project (grant no. 320.6750.17107), and Heilongjiang Postdoctoral Financial Assistance (LBH-Z16216).
Data Availability
The database used in this study was downloaded from the Human Microbe-Disease Association Database (HMDAD, http://www.cuilab.cn/hmdad).
Conflicts of Interest
The authors declare no competing interests.
Authors' Contributions
Di Wang and Hui Chen conceived the concept of the work. Yan Cui, Yuxuan Cao, and Yuehan He performed the experiments. Yan Cui and Di Wang wrote the paper. All authors approved the final version of this manuscript. Di Wang and Yan Cui contributed equally to this work.
References
- 1.Ventura M., O'Flaherty S., Claesson M. J., et al. Genome-scale analyses of health-promoting bacteria: probiogenomics. Nature Reviews. Microbiology. 2009;7(1):61–71. doi: 10.1038/nrmicro2047. [DOI] [PubMed] [Google Scholar]
- 2.Costello E. K., Lauber C. L., Hamady M., Fierer N., Gordon J. I., Knight R. Bacterial community variation in human body habitats across space and time. Science. 2009;326(5960):1694–1697. doi: 10.1126/science.1177486. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Das B., Nair G. B. Homeostasis and dysbiosis of the gut microbiome in health and disease. Journal of Biosciences. 2019;44(5):p. 44(5). doi: 10.1007/s12038-019-9926-y. [DOI] [PubMed] [Google Scholar]
- 4.Pickard J. M., Zeng M. Y., Caruso R., Núñez G. Gut microbiota: role in pathogen colonization, immune responses, and inflammatory disease. Immunological Reviews. 2017;279(1):70–89. doi: 10.1111/imr.12567. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Qin J., Li Y., Cai Z., et al. A metagenome-wide association study of gut microbiota in type 2 diabetes. Nature. 2012;490(7418):55–60. doi: 10.1038/nature11450. [DOI] [PubMed] [Google Scholar]
- 6.Pimentel M., Chow E. J., Lin H. C. Eradication of small intestinal bacterial overgrowth reduces symptoms of irritable bowel syndrome. The American Journal of Gastroenterology. 2000;95(12):3503–3506. doi: 10.1111/j.1572-0241.2000.03368.x. [DOI] [PubMed] [Google Scholar]
- 7.Mittal V. A., Ellman L. M., Cannon T. D. Gene-environment interaction and covariation in schizophrenia: the role of obstetric complications. Schizophrenia Bulletin. 2008;34(6):1083–1094. doi: 10.1093/schbul/sbn080. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Finegold S. M., Molitoris D., Song Y., et al. Gastrointestinal microflora studies in late-onset autism. Clinical Infectious Diseases. 2002;35(Suppl 1):S6–s16. doi: 10.1086/341914. [DOI] [PubMed] [Google Scholar]
- 9.Chen Y., Yang F., Lu H., et al. Characterization of fecal microbial communities in patients with liver cirrhosis. Hepatology. 2011;54(2):562–572. doi: 10.1002/hep.24423. [DOI] [PubMed] [Google Scholar]
- 10.Xu J., Cai L., Liao B., et al. Identifying potential miRNAs–disease associations with probability matrix factorization. Frontiers in Genetics. 2019;10:p. 1234. doi: 10.3389/fgene.2019.01234. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Xu J., Zhu W., Cai L., et al. LRMCMDA: predicting miRNA-disease association by integrating low-rank matrix completion with miRNA and disease similarity information. IEEE Access. 2020;8:80728–80738. doi: 10.1109/ACCESS.2020.2990533. [DOI] [Google Scholar]
- 12.Ma W., Zhang L., Zeng P., et al. An analysis of human microbe-disease associations. Briefings in Bioinformatics. 2017;18(1):85–97. doi: 10.1093/bib/bbw005. [DOI] [PubMed] [Google Scholar]
- 13.Xu J., Cai L., Liao B., Zhu W., Yang J. L. CMF-impute: an accurate imputation tool for single-cell RNA-seq data. Bioinformatics. 2020;36(10):3139–3147. doi: 10.1093/bioinformatics/btaa109. [DOI] [PubMed] [Google Scholar]
- 14.Chen X., Huang Y. A., You Z. H., Yan G. Y., Wang X. S. A novel approach based on KATZ measure to predict associations of human microbiota with non-infectious diseases. Bioinformatics. 2018;34(8):p. 1440. doi: 10.1093/bioinformatics/btx773. [DOI] [PubMed] [Google Scholar]
- 15.Huang Z.-A., Chen X., Zhu Z., et al. PBHMDA: path-based human microbe-disease association prediction. Frontiers in Microbiology. 2017;8:p. 233. doi: 10.3389/fmicb.2017.00233. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Wang F., Huang Z. A., Chen X., et al. LRLSHMDA: Laplacian regularized least squares for human microbe-disease association prediction. Scientific Reports. 2017;7(1):p. 7601. doi: 10.1038/s41598-017-08127-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Shen Z., Jiang Z., Bao W. CMFHMDA: Collaborative Matrix Factorization for Human Microbe-Disease Association Prediction. Intelligent Computing Theories and Application. 2017 doi: 10.1007/978-3-319-63312-1_24. [DOI] [Google Scholar]
- 18.van Laarhoven T., Nabuurs S. B., Marchiori E. Gaussian interaction profile kernels for predicting drug-target interaction. Bioinformatics. 2011;27(21):3036–3043. doi: 10.1093/bioinformatics/btr500. [DOI] [PubMed] [Google Scholar]
- 19.Shi H., Xu J., Zhang G., et al. Walking the interactome to identify human miRNA-disease associations through the functional link between miRNA targets and disease genes. BMC Systems Biology. 2013;7(1):p. 101. doi: 10.1186/1752-0509-7-101. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Hofree M. Network-based stratification of tumor mutations. Nature Methods. 2013;10(11):1108–1115. doi: 10.1038/nmeth.2651. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Zhong X., Yang H., Zhao S., Shyr Y., Li B. Network-based stratification analysis of 13 major cancer types using mutations in panels of cancer genes. BMC Genomics. 2015;16(Suppl 7):p. S7. doi: 10.1186/1471-2164-16-s7-s7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Lee Y. K., Mazmanian S. K. Has the microbiota played a critical role in the evolution of the adaptive immune system? Science. 2010;330(6012):1768–1773. doi: 10.1126/science.1195568. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Vijay-Kumar M., Aitken J. D., Carvalho F. A., et al. Metabolic syndrome and altered gut microbiota in mice lacking toll-like receptor 5. Science. 2010;328(5975):228–231. doi: 10.1126/science.1179721. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Abenavoli L., Scarpellini E., Colica C., et al. Gut microbiota and obesity: a role for probiotics. Nutrients. 2019;11(11):p. 2690. doi: 10.3390/nu11112690. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Kaakoush N. O., Day A. S., Huinao K. D., et al. Microbial dysbiosis in pediatric patients with Crohn's disease. Journal of Clinical Microbiology. 2012;50(10):3258–3266. doi: 10.1128/JCM.01396-12. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Wen L., Ley R. E., Volchkov P. Y., et al. Innate immunity and intestinal microbiota in the development of type 1 diabetes. Nature. 2008;455(7216):1109–1113. doi: 10.1038/nature07336. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Castellarin M., Warren R. L., Freeman J. D., et al. Fusobacterium nucleatum infection is prevalent in human colorectal carcinoma. Genome Research. 2012;22(2):299–306. doi: 10.1101/gr.126516.111. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Eslami M., Sadrifar S., Karbalaei M., Keikha M., Kobyliak N. M., Yousefi B. Importance of the microbiota inhibitory mechanism on the Warburg effect in colorectal cancer cells. Journal of Gastrointestinal Cancer. 2019 doi: 10.1007/s12029-019-00329-3. [DOI] [PubMed] [Google Scholar]
- 29.Jemal A., Bray F., Center M. M., Ferlay J., Ward E., Forman D. Global cancer statistics. CA: a Cancer Journal for Clinicians. 2011;61(2):69–90. doi: 10.3322/caac.20107. [DOI] [PubMed] [Google Scholar]
- 30.Gao Z., Guo B., Gao R., Zhu Q., Qin H. Microbiota disbiosis is associated with colorectal cancer. Frontiers in Microbiology. 2015;6:p. 20. doi: 10.3389/fmicb.2015.00020. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Zhu Q., Jin Z., Wu W., et al. Analysis of the intestinal lumen microbiota in an animal model of colorectal cancer. PLoS One. 2014;9(3, article e90849) doi: 10.1371/journal.pone.0090849. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Noguchi N., Fukuzawa M., Wajima T., et al. Specific clones of Staphylococcus lugdunensis may be associated with colon carcinoma. Journal of Infection and Public Health. 2018;11(1):39–42. doi: 10.1016/j.jiph.2017.03.012. [DOI] [PubMed] [Google Scholar]
- 33.Furet J. P., Kong L. C., Tap J., et al. Differential adaptation of human gut microbiota to bariatric surgery-induced weight loss: links with metabolic and low-grade inflammation markers. Diabetes. 2010;59(12):3049–3057. doi: 10.2337/db10-0253. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Dumas M. E., Barton R. H., Toye A., et al. Metabolic profiling reveals a contribution of gut microbiota to fatty liver phenotype in insulin-resistant mice. Proceedings of the National Academy of Sciences of the United States of America. 2006;103(33):12511–12516. doi: 10.1073/pnas.0601056103. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Backhed F., Ding H., Wang T., et al. The gut microbiota as an environmental factor that regulates fat storage. Proceedings of the National Academy of Sciences of the United States of America. 2004;101(44):15718–15723. doi: 10.1073/pnas.0407076101. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Ma L., Pan Y. P., Zhang J. Q. Study of putative periodontal pathogens colonies in type 2 diabetes with chronic periodontitis. Shanghai Kou Qiang Yi Xue. 2010;19(6):611–615. [PubMed] [Google Scholar]
- 37.Hu R., Zeng F., Wu L., et al. Fermented carrot juice attenuates type 2 diabetes by mediating gut microbiota in rats. Food & Function. 2019;10(5):2935–2946. doi: 10.1039/C9FO00475K. [DOI] [PubMed] [Google Scholar]
- 38.Murphy R., Tsai P., Jüllig M., Liu A., Plank L., Booth M. Differential changes in gut microbiota after gastric bypass and sleeve gastrectomy bariatric surgery vary according to diabetes remission. Obesity Surgery. 2017;27(4):917–925. doi: 10.1007/s11695-016-2399-2. [DOI] [PubMed] [Google Scholar]
- 39.Long J., Cai Q., Steinwandel M., et al. Association of oral microbiome with type 2 diabetes risk. Journal of Periodontal Research. 2017;52(3):636–643. doi: 10.1111/jre.12432. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Vael C., Vanheirstraeten L., Desager K. N., Goossens H. Denaturing gradient gel electrophoresis of neonatal intestinal microbiota in relation to the development of asthma. BMC Microbiology. 2011;11(1):p. 68. doi: 10.1186/1471-2180-11-68. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
The database used in this study was downloaded from the Human Microbe-Disease Association Database (HMDAD, http://www.cuilab.cn/hmdad).
