Abstract
16S rRNA gene profiling is currently the most widely used technique in microbiome research and allows the study of microbial diversity, taxonomic profiling, phylogenetics, functional and network analysis. While a plethora of tools have been developed for the analysis of 16S rRNA gene data, only a few platforms offer a user-friendly interface and none comprehensively covers the whole analysis pipeline from raw data processing down to complex analysis. We introduce Namco, an R shiny application that offers a streamlined interface and serves as a one-stop solution for microbiome analysis. We demonstrate Namco’s capabilities by studying the association between a rich fibre diet and the gut microbiota composition. Namco helped to prove the hypothesis that butyrate-producing bacteria are prompted by fibre-enriched intervention. Namco provides a broad range of features from raw data processing and basic statistics down to machine learning and network analysis, thus covering complex data analysis tasks that are not comprehensively covered elsewhere. Namco is freely available at https://exbio.wzw.tum.de/namco/.
Keywords: bioinformatics pipeline, data visualization, microbial co-occurrence networks, microbiome data analysis, microbial functional profiling
Data Summary
Namco was implemented as a R shiny app (https://exbio.wzw.tum.de/amco/) under GNU General Public License v3.0.
The complete source code is publicly available at Github (https://github.com/biomedbigdata/namco)
Sequence data shown in the usecase were deposited in the National Center for Biotechnology Information (NCBI) SRA archive under the BioProject ID PRJNA774891.
A user manual is available at https://docs.google.com/document/d/1A_3oUV7xa7DRmPzZ-J-IIkk5m1b5bPxo59iF9BgBH7I/edit?usp=sharing
Impact Statement.
Amplicon sequencing is a key technology of microbiome research and has yielded many insights into the complexity and diversity of microbiota. To fully utilize these data, a wide range of tools have been developed for raw data processing, normalization, statistical analysis and visualization. These tools are mostly available as R packages but cannot be easily linked in an automated pipeline due to the heterogeneous characteristics of microbiome data. Instead, user-friendly tools for explorative analysis are needed to give biomedical researchers without experience in scripting languages the possibility to fully exploit their data. Several tools for microbiome data analysis have been proposed in recent years which cover a broad range of functionality, but only few offer a user-friendly and beginner-friendly interface while covering the entire value chain from raw data processing down to complex analysis. With Namco (https://exbio.wzw.tum.de/namco/), we present a beginner-friendly one-stop solution for microbiome analysis that covers upstream analyses such as raw data processing and taxonomic binning and downstream analyses such as basic statistics, machine learning and network analysis, among other features.
Introduction
Over the past decade, microbiome research has contributed to our understanding of human health and microbiome-associated diseases with implications for diagnosis, prevention and treatment [1]. Several studies have discovered important links between the gut microbiome and human diseases including diabetes [2], cancer [3], inflammatory bowel disease [4] and brain disorders [5].
Currently, microbiome datasets are generated using either targeted (amplicon) gene sequencing to characterize the microbial composition and phylogeny or shotgun metagenomics to study, in addition, the gene-coding and functional potential of the microbiome. Due to the comparably lower sequencing costs, sequencing of the 16S rRNA gene is the most commonly available method. Analysis of 16S (or 18S in the case of eukaryotes) rRNA genes can be grouped into four steps. First, amplicons are clustered into operational taxonomic units (OTUs) [6] or amplicon sequence variants (ASVs) [7]. For OTUs, sequencing errors are addressed by choosing a similarity threshold of typically 97% for clustering whereas the latter employs a denoising strategy to identify the unique error-corrected sequence of an organism [8]. Several benchmark studies have proved that denoising methods provide better resolution and accuracy than OTU clustering methods [7, 9]. Along with taxonomic profiling, various measures of alpha and beta diversity are typically computed to study the microbial diversity within and between samples, respectively. Additional analysis steps include, for instance, (i) differential abundance analysis to identify bacteria and/or functions that differ between groups of interest, (ii) in silico inference of metagenomes for functional profiling and (iii) microbial association analysis through correlation, co-occurrence and network inference methods.
Many open-source tools have been developed for microbiome data analysis (Fig. 1). Mothur [10], QIIME2 [11], DADA2 [8] and LotuS2 [12] offer processing of raw sequencing files through clustering and annotation of 16S rRNA genes and provide OTU or ASV tables which serve as an input for further downstream analysis. The QIIME2 [11] pipeline offers more than 20 plugins for downstream analysis including the q2-sample-classifier for supervised classification and regression analysis [12], q2-longitudinal for time-series analysis [13], and plugins for compositional data analysis [14]. In addition, a plethora of R packages have been implemented to perform statistical analysis and high-quality visualizations on amplicon tables, such as Phyloseq [15, 16] and themetagenomics [17]. Even though R scripts utilizing these tools offer a powerful approach to analyse microbial data, their application can be demanding for users without scripting knowledge and bioinformatics training. Hence, there is a need for user-friendly tools that can support end-to-end exploration of microbiomes. To this end, several web-based tools have been developed including microbiomanalyst [18], IMNGS [19], iMAP [20], MGRAST [21], wiSDOM [22], VAMPS [23], Shiny-Phyloseq [16], etc. However, most of these tools (i) cover only the downstream part of the analysis and often omit raw data processing, (ii) offer only standard analysis and are thus not sufficient for more complex data sets, (iii) omit functional profiling or use outdated approaches, such as Tax4FUN [24] and PICRUSt1 [25] which were outperformed by PICRUSt2 [26], (iv) do not offer confounder analysis, (v) offer no support for time-series, (vi) lack support for machine learning, and (vii) do not construct microbial association networks and differential networks on different taxonomic ranks.
To address these limitations, we introduce Namco, an R (v 4.1) shiny application that offers a streamlined and user-friendly interface and serves as a one-stop solution for microbiome analysis. Namco provides a broad range of features from raw data processing and basic statistics down to machine learning and network analysis, thus covering complex data analysis tasks that are not comprehensively covered elsewhere (see Fig. 1, for a comparison with other tools). Namco’s thoroughly documented and easy-to-use graphical interface is intended to eliminate the use of command-line arguments during data processing, making advanced microbiome analysis accessible to a broad range of biomedical researchers. Privacy legislation such as the European Union’s General Data Protection Regulation (GDPR) can prevent users from uploading their data to web tools such as Namco. Hence, we make Namco available under an open source licence and release it as a Docker container, which can be executed locally (e.g. using Docker Desktop) or can be safely deployed on a local protected server.
Methods
Input file formats
For raw data processing, Namco accepts both single- and paired-end FASTQ files which are processed internally based on DADA2 [8] or LotuS2 [12]. Alternatively, users can start their analysis with a previously generated OTU or ASV table, which can either include relative abundance values or read counts. For simplicity, we refer to these as features throughout the paper. Features should be labelled either by their taxonomy or alternatively by a unique id which Namco can map to a separate tabular file with taxonomic labels that can be uploaded separately by the user. Optionally, users can upload a metadata file containing additional information for each sample which can be utilized for groupwise differences, correlation, confounder or condition-specific analysis, for instance. Finally, users can optionally provide a tree file in newick format for phylogenetic tree analysis, for diversity analysis or to create ecologically organized heatmaps. The entire workflow of Namco is presented in Fig. 2.
Denoising and taxonomic assignment
Namco accepts both Ilumina single- and paired-end reads in FASTQ format. It supports two pipelines for upstream analysis, DADA2 [8] and LotuS2 [27]. The latter is a user-friendly pipeline providing access to six different sequence clustering algorithms including USEARCH [6], UNOISE3 [28] and DADA2 [8]. It features 21 quality filtering metrics [27] which improved the quality and consistency of the output. Users should be aware of the filtering techniques used in LotuS2 before applying these to their own data. In addition, Namco also provides a stand-alone DADA2 option for denoising steps. Users can change the default parameters such as trimming length depending on the amplicon length and the quality score threshold. While filtering primer regions, the DADA2 option in Namco expects that the primer sequences are present at the start of the input reads with a constant length. The silva (v. 138) [29] database is used as a reference for taxonomic classification. Namco stores the LotuS2/DADA2 output as a phyloseq object using the phyloseq R [15] package and passes it down for further downstream analysis. Alternatively, users can perform their own upstream process elsewhere and upload abundance tables and metadata as inputs to Namco.
Data overview and filtering
The data overview section in Namco summarizes sample details, total number of features identified during the denoising step and the number of groups provided by the metadata file. This gives an overall picture of the input data and will help in preparing the input data for further processing. Normalization and filtering are considered crucial steps in microbial analyses [30, 31]. By default, Namco applies 0.25% of the abundance filter on the DADA2-generated features and normalizes abundance to 10000 reads before downstream analysis. Filtering the abundance of both OTUs and ASVs at >0.25 % [32] was identified as an effective threshold to prevent the identification of spurious taxa to a large extent. Alternatively, users can also choose different normalization methods and filtering percentages such as sampling depth, rarefaction and centred log-ratio transformation (CLR) [33]. In addition to the normalization methods, Namco offers different filtering options based on sample prevalence and relative abundance, as microbiome data are very sparse in nature and often have zero counts in most samples. These rare taxa are caused by sequencing artefacts, contamination and/or sequencing errors [32]. In addition, Namco utilizes the decontam R package [34] which can differentiate contaminants from non-contaminants across diverse studies [35, 36] and hence improves the quality of biological conclusions in microbiome studies.
Basic analysis
Visualization of taxonomic binning and ecologically organized heatmap
Namco integrates R-scripts from Rhea [37] and phyloseq [15] to perform taxonomic profiling and diversity analysis and provides different options to visualize the distribution of dominant taxa at different ranks (domain, phylum, class, order, family and genus) among groups using barplots. In addition, taxonomic distribution can also be inferred based on intra-individual differences by visualizing taxa for individual samples. Users can download feature tables of relative abundances aggregated at different taxonomic levels and export any of the generated plots. The advanced heatmap option in Namco creates a heatmap using ordination methods to organize the rows and columns instead of a hierarchical clustering approach which gives an overview of the abundance of features across sample groups that are very high/low in abundance.
Diversity analysis
Alpha diversity quantifies the diversity of the microbiome within a group. Namco supports five common alpha diversity measures, namely Shannon entropy [38] and Simpson index [39] together with their counterparts accounting for the effective number of species [40] as well as richness. Users can select different categories from the metadata to visualize alpha diversity and determine significant differences via a Wilcoxon test. Beta diversity analysis explains the variation between groups and relies on a phylogenetic tree as input along with the feature table to calculate dissimilarity. Namco supports the most common distance metrics including weighted and unweighted unifrac distances, generalized unifrac [41], variance adjusted unifrac distance and Bray Curtis dissimilarity [42]. Calculation of unifrac distances is only possible if a tree file has been uploaded. The results are presented using non-metric multidimensional scaling (NMDS) and principal coordinates analysis (PCoA) [43]. In addition, it is also possible to visualize the distance as a hierarchically clustered dendrogram which helps to identify closely related samples. Significance between groups is determined by a permutational multivariate analysis of variances using the adonis function of the vegan R-package [44]. P-values are corrected for multiple testing following Benjamini–Hochberg (BH) [45].
Differential analysis
Differential abundance testing using simple statistical tests and association analysis
A key aim of microbial research is to identify differences in microbial composition between conditions or phenotypes. Namco reports statistically significant features between the sample groups using the non-parametric Wilcoxon test (SIAMCAT R-package [46]), which was shown to reliably control the false discovery rate in differential abundance analysis [47]. Users can choose groups that should be compared against each other and adjust the significance level as well as other filtering parameters. Differential abundance can be calculated at different taxonomic levels such as phylum or genus, where Namco aggregates the feature table accordingly. Namco shows the distribution of microbial relative abundance along with the significance and a generalized fold change [48] as a non-parametric measure of effect size. In addition to the Wilcoxon test, Namco also offers the Kruskal–Wallis test to find significant differences between more than two groups.
Correlation analysis
Namco provides correlation analysis to reveal significant associations between taxa or between taxa and metadata such as continuous experimental variables. Namco further considers relative abundances of features at different levels (phylum, class, order, etc).
Topic modelling
Topic modelling was originally designed to uncover hidden thematic structures in document collections [49]. This concept was adapted to metagenomic analyses to explore co-occurring taxa as topics and to find topics associated with the provided sample metadata [50]. Namco employs the themetagenomic R [50] package to predict topics and to study their association with sample metadata, which can be continuous, binary, categorical or factor covariates.
Functional profiling
Microbial composition varies widely between individuals, making the robust identification of phenotype-associated microbial features challenging. One can hypothesize that the functional potential of the microbiome is more robustly associated with a phenotype than the microbial composition. While investigating the functional potential of the microbiome is not directly feasible with 16S rRNA gene sequencing data, several tools have been proposed for inferring the functional profile with the help of reference sequencing databases. To this end, Namco adapts the PICRUSt2 [26] approach, which showed improved accuracy and flexibility compared to related tools including PICRUSt1, Tax4FUN2, Piphillin and PanFP [26]. Namco also performs differential analysis on the predicted KEGG orthologes, enzyme classification numbers and pathways using Aldex2 [51], which was recently reported to perform best for this type of analysis [52]. The relative abundances of significant KEGG annotation terms are plotted in a barplot along with the p-value.
Phylogenetic tree analysis
Phylogenetic analysis is part of the basic steps to get an overview of the evolutionary relationships between features. Namco displays the provided/calculated phylogenetic tree in circular or rectangular format. In addition, users can add two heatmap layers as taxonomic ranks and/or a meta-group. The meta-group heatmaps are coloured based on the abundance of the features in the corresponding meta-group.
Network analysis
Several methods for inferring and analysing microbial co-occurrence networks were developed to study the role of microbial interactions in association with the host [53, 54]. Namco implements multiple strategies for network construction. As a simple approach, the feature abundance matrix is converted into a binary indicator matrix using an abundance cutoff (presence/absence) (the default cut-off is 1: all features with a value <1 are considered absent, while the rest are considered present). This cutoff can be adapted manually to get a more strict binary representation of the abundance matrix. Next, the number of co-occurring feature pairs is counted across samples for each group (e.g. case and control-pairwise) and the difference in co-occurrence counts, as well as the log2 fold-change between the groups, is calculated and displayed as a network where nodes represent features and edges represent frequent group-specific interactions. For more advanced approaches, Namco employs the NetComi [55] R package, where users can build microbial association networks at different taxonomy levels using nine different network construction algorithms. In addition, NetComi offers a method for differential network analysis between two conditions to identify pairs of taxa differentially associated between two groups.
Confounder analysis and explained variation
Confounding variables may mask the actual relationship between the dependent and independent variables in a study [56]. In particular, microbiome composition is associated with several host variables including body mass index (BMI), sex, age and geographical location, among others [57]. Namco utilizes the permutational multivariate analysis of variances (adonis function of the vegan R-package) [44] to rule out confounding factors using available information from the user-provided metadata table. The explained variation of covariates is determined by R 2 values which are considered significant at p≤0.05.
Classification based on random forest
Beyond differential abundance analysis, an important question is if a classification model can be trained on a minimal set of features to robustly predict the outcome (e.g. disease state or treatment response). Such models highlight the potential of microbiome data for prognostic and diagnostic purposes through biomarkers and surrogate endpoints. Namco allows users to build classification models and identify important features using machine learning algorithms. Within Namco, random forest (ranger [58] R package) is used as a classification tool, since it has shown good performance even on comparably small sample sizes in microbial data analysis [59, 60]. By default, Namco splits the data into training and test sets and performs 10-fold repeated cross-validation. Experienced users can modify advanced parameters such as the ratio of training and test sets, the number of cross-validation folds, the resampling method and the number of decision trees. The results are summarized in a confusion matrix and a receiver-operator-characteristic (ROC) plot which helps in evaluating the model performance. The most informative features that were used for classification can be extracted as biomarker candidates for hypothesis generation and further research.
Time-series analysis and clustering
Time-series analysis in Namco allows users to determine how microbial communities including taxa, OTUs/ASVs and other features such as richness change over time. For instance, time-series analysis helps to study the microbial changes in response to a treatment over multiple timepoints or during different stages of host development. Namco offers different options to modify the inputs for time-series line charts, including displays of changes in either relative abundance or absolute abundance or richness.
Use case
To illustrate the broad utility of Namco, we analysed human faecal samples from an interventional cross-over study. The study’s aim was to develop healthier convenience food products with a increased fibre content and to foster customer acceptance of such products. Here, we analysed if the stool microbiota is altered by the fibre-rich diet.
Ethics statement
The study protocol was approved by the ethics committee of the Faculty of Medicine of the Technical University of Munich in Germany (approval no. 529/16S). The guidelines of the International Conference on Harmonization of Good Clinical Practice and the World Medical Association Declaration of Helsinki (in the revised version of Fortaleza, Brazil 2013) were considered. All study participants gave written informed consent. The study was registered at the German Clinical Trial Register (DRKS00011526).
Study design
The human intervention study was a single-blinded, controlled cross-over study. Volunteers were recruited from a cohort of middle-aged subjects who were broadly phenotyped within the enable nutrition cluster; 50% were male and 50% female. Inclusion criteria: volunteers aged 40–65 years with an elevated waist circumference. For detailed information on inclusion and exclusion criteria of the enabled cohort see Brandl et al. [61]. Study participants were invited four times to the study centre. During the first visit, baseline characteristics were collected. In general, the intervention of giving a meatloaf in a bun and pizza was performed as described in Rennekamp et al. [62].
Phenotypic characteristics of the study group
The study group was age- and sex-matched (N=11 females, N=10 males) and received the same intervention and placebo (Table 1). Baseline measurements were performed in the morning after an overnight fast. Body composition and body weight were measured by using a Seca Medical Body Composition Analyser, mBCA 515 (Seca). Body height was measured in a standing position without shoes using a stadiometer (Seca). BMI was calculated as weight (kg)/height (m2). Waist circumference was measured at the midpoint between the lowest rib and the iliac crest with a measuring tape (Seca).
Table 1.
Mean |
sd |
Differences between sexes |
|
---|---|---|---|
Weight [kg] |
90.14 |
11.42 |
0.0080 (**) |
Height [m] |
1.73 |
0.08 |
1.35e-05 (***) |
BMI [kg m–2] |
30.12 |
2.41 |
0.8490 (ns) |
Fat-free mass [%] |
62.98 |
6.74 |
2.40e-08 (***) |
Fat mass [%] |
37.02 |
6.74 |
2.40e-08 (***) |
Skeletal muscle mass [kg] |
27.55 |
6.28 |
9.34e-07 (***) |
Visceral fat [kg] |
3.24 |
1.32 |
4.55e-05 (***) |
Waist circumference [cm] |
101.3 |
7.26 |
0.0058 (**) |
***Significance level at the < 0.001
**SIgnificance level at the < 0.01
*Significance level at the < 0.05
Sample preparation
The participants were asked to visit the study centre sober (10 h before the visit) and received the intervention or placebo meal in the study centre. Additionally, a capsule with food colouring was administered. Intake of the dye stains the stool green, which helps to associate the collected samples and food intake. The time of the meal, as well as the time of excretion, were recorded and showed a mean transit time of 34.74±24.69 h. Since a colouring capsule was administered together with the meals, stool samples could be assigned to the meal. The dye causes a visible green coloration of the sample, and a recognizable coloration was noted in the data. Participants consumed two different types of food (meatloaf in a bun and a pizza), both either enriched with fibre (intervention) (IM) or not (placebo) (M). For the first interventional meal (meatloaf in a bun, IM1) the white bread roll in the fibre-enriched meal contained an additional 5.7% wheat fibre (VITACEL WF600) and the meatloaf (Fleischkäse) a mixture of 3.1% wheat fibre and 4.5% resistant dextrin. The second intervention (pizza, IM2) was also fibre-enriched, containing up to 20 g fibre with 3.0% wheat fibre, 2.4% powdered cellulose and 2.1% inulin (Table 2). The intervention meals thus constituted a major part of the recommended daily fibre intake. As the fibre content is above 6 g per 100 g, the food products are considered high fibre products.
Table 2.
Portion meatloaf with bun 240 g |
Portion salami pizza 320 g |
|||
---|---|---|---|---|
Enriched |
Standard |
Enriched |
Standard |
|
Energy [kcal] |
413 |
587 |
829 |
876 |
Fat [g] |
13 |
35 |
41 |
45 |
Carbohydrate [g] |
47 |
47 |
75 |
83 |
Total fibre [g] |
19 |
2.9 |
20 |
6 |
Sample preparation and sequencing
For the analysis of gut microbiota, the 16S rRNA gene was sequenced at the ZIEL Core Facility Microbiome, Technical University Munich, Germany. A detailed description of the sample preparation and sequencing are described elsewhere [63]. Briefly, sample DNA was isolated following an in-house developed protocol. For targeting the V3V4 region of the 16S rRNA gene, samples were amplified and purified. Pooled amplicons were paired-end sequenced on an Illumina MiSeq. Sequencing data are available under BioProject ID PRJNA774891
Research question
An increased fibre intake has been shown to be protective against the development of cardiovascular [64, 65] and malignant diseases [66, 67] and there are specific health claims associated with specific types of fibres. In this study, we examined the presence of butyrate-producing bacteria, which could be promoted by the fibre-enriched intervention and thus prove that dietary aspects can have a permanent effect on the gut microbial composition.
Results
Diversity analysis
We studied changes in the microbial composition following dietary intervention using Namco. Paired-end FASTQ files were processed using the DADA2 denoising step embedded in Namco with default parameters. During the DADA2 step, a 0.25 % abundance-based filter was applied to reduce sparsity [32]. ASVs were normalized to 10000 reads before downstream analysis, and outliers were removed. Additionally, a prevalence filter cutoff of 10 % was introduced. No significant differences were observed in alpha diversity measures between the IM and M groups: Shannon, richness, Simpson Index, effective Shannon entropy or effective Simpson entropy (Fig. 3). Likewise, no significant clustering was found in beta diversity, including unweighted and weighted Unifrac between IM and M (Fig. 3).
Taxonomic distribution
Overall, six phyla and 77 genera were observed in all groups. Dominating phyla in both the intervention (IM1 and IM2) and non-intervention groups (M1 and M2) were Firmicutes and Bacteroidota , contributing up to 90% to the total bacterial composition. The relative abundance of Firmicutes was slightly higher in the IM group compared to the M group. Bacteroidota were slightly more abundant in the M group. Other phyla such as Actinobacteria , Verrucomicrobia and Proteobacteria showed <5 % mean relative abundance between the IM and M groups (Fig. 4a). Ruminococcaceae Incertae Sedis was significantly different between the IM and M groups. Overall, except Ruminococcaceae Incertae Sedis, no other significant difference was observed at the phylum level between he IM and M groups. However, intra-individual heterogeneity was observed in abundance at the phylum level (Fig. 4b) at each intervention. At the phylum level, we observed differences in the relative abundance of Firmicutes and Actinobacteria from 34 to 77% and from 0.46 to 10.56%, in the IM and M groups, respectively. Similarly, the relative abundance of Proteobacteria also varied from 0.03 to 5.6% and from 0.07 to 8.6 % in the IM and M groups, respectively. At the genus level, most individuals showed a uniform distribution except for one individual who showed a high level of Prevotella (54% of relative abundance).
The top 20 genera are shown in Fig. 4 (c and intra-individual differences are plotted in Fig. 4 (d. Bacteroides was the most abundant genus in both the IM and M groups followed by Faecalibacterium, Prevotella and Agathobacter . Due to the similarity observed at the genus level between IM and M meals, a non-parametric paired Wilcoxon test was applied to the relative abundance to identify microbial changes between the intervention groups (IM1, IM2) and their respective controls (M1 and M2). Samples with missing information regarding what kind of meals were adminintrated were removed during the following analysis. In total, five genera, Anaerostipes, Ruminococcaceae Incertae Sedis, Parabacteroides, Fusicatenibacter and Butyricicoccus , were significantly different in abundance prior to multiple corrections between intervention meals and normal meals. After multiple testing corrections with BH, only Ruminococcaceae Incertae Sedis remained significant. The relative abundance of Anaerostipes, a butyrate-producing bacterium, was found to be higher in IM2 compared to M2. Anaerostipes is a Gram-positive and anaerobic bacterium from the family Lachnospiraceae and is highly expressed in a normal healthy gut [68]. As a validation, previous studies noted that the abundance of Anaerostipes increases with fibre-rich diets and is negatively correlated with BMI [69, 70]. Ruminococcaceae Incertae Sedis and Parabacteroides also showed a significant difference between IM1 and M1 (Fig. 5). Ruminococcaceae is also known to produce short-chain fatty acids (SCFAs) including butyrate, which promotes a healthy bowel [71] and is nominally protective of weight gain [72]. In addition, Ruminococcus bromii is reported as the key species in fermenting resistant starch, which in turn helps in conferring health benefits including weight control and protection against diabetes [73]. Parabacteroides have been reported to have metabolic benefits and to have a negative correlation with BMI [74, 75]. One species of Parabacteroides ( P. distasonis ) is also reported to be part of the core gut microbiome [76–78] and has the ability to produce succinic acid and of bile acid in regulating host metabolism [75, 79]. Ezeji et al. also found Parabacteroides to be enriched in fibre-rich dietary intervention groups [80]. At the genus level, dietary fibre intervention significantly promoted the growth of beneficial genera Anaerostipes, Ruminococcaceae Incertae Sedis and Parabacteroides . Additionally, Fusicatenibacter was significantly higher in IM1 than M2 and Butyricicoccus was higher in the M1 group compared to IM2.
Correlation of gut microbial composition and clinical metadata
To study possible associations between features and continuous variables of interest such as fat-free mass [%], fat mass [%], skeletal muscle mass [kg], BMI and age, Spearman's correlation was calculated at the phylum level (Fig. 6 (a), showing that the (formerly) Deltaproteobacteria ) was negatively correlated with BMI followed by Verrucomicrobiota and Firmicutes . At the genus level (Fig. 6 (b), the genera Phascolarctobacterium, Lachnospira, Lachnospiraceae FCS020 group, Prevotella, Alistipes and Oscillospiraceae UCG-005 were significantly but positively correlated with BMI, whereas Bacteroides , Ruminococcus and Lachnoclostridium were negatively correlated with BMI. Prevotella, Lachnospiraceae FCS020 group and Phascolarctobacterium , which were positively correlated with BMI, were also slightly less abundant in the IM group compared to the M groups. Anaerostipes , [Eubacterium] ruminantium group and Lachnospira were also positively associated with fat mass [%]. Conversely, the Rikenellaceae RC9 gut group and Clostridia UCG−014 were negatively associated with fat mass percentage.
Functional analysis
The built-in PICRUSt2 option of Namco was used to infer functional differences of the microbial communities between the IM and M groups. Similar to differential abundance analysis at the taxon level, significant differences were calculated with a paired Wilcoxon rank test between the IM and M groups. In total, 82 KEGG Orthologues (KO) were significantly different between the groups without correction for multiple testing. Significant KO terms with P<0.05 were grouped according to KO categories in order to understand their functions. The majority of the 76 KO terms belonged to metabolic categories (level 1) and were further divided into 11 sub-categories (level 2) including carbohydrate metabolism, amino acid metabolism, energy metabolism (oxidative phosphorylation), lipid metabolism, metabolism of cofactors and vitamins, and glycan biosynthesis and metabolism. Among these, the KO term K00845 (glucokinase) is part of the amino sugar and nucleotide sugar metabolism, which was enriched in the IM groups but not in the M groups. Previous studies suggested that high fibre intake has a positive impact on glucose and fat metabolism in humans [81]. In support of this, ATP-binding cassette (ABC) transporters such as K02018, K10823, K15580 and K15583 were also upregulated in IM. Previous studies suggested that Firmicutes , which were slightly more abundant in the IM group, encode ABC transporters that belong to transport ATPase groups on the bacterial plasma membrane. These transporters, which are essential to transfer glucose to the other side of the plasma membrane [82], also help in transporting anti-inflammatory butyrate resulting from bacterial digestion of dietary fibres [83]. Differences in the abundance of ABC transporters and glucokinase are shown in Fig. 7. Aspartate aminotransferase (AST) (K00812) was downregulated in the IM group compared to the M group. AST is an important biomarker for liver damage. A few studies have shown that fibre-enriched diets reduce the level of AST [84]. We also identified 12 pathways as significantly different between the IM and M groups, namely fatty acid elongation saturated (FASYN-ELONG-PWY), super pathway of N-acetylglucosamine, N-acetylmannosamine and N-acetylneuraminate degradation (GLCMANNANAUT-PWY), lipid IVA biosynthesis (NAGLIPASYN-PWY), O-antigen building blocks biosynthesis ( Escherichia coli ) (OANTIGEN-PWY), acetylene degradation (P161-PWY), polyisoprenoid biosynthesis ( E. coli ) (POLYISOPRENSYN-PWY), urate biosynthesis/inosine 5′-phosphate degradation (PWY-5695), Kdo transfer to lipid IVA III (Chlamydia) (PWY-6467), guanosine ribonucleotides de novo biosynthesis (PWY-7221), super pathway of GDP-mannose-derived O-antigen building blocks biosynthesis (PWY-7323), super pathway of UDP-N-acetylglucosamine-derived O-antigen building blocks biosynthesis (PWY-7332) and super pathway of thiamine diphosphate biosynthesis I (THISYN-PWY) (Fig. 8) . We repeated this differential analysis using the ALDEx2 option provided in Namco. ALDEx2 applies CLR transformation to the raw counts to address potential biases introduced through compositionality. Following this approach, no significant KO terms were identified. We hypothesize that CLR transformation might increase the specificity in functional analysis at the cost of sensitivity, suggesting that users need to carefully reflect on their method of choice when interpreting their results.
Network analysis
Co-occurrence networks were analysed in Namco to characterize bacterial interactions between the IM and M groups using the SPRING metric for network construction as the default option. To focus on the most abundant ASVs, an abundance cutoff of 0.25% and a prevalence cutoff of 10% were applied. The 270 remaining ASVs were used as input for network comparison. The co-occurrence network between IM and M at the genus level is illustrated in Fig. 9. A genus-level network was generated using the SPRING [85] method as an association measure (nlambda and replication numbers were set to 50 and 100, respectively). Eigenvector centrality was used for defining hubs and for scaling node sizes. All global measures including degree and eigenvector centrality, were compared but, none of the four centrality measures were any significant differences observed. The two networks shared similar properties and no node hubs were identified in either group. The largest differences in closeness centrality between the IM and M groups belonged to the genus CAG-56, Eubacterium coprostanoligenes , Lachnospira, Lachnospiraceae UCG-004 and Ruminococcaceae Incertae Sedis. Among these genera, only Ruminococcaceae Incertae Sedis was found to be significantly abundant in the IM group from previous analysis. Overall, there was no significant difference observed at the genus-level network between the IM and M groups.
Discussion
To gain biologically and clinically relevant insights from the large amount of available microbiome sequencing data, a plethora of algorithms, statistical methods and software packages have been developed. We implemented Namco to serve as a one-stop data analysis platform that performs both raw data processing and basic as well as advanced downstream analyses of microbial datasets. Namco integrates previously available tools into a single coherent computational workflow and allows the user to construct, analyse and understand microbial composition in a fast and reproducible manner. This platform is intended to eliminate the use of command-line arguments during data processing. Namco is accessible in the web browser and hence does not require the installation of any software packages. Namco also allows results from each analysis to be saved as an R session which can be used to resume at any time, thus simplifying sharing of research results. Since Namco is available as a Docker image, it can be conveniently installed locally or on a clinical server behind a firewall to facilitate the GDPR-compliant analysis of sensitive data without the need for an upload to the public Namco instance. Detailed comparisons between other web-based tools (Fig. 1) showed that Namco offers a unique set of functions, such as time series clustering, function profiling using PICRUSt2, confounder analysis and topic modelling.
We considered a dietary intervention study as a case study to explore the features of Namco. We examined the association of rich fibre dietary intake with the gut microbiota composition through basic and advanced analysis in Namco. We investigated differences in relative abundance between the IM and M groups, where we compared the top abundant taxa and also studied the intra-individual variation in gut microbiome with respect to fibre-enriched diets. After exploring the datasets in terms of relative abundance and diversity analysis, we found that genera with significant differences between the IM and M groups were involved in the production of butyrate, an SCFA that helps to maintain homeostasis of the gut via anti-inflammatory and antimicrobial actions [86–88]. Namco did not only provide information about differentially abundant microbial composition but also helped in determining the significantly different KO terms and pathways. Namco revealed that the IM group showed a positive association with the presence of glucokinase, which belongs to amino sugar and nucleotide sugar metabolism. High fibre intake had a positive impact on glucose metabolism in humans. Studies have shown that long-term intake of fibre improves glucose homeostasis [89]. Additionally, Namco also identified ABC transporters that play a major role in the transmission of glucose through plasma membranes as significantly correlated with the IM group [82]. It was also possible to study microbial interactions by generating differential microbial co-occurrence networks at the genus level using Namco. The topological features of the resulting differential network from the IM and M groups showed only a slight difference in the estimated associations. Overall, Namco provides a much needed interface to analyse microbial community data in a more intuitive way.
Conclusion
We present Namco, a shiny R application dedicated to providing end-to-end microbiome analysis for 16S rRNA gene sequence analysis. We incorporate leading R packages for both upstream and downstream analysis in an efficient framework for researchers to characterize and understand the microbial community structure in their data, leading to valuable insights into the connection between the microbial community and phenotypes of interest. We plan to further expand Namco with support for novel analysis techniques and for correlation of microbial abundances with other data sources such as metabolomics and transcriptomics.
Funding information
This work was funded by the Deutsche Forschungsgemeinschaft (DFG, German 582 Research Foundation) – Projektnummer 395357507 – SFB 1371. This project (J.B.) has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No. 777111. This publication reflects only the authors' view and the European Commission is not responsible for any use that may be made of the information it contains. In addition, J.B. was partially funded by his VILLUM Young Investigator Grant no. 13154. Parts of the study received funding by a grant from the German Ministry for Education and Research (BMBF, 01EA1409C). The preparation of this paper was supported by the enable Cluster.
Acknowledgements
We thank Luise Rauer and Claudia Hülpüsch from IEM Augsburg for constructive feedback during the development of Namco.
Author contributions
CRediT taxonomy from CASRAI: B.B., T.S. and H.H.; Investigation, Methodology, Resources, Validation: B.B., T.S., H.H., A.D. and M.M.; Data curation: A.D., M.Z., M.LA. and B.Ö.; Software: A.D. and M.M.; Formal analysis: A.D. and M.M; Visualization: M.M; Writing – original draft: M.LI. and S.R.; Conceptualization, Project administration, Supervision: D.H., J.B. and H.H.; Funding acquisition. All authors were involved in Writing – review & editing.
Conflicts of interest
M.L.I. receives consulting fees from mbiomics GmbH outside this work.
Ethical statement
The study protocol was approved by the ethics committee of the Faculty of Medicine of the Technical University of Munich in Germany (approval no. 529/16S). The guidelines of the International Conference on Harmonization of Good Clinical Practice and the World Medical Association Declaration of Helsinki (in the revised version of Fortaleza, Brazil 2013) were considered. All study participants gave written informed consent. The study was registered at the German Clinical Trial Register (DRKS00011526).
Footnotes
Abbreviations: ABC, ATP-binding cassette; AST, aspartate aminotransferase; ASV, amplicon sequence variant; BH, Benjamini–Hochberg; BMI, body mass index; CLR, Centered log ratio transform; FASYN-ELONG-PWY, fatty acid elongation saturated pathway; GDPR, General Data Protection Regulation; GLCMANNANAUT-PWY, super pathway of N-acetylglucosamine, N-acetylmannosamine and N-acetylneuraminate degradation; NAGLIPASYN-PWY, lipid IVA biosynthesis; NMDS, non-metric multidimensional scaling; OANTIGEN-PWY, O-antigen building blocks biosynthesis (E. scherichia coli); OTU, operational taxonomic unit; PCoA, principal coordinates analysis; POLYISOPRENSYN-PWY, polyisoprenoid biosynthesis (E. coli); P161-PWY, acetylene degradation; ROC, receiver-operator-characteristic; SCFA, short-chain fatty acid.
References
- 1.Cho I, Blaser MJ. The human microbiome: at the interface of health and disease. Nat Rev Genet. 2012;13:260–270. doi: 10.1038/nrg3182. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Devaraj S, Hemarajata P, Versalovic J. The human gut microbiome and body metabolism: implications for obesity and diabetes. Clin Chem. 2013;59:617–628. doi: 10.1373/clinchem.2012.187617. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Sepich-Poore GD, Zitvogel L, Straussman R, Hasty J, Wargo JA, et al. The microbiome and human cancer. Science. 2021;371:eabc4552. doi: 10.1126/science.abc4552. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Glassner KL, Abraham BP, Quigley EMM. The microbiome and inflammatory bowel disease. J Allergy Clin Immunol. 2020;145:16–27. doi: 10.1016/j.jaci.2019.11.003. [DOI] [PubMed] [Google Scholar]
- 5.Morais LH, Schreiber HL, Mazmanian SK. The gut microbiota-brain axis in behaviour and brain disorders. Nat Rev Microbiol. 2021;19:241–255. doi: 10.1038/s41579-020-00460-0. [DOI] [PubMed] [Google Scholar]
- 6.Edgar RC. Search and clustering orders of magnitude faster than BLAST. Bioinformatics. 2010;26:2460–2461. doi: 10.1093/bioinformatics/btq461. [DOI] [PubMed] [Google Scholar]
- 7.Callahan BJ, McMurdie PJ, Holmes SP. Exact sequence variants should replace operational taxonomic units in marker-gene data analysis. ISME J. 2017;11:2639–2643. doi: 10.1038/ismej.2017.119. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Callahan BJ, McMurdie PJ, Rosen MJ, Han AW, Johnson AJA, et al. DADA2: High-resolution sample inference from Illumina amplicon data. Nat Methods. 2016;13:581–583. doi: 10.1038/nmeth.3869. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Eren AM, Morrison HG, Lescault PJ, Reveillaud J, Vineis JH, et al. Minimum entropy decomposition: unsupervised oligotyping for sensitive partitioning of high-throughput marker gene sequences. ISME J. 2015;9:968–979. doi: 10.1038/ismej.2014.195. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Schloss PD, Westcott SL, Ryabin T, Hall JR, Hartmann M, et al. Introducing mothur: open-source, platform-independent, community-supported software for describing and comparing microbial communities. Appl Environ Microbiol. 2009;75:7537–7541. doi: 10.1128/AEM.01541-09. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Bolyen E, Rideout JR, Dillon MR, Bokulich NA, Abnet CC, et al. Reproducible, interactive, scalable and extensible microbiome data science using QIIME 2. Nat Biotechnol. 2019;37:852–857. doi: 10.1038/s41587-019-0209-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Bokulich NA, Dillon MR, Bolyen E, Kaehler BD, Huttley GA, et al. q2-sample-classifier: machine-learning tools for microbiome classification and regression. J Open Res Softw. 2018;3:934. doi: 10.21105/joss.00934. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Bokulich NA, Dillon MR, Zhang Y, Rideout JR, Bolyen E, et al. q2-longitudinal: Longitudinal and Paired-Sample Analyses of Microbiome Data. mSystems. 2018;3:e00219-18. doi: 10.1128/mSystems.00219-18. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Mandal S, Van Treuren W, White RA, Eggesbø M, Knight R, et al. Analysis of composition of microbiomes: a novel method for studying microbial composition. Microb Ecol Health Dis. 2015;26:27663. doi: 10.3402/mehd.v26.27663. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.McMurdie PJ, Holmes S. phyloseq: an R package for reproducible interactive analysis and graphics of microbiome census data. PLoS One. 2013;8:e61217. doi: 10.1371/journal.pone.0061217. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.McMurdie PJ, Holmes S. Shiny-phyloseq: Web application for interactive microbiome analysis with provenance tracking. Bioinformatics. 2015;31:282–283. doi: 10.1093/bioinformatics/btu616. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Woloszynek S, Mell JC, Zhao Z, Simpson G, O’Connor MP, et al. Themetagenomics: exploring thematic structure and predicted functionality of 16s rRNA amplicon data. Bioinformatics. 2019:678110. doi: 10.1101/678110. [DOI] [PMC free article] [PubMed]
- 18.Dhariwal A, Chong J, Habib S, King IL, Agellon LB, et al. MicrobiomeAnalyst: a web-based tool for comprehensive statistical, visual and meta-analysis of microbiome data. Nucleic Acids Res. 2017;45:W180–W188. doi: 10.1093/nar/gkx295. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Lagkouvardos I, Joseph D, Kapfhammer M, Giritli S, Horn M, et al. IMNGS: a comprehensive open resource of processed 16S rRNA microbial profiles for ecology and diversity studies. Sci Rep. 2016;6:33721. doi: 10.1038/srep33721. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Buza TM, Tonui T, Stomeo F, Tiambo C, Katani R, et al. iMAP: an integrated bioinformatics and visualization pipeline for microbiome data analysis. BMC Bioinformatics. 2019;20:374. doi: 10.1186/s12859-019-2965-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Wilke A, Bischof J, Gerlach W, Glass E, Harrison T, et al. The MG-RAST metagenomics database and portal in 2015. Nucleic Acids Res. 2016;44:D590–4. doi: 10.1093/nar/gkv1322. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Su S-C, Galvin JE, Yang S-F, Chung W-H, Chang L-C, et al. wiSDOM: a visual and statistical analytics for interrogating microbiome. Bioinformatics. 2021;37:2795–2797. doi: 10.1093/bioinformatics/btab057. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Huse SM, Mark Welch DB, Voorhis A, Shipunova A, Morrison HG, et al. VAMPS: a website for visualization and analysis of microbial population structures. BMC Bioinformatics. 2014;15 doi: 10.1186/1471-2105-15-41. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Aßhauer KP, Wemheuer B, Daniel R, Meinicke P. Tax4Fun: predicting functional profiles from metagenomic 16S rRNA data. Bioinformatics. 2015;31:2882–2884. doi: 10.1093/bioinformatics/btv287. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Langille MGI, Zaneveld J, Caporaso JG, McDonald D, Knights D, et al. Predictive functional profiling of microbial communities using 16S rRNA marker gene sequences. Nat Biotechnol. 2013;31:814–821. doi: 10.1038/nbt.2676. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Douglas GM, Maffei VJ, Zaneveld JR, Yurgel SN, Brown JR, et al. PICRUSt2 for prediction of metagenome functions. Nat Biotechnol. 2020;38:685–688. doi: 10.1038/s41587-020-0548-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Özkurt E, Fritscher J, Soranzo N, Ng DYK, Davey RP, et al. LotuS2: an ultrafast and highly accurate tool for amplicon sequencing analysis. [Internet]. bioRxiv . 2021. https://www.biorxiv.org/content/biorxiv/early/2021/12/24/2021.12.24.474111 [DOI] [PMC free article] [PubMed]
- 28.Unoise ERC. Improved error-correction for illumina 16S and ITS amplicon reads. bioRxiv. 2016 [Google Scholar]
- 29.Quast C, Pruesse E, Yilmaz P, Gerken J, Schweer T, et al. The SILVA ribosomal RNA gene database project: improved data processing and web-based tools. Nucleic Acids Res. 2013;41:D590–6. doi: 10.1093/nar/gks1219. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Cao Q, Sun X, Rajesh K, Chalasani N, Gelow K, et al. Effects of rare microbiome taxa filtering on statistical analysis. Front Microbiol. 2020;11:607325. doi: 10.3389/fmicb.2020.607325. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.McKnight DT, Huerlimann R, Bower DS, Schwarzkopf L, Alford RA, et al. Methods for normalizing microbiome data: an ecological perspective. Methods Ecol Evol. 2019;10:389–400. doi: 10.1111/2041-210X.13115. [DOI] [Google Scholar]
- 32.Reitmeier S, Hitch TCA, Treichel N, Fikas N, Hausmann B, et al. Handling of spurious sequences affects the outcome of high-throughput 16S rRNA gene amplicon profiling. ISME COMMUN. 2021;1 doi: 10.1038/s43705-021-00033-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Gloor GB, Macklaim JM, Pawlowsky-Glahn V, Egozcue JJ. Microbiome datasets are compositional: and this Is not optional. Front Microbiol. 2017;8:2224. doi: 10.3389/fmicb.2017.02224. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Davis NM, Proctor DM, Holmes SP, Relman DA, Callahan BJ. Simple statistical identification and removal of contaminant sequences in marker-gene and metagenomics data. Microbiome. 2018;6:226. doi: 10.1186/s40168-018-0605-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Lauder AP, Roche AM, Sherrill-Mix S, Bailey A, Laughlin AL, et al. Comparison of placenta samples with contamination controls does not provide evidence for a distinct placenta microbiota. Microbiome. 2016;4:29. doi: 10.1186/s40168-016-0172-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Callahan BJ, DiGiulio DB, Goltsman DSA, Sun CL, Costello EK, et al. Replication and refinement of a vaginal microbial signature of preterm birth in two racially distinct cohorts of US women. Proc Natl Acad Sci U S A. 2017;114:9966–9971. doi: 10.1073/pnas.1705899114. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Lagkouvardos I, Fischer S, Kumar N, Clavel T. Rhea: a transparent and modular R pipeline for microbial profiling based on 16S rRNA gene amplicons. PeerJ. 2017;5:e2836. doi: 10.7717/peerj.2836. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Shannon CE. A mathematical theory of communication. Bell Syst Tech J. 1948;27:379–423. doi: 10.1002/j.1538-7305.1948.tb01338.x. [DOI] [Google Scholar]
- 39.Simpson EH. Measurement of diversity. Nature. 1949;163:688. doi: 10.1038/163688a0. [DOI] [Google Scholar]
- 40.Chao A, Chiu C-H, Jost L. Phylogenetic diversity measures based on Hill numbers. Philos Trans R Soc Lond B Biol Sci. 2010;365:3599–3609. doi: 10.1098/rstb.2010.0272. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Lozupone C, Knight R. UniFrac: a new phylogenetic method for comparing microbial communities. Appl Environ Microbiol. 2005;71:8228–8235. doi: 10.1128/AEM.71.12.8228-8235.2005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Bray JR, Curtis JT. An Ordination of the upland forest communities of Southern Wisconsin. Ecol Monogr. 1957;27:325–349. doi: 10.2307/1942268. [DOI] [Google Scholar]
- 43.Zuur AF, Ieno EN, Smith GM, editors Analysing Ecological Data. New York, NY: Springer New York; 2007. Principal coordinate analysis and non-metric multidimensional scaling; pp. 259–264. [Google Scholar]
- 44.Oksanen J, Kindt R, Legendre P, O’Hara B, Stevens MHH, et al. The vegan package. Community ecology package. 2007;10(631637):719. https://www.researchgate.net/profile/Gavin_Simpson/publication/228339454_The_vegan_Package/links/0912f50be86bc29a7f000000/The-vegan-Package.pdf
- 45.Benjamini Y, Hochberg Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc. 1995;57:289–300. doi: 10.1111/j.2517-6161.1995.tb02031.x. [DOI] [Google Scholar]
- 46.Wirbel J, Zych K, Essex M, Karcher N, Kartal E, et al. Microbiome meta-analysis and cross-disease comparison enabled by the SIAMCAT machine learning toolbox. Genome Biol. 2021;22:93. doi: 10.1186/s13059-021-02306-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Hawinkel S, Mattiello F, Bijnens L, Thas O. A broken promise: microbiome differential abundance methods do not control the false discovery rate. Brief Bioinform. 2019;20:210–221. doi: 10.1093/bib/bbx104. [DOI] [PubMed] [Google Scholar]
- 48.Wirbel J, Pyl PT, Kartal E, Zych K, Kashani A, et al. Meta-analysis of fecal metagenomes reveals global microbial signatures that are specific for colorectal cancer. Nat Med. 2019;25:679–689. doi: 10.1038/s41591-019-0406-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Blei DM, Lafferty JD. A correlated topic model of Science. Ann Appl Stat. 2007;1 doi: 10.1214/07-AOAS114. [DOI] [Google Scholar]
- 50.Woloszynek S, Mell JC, Zhao Z, Simpson G, O’Connor MP, et al. Exploring thematic structure and predicted functionality of 16S rRNA amplicon data. PLoS One. 2019;14:e0219235. doi: 10.1371/journal.pone.0219235. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Fernandes AD, Macklaim JM, Linn TG, Reid G, Gloor GB. ANOVA-like differential expression (ALDEx) analysis for mixed population RNA-Seq. PLoS One. 2013;8:e67019. doi: 10.1371/journal.pone.0067019. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Nearing JT, Douglas GM, Hayes M, MacDonald J, Desai D, et al. Microbiome differential abundance methods produce disturbingly different results across 38 datasets. Bioinformatics. 2021:2021. doi: 10.1101/2021.05.10.443486. [DOI]
- 53.Hooper LV, Littman DR, Macpherson AJ. Interactions between the microbiota and the immune system. Science. 2012;336:1268–1273. doi: 10.1126/science.1223490. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Matchado MS, Lauber M, Reitmeier S, Kacprowski T, Baumbach J, et al. Network analysis methods for studying microbial communities: a mini review. Comput Struct Biotechnol J. 2021;19:2687–2698. doi: 10.1016/j.csbj.2021.05.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Peschel S, Müller CL, von Mutius E, Boulesteix A-L, Depner M. NetCoMi: network construction and comparison for microbiome data in R. Bioinformatics. 2020 doi: 10.1101/2020.07.15.195248. [DOI] [PMC free article] [PubMed]
- 56.Clayden A. Causal relationships in medicine: a practical system for critical appraisal. Ann Intern Med. 1991;114:916. doi: 10.7326/0003-4819-114-10-916_1. [DOI] [Google Scholar]
- 57.Vujkovic-Cvijin I, Sklar J, Jiang L, Natarajan L, Knight R, et al. Host variables confound gut microbiota studies of human disease. Nature. 2020;587:448–454. doi: 10.1038/s41586-020-2881-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.Wright MN, Ziegler A. Ranger: a fast implementation of random forests for high dimensional data in C++ and R. J Stat Softw Articles. 2017;77:1–17. [Google Scholar]
- 59.Asnicar F, Berry SE, Valdes AM, Nguyen LH, Piccinno G, et al. Microbiome connections with host metabolism and habitual diet from 1,098 deeply phenotyped individuals. Nat Med. 2021;27:321–332. doi: 10.1038/s41591-020-01183-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60.Polster SP, Sharma A, Tanes C, Tang AT, Mericko P, et al. Permissive microbiome characterizes human subjects with a neurovascular disease cavernous angioma. Nat Commun. 2020;11:2659. doi: 10.1038/s41467-020-16436-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61.Brandl B, Skurk T, Rennekamp R, Hannink A, Kiesswetter E, et al. A phenotyping platform to characterize healthy individuals across four stages of life - The Enable Study. Front Nutr. 2020;7:582387. doi: 10.3389/fnut.2020.582387. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62.Rennekamp R, Brandl B, Giesbertz P, Skurk T, Hauner H. Metabolic and satiating effects and consumer acceptance of a fibre-enriched Leberkas meal: a randomized cross-over trial. Eur J Nutr. 2021;60:3203–3210. doi: 10.1007/s00394-020-02472-1. [DOI] [PubMed] [Google Scholar]
- 63.Reitmeier S, Kiessling S, Clavel T, List M, Almeida EL, et al. Arrhythmic gut microbiome signatures predict risk of type 2 diabetes. Cell Host Microbe. 2020;28:258–272. doi: 10.1016/j.chom.2020.06.004. [DOI] [PubMed] [Google Scholar]
- 64.McRae MP. Dietary fiber is beneficial for the prevention of cardiovascular disease: an umbrella review of meta-analyses. J Chiropr Med. 2017;16:289–299. doi: 10.1016/j.jcm.2017.05.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 65.Anderson JW, Baird P, Davis RH, Ferreri S, Knudtson M, et al. Health benefits of dietary fiber. Nutr Rev. 2009;67:188–205. doi: 10.1111/j.1753-4887.2009.00189.x. [DOI] [PubMed] [Google Scholar]
- 66.Aune D, Chan DSM, Lau R, Vieira R, Greenwood DC, et al. Dietary fibre, whole grains, and risk of colorectal cancer: systematic review and dose-response meta-analysis of prospective studies. BMJ. 2011;343:d6617. doi: 10.1136/bmj.d6617. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 67.Xu X, Zhu Y, Li J, Wang S. Dietary fiber, glycemic index, glycemic load and renal cell carcinoma risk. Carcinogenesis. 2019;40:441–447. doi: 10.1093/carcin/bgz049. [DOI] [PubMed] [Google Scholar]
- 68.da Cruz AG, Senaka Ranadheera C, Nazzaro F, Mortazavian A. Academic Press; 2021. P. 346. Probiotics and Prebiotics in Foods: Challenges, Innovations, and Advances.https://play.google.com/store/books/details?id=6WIFEAAAQBAJ [Google Scholar]
- 69.Bailén M, Bressa C, Martínez-López S, González-Soltero R, Montalvo Lominchar MG, et al. Microbiota features associated with a high-fat/low-fiber diet in healthy adults. Front Nutr. 2020;7:583608. doi: 10.3389/fnut.2020.583608. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 70.Aoe S, Nakamura F, Fujiwara S. Effect of wheat bran on fecal butyrate-producing bacteria and wheat bran combined with barley on Bacteroides abundance in Japanese healthy adults. Nutrients. 2018;10:E1980. doi: 10.3390/nu10121980. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 71.Van den Abbeele P, Belzer C, Goossens M, Kleerebezem M, De Vos WM, et al. Butyrate-producing clostridium cluster XIVa species specifically colonize mucins in an in vitro gut model. ISME J. 2013;7:949–961. doi: 10.1038/ismej.2012.158. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 72.Menni C, Jackson MA, Pallister T, Steves CJ, Spector TD, et al. Gut microbiome diversity and high-fibre intake are related to lower long-term weight gain. Int J Obes. 2017;41:1099–1105. doi: 10.1038/ijo.2017.66. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 73.Higgins JA, Higbee DR, Donahoo WT, Brown IL, Bell ML, et al. Resistant starch consumption promotes lipid oxidation. Nutr Metab. 2004;1:8. doi: 10.1186/1743-7075-1-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 74.Kverka M, Zakostelska Z, Klimesova K, Sokol D, Hudcovic T, et al. Oral administration of Parabacteroides distasonis antigens attenuates experimental murine colitis through modulation of immunity and microbiota composition. Clin Exp Immunol. 2011;163:250–259. doi: 10.1111/j.1365-2249.2010.04286.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 75.Wang K, Liao M, Zhou N, Bao L, Ma K, et al. Parabacteroides distasonis alleviates obesity and metabolic dysfunctions via production of succinate and secondary bile acids. Cell Rep. 2019;26:222–235. doi: 10.1016/j.celrep.2018.12.028. [DOI] [PubMed] [Google Scholar]
- 76.Qin J, Li Y, Cai Z, Li S, Zhu J, et al. A metagenome-wide association study of gut microbiota in type 2 diabetes. Nature. 2012;490:55–60. doi: 10.1038/nature11450. [DOI] [PubMed] [Google Scholar]
- 77.Cuffaro B, Assohoun ALW, Boutillier D, Súkeníková L, Desramaut J, et al. In vitro characterization of gut microbiota-derived commensal strains: selection of Parabacteroides distasonis strains alleviating TNBS-induced colitis in mice. Cells. 2020;9:E2104. doi: 10.3390/cells9092104. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 78.Falony G, Joossens M, Vieira-Silva S, Wang J, Darzi Y, et al. Population-level analysis of gut microbiome variation. Science. 2016;352:560–564. doi: 10.1126/science.aad3503. [DOI] [PubMed] [Google Scholar]
- 79.Ezeji JC, Sarikonda DK, Hopperton A, Erkkila HL, Cohen DE, et al. Parabacteroides distasonis: intriguing aerotolerant gut anaerobe with emerging antimicrobial resistance and pathogenic and probiotic roles in human health. Gut Microbes. 2021;13:1922241. doi: 10.1080/19490976.2021.1922241. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 80.Tian T, Zhang X, Luo T, Wang D, Sun Y, et al. Effects of short-term dietary fiber intervention on gut microbiota in young healthy people. Diabetes Metab Syndr Obes. 2021;14:3507–3516. doi: 10.2147/DMSO.S313385. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 81.Shortt C, Hasselwander O, Meynier A, Nauta A, Fernández EN, et al. Systematic review of the effects of the intestinal microbiota on selected nutrients and non-nutrients. Eur J Nutr. 2018;57:25–49. doi: 10.1007/s00394-017-1546-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 82.Sun B, Hou L, Yang Y. Effects of altered dietary fiber on the gut microbiota, short-chain fatty acids and cecum of chickens during different growth periods. Biology (Basel) 2020 doi: 10.20944/preprints202002.0109.v1. [DOI]
- 83.Andersen V, Svenningsen K, Knudsen LA, Hansen AK, Holmskov U, et al. Novel understanding of ABC transporters ABCB1/MDR/P-glycoprotein, ABCC2/MRP2, and ABCG2/BCRP in colorectal pathophysiology. World J Gastroenterol. 2015;21:11862–11876. doi: 10.3748/wjg.v21.i41.11862. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 84.Abazarfard Z, Eslamian G, Salehi M, Keshavarzi S. A randomized controlled trial of the effects of an almond-enriched, hypocaloric diet on liver function tests in overweight/obese women. Iran Red Crescent Med J. 2016;18:e23628. doi: 10.5812/ircmj.23628. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 85.Yoon G, Gaynanova I, Müller CL. Microbial networks in SPRING - Semi-parametric rank-based correlation and partial correlation estimation for quantitative microbiome data. Front Genet. 2019;10:516. doi: 10.3389/fgene.2019.00516. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 86.Corrêa-Oliveira R, Fachi JL, Vieira A, Sato FT, Vinolo MAR. Regulation of immune cell function by short-chain fatty acids. Clin Transl Immunology. 2016;5:e73. doi: 10.1038/cti.2016.17. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 87.Donohoe DR, Garge N, Zhang X, Sun W, O’Connell TM, et al. The microbiome and butyrate regulate energy metabolism and autophagy in the mammalian colon. Cell Metab. 2011;13:517–526. doi: 10.1016/j.cmet.2011.02.018. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 88.Wang G. Human antimicrobial peptides and proteins. Pharmaceuticals. 2014;7:545–594. doi: 10.3390/ph7050545. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 89.Fukagawa NK, Anderson JW, Hageman G, Young VR, Minaker KL. High-carbohydrate, high-fiber diets increase peripheral insulin sensitivity in healthy young and old adults. Am J Clin Nutr. 1990;52:524–528. doi: 10.1093/ajcn/52.3.524. [DOI] [PubMed] [Google Scholar]