Skip to main content

This is a preprint.

It has not yet been peer reviewed by a journal.

The National Library of Medicine is running a pilot to include preprints that result from research funded by NIH in PMC and PubMed.

Research Square logoLink to Research Square
[Preprint]. 2022 Jun 7:rs.3.rs-1717624. [Version 1] doi: 10.21203/rs.3.rs-1717624/v1

Metagenomic assessment of gut microbial communities and risk of severe COVID-19

Peggy Lai 1, Long Nguyen 2, Daniel Okin 3, David Drew 4, Vincent Battista 5, Sirus Jesudasen 6, Thomas Kuntz 7, Amrisha Bhosle 8, Kelsey Thompson 9, Trenton Reinicke 10, Chun-Han Lo 11, Jacqueline Woo 12, Alexander Caraballo 13, Lorenzo Berra 14, Jacob Vieira 15, Ching-Ying Huang 16, Upasana Das Adhikari 17, Minsik Kim 18, Hui-Yu Sui 19, Marina Magicheva-Gupta 20, Lauren McIver 21, Marcia Goldberg 22, Douglas Kwon 23, Curtis Huttenhower 24, Andrew Chan 25
PMCID: PMC9176657  PMID: 35677075

Abstract

The gut microbiome is a critical modulator of host immunity and is linked to the immune response to respiratory viral infections. However, few studies have gone beyond describing broad compositional alterations in severe COVID-19, defined as acute respiratory or other organ failure. We profiled 127 hospitalized patients with COVID-19 (n=79 with severe COVID-19 and 48 with moderate) who collectively provided 241 stool samples from April 2020 to May 2021 to identify links between COVID-19 severity and gut microbial taxa, their biochemical pathways, and stool metabolites. 48 species were associated with severe disease after accounting for antibiotic use, age, sex, and various comorbidities. These included significant in-hospital depletions of Fusicatenibacter saccharivorans and Roseburia hominis, each previously linked to post-acute COVID syndrome or “long COVID”, suggesting these microbes may serve as early biomarkers for the eventual development of long COVID. A random forest classifier achieved excellent performance when tasked with predicting whether stool was obtained from patients with severe vs. moderate COVID-19. Dedicated network analyses demonstrated fragile microbial ecology in severe disease, characterized by fracturing of clusters and reduced negative selection. We also observed shifts in predicted stool metabolite pools, implicating perturbed bile acid metabolism in severe disease. Here, we show that the gut microbiome differentiates individuals with a more severe disease course after infection with COVID-19 and offer several tractable and biologically plausible mechanisms through which gut microbial communities may influence COVID-19 disease course. Further studies are needed to validate these observations to better leverage the gut microbiome as a potential biomarker for disease severity and as a target for therapeutic intervention.

Introduction

Over 530 million individuals worldwide have been infected with SARS-CoV-2 and developed coronavirus disease-2019 (COVID-19), culminating in more than 6 million lives lost1. The gut microbiome is a critical modulator of host immunity2 and affects the immune response to respiratory viral infections (e.g., influenza A virus subtype H1N1, Severe Acute Respiratory Syndrome [SARS], and Middle East Respiratory Syndrome)36. Several early studies have explored the link between broad alterations in gut microbial communities and COVID-19, demonstrating the generalized enrichment of opportunistic pathogens and depletion of commensals718.

Most prior studies have largely focused on the presence, absence, or the differential abundance of specific microbes in COVID-19,7,916, and few have interrogated microbial network dynamics to identify which co-occurring or co-excluded species are foundational to maintaining microbial homeostasis. This represents a missed opportunity to identify potential bacterial targets to restore a more favorable, health-promoting gut configuration. Similarly, other studies have not considered how these shifts might influence gut metabolite pools. Finally, prior studies interested in exploring the gut microbiome in COVID-19 have largely sought to characterize the differences in healthy controls compared to infected patients rather than those with moderate compared to severe disease7, 1012,14,16. Establishing a predictive biomarker of disease severity may improve early identification of at-risk patient populations that require immediate intervention or those that are more likely to benefit from effective antiviral therapies19.

It remains unclear what role the gut microbiome plays in regulating the severity of COVID-19 in hospitalized patients and what specific microbially-mediated mechanisms may underlie this relationship. To address these questions, we conducted a study of hospitalized patients with COVID-19 at a U.S. tertiary medical center. Using metagenomic profiling of fecal samples collected from these patients, we demonstrate significant depletions of Fusicatenibacter saccharivorans and Roseburia hominis in severe COVID-19, reductions of which have previously been linked to post-acute COVID-19 syndrome (PASC) or long COVID18,20. Strikingly, we observed these declines during patients’ index hospitalizations, suggesting the presence of an early microbial signal that may predict the development of a long-term complication. We further use network analysis to identify several critical taxa central to maintaining a gut microbial configuration less likely to be found in severe COVID-19 and perform complementary predicted metabolite analyses to further link these changes to alterations in bile acid pool and short-chain fatty acid (SCFA) levels, offering biologically plausible mechanisms to explain the link between gut microbial communities and COVID-19 disease severity.

Results

Participant characteristics and overall gut community structure

From April 2020 to May 2021, we prospectively enrolled hospitalized patients aged ≥ 18 years with confirmed COVID-19 at the Massachusetts General Hospital to a longitudinal COVID-19 disease surveillance study. Patients were categorized as having severe COVID-19 if they required admission to the intensive care unit with acute respiratory failure (the need for oxygen supplementation ≥ 15 liters per minute (LPM), non-invasive positive pressure ventilation, or mechanical ventilation) or other organ failure (such as shock requiring vasopressor initiation)21. Otherwise, they were categorized as having moderate COVID-19. We enrolled 127 hospitalized COVID-19 patients. 79 (62.2%) had severe disease and 48 (37.8%) had moderate disease. Collectively, they provided 241 stool samples (Fig. 1a). No statistically significant differences were observed between severity groups based on age, sex, race, ethnicity, various comorbidities, and smoking history (Suppl. Table 1). Patients with severe COVID-19 had a higher mean body mass index (BMI) as well as Simplified Acute Physiology Score II (SAPS II)22 and Sequential Organ Failure Assessment (SOFA) scores23, each a validated clinical assessment tool to risk stratify hospitalized patients’ risk of mortality24,25. Severe COVID-19 patients more frequently received antibiotics, antivirals, and ICU therapies. Patients with severe COVID-19 had higher 90-day mortality compared to those with moderate disease (22.8% vs. 4.2%, p-value = 0.01).

Figure 1. Study overview and overall community structure.

Figure 1

a. Study enrollment of hospitalized patients with confirmed COVID-19 with weekly stool sampling until the time of discharge or death, whichever occurred first. b. Marked reduction in species richness and evenness in severe COVID-19 (inverse Simpson α-diversity metric, p-value <0.0001 from multivariable linear modeling adjusting for age, sex, prior antibiotic use, race, ethnicity, body mass index, Charlson Comorbidity Index, use of remdesivir or corticosteroids, days since admission, SARS-CoV-2 stool viral load, sequencing depth, and a participant-level random effect). Boxes represent median and interquartile range, while whiskers represent 95%ile. c. Community-level disturbances in severe vs. moderate COVID-19 as depicted by joint ordination and principal coordinates analysis (PCoA), not fully explained by characteristic trade-offs in Bacteroidetes/Firmicutes or prior antibiotic use.

Gut microbial diversity was significantly reduced in severe COVID-19 after adjusting for factors such as recent antibiotic use (Fig. 1b, p-value < 0.001). Overall gut community structure also appeared to differ based on COVID-19 disease severity (multivariable R2 = 2.4%, p-value = 0.002), a finding not fully explained by characteristic trade-offs along the Bacteroidetes/Firmicutes axes of variation or prior antibiotic usage (Fig. 1c).

Differential abundance testing

Using multivariable linear mixed-effects modeling accounting for age, sex, antibiotic use, race/ethnicity, SARS-CoV-2 stool viral load and other relevant clinical metadata (Methods), we observed statistically significant differences in 48 species-level taxa between severe and moderate COVID-19 (FDR-corrected p-value < 0.05, Fig. 2a & Suppl. Table 2). All but two of these taxa (Candida albicans & Enterococcus faecalis) were relatively depleted in severe disease (Fig. 2a & 2b), a trend concordant with the observed decrease in species richness and evenness. We identified significant depletions of Fusicatenibacter saccharivorans and Roseburia hominis (Fig. 2b), consistent with prior work showing the relative contraction of each in patients with post-acute COVID-19 syndrome (PASC), also known as “long COVID18,20.” Eight taxa were positively associated with stool SARS-CoV-2 viral load, including several linked to pro-inflammatory sulfur metabolism, such as Methanobrevibacter smithii and Bilophila wadsworthia, as well as several Alistipes spp (Suppl. Table 2). Interestingly, an expansion of R. hominis was associated with increased stool viral load despite a corresponding decrease among patients with severe COVID-19, suggesting an interaction between stool SARS-CoV-2 viral load, R. hominis, and severe COVID-19 (Suppl. Table 2). Corresponding to community-wide depletions in microbial diversity, biochemical pathways encoded by gut bacteria were also significantly altered in severe COVID-19, including reductions in amino acid biosynthesis (e.g., glutamine synthesis), isoprenoid biosynthesis, and short-chain fatty acid production (SCFA) pathways, including glycerol degradation, acetyl-CoA fermentation, and methanogenesis from acetate (Suppl. Table 3 & Suppl. Figure 1).

Figure 2. Taxonomic depletions linked to COVID-19 severity.

Figure 2

a. Volcano plot of species-level expansions and depletions linked to severe vs. moderate COVID-19. Effect sizes (β-coefficients) from multivariable linear modeling plotted against FDR-corrected p-value. Full results in Suppl. Table 2. b. Highlighted box and scatter plots of taxa abundance by COVID-19 severity. For visualization purposes, technical/true 0s were imputed with a given taxa’s minimum non-zero value. Boxes represent median and interquartile ranges, while whiskers represent 95%ile.

Machine learning to predict severe COVID-19

Given our findings of both community-wide and feature-level alterations linked to severe COVID-19, we next used a machine learner to predict whether metagenomic features could serve as inputs to classify samples derived from patients with severe vs. moderate COVID-19. To assess whether non-microbial metadata (i.e., participant characteristics) should be jointly considered with microbial taxa in training our classifier, we generated an entropy heatmap to quantify the unique row-wise information with respect to column-wise data (in which non-informative variables would have a value of 0). As all the covariates used in our prior linear modeling (Methods) contributed unique information to label/disease severity prediction (Suppl. Figure 2), each was included in our machine learning workflow.

Using both differentially abundant microbial features and clinical characteristics as our input (Fig. 3a), our random forest regressor achieved an area under the receiver operating characteristic (AUROC) of 0.925 when tasked with predicting whether stool was obtained from patients with severe vs. moderate COVID-19 (Fig. 3b & Fig. 3c). Our findings were only modestly attenuated when modeled without clinical metadata (AUROC 0.922) and stool SARS-CoV-2 viral load (AUROC 0.923), respectively. To robustly assess this result, we trained our model using only the top 20 differentially abundant microbial features, which only modestly degraded task performance (AUROC 0.898). Finally, though we ensured samples from the same individual were confined to a single cross-fold, to minimize the possibility of overfitting data from personalized gut microbial communities, we trained and assessed our model using only the first stool sample from each participant, which again performed with excellent accuracy (AUROC 0.871), further supporting the role of metagenomic profiling as a diagnostic biomarker for disease severity.

Figure 3. Stool-based classifier for COVID-19 disease severity.

Figure 3

a. Box and scatter plots of the top 50 microbial features and their differential abundance by COVID-19 severity with barplots indicating univariate/nominal p-value, fold change by study group, prevalence, and taxa-level contribution to area-under-the curve for a random forest-based machine learner. b. Receiver operator characteristic (ROC) and precision-recall curves demonstrating excellent performance in classifying stool samples by COVID-19 severity. The removal of stool SARS-CoV-2 viral load and clinical metadata resulted in only modestly decreased task performance, as did limiting our input to only the top 20 differentially abundant microbes by disease class. A sensitivity analysis using only the first provided stool from each participant, which should minimize the possibility of overfitting data due to repeated measures and longitudinal sampling, still performed well.

Systems approaches to interrogate microbial assemblages

To explore the possible biological mechanisms underlying our observations, we next sought to characterize whether ecological networks were significantly altered based on COVID-19 severity (Methods). We hypothesized that the community-wide and feature-level alterations observed in moderate vs. severe COVID-19 would change microbial network topology. First, we evaluated global microbial network properties. The adjusted Rand Index (ARI) is a measure of similarities in clustering, quantifying the likelihood that pairs of microbial clades would be assigned to the same cluster in both networks. An ARI value of 0 indicates random clustering across comparator groups, a value of 1 indicates identical clustering, and a value of −1 indicates perfect disagreement26,27. When comparing moderate to severe COVID-19, the ARI was 0.199 (p-value < 0.001). Jaccard’s index (JI) evaluates differences among central nodes between our two severity-specific networks, where a value of 0 indicates completely different sets of central nodes and a value of 1 indicates identical central nodes28. While there were no statistically significant differences in overall centrality measures when comparing moderate to severe cases, there were alterations in proportion of positive edges network-wide (92.9% vs 100%, p-value < 0.001), indicating a loss of moderate negative correlations in severe COVID-19. For example, C. albicans, which was relatively more abundant in severe compared to moderate COVID-19, has 0 vs. 3 negative edges in each disease state, respectively, raising the possibility that the loss of negative selective pressure can promote the growth of certain microbial clades in severe COVID-19.

We identified 16 taxa as network hubs, i.e., species with high putative importance given their centrality to the surrounding microbial networks (Fig. 4 & Suppl. Table 4a). Five species were identified as hubs in both moderate and severe disease (Blautia wexlerae, Eubacterium hallii, Gordonibacter pamelaeae, Odoribacter splanchnicus, and Alistipes shahii), while 11 were unique to one network or the other (Suppl. Table 4b, Suppl. Table 4c, & Suppl. Figure 3). Critically, 9 of these 16 identified hubs, including Blautia wexlerae and Eubacterium hallii, were shown to be differentially abundant by disease severity (Suppl. Table 2), and the relative abundance of two hubs, Eubacterium rectale and Alistipes putredenis, were associated with stool viral load. We further observed that highly-connected clusters in moderate disease become fragmented in severe COVID-19, as evidenced by an increase in singletons, a decrease in the number of hubs, and dynamic taxa-level cluster reassignment (Fig. 4). Notably, all but one of the hubs shown to be differentially abundant by disease severity belonged to the same cluster, suggesting that significant loss of these central taxa in severe disease may contribute to the observed network instability.

Figure 4. Comparative microbial assemblages in moderate vs. severe COVID-19.

Figure 4

We assembled discrete microbial networks for moderate vs. severe disease to demonstrate significant ecological heterogeneity characterized by fractured clustering and taxa-level reassignment in severe disease. Species are represented by circles (nodes) and species-species correlations were weighted by strength of correlation (edges drawn if absolute Pearson’s ρ>0.4). Node size indicates normalized relative abundance, and node colors indicate cluster membership. Cluster colors are retained across networks if two or more taxa are shared. Edge color reflects the direction of correlation, with red edges indicating a negative, and green edges indicating a positive correlation, respectively. Hubs have been numbered, while clusters are referred to by their nominate node, or the taxa with the highest edge count in a given cluster by network.

Predicted stool metabolites linked to disease severity

We next sought to evaluate whether changes in microbial communities affected local metabolite production. Using a validated computational workflow to generate putative metabolic profiles from stool metagenomes29 (Methods), we found 57 of 80 well-predicted known stool metabolites to be differentially abundant based on COVID-19 disease severity (all FDR-corrected p-value < 0.05; Fig. 5a & Suppl. Table 5). We identified the perturbation of bile acid metabolism in severe COVID-19, with relative enrichment of primary bile acids (chenodeoxycholate, cholate, and ketodeoxycholate) alongside depletion of secondary bile acids (lithocholate, lithocholic acid, and deoxycholic acid) (Fig. 5b). Similar to our microbial pathway analysis which revealed reductions in MetaCyc pathways related to SCFA production, predicted levels of butyrate, isobutyrate, and propionate were also reduced in severe COVID-19 (Suppl. Table 5). Furthermore, we confirmed prior data showing relative enrichment of bilirubin30, creatine and polyamines (e.g., acetyl-spermidine31), and pantothenic acid32 in severe COVID-19, as well as a relative depletion of deoxyinosine32 (Suppl. Table 5).

Figure 5. Predicted stool metabolite profiles.

Figure 5

a. Volcano plot of enrichments and depletions in predicted stool metabolites linked to severe compared to moderate COVID-19. Adjusted log2fold change calculated from p-coefficients extracted from multivariable linear modeling plotted against FDR-corrected p-value. Full results in Suppl. Table 5. b. Highlighted box and scatter plots of predicted metabolite abundance by COVID-19 severity. For visualization purposes, technical/true 0s were imputed with a given taxa’s minimum non-zero value prior to log-transformation. Boxes represent medians and interquartile ranges, while whiskers represent 95%ile.

Discussion

In a large U.S. hospital-based cohort of diverse patients admitted with confirmed COVID-19 during the initial year of the pandemic, we found community- and species-level alterations linked to disease severity. Using a random forest machine learner, these microbial features could accurately classify patients based on disease severity, indicating that specific gut microbial configurations may predict a more severe disease course. Network analyses identified significant disruptions to gut ecologic topology in severe COVID-19. Differential abundance testing of microbial pathways and predicted stool metabolites suggest that these disruptions may change the balance of bile acids and SCFAs in the gut, identifying novel treatment opportunities that may ameliorate the severity of COVID-19. We also found significant depletions of two microbes previously associated with long COVID, suggesting early gut microbial disturbances may precede the development of a long-term complication.

Determining who will require a higher level of care remains one of the most challenging questions facing clinicians caring for patients with COVID-19. Our machine learning algorithm demonstrated excellent discrimination between moderate and severe COVID-19 using only gut microbial features. Notably, the inclusion of clinical data did not significantly improve the classification accuracy of our model. Prior work has incorporated such information from initial presentation33, multi-cytokine panels34, and previously validated illness severity scores35 to forecast whether a given patient will suffer from a more severe COVID-19 course. However, based on their performance characteristics, these approaches appear to be less predictive than our microbiome-centered approach.

Our findings expand on prior research linking changes in gut microbial ecology to COVID-19. However, it should be noted that much of the initial work has been done on a smaller scale7, 911,14 and typically outside of North America715, limiting their generalizability. Further, these comparative analyses may have focused on specialized populations, such as the very young, the asymptomatic, or patients in recovery12,1618, and may not have been well-suited to consider clinical factors that may confound the relationship between gut microbial communities and COVID-19 using more robust multivariable approaches7,8,1017. Prior studies also predominantly relied on 16S rRNA sequencing to demonstrate community- or genus-level shifts related to COVID-197,1417, falling short of the species-level resolution and biochemical insights gained by employing next-generation sequencing of gut metagenomes and other functional multi-omic technologies. In contrast, we assembled a large, representative North American patient population admitted with symptomatic COVID-19 whose gut microbial communities were interrogated using metagenomic techniques, allowing us to identify novel microbial features to more comprehensively characterize disease severity with high predictive accuracy.

Prior investigations have observed similar community- and taxa-level alterations in microbial composition in COVID-19. In the earliest phase of the pandemic, a study from Hong Kong (n = 36) also demonstrated relative reductions in the group Eubacterium among the gut metagenomes of COVID-19-infected patients compared to referent populations, and like our work, found widespread depletion of typical gut colonizers such as Faecalibacterium and Roseburia spp. in severe COVID-199. In an expanded population of 100 patients, the same group reaffirmed a reduction in diversity and a loss of health-associated gut commensals in severe COVID-1913. Finally, a study of 30 SARS-CoV-2 infected patients in mainland China using 16S rRNA-based sequencing similarly demonstrated a change in gut community structure with reductions in α-diversity compared to referent counterparts14. Notably, they also achieved success in classifying stool samples from patients with COVID-19 compared to those from healthy controls or those infected with influenza, indicating the relatively distinct gut ecology of COVID-19. However, their classification tasks were conducted in a smaller population using supervised feature selection (i.e., the top results from their linear discriminant analysis) of genus-level taxa, and arguably, the role of a gut microbial biomarker in discriminating COVID-19 from non-infected individuals is uncertain now that SARS-CoV-2 testing is more widely available36.

Our work offers insights beyond these broad characterizations of the gut microbiome in COVID-19. It is appreciated that gut microbial ecology influences the host immune response to viral respiratory infections36. Our identification of Blautia wexlerae and Eubacterium hallii as network hubs depleted in severe COVID-19 (both Lachnospiraceae implicated in other immune-mediated diseases37) suggests these bacteria may engage in important roles in the regulation of immunity to SARS-CoV-2. Predicted depletion of secondary bile acids in severe disease provides another mechanism by which changes in gut microbial communities may influence the immune response to SARS-CoV-2. Bile acids regulate mucosal and systemic immunity in several ways38. Prior work has suggested that secondary bile acids are the primary ligand for TGR539 through which they may suppress pro-inflammatory signaling38,40, resulting in impaired immunity to viral infections41,42. The predicted shift in bile acid pools may also result in increased regulation of bile acid-sensitive transcription factors, as increased primary bile acids will preferentially activate farsenoid X Receptor, while depletions in secondary bile acids will reduce activation of vitamin D receptor (VDR)43,44 and pregnane X receptor (PXR)45. Decreased VDR/PXR signaling during active infection are associated with increased systemic inflammation and increased morbidity and mortality46,47, possibly contributing to the clinical milieu observed in severe COVID-19. This is a particularly noteworthy hypothesis given emerging epidemiologic data on the link between diet48, vitamin D status49, and COVID-19 disease risk and severity, as well as early work linking depletion of secondary bile acids to COVID-19-related mortality50.

Our study has several key strengths. First, we assembled a large representative cohort of patients at a U.S.-based tertiary care center for whom we collected relevant clinical metadata to complement serial stool sampling. Second, our computational workflow allowed us to not only link community-level changes in gut microbial ecology but species-resolved signatures of severe COVID-19. Third, network analyses identified critical taxa central to maintaining a fragile gut microbial configuration less likely to be found in severe COVID-19, and complementary MetaCyc pathway and predicted metabolite analyses further link these changes to alterations in bile acid pool and SCFA levels. Taken together, these observations serve as proof of principle that using NGS to interrogate gut microbial ecology may generate tractable hypotheses to be explored in follow-up investigations. Finally, our results fit well in the context of independent works from other groups–lending credence to our findings–and using a machine learning classifier, we demonstrate excellent accuracy in discriminating samples from moderate vs. severe COVID-19. These findings hint at the possibility that modulating gut microbial communities may be a viable disease prevention or therapeutic strategy in COVID-19.

We acknowledge several limitations. We were not positioned to assess whether findings differed on the basis of SARS-CoV-2 strain or variants. Our study enrolled patients from April 2020 to May 2021 during which genomic surveillance infrastructure in the U.S. was not equipped to comprehensively explore this question. Prior to the Delta variant wave beginning in June 2021, the majority of COVID-19 cases were either Alpha or other less consequential variants of interest51. Given the observational nature of our study, we cannot exclude the possibility of residual confounding. However, we adjusted for multiple potential confounders. All enrolled patients were hospitalized, which may minimize study heterogeneity at the expense of overall generalizability. We also assessed the gut microbiome at the earliest feasible time point on admission. This resulted in variation in the timing of collection, which limits our ability to infer causality. Despite these limitations, our findings are intended to be hypothesis-generating to inform the continuum of research that may logically follow.

Leveraging the gut microbiome as a potential biomarker for disease severity and modulating this fragile ecology to improve COVID-19 outcomes each hold significant appeal in the fight to end this pandemic. Multidisciplinary approaches will be needed to confirm our early findings. Validation of a non-invasive indicator predictive of disease severity could readily identify and target at-risk individuals for more aggressive therapy. Finally, directed probiotic restoration or targeted depletion of severe COVID-19-linked microbes could offer a novel therapeutic modality to complement existing therapies.

Methods

Study population

Patients were screened daily for inclusion from among all admitted individuals for whom a designation of possible SARS-CoV-2 infection was flagged by hospital infection control. COVID-19 infection status was subsequently confirmed with at least one positive nasopharyngeal SARS-CoV-2 polymerase chain reaction (PCR) test. An optional biospecimen collection protocol was nested within this longitudinal study, which allowed collection of additional clinically relevant biospecimens, including stool samples.

Sample/data collection

Fresh stool was collected and refrigerated at 4°C until aliquoting/freezing at −80°C (typically within 4 hours of collection) from adult patients enrolled in the prospective biospecimen collection study. Participants were able to provide stool samples as frequently as once daily, as well as declining donation on any given day (while remaining in the study). Study coordinators blinded to case status abstracted data from the electronic health record using a double data entry approach with discrepancies adjudicated by re-abstraction or after discussions with supervising authors. We collected information on admission age (years), biological sex (male, female, other), race (White, Black, Asian, American Indian, Mixed, or Other), ethnicity (non-Hispanic or Hispanic), admission BMI (kg/m2), comorbidities including history of cancer, pulmonary, or cardiac disease, hypertension, hyperlipidemia, and diabetes mellitus (each yes/no), smoking history (active, former, never, unknown, and pack-years among smokers), and their composite admission Charlson Comorbiditiy Index, a validated score predictive of in-hospital mortality52. Information on hospital course, including admission Simplified Acute Physiology Score II (SAPS II)22 and Sequential Organ Failure Assessment (SOFA) scores23 were calculated from routine laboratory results and clinical assessments. The use of antibiotics, antivirals including remdesivir, hydroxychloroquine, corticosteroids, anti-IL-6 therapy, any form of oxygen support, high-flow oxygen, bilevel positive airway pressure (BiPAP) ventilation, or mechanical ventilation (each yes/no) was collected. Mortality within 90 days of admission was ascertained in the post-study period.

Extraction protocols

Stool samples, reagent-only negative controls, and mock community positive controls (Zymo Research) were extracted using either the AllPrep PowerFecal DNA/RNA 96 Kit (Qiagen) or the Maxwell HT 96 gDNA Blood Isolation System (Promega)53. SARS-CoV-2 viral load was quantified as per CDC guidelines54 using the 2019-nCoV N1 primer and probe set54, as well as human RNaseP as an internal control. Each RT-qPCR reaction contained TaqPath™ 1-Step RT-qPCR Master Mix (Thermo Fisher), RNA template, the CDC N1 or RNaseP forward and reverse primers (IDT), probe, and RNase-free water to a total reaction volume of 10 μl. Viral copy numbers were quantified using N1 quantitative PCR (qPCR) standards (IDT) in 10-fold dilutions to generate a standard curve. The assay was run in triplicate for each sample with three no-template control wells per 384 well plate.

Microbial Sequencing

Samples were sequenced by two metagenomic sequencing facilities at the Broad Institute and Baylor College of Medicine according to their standard established platforms. DNA was prepared for sequencing using the Illumina Nextera XT DNA library preparation kit. All libraries were sequenced with a target of 3GB output at 2×150bp read length using the Illumina NovaSeq platform. No major batch effects attributable to sequencing center were observed, and thus, subsequent analyses were conducted on pooled samples (multivariable PERMANOVA R2 for batch = 1.2%, p-value = 0.12, Suppl. Figure 4).

Sequence bioinformatics

Taxonomic and functional profiles were generated using the bioBakery 3 shotgun metagenome workflow 3.0.0, the details of which have previously been described55. Briefly, human reads were filtered using KneadData 0.10.0 and taxonomic profiles generated using MetaPhlAn 3.0.056. Functional profiling was conducted using HUMAnN 3.0.056, resulting in gene family abundance tables assembled into higher order MetaCyc pathways57.

Given the tight coupling and relatively conserved nature of gut taxonomic and metabolite profiles58, we used the MelonnPan-predict 0.99.023 workflow29 to interrogate the functional relationship between COVID-19 severity and microbial community metabolism. In brief, MelonnPan uses an elastic net model to conservatively predict putative metabolite levels based on stool UniRef90 gene family abundance.

Statistical Analysis

To compare patient characteristics between study groups, we used standard statistical tests, including chi-squared (χ2) tests or Fisher’s exact testing for categorical variables, the Student’s t-test for normally distributed, non-categorical variables and nonparametric Wilcoxon rank sum tests for all others. Differences with two-tailed p-value ≤ 0.05 were considered significant.

α-diversity was calculated using the Shannon index with the “diversity” function from the R package vegan. Principal coordinates analyses (PCoA) were performed using species-level Bray-Curtis dissimilarity metrics with the “vegdist” function in the vegan package.

After filtering out features with no variance and low (< 10%) prevalence, we performed differential abundance testing of species-level taxonomy, MetaCyc pathways, and predicted stool metabolites using linear mixed-effects models to account for a nested data structure from repeated sampling:

log(feature) ~ intercept + COVID-19 severity + age + sex + prior antibiotic use + race + ethnicity + BMI + Charlson Comorbidity Index + remdesivir + corticosteroids + days since admission + SARS-CoV-2 stool viral load + sequencing depth + (1 | participant)

Machine learning model building and evaluation were conducted using the SIAMCAT v.1.13.3 package59. Log-transformed species with pseudocount were filtered to remove biomarkers with low overall abundance and z-transformed. A nested cross-validation procedure was applied to calculate prediction accuracy by splitting data into training and testing sets for twice-repeated, five-fold-cross-validation. To account for longitudinal sampling59, data splits were stratified by participant ID, ensuring samples from the same individual were used in the same fold. For each split, a random forest (RF) regressor was trained and subsequently used to predict COVID-19 disease severity. To evaluate model performance, we used the lambda parameter to maximize the area under the receiver operator characteristic curve (AUROC) with a 95% confidence interval (CI) for cross-validation error.

To assess whether ecological dynamics may help explain observed differences in taxonomy, we performed dedicated microbial network analyses. To account for our longitudinal data structure and to avoid overfitting, we restricted this analysis to each participant’s first collected stool. Network construction was conducted using the “netConstruct” function in NetCoMi v.1.0.260, normalized using a modified centered-log ratio and limited the resulting network to microbes with an absolute Pearson correlation ≥ 0.4 (approximately equal to the 95th percentile of correlation matrix distribution). Network hubs were identified as those in the top quintile of degree, betweenness, and closeness centrality in each network (moderate vs. severe COVID-19, respectively). Finally, comparison of moderate and severe networks was performed using the “netCompare” function with 10,000 permutations.

Regulatory compliance and data availability

Study protocols were approved by the Mass General Brigham Institutional Review Board. Study enrollment with written informed consent was conducted with the patient or their healthcare proxy. Prior to publication, raw sequencing data will be deposited at the National Center for Biotechnology Information’s (NCBI) Sequence Read Archive (SRA) under a to-be-determined BioProject accession ID.

Supplementary Material

Supplement 1

Acknowledgments

Computational work was conducted on the FASRC Cannon cluster supported by the FAS Division of Science Research Computing Group at Harvard University. We thank the MGH Translational and Clinical Research Center (TCRC) for their support of the project. The TCRC is supported by Grant Number 1UL1TR002541.

Funding:

This work was supported in part by the American Gastroenterological Association Research Foundation’s AGA-Takeda COVID-19 Rapid Response Research Award 2021-5102 (L.H.N. and D.A.D.) and Research Scholars Award (L.H.N.), the Massachusetts Consortium on Pathogen Readiness (MassCPR), Mark and Lisa Schwartz (A.T.C), the Crohn’s and Colitis Foundation Career Development Award and Research Fellowship Award (L.H.N.), and the NIH/NIDDK K23DK125838 (L.H.N.) and K01DK120742 (D.A.D.). Study sponsors had no role in study design, data collection, analysis, and interpretation of data. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Center for Research Resources, the National Center for Advancing Translational Science, or the National Institutes of Health.

Contributor Information

Peggy Lai, Massachusetts General Hospital, Harvard Medical School.

Long Nguyen, Massachusetts General Hospital and Harvard Medical School.

Daniel Okin, Massachusetts General Hospital.

David Drew, Clinical and Translational Epidemiology Unit, Massachusetts General Hospital.

Vincent Battista, Massachusetts General Hospital.

Sirus Jesudasen, Massachusetts General Hospital.

Thomas Kuntz, Harvard T.H. Chan School of Public Health.

Amrisha Bhosle, Harvard TH Chan School of Public Health.

Kelsey Thompson, Harvard T.H. Chan School of Public Health.

Trenton Reinicke, Massachusetts General Hospital.

Chun-Han Lo, Massachusetts General Hospital and Harvard Medical School.

Jacqueline Woo, Massachusetts General Hospital.

Alexander Caraballo, Massachusetts General Hospital.

Lorenzo Berra, Massachusetts General Hospital.

Jacob Vieira, Massachusetts General Hospital.

Ching-Ying Huang, Massachusetts General Hospital.

Upasana Das Adhikari, Massachusetts General Hospital.

Minsik Kim, Massachusetts General Hospital.

Hui-Yu Sui, Massachusetts General Hospital.

Marina Magicheva-Gupta, Massachusetts General Hospital.

Lauren McIver, Harvard T.H. Chan School of Public Health.

Marcia Goldberg, Massachusetts General Hospital.

Douglas Kwon, Harvard Medical School.

Curtis Huttenhower, Harvard T.H. Chan School of Public Health.

Andrew Chan, Massachusetts General Hospital.

References

  • 1.Tracking. Johns Hopkins Coronavirus Resource Center https://coronavirus.jhu.edu/data.
  • 2.Lynch S. V. & Pedersen O. The Human Intestinal Microbiome in Health and Disease. N. Engl. J. Med. 375, 2369–2379 (2016). [DOI] [PubMed] [Google Scholar]
  • 3.Keely S., Talley N. J. & Hansbro P. M. Pulmonary-intestinal cross-talk in mucosal inflammatory disease. Mucosal Immunol. 5, 7–18 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Bordon Y. Antibiotics can impede flu vaccines. Nature reviews. Immunology vol. 19 663 (2019). [DOI] [PubMed] [Google Scholar]
  • 5.Hagan T. et al. Antibiotics-Driven Gut Microbiome Perturbation Alters Immunity to Vaccines in Humans. Cell 178, 1313–1328.e13 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Chen C.-J., Wu G.-H., Kuo R.-L. & Shih S.-R. Role of the intestinal microbiota in the immunomodulation of influenza virus infection. Microbes Infect. 19, 570–579 (2017). [DOI] [PubMed] [Google Scholar]
  • 7.Ren Z. et al. Alterations in the human oral and gut microbiomes and lipidomics in COVID-19. Gut 70, 1253–1265 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Zhang F. et al. Prolonged Impairment of Short-Chain Fatty Acid and L-Isoleucine Biosynthesis in Gut Microbiome in Patients With COVID-19. Gastroenterology 162, 548–561.e4 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Zuo T. et al. Alterations in Gut Microbiota of Patients With COVID-19 During Time of Hospitalization. Gastroenterology 159, 944–955.e8 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Zuo T. et al. Depicting SARS-CoV-2 faecal viral activity in association with gut microbiota composition in patients with COVID-19. Gut 70, 276–284 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Zuo T. et al. Alterations in Fecal Fungal Microbiome of Patients With COVID-19 During Time of Hospitalization until Discharge. Gastroenterology 159, 1302–1310.e5 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Ng S. C. et al. Gut microbiota composition is associated with SARS-CoV-2 vaccine immunogenicity and adverse events. Gut (2022) doi: 10.1136/gutjnl-2021-326563. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Yeoh Y. K. et al. Gut microbiota composition reflects disease severity and dysfunctional immune responses in patients with COVID-19. Gut 70, 698–706 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Gu S. et al. Alterations of the Gut Microbiota in Patients With Coronavirus Disease 2019 or H1N1 Influenza. Clin. Infect. Dis. 71, 2669–2678 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Schult D. et al. Gut bacterial dysbiosis and instability is associated with the onset of complications and mortality in COVID-19. Gut Microbes 14, 2031840 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Nashed L. et al. Gut microbiota changes are detected in asymptomatic very young children with SARS-CoV-2 infection. Gut (2022) doi: 10.1136/gutjnl-2021-326599. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Newsome R. C. et al. The gut microbiome of COVID-19 recovered patients returns to uninfected status in a minority-dominated United States cohort. Gut Microbes 13, 1–15 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Liu Q. et al. Gut microbiota dynamics in a prospective cohort of patients with post-acute COVID-19 syndrome. Gut 71, 544–552 (2022). [DOI] [PubMed] [Google Scholar]
  • 19.Nonhospitalized adults: Therapeutic management. COVID-19 Treatment Guidelines https://www.covid19treatmentguidelines.nih.gov/management/clinical-management/nonhospitalized-adults--therapeutic-management/.
  • 20.Zhou Y., Zhang J., Zhang D., Ma W.-L. & Wang X. Linking the gut microbiota to persistent symptoms in survivors of COVID-19 after discharge. J. Microbiol. 59, 941–948 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Berlin D. A., Gulick R. M. & Martinez F. J. Severe Covid-19. N. Engl. J. Med. 383, 2451–2460 (2020). [DOI] [PubMed] [Google Scholar]
  • 22.Gall J.-R. L. & Le Gall J.-R. A New Simplified Acute Physiology Score (SAPS II) Based on a European/North American Multicenter Study. JAMA: The Journal of the American Medical Association vol. 270 2957 (1993). [DOI] [PubMed] [Google Scholar]
  • 23.Vincent J. L. et al. The SOFA (Sepsis-related Organ Failure Assessment) score to describe organ dysfunction/failure. On behalf of the Working Group on Sepsis-Related Problems of the European Society of Intensive Care Medicine. Intensive Care Med. 22, 707–710 (1996). [DOI] [PubMed] [Google Scholar]
  • 24.Beck D. H., Smith G. B., Pappachan J. V. & Millar B. External validation of the SAPS II, APACHE II and APACHE III prognostic models in South England: a multicentre study. Intensive Care Med. 29, 249–256 (2003). [DOI] [PubMed] [Google Scholar]
  • 25.Ferreira F. L., Bota D. P, Bross A., Mélot C. & Vincent J. L. Serial evaluation of the SOFA score to predict outcome in critically ill patients. JAMA 286, 1754–1758 (2001). [DOI] [PubMed] [Google Scholar]
  • 26.Rand W. M. Objective Criteria for the Evaluation of Clustering Methods. J. Am. Stat. Assoc. 66, 846–850 (1971). [Google Scholar]
  • 27.Qannari E. M., Courcoux P. & Faye P. Significance test of the adjusted Rand index. Application to the free sorting task. Food Qual. Prefer. 32, 93–97 (2014). [Google Scholar]
  • 28.Real R. & Vargas J. M. The Probabilistic Basis of Jaccard’s Index of Similarity. Syst. Biol. 45, 380–385 (1996). [Google Scholar]
  • 29.Mallick H. et al. Predictive metabolomic profiling of microbial communities using amplicon or metagenomic sequences. Nat. Commun. 10, 3136 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Shen B. et al. Proteomic and Metabolomic Characterization of COVID-19 Patient Sera. Cell 182, 59–72.e15 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Thomas T. et al. COVID-19 infection alters kynurenine and fatty acid metabolism, correlating with IL-6 levels and renal status. JCI Insight 5, (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Lv L. et al. The faecal metabolome in COVID-19 patients is altered and associated with clinical features and gut microbes. Anal. Chim. Acta 1152, 338267 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Gallo Marin B. et al. Predictors of COVID-19 severity: A literature review. Rev. Med. Virol. 31, 1–10 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Cabaro S. et al. Cytokine signature and COVID-19 prediction models in the two waves of pandemics. Sci. Rep. 11, 20793 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Raschke R. A., Agarwal S., Rangan P., Heise C. W. & Curry S. C. Discriminant Accuracy of the SOFA Score for Determining the Probable Mortality of Patients With COVID-19 Pneumonia Requiring Mechanical Ventilation. JAMA 325, 1469–1470 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Peeling R. W., Heymann D. L., Teo Y.-Y. & Garcia P. J. Diagnostics for COVID-19: moving from pandemic response to control. Lancet 399, 757–768 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Vacca M. et al. The Controversial Role of Human Gut Lachnospiraceae. Microorganisms vol. 8 573 (2020) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Chen M. L., Takeda K. & Sundrud M. S. Emerging roles of bile acids in mucosal immunity and inflammation. Mucosal Immunol. 12, 851–861 (2019). [DOI] [PubMed] [Google Scholar]
  • 39.Kawamata Y. et al. A G Protein-coupled Receptor Responsive to Bile Acids *. J. Biol. Chem. 278, 9435–9440 (2003). [DOI] [PubMed] [Google Scholar]
  • 40.Hao H. et al. Farnesoid X Receptor Regulation of the NLRP3 Inflammasome Underlies Cholestasis-Associated Sepsis. Cell Metab. 25, 856–867.e5 (2017) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Ichinohe T. et al. Microbiota regulates immune defense against respiratory tract influenza A virus infection. Proc. Natl. Acad. Sci. U. S. A. 108, 5354–5359 (2011) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Stefan K. L., Kim M. V., Iwasaki A. & Kasper D. L. Commensal Microbiota Modulation of Natural Resistance to Virus Infection. Cell 183, 1312–1324.e10 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Li T. & Chiang J. Y. L. Nuclear receptors in bile acid metabolism. Drug Metab. Rev. 45, 145–155 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Makishima M. et al. Vitamin D receptor as an intestinal bile acid sensor. Science 296, 1313–1316 (2002). [DOI] [PubMed] [Google Scholar]
  • 45.Staudinger J. L. et al. The nuclear receptor PXR is a lithocholic acid sensor that protects against liver toxicity. Proc. Natl. Acad. Sci. U. S. A. 98, 3369–3374 (2001). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Qiu Z. et al. Pregnane X Receptor Regulates Pathogen-Induced Inflammation and Host Defense against an Intracellular Bacterial Infection through Toll-like Receptor 4. Sci. Rep. 6, 31936 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Kongsbak M., Levring T. B., Geisler C. & von Essen M. R. The vitamin d receptor and T cell function. Front. Immunol. 4, 148 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Merino J. et al. Diet quality and risk and severity of COVID-19: a prospective cohort study. Gut 70, 2096–2104 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Ma W. et al. Associations between predicted vitamin D status, vitamin D intake, and risk of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infection and coronavirus disease 2019 (COVID-19) severity. Am. J. Clin. Nutr. 115, 1123–1133 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Stutz M. R. et al. Association of Fecal Microbiota Composition and Function with Progression of Respiratory Failure and Death in Patients with COVID-19 Critical Illness. in D94. PANDEMIC OUTPUT: ALL THAT IS COVID-19 A5289-A5289 (American Thoracic Society, 2022). doi: 10.1164/ajrccm-conference.2022.205.1_MeetingAbstracts.A5289. [DOI] [Google Scholar]
  • 51.Corum J. & Zimmer C. Tracking Omicron and Other Coronavirus Variants. The New York Times (2021). [Google Scholar]
  • 52.Sundararajan V. et al. New ICD-10 version of the Charlson comorbidity index predicted in-hospital mortality. J. Clin. Epidemiol. 57, 1288–1294 (2004). [DOI] [PubMed] [Google Scholar]
  • 53.Sui H.-Y. et al. Impact of DNA Extraction Method on Variation in Human and Built Environment Microbial Community and Functional Profiles Assessed by Shotgun Metagenomics Sequencing. Front. Microbiol. 11, 953 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.CDC. Labs. Centers for Disease Control and Prevention; https://www.cdc.gov/coronavirus/2019-ncov/lab/rt-pcr-panel-primer-probes.html. [Google Scholar]
  • 55.McIver L. J. et al. bioBakery: a meta’omic analysis environment. Bioinformatics 34, 1235–1237 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56.Beghini F. et al. Integrating taxonomic, functional, and strain-level profiling of diverse microbial communities with bioBakery 3. Elife 10, (2021) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57.Caspi R. et al. The MetaCyc database of metabolic pathways and enzymes and the BioCyc collection of Pathway/Genome Databases. Nucleic Acids Res. 42, D459–71 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58.Chong J. & Xia J. Computational Approaches for Integrative Analysis of the Metabolome and Microbiome. Metabolites 7, (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 59.Wirbel J. et al. Microbiome meta-analysis and cross-disease comparison enabled by the SIAMCAT machine learning toolbox. Genome Biol. 22, 93 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 60.Peschel S., Müller C. L., von Mutius E., Boulesteix A.-L. & Depner M. NetCoMi: network construction and comparison for microbiome data in R. Brief. Bioinform. 22, (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplement 1

Articles from Research Square are provided here courtesy of American Journal Experts

RESOURCES