Abstract
Immediately after birth, newborn babies experience rapid colonisation by microorganisms from their mothers and the surrounding environment1. Diseases in childhood and later in life are potentially mediated through perturbation of the infant gut microbiota colonisations2. However, the impact of modern clinical practices, such as caesarean section delivery and antibiotic usage, on the earliest stages of gut microbiota acquisition and development during the neonatal period (≤1 month) remains controversial3,4. Here we report disrupted maternal transmission of Bacteroides strains and high-level colonisation by healthcare-associated opportunistic pathogens, including Enterococcus, Enterobacter and Klebsiella species, in babies delivered by caesarean section (C-section), and to a lesser extent, in those delivered vaginally with maternal antibiotic prophylaxis or not breastfed during the neonatal period. Applying longitudinal sampling and whole-genome shotgun metagenomic analysis on 1,679 gut microbiotas of 771 full term, UK-hospital born babies and mothers, we demonstrate that the mode of delivery is a significant factor impacting gut microbiota composition during the neonatal period that persists into infancy (1 month - 1 year). Matched large-scale culturing and whole-genome sequencing (WGS) of over 800 bacterial strains cultured from these babies identified virulence factors and clinically relevant antimicrobial resistance (AMR) in opportunistic pathogens that may predispose to opportunistic infections. Our findings highlight the critical early roles of the local environment (i.e. mother and hospital) in establishing the gut microbiota in very early life, and identifies colonisation with AMR carrying, healthcare-associated opportunistic pathogens as a previously unappreciated risk factor.
Keywords: gastrointestinal microbiota, early-life microbiota colonisation, clinical metagenomics, neonatal, c-section, intrapartum antibiotic prophylaxis, paediatric, opportunistic pathogens, antimicrobial resistance (AMR), Enterococcus, Klebsiella
The acquisition and development of the early-life gut microbiota follow successive waves of microbial exposures and colonisation that shapes the longer-term microbiota composition and function5. Early life events, including Caesarean section delivery1,6, formula feeding7,8 and antibiotic exposure8,9 that could perturb the gut microbiota composition are associated with the development of childhood asthma and atopy10–12. While recent studies8,9,13–15 have provided substantial insights into the gut microbiota development during the first 3 years of life, many were limited by the taxonomic resolution provided by 16S rRNA gene profiling, small sample size or limited sampling during the first month of life (neonatal period). High-resolution metagenomic studies of large, longitudinal cohorts are required to establish the impact and risks of early life events on the gut microbiota assembly, particularly during the neonatal period where pioneering microbes could influence subsequent microbiota and immune system development16,17.
To characterise the trajectory of gut microbiota acquisition and development during the neonatal period, we enrolled 596 healthy, term babies (39.5 ± 1.37 gestation weeks, 314 vaginal and 282 C-section births, Fig. 1a, Extended Data Table 1, Supplementary Table 1) through the Baby Biome Study (BBS). Faecal samples were collected from all babies at least once during their neonatal period (<1 month) with 302 babies re-sampled later in infancy (8.75 ± 1.98 months). Maternal faecal samples were also obtained from 175 mothers paired with 178 babies. Metagenomic analysis of 1,679 faecal samples from 771 babies and mothers revealed temporal dynamics of the gut microbiota development (Fig. 1b) and increased diversity with age (Extended Data Fig. 1a). Strikingly, the gut microbiotas exhibited substantial heterogeneity (inter-individual) and instability (intra-individual) during the first weeks of life (Extended Data Fig. 1b). Inter-individual differences explained 57% of the microbial taxonomic variation (Permutational multivariate analysis of variance (PERMANOVA), P < 0.001, 1,000 permutations), followed by sampling age at 5.7% of the variance (P < 0.001). These results indicate that the gut microbiotas were highly dynamic and individualised during the neonatal period, even more than observed in infancy (Extended Data Fig. 1c).
To determine the impact of clinical covariates on the composition of the gut microbial community, we performed cross-sectional PERMANOVA, stratified by age. Mode of delivery was the most significant factor driving gut microbiota variation during the neonatal period (Fig. 2a, Supplementary Table 2), while other clinical covariates associated with hospital birth (e.g. perinatal antibiotics, duration of hospital stay) and breastfeeding exhibited smaller effects (Supplementary Note 1). The largest effect of delivery mode was observed on day 4 (Extended Data Fig. 2, R2=7.64%, P<0.001), which dissipated with age but remained significant at the point of infancy sampling (R2=1.00%, P<0.01). No difference was observed in maternal gut microbiotas by delivery modes or neonatal gut microbiotas between elective and emergency C-section births (Supplementary Table 2).
Given the significant effect of the mode of delivery during the neonatal period, we next sought to understand how the microbiota composition and developmental trajectory were altered. Samples from babies delivered vaginally were enriched with Bifidobacterium (e.g. B. longum, B. breve), Escherichia (E. coli) and Bacteroides/Parabacteroides species (e.g. B. vulgatus, P. distasonis) with these commensal genera comprising 68.3% (95% CI 65.7-71.0%) of the neonatal gut microbial communities (Fig. 2b, Supplementary Table 3), which validated the recent observations in other cohorts4,13. In contrast, the gut microbiota of C-section delivered babies were depleted of these commensal genera and instead were dominated by Enterococcus (E. faecalis, E. faecium), Staphylococcus epidermis, Streptococcus parasanguinis, Klebsiella (K. oxytoca, K. pneumoniae), Enterobacter cloacae and Clostridium perfringens, which are commonly associated with hospital environments18 and hospitalised preterm babies19–21. On day 4, species belonging to these genera accounted for 68.25% (95% CI 62.74-73.75%) of the total microbiota composition in C-section delivered babies (Fig. 2b).
Previous studies reported that, compared to C-section delivered babies, the gut microbiotas of vaginally delivered babies were enriched in lactobacilli associated with the mother’s vaginal microbiota1,22. However, here we observed no statistical difference in the prevalence (vaginal 11.9% vs C-section 15.7% present at over 1% abundance) or abundance of Lactobacillus between vaginally (1.217%, 95% CI 0.81-1.621%) or C-section (2.21%, 95% CI 1.54-2.88%) delivered babies. Rather, commensal species from the Bacteroides genus were detected at high abundance in the gut microbiota of 49.0% (154/314) of vaginally delivered babies (mean relative abundance 8.13%, 95% CI 6.88-9.39%, Extended Data Fig. 3). In contrast, Bacteroides species were low or absent in 99.6% (281/282) C-section delivered babies (mean relative abundance 0.43%, 95% CI 0.11-0.74). In 60.6% (86/142) of the C-section babies, this low-Bacteroides profile (defined in Methods) persisted into infancy, when Bacteroides became the only differentially abundant species between vaginally and C-section delivered babies (Supplementary Table 3). Although we could not assess the independent effect of maternal antibiotic exposure during C-section delivery as antibiotics were administered in all C-section deliveries, among vaginally delivered babies we observed a statistically significant association between the low-Bacteroides profile with maternal intrapartum antibiotic prophylaxis (IAP, OR=1.77, 95% CI: 1.17-2.71, P=0.0074), which also accounted for the greatest amount of gut microbiota variation in vaginally delivered babies (R2=5.88-13.6%, Supplementary Table 2). These results expand on previous findings9,23 and further highlight a low-Bacteroides profile as the perturbation signature associated with C-section and maternal IAP in vaginal delivery.
Maternal transmission of gastrointestinal bacteria to their babies is an underappreciated form of kinship24. To assess if the neonatal microbiota variation could be attributed to differential transmission of maternal microbiota, we profiled the bacterial strain transmission across 178 mother-baby dyads. We show that the majority of maternal strain transmissions during the neonatal period occurred in vaginally delivered babies (74.39%), at much higher frequency in comparison with those delivered by C-section (12.56%, Fisher’s exact test, P<0.0001, Fig 3a, Extended Data Fig. 4, Supplementary Table 4). Bacteroides spp., Parabacteroides spp., E. coli and Bifidobacterium spp. were most frequently transmitted from mothers to babies through vaginal birth, in agreement with previous observation in smaller cohorts4,25–27. For Bacteroides species such as B. vulgatus (Fig. 3b), the lack of transmission continued far beyond the neonatal period in C-section born babies25 with the late transmission of B. vulgatus rarely detected later in infancy. This is in contrast to the transmission pattern of other common early colonisers such as B. longum (Fig. 3c) and E. coli, for which colonisations of maternal strains occurred more frequently later in infancy (Fisher’s exact tests, P=0.0479 and P=0.0226, respectively). This result highlights the neonatal period as a critical early window of maternal transmission with the disrupted transmission of pioneering Bacteroides species evident in C-section babies with long-term Bacteroides absence.
While C-section babies were deprived of maternally transmitted commensal bacteria, they had a substantially higher relative abundance of opportunistic pathogens commonly associated with the healthcare environment. These enriched species included E. faecalis, E. faecium, E. cloacae, K. pneumoniae, K. oxytoca andC. perfringens (Fig. 4a, Supplementary Table 3), some of which are members of the ESKAPE pathogens responsible for the majority of nosocomial infections28. Indeed, their frequent gut microbiota colonisation in C-section newborns was under-reported in previous smaller cohorts3,13 with insufficient statistical power (Supplementary Note 2). Among C-section born babies, 83.7% carried opportunistic pathogen species during the neonatal period (as defined in Methods), in comparison to 49.4% of the vaginally born babies (Fig. 4a). During the first 21 days of life, these healthcare-associated opportunistic pathogens accounted for 30.4% (95% CI 27.86-32.96%) of the species level abundance in the gut microbiota of C-section babies, compared to 9.8% (95% CI 8.19-11.4%) in the vaginal babies, with the greatest difference observed on day 4 (Extended Data Fig. 5a). Longitudinally, the difference in combined opportunistic pathogen abundance persisted in the C-section babies re-sampled later in infancy (C-section 2.8% versus vaginal 1.6%, P=0.0375, Welch’s t-test). Interestingly, frequent and abundant carriage of opportunistic pathogens was also observed in low-Bacteroides vaginally delivered babies (Extended Data Fig. 5b), while the absence of breastfeeding during the neonatal period was associated with a higher carriage of C. perfringens, K. oxytoca and E. faecalis (Supplementary Table 3).
Given the prevalent carriage of opportunistic pathogens in the neonatal gut metagenomes, we sought to validate their presence and viability with culturing. We undertook targeted large-scale culturing of 836 opportunistic pathogen strains in the faecal samples of 177 babies (70 vaginal and 107 C-section babies, total 741 isolates) and 38 mothers (95 isolates) using selective media (Fig. 4b, Supplementary Table 5). Subsequent WGS and genomic characterisation of E. faecalis (n=356), E. cloacae (n=52), K. oxytoca (n=150) and K. pneumoniae (n=78) allowed us to perform high-resolution phylogenetic analysis and to delineate strain-specific carriage of AMR genes and virulence factors.
Focusing on the most prevalent opportunistic pathogen in C-section born babies, we analysed the genomes of a diverse population of BBS E. faecalis strains in the context of publicly available genomes of human and environmental strains (Fig. 4c). We found that 53.9% of the BBS strains were represented by five major lineages, each of which was distributed across vaginal and C-section babies and mothers in the three BBS hospitals (Extended Data Fig. 6a) and UK hospital patients, but did not include high-risk UK epidemic lineages enriched in multi-drug resistance (MDR) and virulence. In congruence with the phylogenetic placement of the BBS strains with the human gastrointestinal and environmental strains, these non-epidemic E. faecalis exhibited comparable levels of carriage of AMR genes (Extended Data Fig. 6b-e, Supplementary Note 3). Similar to E. faecalis, the BBS Enterobacter and Klebsiella strains also exhibited high-level population diversities with the phylogenetic under-representation of epidemic lineages (Fig. 4d, Extended Data Fig. 7), and levels of AMR and virulence gene carriage indicative of non-epidemic lineages circulating in hospital environments and healthy populations, rather than hypervirulent and ESBL-enriched epidemic lineages (Extended Data Fig. 8, Supplementary Note 3). Given the prior isolation of the major BBS lineages in hospitalised patients and their AMR and virulence capabilities, any level of opportunistic pathogen carriage represents a significant risk of future infections, especially for the C-section born babies with high prevalence (83.7%) of carriage.
Whilst there is insufficient evidence from metagenomics and cultured isolate WGS that indicates an apparent maternal origin of the opportunistic pathogens (Supplementary Note 4), the absence of lineage-specific colonisation suggests hospital environmental exposure as the primary factor driving opportunistic pathogen colonisation of the BBS babies. Although our study was not designed for retrospective sampling of the hospital environmental sources, opportunistic pathogens are frequently found in hospital environments, where hospital-born babies have been shown to carry the same bacteria present in operating rooms29 and neonatal intensive care units30.
Undertaking the largest, longitudinal WGS characterisation of the human gut microbiota in the previously under-sampled neonatal period (≤1 month), we consolidate the recent findings that mode of delivery is a major factor shaping the gut microbiota in the first few weeks of life4, with the diminished effect persisting into infancy14,15. The disrupted transmission of the maternal gastrointestinal bacteria, particularly the pioneering Bacteroides species in birth via C-section and maternal IAP, predisposed newborn babies to colonisation by clinically important opportunistic pathogens circulating in healthcare and hospital environments. However, the clinical consequences of the early life microbiota perturbations and carriage of immunogenic pathogens during this critical window of immune development remain to be determined. This highlights the need for large-scale, long-term cohort studies that also sample home births31 to better understand the consequence of hospital birth and establish if neonatal microbiota perturbation negatively impacts health outcomes in childhood and later life.
Methods
Study population
The study was approved by the NHS London - City and East Research Ethics Committee (REC reference 12/LO/1492). Participants were recruited at the Barking, Havering and Redbridge University Hospitals NHS Trust (BHR), the University Hospitals Leicester NHS Trust (LEI), and the University College London Hospitals NHS Foundation Trust (UCLH), through the Baby Biome Study (previously Life Study enhancement pilot study) from May 2014 to December 2017. Mothers provided written, informed consent to participate and for their children to participate in the study. The study was performed in compliance with all relevant ethical regulations.
Sample collection
Faecal samples were collected from babies with at least one sample in the first 21 days of life, primarily on day 4, 7 or 21. For a subset of babies who provided neonatal samples, a follow-up faecal sample collection was performed between 4 to 12 months of their lives. Maternal faecal samples were collected in the maternity unit before or after delivery, or stool was collected during delivery by midwives. Baby samples were collected at home by mothers and returned to the processing laboratory by post at ambient temperature within 24 hours. On arrival at the lab, all faecal samples were immediately stored at 4°C for an average of 2.41 days (95% CI 2.06-2.76 days) before further processing. Samples were aliquoted into six vials, four of which were stored at -80°C for raw faeces biobanking while the other two vials were processed immediately for DNA extraction. Although this sample storage protocol (no preservation buffer for room temperature and 4°C storage) was shown to be robust to technical variation in microbiome profiles at the time of study design (Supplementary Note 5), state-of-the-art preservation methods should be utilised in future large-scale microbiome studies to minimise the potential effect of sample storage on the microbiota composition32. DNA was extracted from 30 mg of faecal samples as described in the BBS collection and processing protocol33. Negative controls using ultrapure water was included in parallel for each kit as well as each extraction batch, and DNA concentration quantified to confirm contamination free. Total DNA was eluted in 60μl DNase/Pyrogen-free water, and stored at -80°C until shipment to the Wellcome Sanger Institute for metagenomic sequencing.
Shotgun metagenomic sequencing and analysis
DNA samples, including negative controls, were quantified by PicoGreen dsDNA assay (Thermo Fisher), and samples with >100 ng DNA material proceeded to paired-end (2 x 125bp) metagenomics sequencing on the HiSeq 2500 v4 platform. Low-quality bases were trimmed (SLIDINGWINDOW:4:20), and reads below 87 nucleotides (70% of original read length) were removed (MINLEN:87) using Trimmomatic34. To remove potential human contaminants, quality trimmed reads were screened against the human genome (GRCh38) with Bowtie2 v2.3.035. On average, 22.4 (95% CI 22.1-22.6) million raw reads were generated per sample. 19.3 (95% CI 19.1-19.6) million reads (87.3% of the raw reads) per sample passed decontamination and quality trimming steps for downstream analysis. Sequencing depth was accounted for as a potential technical confounding factor in analyses of microbiota species and strain measurements, and significant species association with clinical covariates (Supplementary Note 6). Taxonomic classification from metagenomics reads was performed using Kraken v1.036, a k-mer based sequence classification approach against the Human Gastrointestinal Bacteria Genome Collection (HGG) genomes37. Bracken v1.038 was run on the Kraken classification output to estimate taxonomic abundance down to the species level. Metagenomic samples were compared at the genus and species levels by relative abundance. A cut-off of 100 Kraken-assigned paired-end reads (corresponds to 0.001% relative abundance given the sampling depth of ~10 million paired-end reads) was applied to determine metagenomic species detection. To assess whether the trade-off between the observed level of Bacteroides and opportunistic pathogens was an artefact of compositional effects, the proportion of abundances and reads corresponding to Bacteroides were removed separately, prior to relative abundance normalisation. In the normalised datasets, the statistical enrichment of opportunistic pathogen species in C-section babies was consistent with the observation with the original data. The R packages phyloseq39 and microbiome40 was used for metagenomic data analysis and results visualised using ggplot241 in RStudio.
Classification of the low-Bacteroides babies
For each baby, the median relative abundance of the Bacteroides genus was calculated across the neonatal period samples. Based on the threshold described previously9, babies with a median abundance of less than 0.1% were assigned low-Bacteroides status.
Classification of the opportunistic pathogen carriage
Total opportunistic pathogen load is estimated by calculating the median relative abundance of combined opportunistic pathogen species (C. perfringens, E. cloacae, E. faecalis, E. faecium, K. oxytoca, K. pneumoniae) per individual across their neonatal period samples, and independently for the infancy period and maternal samples. To prioritise on relatively high-level opportunistic pathogen carriage feasible for downstream strain cultivation experiments, individuals with a median abundance of over 1% total opportunistic pathogen load were defined as a positive carriage.
Maternal strain transmission analysis
Strain transmissions in mother-baby paired samples were determined using a single-nucleotide variant calling method42. StrainPhlAn was run on pre-processed metagenomes to generate consensus species-specific marker genes for phylogenetic reconstruction of all detectable strains (one dominant strain per sample), using default parameters and with the options "--alignment_program mafft" and "--relaxed_parameters3" as previously described26. No statistically significant variation in sequencing depth was observed between vaginal and C-section born subjects across age groups that had any impact on coverage-dependent microbiota species and strains detection (Supplementary Note 6). For each species and strains with sufficient coverage for strain profiling, we generated a species-specific phylogenetic tree using RAxML43. As previously described26, the strain distance for each pair of mother-baby sample strains was computed by calculating the pairwise normalised phylogenetic distance on the corresponding species tree.
To define strain transmission events, a previously described26, conservative threshold of 0.1 on the strain distance value was used. The detectable strains in a given pair of mother-baby samples were considered identical (strain distance less than 0.1, transmission) or distinct (strain distance greater than 0.1, no transmission). For all mother-baby pairs shown in Extended Data Fig. 4, early transmission event was counted once per species per mother-baby pair, considering the detected transmission (or evidence for no transmission) at the earliest time point (primary transmission), irrespective of the subsequent transmission events in any later neonatal period samples. For a subset of mother-baby pairs with both neonatal and infancy period sampled (Fig. 3a), late transmission events were counted separately, including cases of no early transmission due to insufficient coverage (no detectable strains). To highlight the transmission pattern shared by phylogenetically related species, a neighbour-joining44 tree of the eligible species was constructed based on the mash distance matrix45 of the respective reference genomes included in the StrainPhlAn database (Supplementary Table 4). The same approach and strain distance threshold (core-genome SNPs) were applied to the cultured strains to count the number of identical and distinct strains within mother-baby and longitudinal paired samples.
Statistical analysis
To calculate the effect of clinical covariates on the gut microbiota composition, we stratified by age groups and then assessed the proportion of explained variance (R2 from PERMANOVA) in Bray-Curtis distance for each clinical covariate, using the adonis from the R package vegan46. While PERMANOVA is mostly unaffected by group dispersion effects in balanced designs47 (e.g. mode of delivery comparisons), for unbalanced designs (e.g. breastfeeding comparisons) more sensitive to group dispersion effects, the group variance homogeneity condition was validated using the betadisper function. Group dispersions were not significantly different (betadisper P<0.05) in all comparisons, which lent support to the statistically significant, albeit visibly weak effects of breastfeeding as reported by PERMANOVA. Samples with missing metadata (NA) for the given clinical covariate were excluded prior to running each cross-sectional analysis. Effect sizes and statistical significance were determined by 1,000 permutations, and P-values corrected for multiple testing using the Benjamini-Hochberg false discovery rate (FDR = 5%). Statistical tests of between-group taxonomic abundance comparisons (Welch’s t-test with p-values FDR-corrected) were performed in the Statistical Analysis of Metagenomics Profiles program v2.048. MaAsLin49 was used for adjustment of covariates when determining the significance of species associated with a specific variable while accounting for potentially confounding covariates, as previously described14,15. All the covariates tested in the PERMANOVA were included in the adjustment along with the sequencing depth used as fixed effects. The default MaAsLin parameters were applied (maximum percentage of samples NA in metadata 10%, minimum percentage relative abundance 0.01%, P < 0.05, q < 0.25).
Bacterial isolation and whole-genome sequencing
Raw faecal samples from neonates stored in the biobank lab at -80°C were requested based on faecal carriage of targeted species over 1% relative abundance in metagenomes. Selected frozen faecal aliquots, where available (> 100 ng) were couriered on dry ice to the Wellcome Sanger Institute within 6 hours of shipment from the biobank lab. Bacterial isolates were cultured using the following culture media: Enterococcus faecium ChromoSelect Agar Base (Sigma-Aldrich) for Enterococcus spp., CP ChromoSelect Agar (Sigma-Aldrich) for Closteridium spp., Coliform ChromoSelect Agar (Sigma-Aldrich) and Klebsiella ChromoSelect Selective Agar (Sigma-Aldrich) for species of Enterobacteriaceae. Between 2-5 colonies per sample were picked for full-length 16S rRNA gene sequencing to confirm species identification, as described previously50. Bacterial isolates with species identification congruent with metagenomic identification were re-streaked and purified for genomic DNA extraction using DNeasy 96 kit. DNA sequencing was performed on the Illumina HiSeq X, generating paired-end reads (2 x 151bp). Multiple strains per species per faecal sample were also sequenced based on variation across the full-length 16S rRNA sequences. Bacterial genomes were assembled and annotated using the pipeline described previously51. Genome assemblies were subjected to quality check and contaminant screening with CheckM52 and Mash53, respectively. Where applicable, the suspected contaminant (non-target organism) sequences were confirmed and filtered out via raw read mapping using Bowtie2 v2.3.0, prior to re-assembly.
Bacterial phylogenetic analysis
The phylogenetic analysis of the complete diverse species collection was conducted by extracting the amino acid sequence of 40 universal core marker genes54,55 from the BBS bacterial culture collection using SpecI56. The protein sequences were concatenated and aligned with MAFFT v.7.2040, and maximum-likelihood trees were constructed using RAxML43 with default settings. Four most prevalent BBS collection opportunistic pathogen species E. faecalis, E. cloacae, K. oxytoca and K. pneumoniae were further analysed in context of the public genomes (Supplementary Table 5), including the UK hospital strain collections57–60, the gut microbiota-cultured strains from the HGG and the Culturable Genome Reference (CGR)61 collections, and the environmental strains on the Genome Taxonomy Database (GTDB, v86) 62. To generate phylogenetic trees of individual species, the public genome assemblies were combined with the assemblies of the study isolates, annotated with Prokka63, and a pangenome estimated using Roary(Page et al. 2015). Where multiple identical strains (no SNP difference in species core-genome) were cultured from the same faecal sample, only one representative strain was included in the species phylogenetic trees. A 95% identity cut-off was used, and core genes were defined as those in 99% of isolates unless otherwise stated. A maximum likelihood tree of the SNPs in the core genes was created using RAxML(Stamatakis 2014) and 100 bootstraps. To illustrate the population structure of the closely related Enterobacter and Klebsiella strain isolates, FastANI(Jain et al. 2017) was used to estimate the pairwise average nucleotide identity distance between all public and BBS genome assemblies, which was then used as an input to generate a neighbour-joining with BIONJ(Gascuel 1997). All phylogenetic trees were visualised in iTOL(Letunic & Bork 2016). Sequence types were determined using MLSTcheck(Page et al. n.d.), which was used to compare the assembled genomes against the MLST database for the corresponding species.
Detecting virulence and resistance genes
ABRicate (v0.8.13, https://github.com/tseemann/abricate) was used to screen for known, acquired resistance genes and virulence factors against bacterial genome assemblies. For AMR genes, a comprehensive BLAST database integrating 5,556 non-redundant sequences in the NCBI Bacterial Antimicrobial Resistance Reference Gene Database (PRJNA313047), CARD v2.0.3, ARG-ANNOT and ResFinder was queried against. 3,202 non-redundant experimentally validated core virulence genes in VFDB (version 5 Oct 2018) were included to build a BLAST database for virulence factor screening.
Extended Data
Extended Data Table 1. Baseline clinical characteristics of the Baby Biome Study cohort.
Sampling age group (subjects) | |
Day 4 | 310 |
Day 7 | 532 |
Day 21 | 325 |
Additional neonatal samplings | 35 |
Infancy (8.75 ± 1.98 months) | 302 |
Mother | 175 |
Mode of delivery | |
Caesarean section | 282 (47.3%) |
Mode of feeding | |
Non-breastfeeding (day 4) | 52 (17.5%) |
Non-breastfeeding (day 7) | 83 (16.8%) |
Non-breastfeeding (day 21) | 59 (19.0%) |
Non-breastfeeding (infancy) | 33 (13.4%) |
Antibiotics | |
Intrapartum antibiotic prophylaxis (vaginal delivery) | 23 (7.3%) |
Supplementary Material
Acknowledgements
This work was supported by the Wellcome Trust (WT101169MA) and Wellcome Sanger core funding (WT098051). Y.S. is supported by a Wellcome Trust PhD Studentship. S.C.F. is supported by the Australian National Health and Medical Research Council [1091097, 1159239 and 1141564] and the Victorian Government’s Operational Infrastructure Support Program. We are grateful to the participating families for their time and contribution to the Baby Biome Study, the research midwives at recruiting hospitals for recruitment and clinical metadata collection, Nadia Moreno, Henna Ali, Samra Bibi and Alfred Takyi for raw sample processing. We thank the Core Sequencing and Pathogen Informatics teams at the Wellcome Sanger Institute for informatics support, H. Browne and A. Almeida for critical feedback of the manuscript.
Footnotes
Data availability
All sequencing data generated and analysed in this study have been deposited in the European Nucleotide Archive under accession numbers ERP115334 and ERP024601. Raw faecal samples and bacterial isolates are available from the corresponding authors upon request.
Contributions
S.C.F., A.R., P.B., N.F. and T.D.L. conceived and designed the project. S.C.F., E.T., N.K. and M.D.S. carried out the pilot study, and designed sample collection and processing protocols, overseen by N.F. and T.D.L.; E.T., A.S., N.S. and N.F. managed participant recruitment and coordinated clinical metadata collection; Y.S. performed bacterial culturing and DNA extraction with assistance from M.D.S.; Y.S. generated and analysed the data with assistance from K.V.; Y.S., S.C.F., N.F. and T.D.L. wrote the manuscript. All authors read and approved the manuscript.
Competing interests
The authors declare no competing financial interests.
References
- 1.Dominguez-Bello MG, et al. Delivery mode shapes the acquisition and structure of the initial microbiota across multiple body habitats in newborns. PNAS. 2010;107:11971–11975. doi: 10.1073/pnas.1002601107. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Tamburini S, Shen N, Wu HC, Clemente JC. The microbiome in early life: implications for health outcomes. Nat Med. 2016;22:713–722. doi: 10.1038/nm.4142. [DOI] [PubMed] [Google Scholar]
- 3.Chu DM, et al. Maturation of the infant microbiome community structure and function across multiple body sites and in relation to mode of delivery. Nat Med. 2017 doi: 10.1038/nm.4272. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Wampach L, et al. Birth mode is associated with earliest strain-conferred gut microbiome functions and immunostimulatory potential. Nat Commun. 2018;9 doi: 10.1038/s41467-018-07631-x. 5091. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Koenig JE, et al. Succession of microbial consortia in the developing infant gut microbiome. Proc Natl Acad Sci USA. 2011;108(Suppl 1):4578–4585. doi: 10.1073/pnas.1000081107. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Stokholm J, et al. Cesarean section changes neonatal gut colonization. Journal of Allergy and Clinical Immunology. 2016;138:881–889.e2. doi: 10.1016/j.jaci.2016.01.028. [DOI] [PubMed] [Google Scholar]
- 7.Baumann-Dudenhoeffer AM, D’Souza AW, Tarr PI, Warner BB, Dantas G. Infant diet and maternal gestational weight gain predict early metabolic maturation of gut microbiomes. Nat Med. 2018;5:178. doi: 10.1038/s41591-018-0216-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Bokulich NA, et al. Antibiotics, birth mode, and diet shape microbiome maturation during early life. Science Translational Medicine. 2016;8:343ra82–343ra82. doi: 10.1126/scitranslmed.aad7121. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Yassour M, et al. Natural history of the infant gut microbiome and impact of antibiotic treatment on bacterial strain diversity and stability. Science Translational Medicine. 2016;8:343ra81–343ra81. doi: 10.1126/scitranslmed.aad0917. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Arrieta M-C, et al. Early infancy microbial and metabolic alterations affect risk of childhood asthma. Science Translational Medicine. 2015;7:307ra152–307ra152. doi: 10.1126/scitranslmed.aab2271. [DOI] [PubMed] [Google Scholar]
- 11.Fujimura KE, et al. Neonatal gut microbiota associates with childhood multisensitized atopy and T cell differentiation. Nat Med. 2016 doi: 10.1038/nm.4176. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Stokholm J, et al. Maturation of the gut microbiome and risk of asthma in childhood. Nat Commun. 2018;9:141. doi: 10.1038/s41467-017-02573-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Bäckhed F, et al. Dynamics and Stabilization of the Human Gut Microbiome during the First Year of Life. Cell Host & Microbe. 2015;17:690–703. doi: 10.1016/j.chom.2015.04.004. [DOI] [PubMed] [Google Scholar]
- 14.Stewart CJ, et al. Temporal development of the gut microbiome in early childhood from the TEDDY study. Nature. 2018;562:583–588. doi: 10.1038/s41586-018-0617-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Vatanen T, et al. The human gut microbiome in early-onset type 1 diabetes from the TEDDY study. Nature. 2018;562:589–594. doi: 10.1038/s41586-018-0620-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Vatanen T, et al. Variation in Microbiome LPS Immunogenicity Contributes to Autoimmunity in Humans. Cell. 2016;165:1551. doi: 10.1016/j.cell.2016.05.056. [DOI] [PubMed] [Google Scholar]
- 17.Olin A, et al. Stereotypic Immune System Development in Newborn Children. Cell. 2018;174:1277–1292.e14. doi: 10.1016/j.cell.2018.06.045. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Lax S, et al. Bacterial colonization and succession in a newly opened hospital. Science Translational Medicine. 2017;9 doi: 10.1126/scitranslmed.aah6500. eaah6500. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Stewart CJ, et al. Preterm gut microbiota and metabolome following discharge from intensive care. Scientific Reports. 2015;5 doi: 10.1038/srep17141. 17141. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Gibson MK, et al. Developmental dynamics of the preterm infant gut microbiota and antibiotic resistome. Nature Microbiology. 2016;1:1–10. doi: 10.1038/nmicrobiol.2016.24. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Raveh-Sadka T, et al. Evidence for persistent and shared bacterial strains against a background of largely unique gut colonization in hospitalized premature infants. The ISME Journal. 2016;10:2817–2830. doi: 10.1038/ismej.2016.83. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Dominguez-Bello MG, et al. Partial restoration of the microbiota of cesarean-born infants via vaginal microbial transfer. Nat Med. 2016;22:250–253. doi: 10.1038/nm.4039. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Jakobsson HE, et al. Decreased gut microbiota diversity, delayed Bacteroidetes colonisation and reduced Th1 responses in infants delivered by caesarean section. Gut. 2014;63:559–566. doi: 10.1136/gutjnl-2012-303249. [DOI] [PubMed] [Google Scholar]
- 24.Funkhouser LJ, Bordenstein SR. Mom Knows Best: The Universality of Maternal Microbial Transmission. PLOS Biology. 2013;11:e1001631. doi: 10.1371/journal.pbio.1001631. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Nayfach S, Rodriguez-Mueller B, Garud N, Pollard KS. An integrated metagenomics pipeline for strain profiling reveals novel patterns of bacterial transmission and biogeography. Genome Res. 2016;26:1612–1625. doi: 10.1101/gr.201863.115. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Ferretti P, et al. Mother-to-Infant Microbial Transmission from Different Body Sites Shapes the Developing Infant Gut Microbiome. Cell Host & Microbe. 2018;24:133–145.e5. doi: 10.1016/j.chom.2018.06.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Yassour M, et al. Strain-Level Analysis of Mother-to-Child Bacterial Transmission during the First Few Months of Life. Cell Host & Microbe. 2018;24:146–154.e4. doi: 10.1016/j.chom.2018.06.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Boucher HW, et al. Bad bugs, no drugs: no ESKAPE! An update from the Infectious Diseases Society of America. Clin Infect Dis. 2009;48:1–12. doi: 10.1086/595011. [DOI] [PubMed] [Google Scholar]
- 29.Shin H, et al. The first microbial environment of infants born by C-section: the operating room microbes. Microbiome. 2015;3:59. doi: 10.1186/s40168-015-0126-1. 2015 3:1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Brooks B, et al. The developing premature infant gut microbiome is a major factor shaping the microbiome of neonatal intensive care unit rooms. Microbiome. 2018;6:112. doi: 10.1186/s40168-018-0493-5. 2015 3:1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Combellick JL, et al. Differences in the fecal microbiota of neonates born at home or in the hospital. Scientific Reports. 2018;8 doi: 10.1038/s41598-018-33995-7. 15660. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Vandeputte D, Tito RY, Vanleeuwen R, Falony G, Raes J. Practical considerations for large-scale gut microbiome studies. FEMS Microbiol Rev. 2017;41:S154–S167. doi: 10.1093/femsre/fux027. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Bailey SR, et al. A pilot study to understand feasibility and acceptability of stool and cord blood sample collection for a large-scale longitudinal birth cohort. BMC Pregnancy Childbirth. 2017;17:439. doi: 10.1186/s12884-017-1627-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Bolger AM, Lohse M, Usadel B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics. 2014;30:2114–2120. doi: 10.1093/bioinformatics/btu170. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Ben Langmead, Salzberg SL. Fast gapped-read alignment with Bowtie 2. Nat Methods. 2012;9:357–359. doi: 10.1038/nmeth.1923. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Wood DE, Salzberg SL. Kraken: ultrafast metagenomic sequence classification using exact alignments. 2014;15:R46. doi: 10.1186/gb-2014-15-3-r46. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Forster SC, et al. A human gut bacterial genome and culture collection for improved metagenomic analyses. Nature Biotechnology. 2019;37:186–192. doi: 10.1038/s41587-018-0009-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Lu J, Breitwieser FP, Thielen P, Salzberg SL. Bracken: estimating species abundance in metagenomics data. PeerJ Computer Science. 2017;3:e104. [Google Scholar]
- 39.McMurdie PJ, Holmes S. phyloseq: An R Package for Reproducible Interactive Analysis and Graphics of Microbiome Census Data. PLOS ONE. 2013;8:e61217. doi: 10.1371/journal.pone.0061217. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Lahti L, Shetty S. Tools for microbiome analysis in R. Version 1.1.10013. 2017 URL: http://microbiome.github.com/microbiome. [Google Scholar]
- 41.Wickham H. ggplot2: elegant graphics for data analysis. 2016 [Google Scholar]
- 42.Truong DT, Tett A, Pasolli E, Huttenhower C, Segata N. Microbial strain-level population structure and genetic diversity from metagenomes. Genome Res. 2017 doi: 10.1101/gr.216242.116. gr.216242.116. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Stamatakis A. RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics. 2014;30:1312–1313. doi: 10.1093/bioinformatics/btu033. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Gascuel O. BIONJ: an improved version of the NJ algorithm based on a simple model of sequence data. Molecular Biology and Evolution. 1997;14:685–695. doi: 10.1093/oxfordjournals.molbev.a025808. [DOI] [PubMed] [Google Scholar]
- 45.Ondov BD, et al. Mash: fast genome and metagenome distance estimation using MinHash. Genome Biology. 2016;17:132. doi: 10.1186/s13059-016-0997-x. 2014 15:3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Oksanen J, Blanchet FG, Kindt R, Legendre P. R Package ‘vegan’: Community Ecology Package. R Package version 2.2–0. 2014 [Google Scholar]
- 47.Anderson MJ, Walsh DCI. PERMANOVA, ANOSIM, and the Mantel test in the face of heterogeneous dispersions: What null hypothesis are you testing? Ecological Monographs. 2013;83:557–574. [Google Scholar]
- 48.Parks DH, Tyson GW, Hugenholtz P, Beiko RG. STAMP: statistical analysis of taxonomic and functional profiles. Bioinformatics. 2014;30:3123–3124. doi: 10.1093/bioinformatics/btu494. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Morgan XC, et al. Dysfunction of the intestinal microbiome in inflammatory bowel disease and treatment. Genome Biology. 2012;13:R79. doi: 10.1186/gb-2012-13-9-r79. 2014 15:3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Browne HP, et al. Culturing of ‘unculturable’ human microbiota reveals novel taxa and extensive sporulation. Nature. 2016;533:543–546. doi: 10.1038/nature17645. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Page AJ, et al. Robust high-throughput prokaryote de novo assembly and improvement pipeline for Illumina data. Microbial Genomics. 2016;2:e000083. doi: 10.1099/mgen.0.000083. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Parks DH, Imelfort M, Skennerton CT, Hugenholtz P, Tyson GW. CheckM: assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes. Genome Res. 2015;25:1043–1055. doi: 10.1101/gr.186072.114. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Ondov BD, et al. Mash Screen: High-throughput sequence containment estimation for genome discovery. bioRxiv. 2019 doi: 10.1101/557314. 557314. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Sorek R, et al. Genome-Wide Experimental Determination of Barriers to Horizontal Gene Transfer. Science. 2007;318:1449–1452. doi: 10.1126/science.1147112. [DOI] [PubMed] [Google Scholar]
- 55.Ciccarelli FD, et al. Toward Automatic Reconstruction of a Highly Resolved Tree of Life. Science. 2006;311:1283–1287. doi: 10.1126/science.1123061. [DOI] [PubMed] [Google Scholar]
- 56.Mende DR, Sunagawa S, Zeller G, Bork P. Accurate and universal delineation of prokaryotic species. Nat Methods. 2013;10:881–884. doi: 10.1038/nmeth.2575. [DOI] [PubMed] [Google Scholar]
- 57.Raven KE, et al. Genome-based characterization of hospital-adapted Enterococcus faecalis lineages. Nature Microbiology. 2016;1 doi: 10.1038/nmicrobiol.2015.33. 15033. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.Moradigaravand D, Reuter S, Martin V, Peacock SJ, Parkhill J. The dissemination of multidrug-resistant Enterobacter cloacae throughout the UK and Ireland. Nature Microbiology. 2016;1 doi: 10.1038/nmicrobiol.2016.173. 16173. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59.Moradigaravand D, Martin V, Peacock SJ, Parkhill J. Population Structure of Multidrug-Resistant Klebsiella oxytoca within Hospitals across the United Kingdom and Ireland Identifies Sharing of Virulence and Resistance Genes with K. pneumoniae. Genome Biology and Evolution. 2017;9:574–584. doi: 10.1093/gbe/evx019. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60.Moradigaravand D, Martin V, Peacock SJ, Parkhill J, Chiller T. Evolution and Epidemiology of Multidrug-Resistant Klebsiella pneumoniae in the United Kingdom and Ireland. MBio. 2017;8:e01976–16. doi: 10.1128/mBio.01976-16. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61.Zou Y, et al. 1,520 reference genomes from cultivated human gut bacteria enable functional microbiome analyses. Nature Biotechnology. 2019;37:179–185. doi: 10.1038/s41587-018-0008-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62.Parks DH, et al. A standardized bacterial taxonomy based on genome phylogeny substantially revises the tree of life. Nature Biotechnology. 2018;36:996. doi: 10.1038/nbt.4229. [DOI] [PubMed] [Google Scholar]
- 63.Seemann T. Prokka: rapid prokaryotic genome annotation. Bioinformatics. 2014;30:2068–2069. [Google Scholar]
- 64.Page AJ, et al. Roary: rapid large-scale prokaryote pan genome analysis. Bioinformatics. 2015;31:3691–3693. doi: 10.1093/bioinformatics/btv421. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 65.Jain C, Rodriguez-R LM, Phillippy AM, Konstantinidis KT, Aluru S. High-throughput ANI Analysis of 90K Prokaryotic Genomes Reveals Clear Species Boundaries. bioRxiv. 2017 doi: 10.1101/225342. 225342. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 66.Letunic I, Bork P. Interactive tree of life (iTOL) v3: an online tool for the display and annotation of phylogenetic and other trees. Nucl Acids Res. 2016;44:W242–W245. doi: 10.1093/nar/gkw290. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 67.Page AJ, Taylor B, Softw JKJOS. Multilocus sequence typing by blast from de novo assemblies against PubMLST. theoj org. 2016 [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.