Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2022 Apr 5.
Published in final edited form as: Int J Cancer. 2021 Nov 5;150(6):916–927. doi: 10.1002/ijc.33847

Prospective study of oral microbiome and gastric cancer risk among Asian, African American and European American populations

Yaohua Yang 1, Jirong Long 1, Cong Wang 1, William J Blot 1, Zhiheng Pei 2,3,4, Xiang Shu 5, Fen Wu 6, Nathaniel Rothman 7, Jie Wu 1, Qing Lan 7, Qiuyin Cai 1, Wei Zheng 1, Yu Chen 6,8, Xiao-Ou Shu 1
PMCID: PMC8982516  NIHMSID: NIHMS1789849  PMID: 34664266

Abstract

Colonization of specific bacteria in the human mouth was reported to be associated with gastric cancer risk. However, previous studies were limited by retrospective study designs and low taxonomic resolutions. We performed a prospective case-control study nested within three cohorts to investigate the relationship between oral microbiome and gastric cancer risk. Shotgun metagenomic sequencing was employed to characterize the microbiome in prediagnostic buccal samples from 165 cases and 323 matched controls. Associations of overall microbial richness and abundance of microbial taxa, gene families and metabolic pathways with gastric cancer risk were evaluated via conditional logistic regression. Analyses were performed within each cohort, and results were combined by meta-analyses. We found that overall microbial richness was associated with decreased gastric cancer risk, with an odds ratio (OR) per standard deviation (SD) increase in Simpson’s reciprocal index of 0.77 (95% confidence interval [CI] = 0.61-0.99). Nine taxa, 38 gene families and six pathways also showed associations with gastric cancer risk at P < .05. Neisseria mucosa and Prevotella pleuritidis were enriched, while Mycoplasma orale and Eubacterium yurii were depleted among cases with ORs and 95% CIs per SD increase in centered log-ratio transformed taxa abundance of 1.31 (1.03-1.67), 1.26 (1.00-1.57), 0.74 (0.59-0.94) and 0.80 (0.65-0.98), respectively. The top two gene families (P = 3.75 × 10−4 and 3.91 × 10−4) and pathways (P = 1.75 × 10−3 and 1.53 × 10−3) associated with gastric cancer were related to the decreased risk and are involved in hexitol metabolism. Our study supports the hypothesis that oral microbiota may play a role in gastric cancer etiology.

Keywords: gastric cancer risk, oral microbiome, shotgun metagenomic sequencing

1 |. INTRODUCTION

Gastric cancer is the fifth most commonly diagnosed malignancy and the fourth leading cause of cancer death worldwide.1 The incidence of gastric cancer is higher in Eastern Asia but lower in North America.2 In the United States, African Americans and Asian Americans have higher gastric cancer incidences than European Americans.3 Helicobacter pylori infection, cigarette smoking and alcohol consumption are well-recognized risk factors for gastric cancer.4 However, our knowledge on the etiology of this malignancy is far from complete, hindering the development of effective prevention.

Oral microbiota is the second most complex microbial community in the human body5 and plays important roles in both oral and systemic health.6 Multiple oral microbial taxa have been reported to be associated with increased risks of cancers.7-11 In addition, a recent prospective cohort study showed that poor oral health was associated with increased gastric cancer risk.12 To date, only four case-control studies have explored the relationship between oral microbiome and gastric cancer.13-16 These studies, all conducted in Asians, however, were limited by retrospective study designs and small sample sizes. In addition, in all these studies, the microbiome was profiled using 16S rRNA gene sequencing, which has a low taxonomic resolution compared to shotgun metagenomic sequencing.

Herein, we conducted a prospective study nested within three population-based cohort studies, the Shanghai Women’s Health Study and Shanghai Men’s Health Study (SWMHS) and the Southern Community Cohort Study (SCCS), to investigate the relationship between baseline oral microbiome and subsequent risk of gastric cancer. Shotgun metagenomic sequencing was utilized to assess the oral microbiome in prediagnostic buccal samples from 165 gastric cancer cases and 323 controls from these cohorts.

2 |. MATERIALS AND METHODS

2.1 |. Study population

2.1.1 |. Parent cohorts

The SWMHS recruited 74 941 women aged 40 to 70 years and 61 469 men aged 40 to 74 years, who were permanent residents of Shanghai between 1997 and 2006. Detailed information on the study designs and methods have been described elsewhere.17,18 At study enrollment, in-person interviews were conducted to collect information on demographics and lifestyle factors. At that time, 17 431 participants, including 8934 females and 8497 males, donated buccal cell samples using a modified mouth wash protocol, which consisted of brushing plus rinsing and spitting into a tube with 70% alcohol. Cohort members were followed up by a combination of in-person follow-ups every 2 to 4 years and annual record linkages to the Shanghai Cancer Registry and Vital Statistics Databases.

The SCCS is an ongoing prospective cohort study, including ~86 000 participants recruited during 2002 and 2009 from 12 Southeastern states in the United States. A detailed description of the SCCS can be found elsewhere.19 Nearly 67% of study participants are African Americans (AAs), and the remaining are European American (EA) subjects. Approximately 86% of participants were recruited from community health centers that provide basic health care and preventative services in underserved areas. The remaining study participants were recruited from the general population via mailed surveys. During enrollment, all study participants completed baseline surveys, through which information on subject characteristics, including self-identified ethnicity, demographics and lifestyle factors, was obtained. Mouth rinse samples were collected from ~34 100 participants with a mouthwash, Scope. Identification of incident cancer cases and deaths was conducted via linkages to state cancer registries operating in the 12-state study area and national mortality registries.

2.1.2 |. Nested case-control study selection

Incident gastric cancer cases were identified from cohorts according to the International Classification of Diseases (ICD-10) code C-16. Sampling was performed in 2014, up to which a total of 912 and 166 incident patients with gastric cancer were identified in the SWMHS and SCCS, respectively. Among them, 117 cases from the SWMHS and 70 cases from the SCCS donated a buccal sample at study enrollment. After excluding those who used antibiotics within a week before buccal sample collection, 107 gastric cancer cases from the SWMHS and 58 from the SCCS were retained. Among subjects without usage of antibiotics within a week before buccal sample collection, two controls were randomly selected and individually matched to each case on age at gastric cancer diagnosis (±2 years), ethnicity (AA/EA; SCCS only), sex, smoking status (current/former/never) and buccal sample collection date (± 30 days) and time (morning/afternoon). Finally, 495 participants, including 107 and 58 case-control sets from the SWMHS and SCCS, respectively, were selected for the present study.

2.2 |. Microbiome profiling

2.2.1 |. DNA extraction and shotgun metagenomic sequencing

In both the SWMHS and SCCS, buccal samples were kept at 4°C during transportation and stored at −75°C afterwards until assays. The DNeasy PowerSoil Kit (Qiagen) was used to isolate DNA from buccal samples, following protocols provided by the manufacturer. Then, the TruePrep DNA Library Prep Kit V2 or Nextera XT DNA Library Preparation Kit (Illumina) was utilized to build sequencing libraries from DNA samples for shotgun metagenomic sequencing, following manufacturers’ instructions. Finally, sequencing was conducted at paired-end 150 bp using the Illumina HiSeq System at BGI Americas. DNA extractions, library preparation and sequencing for all samples were conducted in one batch. Of 495 participants, buccal samples from seven control subjects, two from the SWMHS and five from the SCCS, were excluded due to low DNA yields. The remaining 488 participants, with sequencing data available, were included in downstream analyses.

2.2.2 |. Sequencing data processing

The quality statistics of sequencing data for each sample are summarized in Supporting Information Table 1. On average, 12.4 and 11.8 million raw sequencing reads were obtained for each sample from the SWMHS and SCCS, respectively. Raw reads were processed by Trimmomatic v0.3920 to trim low-quality bases (sequencing quality score <20), after which, reads with fewer than 105 nucleotides, that is, 70% of original read lengths, were discarded. Then, Bowtie2 v2.3.021 was used to remove reads that could be mapped onto the human genome (GRCh38). After quality-trimming and human reads removal steps, respective averages of 8.5 and 8.1 million clean reads per sample for SWMHS and SCCS participants were retained for downstream analyses. Kraken v2.1.122 and Bracken v2.623 were used for taxonomic profiling to estimate absolute abundance of microbial taxa, with oral bacterial genomes from the expanded Human Oral Microbiome Database (eHOMD)5 as the reference. Within each sample, only taxa with a relative abundance of >0.001% were considered detected.24 We chose to use Kraken and Bracken for taxonomic profiling, according to findings from a recent study that this combination showed the best taxonomic classification performance among 20 benchmarked metagenomic classifiers, including MetaPhlAn2.25 Functional profiling was performed using HUMAnN2 (v0.11.1),26 with the UniRef90 comprehensive protein database as the reference, to estimate relative abundance of microbial gene families and metabolic pathways.

2.3 |. Statistical analysis

We first evaluated associations of overall microbial richness (alpha diversity) and composition (beta diversity) with gastric cancer risk. Considering that both alpha and beta diversity estimates are affected by sequencing depth,27 we first rarefied the species level absolute abundance, that is, read counts, of every sample to the minimum number of clean reads (n = 451 330) of all samples, using the R function vegan::rarefy.28 Then, alpha diversity (Simpson’s reciprocal index and Shannon index) and beta diversity (Bray-Curtis dissimilarity matrix [Bray-Curtis], weighted UniFrac distance matrix [wUniFrac] and unweighted UniFrac distance [uwUniFrac]) were calculated based on the rarefied species level absolute abundance data using the R functions vegan::diversity and vegan::vegdist, respectively.28 Finally, the associations of Simpson’s reciprocal indexes and Bray-Curtis dissimilarity matrices with gastric cancer risk were examined through conditional logistic regression and MiRKAT,29 respectively.

We then evaluated the associations of microbial taxa, gene families and metabolic pathways with gastric cancer risk via conditional logistic regression analyses. Only microbial markers that were observed among >10% of all control participants were included in the analyses. In addition, given the large number of gene families, we limited our analyses to those with a median relative abundance of >0.0001% among all control subjects. Centered log-ratio (clr) transformation30 was utilized to normalize the absolute abundance of taxa at each taxonomic level, with zeros replaced by the minimum read count value of the whole dataset. Arcsine square root (asr) transformation was used to normalize the relative abundance of gene families and metabolic pathways. Covariates adjusted in all analyses included age, highest level of education, pack-year of smoking (ethnic-specific tertiles among ever-smokers), alcohol drinking status (g/day), body mass index (BMI; weight [kg] divided by height [m] squared), diet quality (HEI-2010 standard),31 hypertension status at enrollment and type 2 diabetes (T2D) status at enrollment. We further conducted analyses stratified according to time of cancer diagnosis (within or beyond 5 years after sample collection), ethnicity, sex (male/female), smoking status (ever/never) and alcohol drinking status (ever/never). All of these analyses were performed separately for the SWMHS and SCCS, and results were combined through random- or fixed-effect meta-analyses, according to whether heterogeneity was detected (Cochran’s Q <0.05) or not, respectively, using the R function meta::metagen. All statistical analyses were based on two-sided tests, and associations with P < .05 were considered nominally significant.

3 |. RESULTS

3.1 |. Characteristics of study participants

Demographic characteristics of study participants are shown in Table 1. Of 488 participants, there were 107 cases and 212 controls from the SWMHS (105 sets with two controls per case and two sets with one control per case), and 58 cases and 111 controls from the SCCS (53 sets with two controls per case and five sets with one control per case). In general, participants included in the present study had relatively low educational and income levels, with only ~15% of subjects who graduated from high school and ~10% of subjects with income levels higher than the middle level. Cases and controls were comparable on all matching variables, including age at enrollment, sex, ethnicity (SCCS) and smoking status. For unmatched variables, compared to controls, cases had a higher prevalence of T2D and higher smoking pack-years among ever-smokers, although neither reached statistical significance. Within Asians and AAs, the proportion of ever-drinkers was higher among cases; however, within EAs, cases had a lower proportion of ever-drinkers. Compared to Asians, AAs and EAs were more likely to be ever-smokers (40.7% vs 77.1% vs 64.7%) and ever-drinkers (21.3% vs 45.8% vs 43.1%), have a higher mean BMI value (23.8 vs 30.2 vs 31.2) and higher proportion of low-income status (13.5% vs 54.2% vs 43.1%). In addition, at study enrollment, AA and EA subjects had a higher prevalence of hypertension (66.9% and 56.9%) and T2D (28.0% and 29.4%) than Asians (33.2% [hypertension] and 6.6% [T2D]).

TABLE 1.

Characteristics of study participants from the Shanghai Women’s Health Study and Shanghai Men’s Health Study (SWMHS) and Southern Community Cohort Study (SCCS)

Asian (SWMHS; N = 319)
African American (SCCS; N = 118)
European American (SCCS; N = 51)
Characteristics Group Cases (N = 107) Controls (N = 212) P a Cases (N = 41) Controls (N = 77) P a Cases (N = 17) Controls (N = 34) P a
Age at enrollmentb 60.4 ± 9.7 60.4 ± 9.5 1.00 58.1 ± 10.1 58.0 ± 10.4 .95 58.6 ± 9.6 58.2 ± 9.4   .87

Sex 1.00 .89 1.00
Male 66 (38.3%) 130 (38.7%) 21 (48.8%) 37 (51.9%) 11 (35.3%) 22 (35.3%)
Female 41 (61.7%)   82 (61.3%) 20 (51.2%) 40 (48.1%)   6 (64.7%) 12 (64.7%)

Body mass index (BMI)b 23.9 ± 2.8 23.7 ± 3.3   .45 29.2 ± 8.4 30.7 ± 7.5 .18 31.3 ± 6.7 31.1 ± 6.0   .85

Highest level of education   .48 .80   .24
<High school 55 (51.4%)   95 (44.8%) 15 (36.6%) 33 (42.9%)   6 (35.3%)   5 (14.7%)
High school 37 (34.6%)   79 (37.3%) 22 (53.7%) 37 (48.0%)   8 (47.1%) 22 (64.7%)
College 15 (14.0%)   38 (17.9%)   4 (9.7%)   7 (9.1%)   3 (17.6%)   7 (20.6%)

Incomec   .97 .80   .43
<Middle 14 (13.1%)   29 (13.7%) 22 (53.7%) 42 (54.5%)   9 (52.9%) 13 (38.2%)
Middle 84 (78.5%) 164 (77.3%) 17 (41.4%) 29 (37.7%)   6 (35.3%) 12 (35.3%)
>Middle   9 (8.4%)   19 (9.00%)   2 (4.9%)   6 (7.8%)   2 (11.8%)   9 (26.5%)

Smoking status   .99 .98 1.00
Current 29 (27.1%)   56 (26.4%) 18 (43.9%) 34 (44.2%)   3 (17.6%)   6 (17.6%)
Former 15 (14.0%)   30 (14.2%) 14 (34.1%) 25 (32.5%)   8 (47.1%) 16 (47.1%)
Never 63 (58.9%) 126 (59.4%)   9 (22.0%) 18 (23.3%)   6 (35.3%) 12 (35.3%)

Pack years of smokingb,d 30.8 ± 23.2 27.2 ± 19.4   .57 26.1 ± 20.5 20.4 ± 21.7 .11 43.8 ± 28.0 37.0 ± 23.5   .64

Drinking status 1.00 .15   .02
Ever 26 (24.3%)   42 (19.8%) 23 (56.1%) 31 (40.3%)   3 (17.6%) 19 (55.9%)
Never 81 (75.7%) 170 (80.2%) 18 (43.9%) 46 (59.7%) 14 (82.4%) 15 (44.1%)

Drinking amountb,d 2.3 ± 2.1 1.8 ± 1.2   .63 3.7 ± 6.7 1.5 ± 2.6 .28 2.1 ± 3.4 3.2 ± 6.6   .96

Hypertension at enrollment .20 .21   .62
Yes 30 (28.0%)   76 (35.8%) 31 (75.6%) 48 (62.3%) 11 (64.7%) 18 (52.9%)
No 77 (72.0%) 136 (64.2%) 10 (24.4%) 29 (37.7%)   6 (35.3%) 16 (47.1%)

Type 2 diabetes at enrollment   .49 .08   .74
Yes   9 (8.4%) 12 (5.7%) 16 (39.0%) 17 (22.1%)   4 (23.5%) 11 (32.4%)
No 98 (91.6%) 200 (94.3%) 25 (61.0%) 60 (77.9%) 13 (76.5%) 23 (67.6%)
a

P values were calculated through two-sided t-test or chi-squared (χ2) test with missing values excluded.

b

Mean ± standard deviation (SD) is presented for age at enrollment, BMI, pack year of smoking and drinking per day.

c

Annual per capita income (¥) and annual household income ($) are presented for SWMHS and SCCS participants, respectively. Middle level income was defined as ¥6000-¥10 000, ¥4000-¥8000 and $15 000-$50 000 for SWMHS men, SWMHS women and SCCS participants, respectively.

d

Pack years of smoking and drinking amount (gram per day) were calculated only among current- and former-smokers, and current- and former-drinkers, respectively.

3.2 |. Associations of alpha and beta diversity with gastric cancer risk

In general, Asians had a higher alpha diversity than AAs and EAs, with mean Simpson’s reciprocal indexes of 28.1, 13.3 and 11.9, respectively. A significant association between alpha diversity and decreased risk of gastric cancer was found among Asians (Supporting Information Figure 1), with mean Simpson’s reciprocal indexes of 25.5 in cases and 29.5 in controls. The odds ratio (OR) per standard deviation (SD) increase in Simpson’s reciprocal index was 0.68 (95% confidence interval [CI]: 0.52-0.90). No significant differences were observed in Simpson’s reciprocal index between cases and controls in either AAs or EAs. In meta-analyses, the Simpson’s reciprocal index retained a significant association with decreased gastric cancer risk (OR = 0.77; 95% CI: 0.61-0.99) (Supporting Information Figure 1). This association was more pronounced in never-drinkers, with an OR of 0.69 (95% CI: 0.51-0.93). Meta-analysis also revealed a significant inverse association with gastric cancer risk for Shannon index (OR = 0.76; 95% CI: 0.59-0.99).

No significant differences were found in beta diversity between gastric cancer cases and controls in the analysis of all samples combined, with P values of .76 (Bray-Curtis), .39 (wUniFrac) and .81 (uwUniFrac), respectively. Similarly, null associations were observed in cohort specific analyses (all P > .05).

3.3 |. Associations of microbial taxa with gastric cancer risk

At the phylum level, Asian participants had higher median relative abundance of Proteobacteria (24.6% vs 12.0% vs 13.6%) and Fusobacteria (6.6% vs 1.4% vs 1.4%) but a lower level of Firmicutes (36.3% vs 50.8% vs 51.1%) compared to AA and EA subjects (Supporting Information Figure 2). Although the positivity rate of plasma/serum antibody to H. pylori was high among both SWMHS (~94%) and SCCS (~89% for AAs and ~69% for EAs) participants,32 this bacterium was not detected in buccal samples of any participants in the present study. We investigated six previously reported periodontal pathogens,8,33 including Porphyromonas gingivalis, Treponema denticola, Aggregatibacter actinomycetemcomitans, Prevotella intermedia, Tannerella forsythia and Fusobacterium nucleatum,8,33 in association with gastric cancer risk. Although in meta-analyses, none of these pathogenic bacteria showed associations (all P > .05) among Asians, P. gingivalis was found to be associated with an increased risk of gastric cancer. The OR per SD increase in clr-transformed absolute abundance of this bacterium in Asians was 1.37 (95% CI: 1.01-1.86), with P = .04 (Supporting Information Table 2).

In addition, a total of other 427 microbial taxa were included in meta-analyses, nine of which were found to be significantly associated with gastric cancer risk at P < .05 (Table 2 and Figure 1). Five of the nine taxa were associated with an increased risk of gastric cancer, including four taxa under the phylum Proteobacteria, including Betaproteobacteria (class), Neisseriales (order), Neisseriaceae (family), and Neisseria mucosa (species), and one species, Prevotella pleuritidis, from the phylum Bacteroidetes. The species N. mucosa was associated with increased gastric cancer risk, with an OR of 1.31 (95% CI: 1.03-1.67) and P = .03. This association possibly drove the associations observed for three taxa of higher taxonomic ranks, that is, Betaproteobacteria, Neisseriales and Neisseriaceae, all of which were associated with increased gastric cancer risk (P < .05). Of note, the association of this bacterium with gastric cancer showed a consistent association pattern across ethnic groups and reached significance in meta-analyses, and the relative abundance of N. mucosa in Asian participants was approximately 10 times higher than in AA and EA subjects (Supporting Information Table 3). The abundance of P. pleuritidis was comparable across all three ethnic groups (Supporting Information Table 3), and the association of this species with gastric cancer risk showed a same direction across ethnic groups, although only the point estimate of meta-analyses reached statistical significance (OR = 1.26; 95% CI: 1.00-1.57).

TABLE 2.

Microbial taxa associated with gastric cancer risk in the meta-analyses

Meta-analyses
Asians (SWMHS)
African Americans (SCCS)
European Americans (SCCS)
Taxa OR (95% CI)a P a OR (95% CI)b P b OR (95% CI)b P b OR (95% CI)b P b
Phylum Proteobacteria
Betaproteobacteria (C) 1.31 (1.03-1.67)   .03 1.22 (0.90-1.65)   .19 1.16 (0.67-2.00) .59 2.29 (1.13-4.67) .02
Neisseriales (O) 1.30 (1.03-1.65)   .03 1.17 (0.87-1.56)   .30 1.22 (0.72-2.05) .46 2.32 (1.13-4.75) .02
Neisseriaceae (F) 1.28 (1.01-1.62)   .04 1.14 (0.85-1.53)   .39 1.29 (0.76-2.18) .35 2.31 (1.13-4.72) .02
Neisseria mucosa (S) 1.31 (1.03-1.67)   .03 1.34 (0.99-1.81)   .06 1.24 (0.68-2.24) .48 1.63 (0.72-3.68) .24

Phylum Bacteroidetes
Prevotella pleuritidis (S) 1.26 (1.00-1.57)   .05 1.30 (0.99-1.69)   .06 1.15 (0.66-1.99) .62 1.87 (0.79-4.42) .15

Phylum Tenericutes 0.70 (0.55-0.88) 2.98 × 10−3 0.58 (0.42-0.81) 1.24 × 10−3 1.37 (0.79-2.37) .26 1.16 (0.64-2.09) .63
Mycoplasma orale (S) 0.74 (0.59-0.94)   .01 0.66 (0.49-0.89) 7.18 × 10−3 0.96 (0.57-1.63) .89 1.04 (0.52-2.06) .92

Phylum Firmicutes
Eubacterium yurii (S) 0.80 (0.65-0.98)   .03 0.73 (0.57-0.93) 9.69 × 10−3 0.63 (0.31-1.27) .20 1.77 (0.89-3.52) .10

Phylum Actinobacteria
Cutibacterium (G) 0.77 (0.62-0.96)   .02 0.87 (0.67-1.12)   .29 0.58 (0.32-1.07) .08 0.65 (0.33-1.29) .22

Abbreviations: C, class; CI, confidence interval; F, family; G, genus; O, order; OR, odds ratio; S, species .

a

ORs and 95% CIs per SD increase in centered log-ratio (clr)-transformed absolute abundance of taxa and P values were derived from meta-analyses.

b

ORs and 95% CIs per SD increase in clr-transformed absolute abundance of taxa and P values were calculated from conditional logistic regression models, additionally adjusting for age, education, pack-years of smoking, drinking status, hypertension status and type 2 diabetes (T2D) status.

FIGURE 1.

FIGURE 1

Forest plots of the associations between nine microbial taxa and gastric cancer risk. Odds ratio and 95% confidence intervals per SD increase in clr-transformed absolute abundance of taxa and P values were estimated via conditional logistic regression. P, phylum; C, class; F, family; G, genus; O, order; S, species

Among the remaining four taxa that were associated with a decreased risk of gastric cancer in meta-analyses at P < .05, the phylum Tenericutes showed the most significant association, with an OR of 0.70 (95% CI: 0.55-0.88) and P = 2.98 × 10−3 This association was mainly attributed to the species Mycoplasma orale, which was associated with a decreased risk of gastric cancer, with an OR of 0.74 (95% CI: 0.59-0.94) and P = .01. The association between Eubacterium yurii and decreased gastric cancer risk in meta-analyses was primarily attributable to data of Asians and AAs. Cutibacterium was associated with a decreased risk of gastric cancer in the meta-analyses (P = .02). The association patterns of this genus were consistent across all three ethnic groups; however, the OR appeared to be strongest among AAs. It is worth noting that the prevalence of Cutibacterium among AAs (69.5%) and EAs (70.6%) was more than five times higher than among Asians (13.5%) (Supporting Information Table 3).

In stratified analyses by time of cancer diagnosis, except for P. pleuritidis, all the other eight taxa showed a significant association with gastric cancer risk only among cases identified beyond 5 years after sample collection (Supporting Information Table 4). In stratified analyses by ethnicity, within Asians, AAs and EAs, 18, seven and 19 taxa, respectively, were associated with gastric cancer risk at P < .05 (Supporting Information Table 2). Most association patterns were not consistent across the three ethnic groups except the few presented in Table 2. For example, among Asians, three phyla, Tenericutes, Actinobacteria and Proteobacteria, were significantly associated with gastric cancer risk (Supporting Information Table 2), even at a false discovery rate (FDR) of <0.05. However, none of these showed significant associations (P < .05) with gastric cancer among AAs or EAs, and association directions were inconsistent (Supporting Information Table 2). In analyses stratified by sex, smoking status and alcohol drinking status, few of the nine significant associations from meta-analyses remained significant among all stratification groups. Associations for Betaproteobacteria, Neisseriales and Neisseriaceae were significant in males, ever-smokers and ever-drinkers (Supporting Information Tables 5-7); however, N. mucosa was only significant among males (Supporting Information Table 5). The association for P. pleuritidis was not significant in any specific group (all P > .05). Both Tenericutes and M. orale were associated with significantly decreased risk of gastric cancer in never-smokers (P of 6.06 × 10−3 and 0.04) and never-drinkers (P of .03 and .02) (Supporting Information Tables 6 and 7). In addition, the association for Tenericutes was also significant in females (P = 7.56 × 10−3), while that for M. orale was significantly observed in males (P = .04) (Supporting Information Table 5). E. yurii was significantly associated with increased gastric cancer risk in males (P = 9.53 × 10−3) (Supporting Information Table 5) and never-drinkers (P = .03) (Supporting Information Table 7), while a significant association of Cutibacterium with decreased risk of gastric cancer was only found in females (P = .02) (Supporting Information Table 5).

3.4 |. Associations of microbial gene families and metabolic pathways with gastric cancer risk

A total of 3080 gene families and 279 metabolic pathways were tested in the meta-analyses. Among them, 38 gene families (Supporting Information Table 8) and six pathways were significantly associated with gastric cancer risk at P < .05. Ten of the 38 gene families (gene ontology [GO] terms), five showing the highest ORs and the other five showing the lowest ORs, are presented in Table 3 and Figure 2. The top two GO terms that were most significantly associated with decreased gastric cancer risk were GO:0009010 (sorbitol-6-phosphate 2-dehydrogenase activity) and GO:0006062 (sorbitol catabolic process). The ORs per SD increase in asr-transformed relative abundance of these two GO terms were 0.64 (95% CI: 0.50-0.82) and 0.64 (95% CI: 0.50-0.82), with P values of 3.75 × 10−4 and 3.91 × 10−4, respectively. The maltose phosphorylase activity (GO:0050082) and the protein-containing complex (GO:0043234) were the top two gene families that were most significantly associated with increased risk of gastric cancer, with ORs of 1.27 (95% CI: 1.02-1.58) and 1.26 (95% CI: 1.03-1.55), respectively, and identical P values of .03.

TABLE 3.

Functional microbial markers associated with gastric cancer risk in the meta-analyses

Meta-analyses
Asians (SWMHS)
African Americans (SCCS)
European Americans (SCCS)
Functional markers Function OR (95% CI)a P a OR (95% CI)b P b OR (95% CI)b P b OR (95% CI)b P b
Selectedc gene families
 GO:0009010 [MF] sorbitol-6-phosphate 2-dehydrogenase activity 0.64 (0.50-0.82) 3.75 × 10−4 0.62 (0.45-0.84) 1.92 × 10−3 0.76 (0.46-1.27) .30 0.56 (0.23-1.38) .21
 GO:0006062 [BP] sorbitol catabolic process 0.64 (0.50-0.82) 3.91 × 10−4 0.62 (0.46-0.84) 1.95 × 10−3 0.75 (0.44-1.25) .27 0.59 (0.25-1.42) .24
 GO:0016774 [MF] phosphotransferase activity, carboxyl group as acceptor 0.73 (0.57-0.92) 8.89 × 10−3 0.64 (0.46-0.88) 7.00 × 10−3 0.75 (0.46-1.23) .25 1.18 (0.65-2.14) .59
 GO:0008679 [MF] 2-hydroxy-3-oxopropionate reductase activity 0.74 (0.59-0.93) 8.89 × 10−3 0.77 (0.59-1.01)   .06 0.60 (0.36-1.00) .05 1.21 (0.59-2.50) .60
 GO:0009605 [BP] response to external stimulus 0.75 (0.61-0.92) 5.98 × 10−3 0.79 (0.61-1.02)   .07 0.76 (0.46-1.24) .27 0.39 (0.16-0.96) .04
 GO:0050082 [MF] maltose phosphorylase activity 1.27 (1.02-1.58)   .03 1.26 (0.96-1.65)   .10 0.97 (0.55-1.72) .92 1.24 (0.67-2.29) .50
 GO:0043234 [CC] protein-containing complex 1.26 (1.03-1.55)   .03 1.13 (0.89-1.44)   .32 1.76 (0.99-3.13) .06 1.54 (0.80-2.97) .20
 GO:0004185 [MF] serine-type carboxypeptidase activity 1.25 (1.01-1.56)   .04 1.27 (0.97-1.65)   .08 1.37 (0.81-2.33) .24 1.20 (0.60-2.41) .61
 GO:0015074 [BP] DNA integration 1.24 (1.01-1.51)   .04 1.26 (0.98-1.62)   .07 0.97 (0.57-1.65) .92 1.19 (0.62-2.26) .60
 GO:0015668 [MF] type III site-specific deoxyribonuclease activity 1.23 (1.01-1.50)   .04 1.17 (0.92-1.50)   .20 1.29 (0.80-2.07) .30 1.21 (0.67-2.19) .53

Metabolic pathways
 HEXITOLDEGSUPER-PWY Superpathway of hexitol degradation 0.71 (0.57-0.88) 1.75 × 10−3 0.71 (0.55-0.92)   .01 0.76 (0.44-1.32) .33 0.68 (0.33-1.38) .28
 P461-PWY Hexitol fermentation to lactate, formate, ethanol and acetate 0.69 (0.55-0.87) 1.53 × 10−3 0.68 (0.52-0.90) 7.67 × 10−3 0.79 (0.48-1.29) .34 0.55 (0.24-1.26) .16
 POLYAMINSYN3-PWY Superpathway of polyamine biosynthesis II 0.80 (0.64-0.99)   .04 0.82 (0.62-1.08)   .16 0.97 (0.60-1.57) .91 0.69 (0.35-1.37) .28
 PWY-7013 L-1,2-propanediol degradation 0.77 (0.63-0.94)   .01 0.76 (0.59-0.98)   .04 0.86 (0.52-1.42) .55 0.78 (0.41-1.48) .44
 PWY-5690 TCA cycle II 1.29 (1.04-1.59)   .02 1.18 (0.91-1.52)   .22 1.20 (0.70-2.04) .51 2.26 (0.99-5.15) .05
 PWY-7254 TCA cycle VII 1.25 (1.01-1.54)   .04 1.14 (0.88-1.46)   .32 1.05 (0.60-1.83) .87 2.02 (0.97-4.21) .06

Abbreviations: BP, biological process; CC, cellular component; CI, confidence interval; MF, molecular function; OR, odds ratio; TCA, tricarboxylic acid.

a

ORs and 95% CIs per SD increase in arcsine squared root (asr) transformed relative abundance of functional markers and P values were derived from met a-analyses.

b

ORs and 95% CIs per SD increase in asr-transformed relative abundance of functional markers and P values were calculated from conditional logistic regression models, additionally adjusting for age, education, pack-years of smoking, drinking status, hypertension status and T2D status.

c

Selected from 38 gene families that showed an association with gastric cancer risk at P < .05 in the combined data of SWMHS and SCCS. Ten gene families, five with lowest ORs and five with highest ORs are presented. The full list including all these 38 gene families are presented in Supporting Information Table 8.

FIGURE 2.

FIGURE 2

Forest plots of the associations between 10 selected microbial gene families and gastric cancer risk. Ten gene families, five showing the top greatest odds ratio (ORs) and the other five showing the top smallest ORs, are shown. ORs and 95% confidence intervals per SD increase in asr-transformed relative abundance of gene families and P values were estimated via conditional logistic regression. BP, biological process; CC, cellular component; MF, molecular function

The six metabolic pathways that were significantly associated with gastric cancer risk in meta-analyses at P < .05 are shown in Table 3 and Figure 3. Consistent with the results of the gene families, two hexitol metabolism-related pathways, including the superpathway of hexitol degradation (HEXITOLDEGSUPER-PWY) and hexitol fermentation to lactate, formate, ethanol and acetate pathway (P461-PWY), were associated with decreased gastric cancer risk, with ORs of 0.71 (95% CI: 0.57-0.88) and 0.69 (95% CI: 0.55-0.87) and P values of 1.75 × 10−3 and 1.53 × 10−3, respectively. The superpathway of polyamine biosynthesis II (POLYAMINSYN3-PWY) was also associated with decreased risk of gastric cancer, with an OR of 0. 80 (95% CI: 0.64-0.99) and P = .04. On the other hand, two types of tricarboxylic acid (TCA) cycles (Krebs cycle), II and VII, showed increased associations with gastric cancer risk, with ORs of 1.29 (95% CI: 1.04-1.59) and 1.25 (95% CI: 1.01-1.54) and P values of .02 and .04, respectively.

FIGURE 3.

FIGURE 3

Forest plots of the associations between six microbial metabolic pathways and gastric cancer risk. Odds ratio and 95% confidence intervals per SD increase in asr-transformed relative abundance of pathways and P values were estimated via conditional logistic regression

In analyses stratified by time interval between sample collection and cancer diagnosis, 12 of the 38 gene families and four of the six pathways showed a significant association with gastric cancer in the stratification group, including cases identified more than 5 years after sample collection (Supporting Information Table 9). Of the 38 gene families, significant associations (P < .05) were more likely to be significantly detected in males (n = 11 vs n = 5 in females; Supporting Information Table 10), ever-smokers (n = 27 vs n = 4 in never-smokers; Supporting Information Table 11) and never-drinkers (n = 19 vs n = 3 in ever-drinkers; Supporting Information Table 12). Almost all associations for the six metabolic pathways were significantly (P < .05) observed only among males (n = 5 vs none in females; Supporting Information Table 10) and ever-smokers (n = 6 vs none in never-smokers; Supporting Information Table 11). When stratified by drinking status, associations for the two hexitol metabolism-related pathways were significant only among never-drinkers, while the two TCA cycles showed significant associations only in ever-drinkers (Supporting Information Table 12).

Within Asians, AAs and EAs, 127, 71 and 152 gene families were found to be associated with gastric cancer risk at P < .05, respectively (Supporting Information Table 2). In addition, 24, two and four metabolic pathways also showed associations with gastric cancer risk within these three ethnic groups, respectively (Supporting Information Table 2). Similar to findings for microbial taxa, few of these associations could be consistently observed across all ethnic groups, with the exception of those shown in Table 3 and Supporting Information Table 8. For instance, of the 151 gene families and pathways that showed significant associations (P < .05) with gastric cancer in Asians, only ~10% (n = 15) showed consistent association directions in both AAs and EAs (Supporting Information Table 2).

4 |. DISCUSSION

In this first prospective study of oral microbiome and gastric cancer risk, decreased overall microbial richness and abundance of Tenericutes, M. orale, E. yurii and Cutibacterium, and increased abundance of Betaproteobacteria, Neisseriales, Neisseriaceae, N. mucosa, and P. pleuritidis, were associated with an increased risk of gastric cancer. In addition, decreased abundance of four hexitol metabolism-related microbial gene families and metabolic pathways, and increased abundance of microbial TCA cycles II and VII were associated with increased gastric cancer risk. Our findings suggest that, in the human mouth, overall microbial richness and multiple microbes may play etiological roles in gastric cancer.

In a study of tongue coating microbiome, alpha diversity was higher in healthy participants than that of patients with gastric cancer.15 Another two studies of gastric biopsies revealed that gastric microbiota of healthy individuals had higher alpha diversity than that of patients with gastric cancer.34,35 Furthermore, a most recent study found that saliva samples from gastritis patients had a higher alpha diversity compared to those from patients with gastric cancer.16 All these findings suggest that higher alpha diversity in human mouth and gastric tissues may protect hosts from gastric tumorigenesis. Our study is the first prospective study to reveal that alpha diversity of prediagnosis buccal samples is associated with decreased gastric cancer risk.

Poor oral health has been associated with increased risk of gastric cancer.12 A recent study in the United States compared abundance of periodontal pathogens in saliva and dental plaque samples from 35 patients with precancerous gastric lesions and 70 controls. They found that the burdens of T. forsythia, T. denticola and A. actinomycetemcomitans were associated with an increased risk of precancerous gastric lesions.36 We found that P. gingivalis was found to be associated with increased gastric cancer risk among Asians. This pathogen was associated with risk of pancreatic cancer and esophageal squamous cell carcinoma in two prospective studies of oral microbiome among EAs.8,10

Betaproteobacteria, Neisseriales, Neisseriaceae and N. mucosa were associated with increased gastric cancer risk in our study. In a parallel case-control study conducted by our team,37 in which shotgun metagenomic sequencing was used to assess the oral microbiome in buccal samples from 89 patients with precancerous gastric lesions and 89 matched controls, all of these four taxa were consistently found to be associated with increased risk of precancerous gastric lesions, and the associations for Neisseriales and Neisseriaceae reached statistical significance (P < .05). In addition, among 97 subjects with both buccal and gastric microbiome data available, abundance of these four taxa in buccal samples were highly correlated (Spearman correlation r of .75-.78, all P < .01) with that in gastric samples.37 N. mucosa is a commensal bacteria in the human upper respiratory tract38 and has rarely been reported to be associated with human diseases.

Tenericutes, M. orale and E. yurii showed inverse associations with gastric cancer in the present study. Tenericutes consists of gram-negative bacteria without cell walls and was reported to be more abundant among healthy controls than patients with gastric cancer by the earlier-mentioned tongue coating microbiome study.13 M. orale is a commensal bacterium in the healthy human oropharynx, and no study has associated it with any human cancers. Very little information can be found for E. yurii. Cutibacterium acnes is a common species of Cutibacterium and was associated with decreased gastric cancer risk among AAs in the present study. This species mainly presents in skin microbiota of healthy individuals.39 The role of these bacteria in the mouth, and its association with gastric cancer, is not clear. Tenericutes, M. orale and E. yurii were also detected in gastric samples in a parallel case-control study conducted by our team,37 and abundance of them in gastric samples were found to be correlated (Spearman correlation r of .56-.60, all P < .01) with that in buccal samples.37

GO:0006062, GO:0009010, HEXITOLDEGSUPER-PWY and P461-PWY, all associated with decreased gastric cancer risk in the present study, are related to hexitol metabolism. In addition, all except the HEXITOLDEGSUPER-PWY were consistently associated with decreased risk of precancerous gastric lesions at P < .05 in the aforementioned parallel case-control study by our team.37 In general, hexitols are degraded through HEXITOLDEGSUPER-PWY and fermented to lactate, formate, ethanol and acetate through P461-PWY. The sorbitol catabolic biological process (GO:0006062) belongs to the biological process of hexitol catabolism. The activity of sorbitol-6-phosphate 2-dehydrogenase (GO:0009010) belongs to the oxidoreductases family and is included in P461-PWY. Hexitols are used as sugar substitutes to prevent the formation of dental caries, which requires the fermentation of sugars to acid by microbes in dental plaque.40 The inverse associations of gastric cancer with multiple metabolic markers in hexitol metabolism, coupled with a previously reported link between periodontal disease and gastric cancer,41 suggest that hexitols may be involved in gastric cancer etiology.

The main limitation of the present study is a limited sample size for ethnic- or sex-specific analyses, although our study is, thus far, the largest study on this topic to date. On the other hand, buccal samples were collected only once for study participants, which made it difficult to evaluate the dynamics of the oral microbiota and its relationship to the development of gastric cancer. Furthermore, buccal samples were collected up to 10 years before the diagnosis of gastric cancer, which raises a concern that oral microbiome may have changed over time. However, previous studies have suggested that oral microbiome was quite stable for long periods of time.42 Moreover, additional analyses stratified by time interval between sample collection to cancer diagnosis ≤ vs >5 year showed similar association patterns. Furthermore, different protocols, that is, brushing plus rinsing in the SWMHS and mouth washing in the SCCS, were used to collect oral samples, which may introduce heterogeneities and reduce the chances of finding more consistent associations.

Our study has several strengths. First, our study is the first prospective investigation of oral microbiome with gastric cancer risk. Our time lag analyses indicated that most of our findings were unlikely to be attributed to reverse causation. In addition, the present study employed shotgun metagenomic sequencing, which not only led to enhanced resolution on the species level microbial profile but made it possible to directly assess associations of functional microbial markers with gastric cancer risk. Furthermore, our study is the first population-based study involving participants at high-risk of gastric cancer (Asians, AAs), and an understudied population (AAs). Finally, a variety of established and/or possible risk factors for gastric cancer, including amounts of smoking and alcohol drinking, education level, BMI, diet quality, hypertension and T2D were adjusted throughout the analyses, which minimized the possibility that our findings were confounded by these factors.

In conclusion, our prospective metagenomics study revealed that overall microbial richness, nine microbial taxa, 38 microbial gene families and six microbial metabolic pathways were associated with gastric cancer risk. Our findings provide evidence supporting the hypothesis that the oral microbiome may play a role in the etiology of gastric cancer. Larger studies in the future are needed to validate our findings.

Supplementary Material

Supplementary material

What’s new?

Previous studies of oral microbiome and gastric cancer risk were mainly conducted in Asians and were limited by retrospective designs and low taxonomic resolutions. This study is the first multi-racial study to prospectively investigate the relationship between oral microbiome and gastric cancer risk, utilizing shotgun metagenomic sequencing to assess the microbiome in prediagnostic buccal samples. Decreased overall microbial richness, altered abundance of several microbial taxa, and multiple microbial functional markers were found to be associated with gastric cancer risk. The study supports the hypothesis that oral microbiota may play a role in gastric cancer etiology.

ACKNOWLEDGMENTS

The authors wish to thank study participants and research team members who have enabled this work to be carried out. We thank Regina Courtney, Dr Hui Cai and Dr Mary Shannon Byers for their help with sample preparation, data management, technical support and editorial assistance for this project. This project was supported by the National Cancer Institute (NCI) grant R01CA204113. Parent studies, the SMHS, the SWHS and the SCCS, were supported by National Institutes of Health (NIH) grants UM1CA173640, UM1CA182910 and U01CA202979, respectively. The data analyses were conducted using the Advanced Computing Center for Research and Education (ACCRE) at Vanderbilt University. Sample preparation was conducted at the Survey and Biospecimen Shared Resources, which is supported in part by the Vanderbilt-Ingram Cancer Center (P30 CA68485). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the study.

Funding information

National Cancer Institute, Grant/Award Numbers: R01CA204113, U01CA202979, UM1CA173640, UM1CA182910; Vanderbilt-Ingram Cancer Center, Grant/Award Number: P30 CA68485

Abbreviations:

AA

African American

BMI

body mass index

CI

confidence interval

EA

European American

eHOMD

expanded Human Oral Microbiome Database

FDR

false discovery rate

GO

gene ontology

ICD

International Classification of Diseases

OR

odds ratio

SCCS

Southern Community Cohort Study

SWMHS

Shanghai Women’s Health Study and the Shanghai Men’s Health Study

T2D

type 2 diabetes

TCA

tricarboxylic acid

uwUniFrac

unweighted UniFrac

wUniFrac

weighted UniFrac

Footnotes

CONFLICT OF INTEREST

The authors declare no potential conflicts of interest.

ETHICS STATEMENT

The SWMHS was approved by the Institutional Review Boards at Shanghai Cancer Institute and Vanderbilt University Medical Center. The SCCS was approved by the Institutional Review Boards at Vanderbilt University Medical Center and Meharry Medical College. Written informed consent was obtained from all study participants. No animal experiments were involved in the present study.

SUPPORTING INFORMATION

Additional supporting information may be found in the online version of the article at the publisher’s website.

DATA AVAILABILITY STATEMENT

Metagenomic sequencing data generated in our study is available in the database of Genotypes and Phenotypes (dbGaP) with accession code phs002566.v1.p1. Investigators who would like to access individual-level study data should submit an application to the NIH Data Access Committee (DAC) to request the datasets. Further information is available from the corresponding author upon request.

REFERENCES

  • 1.Sung H, Ferlay J, Siegel RL, et al. Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin. 2021;71(3): 209–249. [DOI] [PubMed] [Google Scholar]
  • 2.Bray F, Ferlay J, Soerjomataram I, Siegel RL, Torre LA, Jemal A. Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin. 2018;68:394–424. [DOI] [PubMed] [Google Scholar]
  • 3.Shah SC, McKinley M, Gupta S, Peek RM Jr, Martinez ME, Gomez SL. Population-based analysis of differences in gastric cancer incidence among races and ethnicities in individuals age 50 years and older. Gastroenterology. 2020;159:1705–14.e2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Rawla P, Barsouk A. Epidemiology of gastric cancer: global trends, risk factors and prevention. Prz Gastroenterol. 2019;14:26. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Escapa IF, Chen T, Huang Y, Gajare P, Dewhirst FE, Lemon KP. New insights into human nostril microbiome from the expanded Human Oral Microbiome Database (eHOMD): a resource for the microbiome of the human aerodigestive tract. Msystems. 2018;3:e00187–e00118. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Sampaio-Maia B, Caldas I, Pereira M, Perez-Mongiovi D, Araujo R. The oral microbiome in health and its implication in oral and systemic diseases. Adv Appl Microbiol. 2016;97:171–210. [DOI] [PubMed] [Google Scholar]
  • 7.Hayes RB, Ahn J, Fan X, et al. Association of oral microbiome with risk for incident head and neck squamous cell cancer. JAMA Oncol. 2018; 4:358–365. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Fan X, Alekseyenko AV, Wu J, et al. Human oral microbiome and prospective risk for pancreatic cancer: a population-based nested case-control study. Gut. 2018;67:120–127. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Yang Y, Cai Q, Shu XO, et al. Prospective study of oral microbiome and colorectal cancer risk in low-income and African American populations. Int J Cancer. 2019;144:2381–2389. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Peters BA, Wu J, Pei Z, et al. Oral microbiome composition reflects prospective risk for esophageal cancers. Cancer Res. 2017;77:6777–6787. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Hosgood HD, Cai Q, Hua X, et al. Variation in oral microbiome is associated with future risk of lung cancer among never-smokers. Thorax. 2021;76:256–263. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Ndegwa N, Ploner A, Liu Z, Roosaar A, Axéll T, Ye W. Association between poor oral health and gastric cancer: a prospective cohort study. Int J Cancer. 2018;143:2281–2288. [DOI] [PubMed] [Google Scholar]
  • 13.Wu J, Xu S, Xiang C, et al. Tongue coating microbiota community and risk effect on gastric cancer. J Cancer. 2018;9:4039. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Sun JH, Li XL, Yin J, Li YH, Hou BX, Zhang Z. A screening method for gastric cancer by oral microbiome detection. Oncol Rep. 2018;39: 2217–2224. [DOI] [PubMed] [Google Scholar]
  • 15.Xu S, Xiang C, Wu J, et al. Tongue coating bacteria as a potential stable biomarker for gastric cancer independent of lifestyle. Dig Dis Sci. 2020;66(9):1–17. [DOI] [PubMed] [Google Scholar]
  • 16.Huang K, Gao X, Wu L, et al. Salivary microbiota for gastric cancer prediction: an exploratory study. Front Cell Infect Microbiol. 2021;11 (640309):1–10. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Zheng W, Chow W-H, Yang G, et al. The Shanghai women’s health study: rationale, study design, and baseline characteristics. Am J Epidemiol. 2005;162:1123–1131. [DOI] [PubMed] [Google Scholar]
  • 18.Shu X-O, Li H, Yang G, et al. Cohort profile: the shanghai men’s health study. Int J Epidemiol. 2015;44:810–818. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Signorello LB, Hargreaves MK, Blot WJ. The southern community cohort study: investigating health disparities. J Health Care Poor Underserv. 2010;21:26. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Bolger AM, Lohse M, Usadel B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics. 2014;30:2114–2120. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Langmead B, Salzberg SL. Fast gapped-read alignment with Bowtie 2. Nat Methods. 2012;9:357–359. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Wood DE, Lu J, Langmead B. Improved metagenomic analysis with Kraken 2. Genome Biol. 2019;20:257. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Lu J, Breitwieser FP, Thielen P, Salzberg SL. Bracken: estimating species abundance in metagenomics data. PeerJ Comput Sci. 2017;3:e104. [Google Scholar]
  • 24.Shao Y, Forster SC, Tsaliki E, et al. Stunted microbiota and opportunistic pathogen colonization in caesarean-section birth. Nature. 2019; 574:117–121. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Simon HY, Siddle KJ, Park DJ, Sabeti PC. Benchmarking metagenomics tools for taxonomic classification. Cell. 2019;178: 779–794. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Franzosa EA, McIver LJ, Rahnavard G, et al. Species-level functional profiling of metagenomes and metatranscriptomes. Nat Methods. 2018;15:962. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Weiss S, Xu ZZ, Peddada S, et al. Normalization and microbial differential abundance strategies depend upon data characteristics. Microbiome. 2017;5:27. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Oksanen J, Blanchet FG, Kindt R, et al. Package ‘vegan’. Community Ecology Package, Version 2, 1–295; 2013. [Google Scholar]
  • 29.Zhao N, Chen J, Carroll IM, et al. Testing in microbiome-profiling studies with MiRKAT, the microbiome regression-based kernel association test. Am J Hum Genetcs. 2015;96:797–807. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Gloor GB, Wu JR, Pawlowsky-Glahn V, Egozcue JJ. It’s all relative: analyzing microbiome data as compositions. Ann Epidemiol. 2016;26:322–329. [DOI] [PubMed] [Google Scholar]
  • 31.Guenther PM, Casavale KO, Reedy J, et al. Update of the healthy eating index: HEI-2010. J Acad Nutr Diet. 2013;113:569–580. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Epplein M, Signorello LB, Zheng W, et al. Race, African ancestry, and Helicobacter pylori infection in a low-income United States population. Cancer Epidemiol Prev Biomark. 2011;20:826–834. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Newman MG, Takei H, Klokkevold PR, Carranza FA. Carranza’s Clinical Periodontology. St Louis, Missouri: Elsevier Health Sciences; 2011. [Google Scholar]
  • 34.Gunathilake MN, Lee J, Choi IJ, et al. Association between the relative abundance of gastric microbiota and the risk of gastric cancer: a case-control study. Sci Rep. 2019;9:1–11. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Gunathilake M, Lee J, Choi IJ, et al. Alterations in gastric microbial communities are associated with risk of gastric cancer in a Korean population: a case-control study. Cancer. 2020;12:2619. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Sun J, Zhou M, Salazar CR, et al. Chronic periodontal disease, periodontal pathogen colonization, and increased risk of precancerous gastric lesions. J Periodontol. 2017;88:1124–1134. [DOI] [PubMed] [Google Scholar]
  • 37.Wu F, Yang L & Hao Y et al. Foregut microbiome and gastric intestinal metaplasia. Forthcoming 2021. 10.1002/ijc.33848 [DOI] [Google Scholar]
  • 38.Ryan KJ, Ray CG. Medical Microbiology. McGraw Hill. 2004;4:370. [Google Scholar]
  • 39.Kolar SL, Tsai C-M, Torres J, Fan X, Li H, Liu GY. Propionibacterium acnes-induced immunopathology correlates with health and disease association. JCI Insight. 2019;4(5):1–11. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Mäkinen KK. Sugar alcohol sweeteners as alternatives to sugar with special consideration of xylitol. Med Princ Pract. 2011;20:303–320. [DOI] [PubMed] [Google Scholar]
  • 41.Lo C-H, Kwon S, Wang L, et al. Periodontal disease, tooth loss, and risk of oesophageal and gastric adenocarcinoma: a prospective study. Gut. 2021;70:620–621. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Costello EK, Lauber CL, Hamady M, Fierer N, Gordon JI, Knight R. Bacterial community variation in human body habitats across space and time. Science. 2009;326:1694–1697. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary material

Data Availability Statement

Metagenomic sequencing data generated in our study is available in the database of Genotypes and Phenotypes (dbGaP) with accession code phs002566.v1.p1. Investigators who would like to access individual-level study data should submit an application to the NIH Data Access Committee (DAC) to request the datasets. Further information is available from the corresponding author upon request.

RESOURCES