Skip to main content
Carcinogenesis logoLink to Carcinogenesis
. 2014 Jul 18;35(9):2089–2096. doi: 10.1093/carcin/bgu131

Fecal metabolomics: assay performance and association with colorectal cancer

James J Goedert 1,*, Joshua N Sampson 1, Steven C Moore 1, Qian Xiao 1, Xiaoqin Xiong 1, Richard B Hayes 2, Jiyoung Ahn 2, Jianxin Shi 1, Rashmi Sinha 1
PMCID: PMC4146421  PMID: 25037050

Summary

Of 1043 molecules in feces, 26 were increased and 15 were decreased in CRC patients. Because fecal metabolites vary over time, very large studies with prediagnostic specimens will be needed to reliably define molecular pathways that affect colorectal carcinogenesis.

Abstract

Metabolomic analysis of feces may provide insights on colorectal cancer (CRC) if assay performance is satisfactory. In lyophilized feces from 48 CRC cases, 102 matched controls, and 48 masked quality control specimens, 1043 small molecules were detected with a commercial platform. Assay reproducibility was good for 527 metabolites [technical intraclass correlation coefficient (ICC) >0.7 in quality control specimens], but reproducibility in 6-month paired specimens was lower for the majority of metabolites (within-subject ICC ≤0.5). In the CRC cases and controls, significant differences (false discovery rate ≤0.10) were found for 41 of 1043 fecal metabolites. Direct cancer association was found with three fecal heme-related molecules [covariate-adjusted 90th versus 10th percentile odds ratio (OR) = 17–345], 18 peptides/amino acids (OR = 3–14), palmitoyl-sphingomyelin (OR = 14), mandelate (OR = 3) and p-hydroxy-benzaldehyde (OR = 4). Conversely, cancer association was inverse with acetaminophen metabolites (OR <0.1), tocopherols (OR = 0.3), sitostanol (OR = 0.2), 3-dehydrocarnitine (OR = 0.4), pterin (OR = 0.3), conjugated-linoleate-18-2N7 (OR = 0.2), N-2-furoyl-glycine (OR = 0.3) and p-aminobenzoate (PABA, OR = 0.2). Correlations suggested an independent role for palmitoyl-sphingomyelin and a central role for PABA (which was stable over 6 months, within-subject ICC 0.67) modulated by p-hydroxy-benzaldehyde. Power calculations based on ICCs indicate that only 45% of metabolites with a true relative risk 5.0 would be found in prospectively collected, prediagnostic specimens from 500 cases and 500 controls. Thus, because fecal metabolites vary over time, very large studies will be needed to reliably detect associations of many metabolites that potentially contribute to CRC.

Introduction

Metabolomics technology can measure hundreds of small molecules in a biospecimen, which may advance understanding of mechanisms and facilitate early diagnosis of disease. Colorectal cancer (CRC) is a good candidate, as it arises as a consequence of genetic mutations that accumulate and are driven by carcinogens related to diet (1). By convention, biomarker assays have been developed for serum or urine, but this approach may be limited because these fluids are anatomically remote from the gut mucosa in which CRC arises. Metabolomic study of feces may prove to be more effective, because feces is in close proximity to the colorectal mucosa and is a product of interactions between dietary components and the microbiota. In addition to host cell metabolites that may be shed into the gut lumen, the gut microbiota is thought to contribute to colorectal neoplasia through several immunologic and metabolic pathways (2,3). The gut microbiota produces many important metabolites including short-chain fatty acids, biotin and vitamin K (4,5) and may affect the metabolism of suspected dietary carcinogens, such as benzo(a)pyrene and acetaldehyde (6–10).

To date, metabolomic analyses of fecal samples have mostly been restricted to experimental studies in animals and small cross-sectional studies in humans (2,5,11). A few have considered CRC (12–16), but comprehensive identification of human CRC-associated metabolic end products is lacking. Thus, the current study had three objectives. We quantified technical, between-subject and within-subject variances in fecal metabolites as determined by ultra high-performance liquid phase chromatography and gas chromatography coupled with tandem mass spectrometry (HPLC-GC/MS-MS). Then, in a case–control study, we compared fecal metabolomic profiles in a small group of well-characterized CRC cases and matched controls. Finally, we estimated expected power for a larger case–control study nested within a prospective cohort to detect true associations between a fecal metabolite and a subsequent disease event.

Materials and methods

Study participants and specimens

The current project used data and stored frozen specimens from a CRC case–control study of fecal mutagens that was reviewed and approved by an Institutional Review Board at the National Cancer Institute (17,18). Briefly, during 1985–89 at three Washington DC area hospitals, following signed informed consent, patients suspected to have CRC were recruited before surgery or initiation of treatment. Only newly diagnosed, histologically confirmed cases of adenocarcinoma of the colon or rectum were retained. Likewise, contemporaneous patients awaiting elective surgery for non-oncologic, non-gastrointestinal conditions at these hospitals were recruited as controls. A median of 6 days (interquartile range 3–13 days) prior to hospitalization and surgery, participants completed diet and demographic questionnaires and provided 2-day fecal samples that were frozen at home on dry ice and subsequently lyophilized. The 2-day lyophilates were pooled, mixed and stored at −40°C.

Of 69 cases and 114 controls in the original study (17,18), the case–control analysis included 48 cases and 102 controls for whom at least 100 mg of lyophilized feces was available. Controls were frequency matched to cases by gender and body mass index. Tumors in the cases were classified by stage and site in the large intestine.

To quantify technical variability of the metabolomics assay, 48 identical, masked aliquots of lyophilized feces were prepared from the same patients, including 11 from each of 2 controls, and 26 from 7 other controls. To quantify within-subject variability over time, half of the 48 replicate control specimens were collected 6 months after the baseline specimens, using identical collection and handling methods.

Laboratory methods

A range of small molecules (most <1000 Daltons) was detected in the lyophilized fecal specimens by HPLC-GC/MS-MS (Metabolon, Durham, NC) as described previously (19,20). Briefly, non-targeted single methanol extraction was performed, followed by protein precipitation. Individual molecules and their relative levels were identified from the mass spectral peaks compared with a chemical reference library generated from 2500 standards, based on mass spectral peaks, retention times and mass-to-charge ratios. The molecules include, but are not limited to, amino acids, carbohydrates, fatty acids, androgens and xenobiotics. Volatile molecules, such as short-chain fatty acids, may be lost during lyophilization or extraction. However, such loss is generally equivalent across specimens, and lyophilization is optimal for fecal specimens to assure equal loading of dry weight.

Assay performance statistical methods

As we have previously evaluated overall reliability and validity of this platform for serum and urine (21), herein we assessed the technical, within-subject and between-subject variability of the fecal metabolite data by calculating intraclass correlation coefficients (ICC) for the fecal quality control samples, as follows. We decomposed the total variance of each metabolite, σT2, into three different components: the between-subject variance, σB2, which represents the variance of the ‘usual’ level for subjects in a population; the within-subject variance, σW2, which represents the variability over time around the ‘usual’ level within an individual and the technical variance, σE2, which is the variance introduced by measurement error in the laboratory procedures.

From these three variance components, we defined the following additional quantities:

  • 1) Technical ICC: the proportion of the total variance that is attributed to biological variance, as opposed to random laboratory variation. High technical ICC indicates high laboratory reproducibility.

ICC=σB2+σW2σT2=1σE2σT2
  • 2) Within-subject ICC: the ratio of between-subject variance/total variance. Higher πTB indicates higher stability over time in vivo and thus higher power to act as a marker for long-term risk analysis.

πTB=σB2σB2+σW2+σE2

Case–control statistical methods

Demographic data for the cases and controls were compared by Fisher’s exact test for categorical variables. For this pilot study of fecal metabolites, our primary analysis modeled the association between each metabolite and CRC by unconditional logistic regression, adjusting for body mass index, age, gender, race and hospital. Metabolite values below the level of detection were assigned the minimum observed value for that metabolite. For metabolites present in <80% of the individuals, we categorized the metabolite as present or absent (i.e. dichotomous) and report the associated odds ratio (OR) and its P value from a likelihood ratio test. For metabolites present in at least 80% of individuals, we report the ORs comparing the 90th to the 10th percentile of the metabolite values, the corresponding confidence intervals, and the P value from the likelihood ratio test comparing models with and without the metabolite. Letting X 90, X 10 and β denote the 90th percentile, 10th percentile and the log(OR) from the logistic regression, we defined the OR of interest by eβ(X90X10). The q values, which reflect the false discovery rate (FDR), were calculated separately for metabolites present in <80% and metabolites in at least 80% of individuals.

We examined the proportion of metabolites associated with cancer by a quantile–quantile (QQ) plot. We plotted, on a log10-scale, the expected P values (n/(n+1), (n−1)/(n+1),…,1/n) against the observed P values, ordered from largest to smallest (p(1),p(2),...,p(n)), where n is the number of metabolites. We also plotted a point-wise 95% confidence interval showing the range of p (i) that can occur by chance. Specifically, we created 1000 permuted datasets by randomly assigning case/control status, calculated (p(1)b,p(2)b,...,p(n)b) for each permutation b{1,...,1000}, and then extracted the 2.5th and 97.5th quantiles of each p(i)b.

We performed a standard pathway analysis. Specifically, we evaluated whether the metabolites within predefined pathways were associated with the outcome of CRC by analysis of variance (globalAncova) (22). P values were calculated by permutation and are therefore valid. We also performed Pearson pairwise correlations, overall and by case–control status, of all FDR significant metabolites, using natural log-transformed values for metabolites detected in at least 80% of specimens, else presence versus absence for lower prevalence metabolites.

Statistical power calculations

Using the observed estimates of technical, within-subject and between-subject variability, we estimated the expected power for a case–control study nested within a cohort focused on a single outcome. Specifically, we assumed that a nested study will have n participants, with an equal number of cases and controls. We further assumed that the study will use a t-test to compare the metabolite levels between cases and controls to detect associations between metabolites and disease, using a Bonferroni-corrected significance threshold. We defined the effect size as the relative risk (RR) of disease comparing individuals in the top to the bottom quartiles of the ‘usual’ metabolite level. At a given effect size, we calculated, across metabolites, the mean probability of detecting a statistically significant association, accounting for the three sources of variability. This average probability, or the average power, indicates the proportion of true metabolite–disease associations that we expect to discover in a given prospective study. Statistical power then was applied to selected metabolites that were observed to be associated with CRC in the case–control study.

All analyses were performed with SAS software version 9.1.3 (SAS Institute, Cary, NC) and the R statistical language version 3.0.1.

Results

Assay performance

In the fecal specimens, there were 1043 small molecules detected. These included 773 characterized and 270 uncharacterized (‘X’) molecules. Of the 579 molecules detected in at least 10% of the fecal specimens, overall laboratory reproducibility in masked replicate specimens was high, with the technical ICCs exceeding 0.7 for 527 (91%) of the metabolites (Figure 1A).

Fig. 1.

Fig. 1.

Distributions of ICCs across 579 metabolites detected in at least 10% of fecal specimens. Percentage beside each asterisk indicates the proportion of metabolites with at least the indicated ICC. (A) Technical ICCs, a measure of laboratory variability; 91% of metabolites have technical ICC ≥0.7. (B) Within-subject ICCs, a measure of stability over 6 months; 44% of metabolites have within-subject ICC ≥ 0.5.

In addition to technical reproducibility, within-subject and between-subject variance contribute to total variance. Considering the between-subject/total variance ratio, which is equivalent to within-subject ICC, Figure 1B shows that within-subject ICC was relatively low. Only 44% of the metabolites had a within-subject ICC ≥0.5. Only 5% of the metabolites had a within-subject ICC ≥0.7.

CRC case–control study

The 48 CRC cases tended to be older than the 102 frequency-matched controls (mean 62.9 versus 58.3 years, P = 0.06), but they did not differ by sex (60% male), body mass index, race, smoking history, attained education or hospital of recruitment (Supplementary Table 1, available at Carcinogenesis Online). The primary CRC tumors arose in approximately equal proportions in the proximal colon, distal colon and rectum (29, 33 and 27%, respectively). The cases included 35% with metastases at diagnosis (Dukes’ stage C/D), 42% with local invasion but no known metastases (Dukes’ stage B) and 21% with only localized disease (Dukes’ stage A).

Fecal metabolite associations with CRC

Global assessment indicated that many of the 1043 fecal metabolites differed significantly between cases and controls (Supplementary Figure 1, available at Carcinogenesis Online). The prevalence of all 1043 fecal metabolites in cases and controls is presented in Supplementary Table 2, available at Carcinogenesis Online. Of these, 41 (3.9%) were significantly associated with CRC at FDR = 0.10. As summarized in Supplementary Table 3, available at Carcinogenesis Online, CRC was overrepresented with cofactors and vitamins (P = 0.032, especially tocopherol-related), xenobiotics (P = 0.006, especially drugs and food components/plants) and marginally with lipids (P = 0.064) and uncharacterized metabolites (P = 0.074).

Heme was detected in 29% of cases and 2% of controls (OR = 16.55, Table I). Heme met the Bonferroni threshold for statistical significance, as did two heme-related metabolites that have not been fully characterized—X_18565 (67 versus 1%, OR = 345) and X_19549 (48 versus 3%, OR = 30). The heme-related molecules were highly but not perfectly correlated (X_19549 with heme, R = 0.36; X_18565 with heme, R = 0.49 and X_19549 with X_18565, R = 0.73). In combination analyses, CRC risk was increased 24-fold (58 versus 5%) with detection of heme or X_19549, 97-fold (69 versus 3%) with detection of heme or X_18565 and 49-fold (71 versus 5%) with detection of any of these heme-related metabolites.

Table I.

Fecal metabolites associated with CRC at FDR 0.10

Metabolite Prevalence (%)a P valueb OR (CI)b
Cases Controls
Heme-related
 Heme 29 2 1.5E-05 16.55 (3.46–79.09)
 X_18565 67 1 <1.0E-10 345.3 (35.43–3364.92)
 X_19549 48 3 1.8E-10 29.81 (7.99–111.14)
Cofactors and vitamins
 α-Tocopherol 96 100 6.0E-03 0.25 (0.08–0.74)
 γ-Tocopherol 98 100 1.8E-03 0.26 (0.1–0.64)
 Pterin 90 99 4.0E-03 0.33 (0.15–0.73)
Xenobiotics
 4-Acetamidophenol 92 98 9.1E-06 0.04 (0.01–0.27)
 2-Hydroxyacetaminophen sulfate 0 10 1.4E-03 0 (0–Infinity)
 3-Cystein-S-YL-acetaminophen 2 23 7.6E-04 0.07 (0.01–0.58)
p-Acetamidophenylglucuronide 0 8 1.9E-03 0 (0–Infinity)
 PABA 98 100 8.8E-04 0.22 (0.08–0.57)
N-2-Furoyl-glycine 98 100 3.8E-03 0.26 (0.1–0.69)
 Sitostanol 90 99 3.0E-04 0.20 (0.08–0.5)
p-Hydroxybenzaldehyde 100 99 1.6E-03 3.96 (1.56–10.05)
 Mandelate 96 94 5.1E-03 3.32 (1.36–8.08)
Lipids
 Palmitoyl-sphingomyelin 98 95 2.1E-06 13.6 (3.9–47.41)
 Conjugated linoleate-18-2N7 96 100 2.3E-04 0.16 (0.06–0.47)
 3-Dehydrocarnitine 92 98 1.6E-03 0.35 (0.18–0.7)
Amino acids
 Histidine 100 99 2.7E-05 7.46 (2.56–21.8)
Cis-Urocanate 100 99 6.6E-04 5.19 (1.88–14.37)
Peptides
 Tryptophyl-glycine 100 97 1.7E-06 14.34 (4.09–50.35)
 Leucyl-tryptophan 100 94 5.6E-05 7.96 (2.58–24.56)
 Alanyl-histidine 60 28 4.1E-05 5.28 (2.29–12.16)
 Histidyl-glycine 67 32 3.7E-05 4.94 (2.23–10.94)
 Tyrosyl-glutamine 96 91 5.7E-04 4.97 (1.84–13.45)
 Histidyl-alanine 92 82 3.3E-04 6.87 (2.22–21.28)
 Valyl-aspartate 100 99 3.6E-04 7.10 (2.18–23.12)
 Pyro-glutamyl-glycine 19 5 1.8E-03 7.12 (1.98–25.68)
 Alanyl-leucine 100 99 2.9E-03 4.94 (1.63–14.95)
 Alanyl-tryptophan 100 94 3.4E-03 4.09 (1.51–11.08)
 Histidyl-phenylalanine 96 92 1.7E-03 5.33 (1.76–16.19)
 Leucyl-glutamate 100 99 2.5E-03 5.12 (1.67–15.65)
 Leucyl-serine 100 99 2.1E-03 5.34 (1.7–16.79)
 α-Glutamyl-valine 100 99 4.6E-03 3.39 (1.39–8.3)
 Prolyl-alanine 100 99 2.3E-03 3.21 (1.45–7.08)
 Valyl-histidine 90 79 3.0E-03 4.43 (1.61–12.22)
Uncharacterized molecules
 X_19558 92 84 9.3E-05 9.11 (2.76–30.11)
 X_16343 96 98 4.6E-03 3.77 (1.44–9.86)
 X_17749 98 100 7.0E-04 0.22 (0.09–0.57)
 X_19232 96 100 4.9E-04 0.22 (0.09–0.55)
 X_19136 77 86 1.8E-03 0.19 (0.06–0.55)

aFraction above the limit of detection.

bORs, corresponding confidence intervals (CI) and P values were calculated by logistic regression, adjusted for continuous age, body mass index, sex, race and hospital. ORs compared 10th versus 90th percentiles for common metabolites (prevalence >80%), else presence versus absence for less common metabolites.

As shown in Table I, the CRC risk was reduced with four acetaminophen metabolites. Pairwise correlations of the four acetaminophen metabolites ranged R = 0.29–0.82 (median R = 0.40). CRC risk also was low with 2-hydroxyhippurate (salicylulate), which did not meet FDR (OR = 0.27, P = 0.004), but not with salicylate, salicyluric-glucuronide, ibuoprofen or hydroxy-ibuprofen (P = 0.023–0.081, data not presented).

Eighteen fecal peptides/amino acids were associated with increased CRC risk, and none was associated with decreased risk. These 18 peptides/amino acids were highly correlated with each other (median R = 0.51, interquartile range R = 0.33–0.67). Tryptophylglycine had very strong correlations (R = 0.73–0.88) with eight other dipeptides. One uncharacterized molecule (X_16343) had very strong correlations (R = 0.70–0.85) with tryptophylglycine and five other dipeptides. The strongest inverse correlation observed was between tryptophylglycine and sitostanol (R = −0.45).

The 11 other CRC-associated molecules, which may functionally contribute to the disease, included pterin, 2 tocopherols, 5 xenobiotics and 3 lipids. Of these 11, eight were associated with lower CRC risk and three with higher risk (Table I). Figure 2 presents the mean levels of the eight reduced-risk (green) and three increased-risk (red) metabolites. Figure 3 presents pairwise correlations of these metabolites. Nearly all correlations were positive (arrows). In both cases and controls, the metabolite network was centered around p-aminobenzoate (PABA). Cases had lower levels (Figure 2) but many more and stronger correlations (Figure 3) of reduced-risk metabolites compared with controls.

Fig. 2.

Fig. 2.

Mean levels, by case–control status, of 11 CRC-associated metabolites in feces. These metabolites’ pairwise correlations and associations with CRC risk are presented in Figure 3.

Fig. 3.

Fig. 3.

Pairwise correlations of 11 CRC-associated metabolites in feces of CRC cases (above) and matched controls (below). Double-headed arrows (↔) indicate direct (positive) correlations; blocked lines (|―|) indicate inverse (negative) correlations. Green indicates inverse correlation with CRC; red indicates direct correlation with CRC. Line weight indicates strength of correlation coefficient. Cases’ metabolites are generally more correlated than controls’ metabolites. Mean normalized levels of these metabolites are presented in Figure 2.

Among potential candidate molecules (23), ursodeoxycholate was marginally associated with lower CRC risk (OR = 0.32, P = 0.03), but six other fecal bile acids were unrelated (P = 0.30–0.98, data not presented). CRC was not associated with 3-aminobutyrate, 3-aminoisobutyrate or 3-methyl-2-oxobutyrate (P = 0.03–0.74). The five uncharacterized CRC-associated molecules included a pair (X_19232 and X_17749) associated with very low CRC risk that had the strongest direct correlation that was observed (R = 0.93).

None of the metabolites that were very strongly associated with CRC differed significantly by tumor site (Supplementary Table 4A, available at Carcinogenesis Online). Two of them differed by stage of the tumor (Supplementary Table 4B, available at Carcinogenesis Online). Prevalence of heme was directly related to stage: 10% without invasion (Dukes’ stage A), 20% with local invasion (Dukes’ stage B), 47% with metastases (Dukes’ stage C/D, P trend = 0.03). Palmitoyl-sphingomyelin level above median had a similar trend: 60% without invasion, 80% with local invasion, 94% with metastases (P trend = 0.03).

Statistical power of fecal metabolomics

With the observed technical, within-subject and between-subject variances, we estimated study power to detect RRs of 1.5, 2.0, 5.0 and 10.0 in prospective, nested case–control studies using a Bonferroni adjusted α-level of 0.05/579. As shown in Figure 4, a study with 250 cases and 250 controls (500 participants) is expected to detect 0.2, 1.2, 45 and 82% of the metabolites with true RRs of 1.5, 2.0, 5.0 and 10.0, respectively. In a study of 1000 participants, the proportions of metabolites detected with these RRs increases to 0.6, 6.5, 84 and 96%, respectively. With a sample size of 5000, ~100% of metabolites with RRs of 5.0 and 10.0 are expected to be detected (Figure 4).

Fig. 4.

Fig. 4.

Proportion of metabolites expected to be detected in a case–control study as a function of effect size according to different sample size [n of 500 (dashed), 1000 (solid) and 5000 (dotted)] under Bonferroni-adjusted α-levels (0.05/579). Effect size is defined by the true RR (on the x-axis) of disease comparing individuals in the top and bottom deciles of the ‘usual’ metabolite level. A study of n = 500 (dashed line), would detect 0.2, 1.2, 45 and 82% of metabolites that have true RRs of 1.5, 2.0, 5.0 and 10.0, respectively. In a study of n = 5000 (dotted line), the same RRs would be detected with 1.9, 8.0, 99 and 99% of the metabolites.

Table II illustrates the impact of instability over time (within-subject ICC) on study power, using as examples the 11 potentially functional metabolites associated with CRC in our case–control study. In a study of 500 participants, power to detect a true RR = 5.0 (or the converse, RR = 0.20) would be good (≥0.89) for six metabolites—3-dehydrocarnitine, PABA, α-tocopherol, γ-tocopherol, pterin and N-2-furoyl-glycine. However, even for these metabolites, power would be low for a true RR = 2.5 (or RR = 0.40). In a larger study of 1000 participants, power for a true RR = 2.5 would be good only for 3-dehydrocarnitine and PABA. At RR = 5.0, power would be excellent for nine metabolites and good for palmitoyl-sphingomyelin (0.84) among 1000 participants. None of these metabolites could reliably detect a true RR = 1.5 even among 1000 participants.

Table II.

Statistical power to detect 11 potentially functional CRC-associated metabolites in prospective studies of 500 or 1000 participants, at prespecified RRa

Within-subject ICCb N = 500 N = 1000
RR = 1.5 RR = 2.5 RR = 5.0 RR = 1.5 RR = 2.5 RR = 5.0
3-Dehydrocarnitine 0.68 0.01 0.39 0.99 0.05 0.89 1.00
PABA 0.67 0.01 0.38 0.99 0.04 0.88 1.00
α-Tocopherol 0.53 0.01 0.24 0.95 0.03 0.73 1.00
γ-Tocopherol 0.50 0.01 0.21 0.92 0.02 0.68 1.00
Pterin 0.48 <0.01 0.19 0.91 0.02 0.65 1.00
N-2-Furoyl-glycine 0.46 <0.01 0.18 0.89 0.02 0.62 1.00
p-Hydroxybenzaldehyde 0.32 <0.01 0.08 0.64 0.01 0.34 0.98
Sitostanol 0.31 <0.01 0.07 0.61 0.01 0.32 0.98
Conjugated linoleate-18-2N7 0.25 <0.01 0.04 0.44 0.01 0.21 0.93
Palmitoyl-sphingomyelin 0.21 <0.01 0.03 0.33 <0.01 0.14 0.84
Mandelate 0.01 <0.01 <0.01 <0.01 <0.01 <0.01 <0.01

aAlpha is Bonferroni-corrected (0.05/579 detectable metabolites).

bICC measure of within-subject variance over 6 months, adjusting for technical and between-subject variances.

Discussion

This study found that HPLC-GC/MS-MS technology had very high technical reproducibility for fecal metabolite profiles, but many of the fecal metabolites varied greatly in paired specimens that were collected 6 months apart. Expanding and largely replicating previously reported CRC associations with fecal amino acids and a few other molecules that used older technologies (12–16), our technology’s greater sensitivity revealed that 41 (3.9%) of 1043 fecal metabolites differed significantly between CRC cases and matched controls in our pilot study. Our calculations of within-subject variance over time and statistical power provide guidance for the sizes of the future, prospective studies that are needed to replicate and extend our observations.

The 41 CRC-associated fecal molecules in our pilot fell into four categories. First, providing proof of the technology, heme-related molecules were strongly associated with case status. With our study’s pooled 2-day lyophilized stool samples and HPLC-GC/MS-MS method, heme itself was detected in 2% of controls, 10% of non-invasive cases, 20% of locally invasive cases and 47% of metastatic cases. Better performance, particularly higher sensitivity in preinvasive cases, would be expected with fecal immunochemical tests that are licensed and currently in widespread use (24) and perhaps with detection of methylated or mutated CRC-associated oncogenes in feces (25). We observed higher sensitivity, as well as good specificity, with our two partially characterized, heme-related molecules X_19549 and especially X_18565, suggesting that further investigations with them are warranted.

Second, four acetaminophen-related molecules in feces were associated with a very low risk of CRC. This probably reflects confounding, with frequent use of acetaminophen among controls (30.4% with 4-acetaminophen sulfate in feces, versus 12.5% in cases), approximately half of whom were awaiting orthopedic surgery (17). Detection of salicylate and ibuprofen did not differ between cases and controls, despite prior evidence in favor of risk reduction (26).

Third, CRC risk was positively associated with 18 fecal peptides/amino acids; there were no inverse associations with peptides/amino acids. These metabolites included histidine, a metabolite of histidine (cis-Urocanate), and five histidine-containing dipeptides. This is consistent with previous reports that amino acids were more abundant in CRC tumor tissue compared with normal colonic mucosa (27–30). Paradoxically, two of our dipeptides, histidyl-alanine and histidyl-glycine, were reported to inhibit the growth of three different cancer cell lines (31). Moreover, histidine was not among the 11 amino acids reported as higher in previous CRC case–control analyses of feces, although histidine may not have been readily detected with the older methods that were used (12–15). In our study, tryptophyl-glycine had the highest risk and strong correlations with other CRC-associated dipeptides, and it was reported to have moderately high activity in the Salmonella typhimurium mutagenesis assay (32). It is likely, however, that the peptide/amino acid associations are not etiologic. Rather, the breadth of CRC-associated amino acids and dipeptides may reflect shedding into cases’ feces due to high levels of cell division, cell death and protein degradation.

Fourth, unlike heme, acetaminophen and peptides/amino acids, the 11 other CRC-associated molecules cannot be readily dismissed as the result of fecal shedding or confounding. Their pairwise correlations are presented in Figure 3. Compared with controls, the correlations in cases were stronger, more numerous, centered around PABA and included strong inverse associations of p-hydroxy-benzaldehyde with linoleate-18-2N7 and N-2-furoyl-glycine. Linoleate-18-2N7, a lipid that can be generated by bacteria (33), was low in our CRC cases and was reported to inhibit cancer in animal models by modulating inflammation (34,35). Obstruction of the protective effect of linoleate by p-hydroxy-benzaldehyde is suggested by the inverse correlation that we observed (R = −0.46 in cases versus R = −0.21 in controls). N-2-furoyl-glycine, also low in our cases, is a xenobiotic found in cigarette smoke and especially in coffee (36). Coffee consumption is inversely associated with CRC risk (37), and our data suggest the same association with fecal N-2-furoyl-glycine, possibly reduced by p-hydroxy-benzaldehyde (Figure 3).

PABA can be processed by certain bacteria into acetyl coenzyme A for citric acid and fatty acid metabolism (38). Along with pterin, PABA also is centrally related to folate-mediated one-carbon metabolism (39). Folate was not detected in our stool specimens. However, CRC risk was reported to be increased with altered metabolism of folate (40,41), inconsistently associated with level of folate in plasma (42), and perhaps slightly increased with folate-deficient diet (43). Animal models suggest that folate repletion or supplementation upregulates inflammation pathways (43). The reports that CRC tissue has reduced levels of PABA, lipids and glucose (27–29) complement our findings.

Palmitoyl-sphingomyelin was strongly associated with a >10-fold increased risk of CRC. An independent effect is suggested by its inverse correlation with p-hydroxy-benzaldehyde in the cases (Figure 3). Sphingomyelinase signaling produces ceramide, a messenger that modulates both cell proliferation and apoptosis through critical pathways, including the mitochondrial and c-Jun N-terminal kinase systems that can generate reactive oxygen species (44). Sphingolipid alterations are common in cancer and may contribute to CRC risk through the WNT/β-catenin pathway (45). Moreover, sphingomyelinase activity in stool has been proposed as a screening test for CRC (46).

Finally, carnitine, sitostanol and the two tocopherols were low in cases and correlated with each other. Sitostanol is a vegetable oil derivative that may reduce absorption of cholesterol and perhaps tocopherols from the gut (47–49). Tocopherols, which are vitamin E, have antioxidant properties and may reduce the formation of nitrosamines from dietary nitrites (50). In a double blind, placebo-controlled trial, supplementation with α-tocopherol was associated with a 20% lower incidence of CRC, although this was not statistically significant (51). Carnitine appears to reduce colonic neoplasia by reducing inflammation, inhibiting proliferation and increasing apoptosis (52–54).

Strengths of our study include careful processing and preservation of the fecal specimens, and our quantification of within-subject ICC, from which we could estimate statistical power with our cutting-edge fecal metabolomics platform. Our platform had high sensitivity and technical reproducibility, but it has limited ability to detect some volatile and larger molecules, such as calprotectin that has been associated with colonic inflammation and cancer (12–15,55). Our pilot study’s major limitations are its small size and cross-sectional, hospital-based case–control design. It provided no assessment of temporality and could only detect very strong associations with CRC. Because large prospective studies have not collected fecal samples, we conducted the pilot study to generate hypotheses that can be tested in the future. Our calculations of statistical power for larger prospective studies to detect 2.5- to 5-fold RRs across metabolites with a range of ICCs is an additional strength.

In summary, complementing CRC-associated differences in fecal microbial diversity and composition that we found in the same cases and controls (56), herein we found that CRC was associated with differences in a diverse array of many small molecules in feces. Some of these, such as heme and peptides, are likely shed from the tumor and might facilitate earlier diagnosis. Others appear to represent differences in diet, medications, microbes (56,57) and the host, with PABA modulated by p-hydroxy-benzaldehyde playing a central role. Fecal metabolomics may prove to be a useful new tool for studies of CRC and other diseases. However, instability of metabolomic profiles over time mandate large, prospective studies with prediagnostic specimens to clearly identify targets for prevention, diagnosis and intervention.

Supplementary material

Supplementary Tables 1–4 and Figure 1 can be found at http://carcin.oxfordjournals.org/

Funding

Intramural Research Program of the National Cancer Institute, National Institutes of Health (Z01-CP-010214).

Supplementary Material

Supplementary Data

Acknowledgements

We thank Dr M.Schiffman for his visionary study that made this project possible, as well as the patients who participated.

Conflict of Interest Statement: None declared.

Glossary

Abbreviations:

CRC

colorectal cancer

FDR

false discovery rate

HPLC-GC/MS-MS

high-performance liquid phase chromatography and gas chromatography coupled with tandem mass spectrometry

ICC

intraclass correlation coefficient

OR

odds ratio

PABA

p-aminobenzoate

RR

relative risk.

References

  • 1. Derry M.M., et al. (2013). Identifying molecular targets of lifestyle modifications in colon cancer prevention. Front. Oncol., 3, 119. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2. Martin F.P., et al. (2010). Dietary modulation of gut functional ecology studied by fecal metabonomics. J. Proteome Res., 9, 5284–5295 [DOI] [PubMed] [Google Scholar]
  • 3. Schwabe R.F., et al. (2013). The microbiome and cancer. Nat. Rev. Cancer, 13, 800–812 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4. Topping D.L., et al. (2001). Short-chain fatty acids and human colonic function: roles of resistant starch and nonstarch polysaccharides. Physiol. Rev., 81, 1031–1064 [DOI] [PubMed] [Google Scholar]
  • 5. Zheng X., et al. (2011). The footprints of gut microbial-mammalian co-metabolism. J. Proteome Res., 10, 5512–5522 [DOI] [PubMed] [Google Scholar]
  • 6. Diggs D.L., et al. (2011). Polycyclic aromatic hydrocarbons and digestive tract cancers: a perspective. J. Environ. Sci. Health. C. Environ. Carcinog. Ecotoxicol. Rev., 29, 324–357 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7. Cross A.J., et al. (2004). Meat-related mutagens/carcinogens in the etiology of colorectal cancer. Environ. Mol. Mutagen., 44, 44–55 [DOI] [PubMed] [Google Scholar]
  • 8. World Cancer Research Fund/American Institute for Cancer Research. (2011). Continuous Update Project. Food, Nutrition, Physical Activity, and the Prevention of Colorectal Cancer, Washington, DC [Google Scholar]
  • 9. Cross A.J., et al. (2010). A large prospective study of meat consumption and colorectal cancer risk: an investigation of potential mechanisms underlying this association. Cancer Res., 70, 2406–2414 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10. Fedirko V., et al. (2011). Alcohol drinking and colorectal cancer risk: an overall and dose-response meta-analysis of published studies. Ann. Oncol., 22, 1958–1972 [DOI] [PubMed] [Google Scholar]
  • 11. Stella C., et al. (2006). Susceptibility of human metabolic phenotypes to dietary modulation. J. Proteome Res., 5, 2780–2788 [DOI] [PubMed] [Google Scholar]
  • 12. Weir T.L., et al. (2013). Stool microbiome and metabolome differences between colorectal cancer patients and healthy adults. PLoS One, 8, e70803. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13. Monleón D., et al. (2009). Metabolite profiling of fecal water extracts from human colorectal cancer. NMR Biomed., 22, 342–348 [DOI] [PubMed] [Google Scholar]
  • 14. Bezabeh T., et al. (2009). MR metabolomics of fecal extracts: applications in the study of bowel diseases. Magn. Reson. Chem., 47 (suppl. 1), S54–S61 [DOI] [PubMed] [Google Scholar]
  • 15. Bezabeh T., et al. (2009). Detecting colorectal cancer by 1H magnetic resonance spectroscopy of fecal extracts. NMR Biomed., 22, 593–600 [DOI] [PubMed] [Google Scholar]
  • 16. Phua L.C., et al. (2014). Non-invasive fecal metabonomic detection of colorectal cancer. Cancer Biol. Ther., 15, 389–397 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17. Schiffman M.H., et al. (1989). Case-control study of colorectal cancer and fecapentaene excretion. Cancer Res., 49, 1322–1326 [PubMed] [Google Scholar]
  • 18. Schiffman M.H., et al. (1989). Case-control study of colorectal cancer and fecal mutagenicity. Cancer Res., 49, 3420–3424 [PubMed] [Google Scholar]
  • 19. Evans A.M., et al. (2009). Integrated, nontargeted ultrahigh performance liquid chromatography/electrospray ionization tandem mass spectrometry platform for the identification and relative quantification of the small-molecule complement of biological systems. Anal. Chem., 81, 6656–6667 [DOI] [PubMed] [Google Scholar]
  • 20. Sha W., et al. (2010). Metabolomic profiling can predict which humans will develop liver dysfunction when deprived of dietary choline. FASEB J., 24, 2962–2975 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21. Sampson J.N., et al. (2013). Metabolomics in epidemiology: sources of variability in metabolite measurements and implications. Cancer Epidemiol. Biomarkers Prev., 22, 631–640 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22. Hummel M., et al. (2008). GlobalANCOVA: exploration and assessment of gene group effects. Bioinformatics, 24, 78–85 [DOI] [PubMed] [Google Scholar]
  • 23. Sears C.L., et al. (2014). Microbes, microbiota, and colon cancer. Cell Host Microbe, 15, 317–328 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24. Allison J.E., et al. (2007). Screening for colorectal neoplasms with new fecal occult blood tests: update on performance characteristics. J. Natl Cancer Inst., 99, 1462–1470 [DOI] [PubMed] [Google Scholar]
  • 25. Imperiale T.F., et al. (2014). Multitarget stool DNA testing for colorectal-cancer screening. N. Engl. J. Med., 370, 1287–1297 [DOI] [PubMed] [Google Scholar]
  • 26. Harris R.E., et al. (2005). Aspirin, ibuprofen, and other non-steroidal anti-inflammatory drugs in cancer prevention: a critical review of non-selective COX-2 blockade (review). Oncol. Rep., 13, 559–583 [PubMed] [Google Scholar]
  • 27. Hirayama A., et al. (2009). Quantitative metabolome profiling of colon and stomach cancer microenvironment by capillary electrophoresis time-of-flight mass spectrometry. Cancer Res., 69, 4918–4925 [DOI] [PubMed] [Google Scholar]
  • 28. Denkert C., et al. (2008). Metabolite profiling of human colon carcinoma–deregulation of TCA cycle and amino acid turnover. Mol. Cancer, 7, 72. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29. Chan E.C., et al. (2009). Metabolic profiling of human colorectal cancer using high-resolution magic angle spinning nuclear magnetic resonance (HR-MAS NMR) spectroscopy and gas chromatography mass spectrometry (GC/MS). J. Proteome Res., 8, 352–361 [DOI] [PubMed] [Google Scholar]
  • 30. Qiu Y., et al. (2014). A distinct metabolic signature of human colorectal cancer with prognostic potential. Clin. Cancer Res., 20, 2136–2146 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31. Lucietto F.R., et al. (2006). The biological activity of the histidine-containing diketopiperazines cyclo(His-Ala) and cyclo(His-Gly). Peptides, 27, 2706–2714 [DOI] [PubMed] [Google Scholar]
  • 32. Ochiai M., et al. (1986). Mutagenicities of indole and 30 derivatives after nitrite treatment. Mutat. Res., 172, 189–197 [DOI] [PubMed] [Google Scholar]
  • 33. Storey A., et al. (2007). Conjugated linoleic acids modulate UVR-induced IL-8 and PGE2 in human skin cells: potential of CLA isomers in nutritional photoprotection. Carcinogenesis, 28, 1329–1333 [DOI] [PubMed] [Google Scholar]
  • 34. Zulet M.A., et al. (2005). Inflammation and conjugated linoleic acid: mechanisms of action and implications for human health. J. Physiol. Biochem., 61, 483–494 [DOI] [PubMed] [Google Scholar]
  • 35. Belury M.A. (1995). Conjugated dienoic linoleate: a polyunsaturated fatty acid with unique chemoprotective properties. Nutr. Rev., 53(4 Pt 1), 83–89 [DOI] [PubMed] [Google Scholar]
  • 36. Kowalski S. (2013). Changes of antioxidant activity and formation of 5-hydroxymethylfurfural in honey during thermal and microwave processing. Food Chem., 141, 1378–1382 [DOI] [PubMed] [Google Scholar]
  • 37. Sinha R., et al. (2012). Caffeinated and decaffeinated coffee and tea intakes and risk of colorectal cancer in a large prospective study. Am. J. Clin. Nutr., 96, 374–381 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38. Butler J.E., et al. (2007). Genomic and microarray analysis of aromatics degradation in Geobacter metallireducens and comparison to a Geobacter isolate from a contaminated field site. BMC Genomics, 8, 180. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39. Hanson A.D., et al. (2011). Folate biosynthesis, turnover, and transport in plants. Annu. Rev. Plant Biol., 62, 105–125 [DOI] [PubMed] [Google Scholar]
  • 40. Ma J., et al. (1997). Methylenetetrahydrofolate reductase polymorphism, dietary interactions, and risk of colorectal cancer. Cancer Res., 57, 1098–1102 [PubMed] [Google Scholar]
  • 41. Weinstein S.J., et al. (2008). One-carbon metabolism biomarkers and risk of colon and rectal cancers. Cancer Epidemiol. Biomarkers Prev., 17, 3233–3240 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42. Le Marchand L., et al. (2009). Plasma levels of B vitamins and colorectal cancer risk: the multiethnic cohort study. Cancer Epidemiol. Biomarkers Prev., 18, 2195–2201 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43. Protiva P., et al. (2011). Altered folate availability modifies the molecular environment of the human colorectum: implications for colorectal carcinogenesis. Cancer Prev. Res. (Phila)., 4, 530–543 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44. Mullen T.D., et al. (2012). Ceramide and apoptosis: exploring the enigmatic connections between sphingolipid metabolism and programmed cell death. Anticancer. Agents Med. Chem., 12, 340–363 [DOI] [PubMed] [Google Scholar]
  • 45. García-Barros M., et al. (2014). Sphingolipids in colon cancer. Biochim. Biophys. Acta, 1841, 773–782 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46. Di Marzio L., et al. (2005). Detection of alkaline sphingomyelinase activity in human stool: proposed role as a new diagnostic and prognostic marker of colorectal cancer. Cancer Epidemiol. Biomarkers Prev., 14, 856–862 [DOI] [PubMed] [Google Scholar]
  • 47. Miettinen T.A., et al. (1995). Reduction of serum cholesterol with sitostanol-ester margarine in a mildly hypercholesterolemic population. N. Engl. J. Med., 333, 1308–1312 [DOI] [PubMed] [Google Scholar]
  • 48. Ostlund R.E., Jr, et al. (1999). Sitostanol administered in lecithin micelles potently reduces cholesterol absorption in humans. Am. J. Clin. Nutr., 70, 826–831 [DOI] [PubMed] [Google Scholar]
  • 49. Gylling H., et al. (1999). Retinol, vitamin D, carotenes and alpha-tocopherol in serum of a moderately hypercholesterolemic population consuming sitostanol ester margarine. Atherosclerosis, 145, 279–285 [DOI] [PubMed] [Google Scholar]
  • 50. Mirvish S.S. (1996). Inhibition by vitamins C and E of in vivo nitrosation and vitamin C occurrence in the stomach. Eur. J. Cancer Prev., 5 (suppl. 1), 131–136 [PubMed] [Google Scholar]
  • 51. Virtamo J., et al. ; ATBC Study Group. (2003). Incidence of cancer and mortality following alpha-tocopherol and beta-carotene supplementation: a postintervention follow-up. JAMA, 290, 476–485 [DOI] [PubMed] [Google Scholar]
  • 52. Roy M.J., et al. (2009). In vitro studies on the inhibition of colon cancer by butyrate and carnitine. Nutrition, 25, 1193–1201 [DOI] [PubMed] [Google Scholar]
  • 53. Dionne S., et al. (2012). Studies on the chemopreventive effect of carnitine on tumorigenesis in vivo, using two experimental murine models of colon cancer. Nutr. Cancer, 64, 1279–1287 [DOI] [PubMed] [Google Scholar]
  • 54. Park S.J., et al. (2012). Carnitine sensitizes TRAIL-resistant cancer cells to TRAIL-induced apoptotic cell death through the up-regulation of Bax. Biochem. Biophys. Res. Commun., 428, 185–190 [DOI] [PubMed] [Google Scholar]
  • 55. Johne B., et al. (2001). A new fecal calprotectin test for colorectal neoplasia. Clinical results and comparison with previous method. Scand. J. Gastroenterol., 36, 291–296 [PubMed] [Google Scholar]
  • 56. Ahn J., et al. (2013). Human gut microbiome and risk for colorectal cancer. J. Natl Cancer Inst., 105, 1907–1911 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57. Hamer H.M., et al. (2012). Functional analysis of colonic bacterial metabolism: relevant to health? Am. J. Physiol. Gastrointest. Liver Physiol., 302, G1–G9 [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Data

Articles from Carcinogenesis are provided here courtesy of Oxford University Press

RESOURCES