a,b, Multidimensional scaling of the human whole transcriptome by UMAP from patients with UC and CD, as well as healthy (control) participants. c, MA plots showing the differential gene expression results, expressed as log2(fold change) between the indicated comparisons (M) as a function of log2(average gene expression) (A). Red dots represent genes being differentially expressed with high statistical significance (false discovery rate (FDR) < 1 × 10−10). The number of differentially expressed genes and their trends are indicated in red. d, Box plots showing differential tumor necrosis factor (TNF) normalized expression among UC and CD, as well as healthy ileum, colon and rectum. e,f, Box plots showing differential calprotectin (S100A8 and S100A9) (e) and S100A12 (f) encoding gene normalized expression among UC and CD, as well as healthy (control) ileum, colon and rectum. g, Bar plots showing the relative abundance of the indicated bacterial phyla in stools, ileum and colon from patients with UC and CD, and healthy participants (control). h–j, Violin plots showing Shannon diversity indices among CD, healthy (control) and UC in stools (h), colon (i) and ileum (j). k, Box plots showing relative Caudovirales order abundance in colon, ileum and stools from healthy participants (control) and patients with CD, and UC. All box plots represent the sample distribution with median, minimum, maximum, first and third quartiles. An interquartile range of 1.5 is used to define outliers. Statistical differences between groups were calculated by analysis of variance with Tukey’s honestly significant difference post hoc test for multiple comparisons. Differences with adjusted P ≤ 0.05 were considered significant. For complete statistics, see Supplementary Table 2.
Source data