Figure - PMC

Skip to main content

An official website of the United States government

Here's how you know

Here's how you know

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.

View full-text article in PMC

. 2024 Feb 12;9(3):595–613. doi: 10.1038/s41564-023-01580-y

Search in PMC
Search in PubMed
View in NLM Catalog
Add to search

© The Author(s) 2024

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

PMC Copyright notice

Fig. 6 — a, Cross-validation errors of multi-omic data sets. 16S and 18S rRNA gene data were collapsed to SILVA taxonomic level 7 (L7) and 12 (L12). Boxplots represent average prediction MAE in ADD of individual bodies during nested cross-validation of 36 body dataset. 16S rRNA soil face, soil hip, skin face and skin hip datasets contain n = 600, 616, 588 and 500 biologically independent samples, respectively. 18S rRNA soil face, soil hip, skin face and skin hip datasets contain n = 939, 944, 837 and 871 biologically independent samples, respectively. Paired 16S rRNA+18S rRNA soil face, soil hip, skin face and skin hip datasets contain n = 440, 450, 428 and 356 biologically independent samples, respectively. MAG datasets contain n = 569 biologically independent samples. Metabolite soil hip and skin hip datasets contain n = 746 and 748 biologically independent samples, respectively. b, Mean absolute prediction errors are lowest when high-resolution taxonomic data are used for model training and prediction. Data represented contain the same biologically independent samples as in a. In boxplots in a and b, the lower and upper hinges of the boxplot correspond to the first and third quartiles (the 25th and 75th percentiles); the upper and lower whiskers extend from the hinge to the largest and smallest values no further than 1.5× IQR; the centre lines represent the median; the diamond symbol represents the mean. c, Linear regressions of predicted to true ADDs to assess model prediction accuracy show that all sampling locations significantly predict ADD. Data represented contain the same biologically independent samples as in a. Data are presented as mean ± 95% CI. Black dashed lines represent ratio of predicted to real ADD predictions at 1:1. The coloured solid lines represent the linear model calculated from the difference between the predicted and real ADD. d, The most important SILVA L7 taxa driving model accuracy from the best-performing model derived from 16S rRNA gene amplicon data sampled from the skin of the face. e, Comparison of abundance changes of the top important taxon, Helcococcus seattlensis, in skin reveals that low-abundance taxa provide predictive responses. Data plotted with loess regression and represent the same biologically independent samples as in a. Data are presented as mean ± 95% CI. Bact., bacterial; Avg., average; Marg., marginal.

Source data