Figure - PMC

Skip to main content

An official website of the United States government

Here's how you know

Here's how you know

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.

View full-text article in PMC

letter

. 2022 Oct 28;72(10):2004–2006. doi: 10.1136/gutjnl-2022-328216

Search in PMC
Search in PubMed
View in NLM Catalog
Add to search

© Author(s) (or their employer(s)) 2023. Re-use permitted under CC BY-NC. No commercial re-use. See rights and permissions. Published by BMJ.

This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/.

PMC Copyright notice

Data distribution, assessment workflow and framework evaluation. (A) data distribution. The number of samples of different diseases. The seven diseases marked in red were assessed, including metabolic syndrome, gastritis, kidney stones, T2D, rheumatoid arthritis, constipation and COPD. (B) The workflow of assessment. The genera abundance profiles of samples from each city were randomly divided into the training subset (80%) and the testing subset (20%). Three assessment workflows for each model were marked by three different colours. The testing subset of city B was used to test all the three models. (C) Framework evaluation: comparison of the AUROC of three models. Boxplots in the left panel show the AUROC of the three models for diagnosing seven diseases using samples in each of city, and the right panel shows these values collectively. ^*, p<0.05; ^**, p<0.01; ^***, p<0.005; Mann-Whitney-Wilcoxon test. (D) The relationship between sample size and the performance of three models. Boxplots show AUROC of three models for diagnosing three diseases (COPD, rheumatoid arthritis and T2D). The lines show the change in average AUROC of three models with sample size increasing. The dashed line shows the average AUROC of cross-regional diagnosis of T2D using random forest model.⁵ For all the boxplots, boxes represent the IQR between the first and third quartiles and the line inside represents the median. Whiskers denote the lowest and highest values within the 1.5×IQR from the first and third quartiles, respectively. AUROC, area under the receiver operating characteristic; COPD, chronic obstructive pulmonary disease; T2D, type 2 diabetes.