Skip to main content
. 2023 Dec 13;624(7992):602–610. doi: 10.1038/s41586-023-06842-7

Extended Data Fig. 1. Genomic library characteristics across different groups.

Extended Data Fig. 1

(a) Barchart shows the proportion of non-human reads in sequencing libraries derived from blood or saliva samples. (b) Barchart shows the proportion of mapped (green) and unmapped (orange) non-human reads in sequencing libraries derived from blood or saliva samples. (c) Boxplot shows the average depth of coverage per individual grouped by their communities (NCIGP1 = pink, P2 = purple, P3 = blue, P4 = green & non-NCIG = orange). The horizontal dashed line indicates the average coverage across all libraries in the cohort. (d) Boxplot shows the N50 distribution of individual libraries in the different communities. The horizontal dashed line indicates the average N50 across all libraries in the cohort. (e) Boxplot shows the distribution of DNA Integrity Number (DIN), which indicates the level of fragmentation of a genomic DNA sample, for individual libraries across the different communities. (f) Boxplot shows the distribution of the number of high-quality (PASS=orange) and low-quality (q5=green) structural variants (SVs) per individual grouped by community after quality filtering (Quality ≥ 5). The horizontal dashed lines indicate the average number of high-quality (top) and low-quality (bottom) SVs across all libraries in the cohort. A total of n = 141 individuals (NCIGP1 = 41, NCIGP2 = 32, NCIGP3 = 9, NCIGP4 = 39 and non-NCIG = 20) were examined from independent sequencing experiments in figures c-f. In the boxplots, the middle line is the median, the box represents the interquartile range (IQR), the whiskers extend 1.5 times the IQR from the hinge, and any data points beyond the whiskers are shown individually.