Skip to main content
. 2022 Sep 15;13:5412. doi: 10.1038/s41467-022-33073-7

Fig. 1. The composition of GCPAN.

Fig. 1

a GCPAN contains two parts, the human reference genome GRCh38, and unaligned sequences. The 24 chromosomes represent GRCh38, and the sequences on the right side represent unaligned sequences including fully unaligned sequences and partially unaligned sequences. The later parts were shown as orange bars on the chromosomes of the left side (only sequences longer than 2000 bp are marked on chromosomes). b Distribution of distributed genes and sequences on 22 autosomes and two sex chromosomes. The heights of the bars represent the sequence lengths. The gap, gap closure, and gap extension were shown in the out-most circle. c There was no significant difference in gene numbers between normal mucosae and cancer tissues (paired t-test). Normal: normal mucosae; Tumor: cancer tissues. d Comparison of mapping ratio of sequencing data from 185 samples using pan-genome versus GRCh38. The reads mapping rates were significantly increased by pan-genome mapping, compared to GRCh38 mapping (paired t-test). The center lines of the box plots show median values, hinges the first and third quartiles, and the whiskers the maxima and minima within 1.5 times of the interquartile range. ***P ≤ 0.001. e The diagram of GCPAN from 185 individuals. The numbers of core genes and distributed genes are presented in GRCh38 box, while the numbers of core genes and distributed genes for predicted new sequences are shown in the novel gene box. The total numbers of genes shared by cancer and normal mucosa are shown on the right side.