a) The percent identity differential of the mapping of full-length Iso-Seq transcripts (n=4,718) from human-specific duplications (HSDs) to both GRCh38 and SDA results on CHM13. The red dotted line represents equal mapping between the two; whereas points to the right represent an improved mapping with the SDA contigs. Six HSD gene families showed significantly (p < 0.05, two-sided Wilcoxon signed-rank test) improved mapping to the SDA-resolved contigs with the biggest difference occurring for GPRIN2. The boxes indicate the range between the first and third quartiles, with the bold line specifying the median. The whiskers show the minimum and maximum within 1.5 times the interquartile range extending from the first and third quartiles. b)
GPRIN2 SDA contigs compared (Miropeats) to the human reference assembly (GRCh38) with gene and SD annotation. The SDA contigs close a gap (red) in GRCh38, which contains a duplicate copy of GPRIN2A denoted here as GPRIN2B. Mapping of individual Iso-Seq transcripts (inset) from the brain show that both loci are transcribed but that GPRIN2B has several coding differences, including a 3-amino-acid insertion at position 239 in GPRIN2B compared to GPRIN2A, the ancestral copy (Figure S10, Table S7).