Skip to main content
. 2023 Jan 2;7(2):264–278. doi: 10.1038/s41559-022-01925-6

Fig. 4. Nuclear export is a selectively constrained boundary differentiating mRNAs from lncRNAs.

Fig. 4

ac, Estimated features of ancestral lncRNAs for the de novo genes identified using their macaque orthologues encoding lncRNAs. The distributions of GC content (a), N/C ratio (b) and splice efficiency (c) of these lncRNAs are summarized in box plots and compared with those of the genome-wide lncRNAs or protein-coding genes in macaques. Statistics for a: n = 25,211 for macaque protein-coding genes; n = 71 for macaque orthologues of human de novo genes; n = 4,370 for macaque lncRNAs; one-sided, unpaired Wilcoxon test, P = 2.5 × 10−7 and P = 3.0 × 10−13, respectively; statistics for b: n = 11,703 for macaque protein-coding genes; n = 34 for macaque orthologues of human de novo genes; n = 2,719 for rhesus macaque lncRNAs; one-sided, unpaired Wilcoxon test, P = 2.6 × 10−5 and P = 2.6 × 10−2; statistics for c: n = 13,463 for macaque protein-coding genes; n = 19 for macaque orthologues of human de novo genes; n = 3,702 for macaque lncRNAs; one-sided, unpaired Wilcoxon test, P = 8.0 × 10−6 and P = 7.4 × 10−2. d, pN/pS scores in human population are summarized in box plots, for de novo genes, the macaque orthologues of these de novo genes, as well as the protein-coding genes conserved in human and macaque as a control. n = 13,131 for conserved genes between human and macaque; n = 41 for de novo genes; n = 60 for macaque orthologues of human de novo genes; one-sided, unpaired Wilcoxon test, P = 4.7 × 10−2 and P = 8.4 × 10−4. The boxes represent interquartile range, with the line across the box indicates the median. The whiskers extend to the lowest and the highest value in the dataset. e,f, Derived allele frequency spectra for the human polymorphic sites that led to stronger (Strengthen, n = 78 SNPs) or weaker U1 sites (Weaken, n = 175 SNPs) in de novo genes (e), as well as the macaque polymorphic sites that led to stronger (Strengthen, n = 119 single-nucleotide polymorphisms (SNPs)) or weaker U1 sites (Weaken, n = 307 SNPs) in macaque orthologues of these de novo genes (f). g,h, Derived allele frequency spectra for the human polymorphic sites that introduce stronger (Strengthen, n = 10 SNPs) or weaker splice sites (Weaken, n = 28 SNPs) in de novo genes (g), as well as the macaque polymorphic sites that introduce stronger (Strengthen, n = 28 SNPs) or weaker splice sites (Weaken, n = 44 SNPs) in the macaque orthologues of de novo genes (h). Derived allele frequency spectra for synonymous sites (Synonymous, n = 130 SNPs and 268 SNPs) in human and macaque are also shown as neutral controls; error bars, standard deviations (s.d.) estimated by 1,000 bootstrap replicates. Data are presented as mean ± s.d. *P ≤ 0.05; ***P ≤ 0.001; NS, not significant. i, A ‘successful stowaway’ model for the origin of functional de novo genes from lncRNA loci.