Table 1.
Humana |
Drosophila melanogastera |
|||||
---|---|---|---|---|---|---|
Features Compared | N-mt Genes (%) | Whole Genome Gene Set (%) | Inference | N-mt Genes (%) | Whole Genome Gene Set (%) | Inference |
Gene duplication | ||||||
Genes in gene families | 298 (18.2) | 8005 (40.5) | N-mt genes have been duplicated less often than nuclear genes (P < 2.2e-16). | 132 (22.6) | 2335 (16.8) | N-mt genes have been duplicated more often than nuclear genes (P = 0.0004). |
Duplication eventsb | 167 | 5375 | 75 | 1504 | ||
Retrogenes | 12 (7.2) | 96 (1.8) | RNA-mediated duplications are more prevalent for N-mt genes (P = 9.03e-05). | 26 (34.7) | 92 (6.1) | RNA-mediated duplications are more prevalent for N-mt genes (P =2.613e-12). |
Duplication agec | Duplications in the whole genome are significantly younger than N-mt duplications (P = 5.201e-07). | Duplications in the whole genome are significantly older than N-mt duplications (P = 0.0006). | ||||
Older | 86 (87.8) | 1124 (64.2) | 29 (78.4) | 430 (48.9) | ||
Younger | 12 (12.2) | 626 (35.8) | 8 (21.6) | 449 (51.1) | ||
Relocation patternd | N-mt duplicates were not significantly more relocated than nuclear gene duplicates (P = 0.4933). | N-mt duplicates have been significantly more relocated than nuclear gene duplicates (P = 4.119e-05). | ||||
Same chromosome | 40 (33.6) | 686 (37.0) | 17 (45.9) | 972 (77.5) | ||
Different chromosomes | 79 (66.4) | 1167 (63.0) | 20 (54.1) | 282 (22.5) |
Note.—All of the P values are based on Fisher’s exact tests. All of D. melanogaster inferences are consistent with previous observations (Gallach et al. 2010).
The total number of N-mt genes in the human genome is 1,640. The total number of genes in the genome for the genome version used (Ensembl Genes 80, GRCH38.p3) was 19,766. The total number of N-mt genes in the D. melanogaster genome is 583. The total number of genes in the genome for the genome version used (Ensembl Genes 80, BDGP6) was 13,900.
Duplications events were inferred from the number of events needed to explain the number of genes in that gene family. For example, two genes in a gene family requires only one duplication event but three genes requires two duplication events.
An arbitrary 104.7 My cut off (i.e., mammalian duplications vs. older duplication events) was used here for human genome analyses. However, for Drosophila, a 63 My cut off (time of Drosophila genus diversification) was used. Age is from Gentree database (http://gentree.ioz.ac.cn, last accessed March 17, 2017; Zhang et al. 2010).
Only genes with inferred child and parent (See Materials and Methods for more details) were used here. Because we couldn’t assign child–parent relationship to the tandem duplications, we didn’t consider those in this analysis.