Fig. 5.—
Functional enrichments in differentially conserved small and large UIR orthologs. (A) Significantly enriched Gene Ontology (GO) terms (P < 0.01 by hypergeometric testing after Bonferroni correction) for genes with conserved UIR size (upper or lower quintile) across the taxonomic ranks of subdivision Pezizomycotina (14 species), class Eurotiomycete (12 species), order Eurotiale (8 species), and genus Aspergillus (6 species). Heatmap color scale represents both term significance (log 10 P value after subtraction of significance for nonconserved UIR quintiles) and UIR size; positive values (red) for large UIR orthologs, negative values (blue) for small UIR orthologs. Term dendrogram is based on hierarchical clustering (average linkage, Euclidean distance), selected annotations are shown at right. (B, C) Pfam protein domain co-occurrence networks for genes of small (lower quintile, B) and large (upper quintile, C) UIR size maintained across the Aspergilli, arranged by sub-network connectivity. Two nodes (domains) are connected by an edge if they occur within the same protein. Node size and colour are scaled by domain frequency within each conserved UIR group (repeats in a single protein are counted once), edge thickness is scaled by the number of co-occurrences for each domain pair within each conserved UIR group. Common motifs in small UIR genes are WD40, HAT (half a TPR), Heat, TPR_2 and Helicase_C domains, which are predominantly involved in protein–protein and protein–RNA interaction and RNA processing. Relative to large UIR orthologs, motifs are not highly connected within small UIR orthologs indicating functional stability (clustering co-efficient = 0.35, average number of neighbors = 2.12). Common motifs in large UIR genes are C2H2 and Zn2C6 zinc finger DNA-binding domains, protein kinase, LRR_1 (leucine rich repeat), WD40, PUF (Pumilio-family RNA binding repeat) and RRM_1 repeat motifs, C2 (calcium binding), and MFS and ABC transporter domains. Although the Zn2C6 zinc finger domain is the dominant DNA binding domain in fungi (found on average 4.2 times more frequently than C2H2 domains in the six Aspergilli), C2H2 domains are far more common among large UIR genes. Relative to small UIR genes, common motifs in large UIR genes are typically highly connected locally and globally (clustering coefficient = 4.27, average number of neighbors = 2.55), indicative of greater functional complexity and versatility (Yang et al. 2012).