Table 1.
Information on the 52 Gene Properties Used in This Study
Property | Name | Description |
---|---|---|
Sequence-based | Aln_quality | Average of column confident scores (calculated using GUIDANCE2 from Landan and Graur [2008]; Sela et al. [2015]) |
AlnLen | Alignment length | |
AlnLen_nogaps | Alignment length after exclusion of all sites containing gaps | |
CAM | Number of sites containing RGC-CAM substitutions (as defined by Rogozin et al. [2007]; Polzin and Rokas [2014]) | |
CAM_pct | Percentage of CAM substitutions | |
Gap_pct_mean | Percent average of sites containing gaps across taxa | |
Gap_pct_var | Variance of percentage of sites containing gaps across taxa | |
GC_pct_mean | Percent average of GC content of all sites across taxa | |
GC_pct_var | Variance of GC content percentage of all sites across taxa | |
GC1_pct_mean | Percent average of GC content of first codon positions across taxa | |
GC1_pct_var | Variance of GC content percentage of first codon positions across taxa | |
GC2_pct_mean | Percent average of GC content of second codon positions across taxa | |
GC2_pct_var | Variance of GC content percentage of second codon positions across taxa | |
GC3_pct_mean | Percent average of GC content of third codon positions across taxa | |
GC3_pct_var | Variance of GC content percentage of third codon positions across taxa | |
nonCAM | Number of sites containing RGC_non-CAM substitutions (as defined by Rogozin et al. [2007]; Polzin and Rokas [2014]) | |
nonCAM_pct | Percentage of non-CAM substitutions | |
PI_pct_mean | Percent average of pairwise identity across taxa | |
PI_pct_var | Variance of percentage of pairwise identity across taxa | |
PI_sites | Number of parsimony-informative sites | |
PI_sites_pct | Percentage of parsimony-informative sites | |
RCV | Relative nucleotide composition variability (as defined by Phillips and Penny [2003]) | |
Varsites | Number of variable sites | |
Varsites_pct | Percentage of variable sites | |
Function-based | CAI | Codon adaptation index for a S. cerevisiae or H. sapiens gene (calculated using codonw 1.4.2 from Peden [1999]) |
CBI | Codon bias index for a S. cerevisiae or H. sapiens gene (calculated using codonw 1.4.2 from Peden [1999]) | |
CC_regions | Number of coiled–coil regions for a S. cerevisiae or H. sapiens gene (identified by Paircoil2 from McDonnell et al. [2006]) | |
Cen_distance | The physical distance between gene and centromere divided by chromosome length for a S. cerevisiae or H. sapiens gene | |
Function-based | Exons | Number of exons in a S. cerevisiae or H. sapiens gene |
Gen_interactions | Number of genetic interactions for a S. cerevisiae or H. sapiens gene (calculated using the BioGRID database from Chatr-Aryamontri et al. [2015]) | |
Gene_expression | Number of mapped reads per kilobase for a given gene from one million mapped reads (calculated using 2-replicate RNA-Seq data of S. cerevisiae from Busby et al. [2011] or H. sapiens RNA-Seq data across 122 samples from Uhlén et al. [2015]) | |
GO_numbers | Number of Gene Ontology terms for a S. cerevisiae or H. sapiens gene | |
InterPros | Number of unique domains for a S. cerevisiae or H. sapiens gene | |
Paralogs | Number of paralogs of a S. cerevisiae or H. sapiens gene | |
Phy_interactions | Number of physical interactions for a S. cerevisiae or H. sapiens gene (calculated using the BioGRID database from Chatr-Aryamontri et al. [2015]) | |
Prot2Tran | Number of protein isoforms divided by number of transcripts for a S. cerevisiae or H. sapiens gene | |
Protein_abundance | Protein abundance levels for a S. cerevisiae or H. sapiens gene (calculated using the PaxDb database from [Wang et al. 2012]) | |
Proteins | Number of protein isoforms for a S. cerevisiae or H. sapiens gene | |
Rel_distance | The physical position of a S. cerevisiae or H. sapiens gene divided by the length of the chromosome on which it resides | |
Repeats | Number of repeat elements for a S. cerevisiae or H. sapiens gene (identified by RepeatMasker [http://www.repeatmasker.org/; last accessed on March 21, 2016]) | |
Syn_codons_fre | Frequency of synonymous codons for a S. cerevisiae or H. sapiens gene (calculated using codonw 1.4.2 from Peden [1999]) | |
TFs | Number of transcription factors targeting a given gene (calculated using the Yeastract database of S. cerevisiae from Teixeira et al. [2014] or the ITFP database of H. sapiens from Zheng et al. [2008]) | |
Transcripts | Number of transcripts for a S. cerevisiae or H. sapiens gene | |
Tree-based | Inter_len_mean | Average length of internal branches across the maximum likelihood tree of a given alignment |
Inter_len_var | Variance of lengths of internal branches across the maximum likelihood tree of a given alignment | |
Leaf_len_mean | Average length of external branches across the maximum likelihood tree of a given alignment | |
Leaf_len_var | Variance of lengths of external branches across the maximum likelihood tree of a given alignment | |
Leaf2node_mean | Average of the sum of all branch lengths that are between the outgroup node and each ingroup node across the maximum likelihood tree of a given alignment | |
Leaf2node_var | Variance of the sum of all branch lengths that are between the outgroup node and each ingroup node across the maximum likelihood tree of a given alignment | |
Total_treelen | Sum of all branch lengths across the maximum likelihood tree of a given alignment | |
Treeness | Proportion of sum of internal branch lengths over sum of all branch lengths across the maximum likelihood tree of a given alignment (as defined by Phillips and Penny [2003]) | |
Treeness/RCV | Treeness divided by RCV (as defined by Phillips and Penny [2003]) |