Skip to main content
. 2017 Nov 14;9(11):3162–3178. doi: 10.1093/gbe/evx238

Table 1.

Sequence Similarity Within the Gclust Clusters and the Alignments for Trees

Tree Name Figure # Gclust2012
Gclust2017R6
Merge
Cluster # # of seq E-value Identity Cluster # # of seq E-value Identity Identity
ATS1 3B 7661 26 1.00E-80 0.452 ± 0.131 2193 17 1.00E-45 0.360 ± 0.103 0.405 ± 0.117
PlsX 3A 1241 117 1.00E-31 0.405 ± 0.131 not used n/a
PlsY 3C 937 141 1.00E-22 0.420 ± 0.112 not used n/a
ATS2 & PlsC 4 488 190 1.00E-10 0.250 ± 0.100 1090 26 1.00E-50 0.319 ± 0.098 0.270 ± 0.098
5855 7 1.00E-50 0.487 ± 0.084
4242 42 1.00E-28 0.420 ± 0.113 7159 6 1.00E-45 0.427 ± 0.082
LPP 6 1284 113 1.00E-60 0.453 ± 0.175 1174 25 1.00E-12 0.252 ± 0.070 0.231 ± 0.115
2041 76 1.00E-16 0.260 ± 0.104
7867 25 1.00E-35 0.433 ± 0.177
7879 25 1.00E-80 0.545 ± 0.190
10402 19 1.00E-50 0.348 ± 0.143
10903 18 1.00E-50 0.519 ± 0.186
32991 5 1.00E-45 0.339 ± 0.137
49241 3 1.00E-60 0.545 ± 0.046
GPT4 S8 5749 33 1.00E-60 0.388 ± 0.184 838 30 1.00E-50 0.246 ± 0.091 0.260 ± 0.113
6695 29 1.00E-45 0.247 ± 0.129
16298 12 1.00E-25 0.639 ± 0.314
LPAAT7 5 4244 42 1.00E-99 0.405 ± 0.207 1066 26 1.00E-70 0.243 ± 0.063 0.302 ± 0.132
6202 31 1.00E-50 0.392 ± 0.181
LPAAT9 S7 5766 33 1.00E-70 0.400 ± 0.195 1492 21 1.00E-70 0.355 ± 0.106 0.333 ± 0.162
7815 25 1.00E-45 0.334 ± 0.171
Reference clusters for comparison (acyltransferases)
NCBI CPDF cd07984 100 n/a 0.176 ± 0.043
cd07989 100 n/a 0.173 ± 0.050
InterPro IPR004552 100 n/a 0.371 ± 0.255
100 n/a 0.264 ± 0.122
1000 n/a 0.278 ± 0.108
Pfam PF01553 100 n/a 0.156 ± 0.056
100 n/a 0.150 ± 0.050

Note.—The original clusters from Gclust2012 and Gclust2017R6 were individually aligned. The average identity ± standard deviation was calculated over all sequence pairs within each cluster. “Merge” indicates the data for the actual sequence alignments used for the phylogenetic tree construction. Comparable data of homologous sequence family were obtained from NCBI Cluster of Protein Domain, InterPro, and Pfam, and the average identity score was calculated in a similar way. “E-value” is the threshold of E-value. “n/a” indicates not applicable. For Reference clusters, 100 or 1000 sequences were arbitrarily taken from the original data and analyzed.