Table 1.
Tree Name | Figure # | Gclust2012 |
Gclust2017R6 |
Merge | ||||||
---|---|---|---|---|---|---|---|---|---|---|
Cluster # | # of seq | E-value | Identity | Cluster # | # of seq | E-value | Identity | Identity | ||
ATS1 | 3B | 7661 | 26 | 1.00E-80 | 0.452 ± 0.131 | 2193 | 17 | 1.00E-45 | 0.360 ± 0.103 | 0.405 ± 0.117 |
PlsX | 3A | 1241 | 117 | 1.00E-31 | 0.405 ± 0.131 | not used | n/a | |||
PlsY | 3C | 937 | 141 | 1.00E-22 | 0.420 ± 0.112 | not used | n/a | |||
ATS2 & PlsC | 4 | 488 | 190 | 1.00E-10 | 0.250 ± 0.100 | 1090 | 26 | 1.00E-50 | 0.319 ± 0.098 | 0.270 ± 0.098 |
5855 | 7 | 1.00E-50 | 0.487 ± 0.084 | |||||||
4242 | 42 | 1.00E-28 | 0.420 ± 0.113 | 7159 | 6 | 1.00E-45 | 0.427 ± 0.082 | |||
LPP | 6 | 1284 | 113 | 1.00E-60 | 0.453 ± 0.175 | 1174 | 25 | 1.00E-12 | 0.252 ± 0.070 | 0.231 ± 0.115 |
2041 | 76 | 1.00E-16 | 0.260 ± 0.104 | |||||||
7867 | 25 | 1.00E-35 | 0.433 ± 0.177 | |||||||
7879 | 25 | 1.00E-80 | 0.545 ± 0.190 | |||||||
10402 | 19 | 1.00E-50 | 0.348 ± 0.143 | |||||||
10903 | 18 | 1.00E-50 | 0.519 ± 0.186 | |||||||
32991 | 5 | 1.00E-45 | 0.339 ± 0.137 | |||||||
49241 | 3 | 1.00E-60 | 0.545 ± 0.046 | |||||||
GPT4 | S8 | 5749 | 33 | 1.00E-60 | 0.388 ± 0.184 | 838 | 30 | 1.00E-50 | 0.246 ± 0.091 | 0.260 ± 0.113 |
6695 | 29 | 1.00E-45 | 0.247 ± 0.129 | |||||||
16298 | 12 | 1.00E-25 | 0.639 ± 0.314 | |||||||
LPAAT7 | 5 | 4244 | 42 | 1.00E-99 | 0.405 ± 0.207 | 1066 | 26 | 1.00E-70 | 0.243 ± 0.063 | 0.302 ± 0.132 |
6202 | 31 | 1.00E-50 | 0.392 ± 0.181 | |||||||
LPAAT9 | S7 | 5766 | 33 | 1.00E-70 | 0.400 ± 0.195 | 1492 | 21 | 1.00E-70 | 0.355 ± 0.106 | 0.333 ± 0.162 |
7815 | 25 | 1.00E-45 | 0.334 ± 0.171 | |||||||
Reference clusters for comparison (acyltransferases) | ||||||||||
NCBI CPDF | cd07984 | 100 | n/a | 0.176 ± 0.043 | ||||||
cd07989 | 100 | n/a | 0.173 ± 0.050 | |||||||
InterPro | IPR004552 | 100 | n/a | 0.371 ± 0.255 | ||||||
100 | n/a | 0.264 ± 0.122 | ||||||||
1000 | n/a | 0.278 ± 0.108 | ||||||||
Pfam | PF01553 | 100 | n/a | 0.156 ± 0.056 | ||||||
100 | n/a | 0.150 ± 0.050 |
Note.—The original clusters from Gclust2012 and Gclust2017R6 were individually aligned. The average identity ± standard deviation was calculated over all sequence pairs within each cluster. “Merge” indicates the data for the actual sequence alignments used for the phylogenetic tree construction. Comparable data of homologous sequence family were obtained from NCBI Cluster of Protein Domain, InterPro, and Pfam, and the average identity score was calculated in a similar way. “E-value” is the threshold of E-value. “n/a” indicates not applicable. For Reference clusters, 100 or 1000 sequences were arbitrarily taken from the original data and analyzed.