Template-based protein structure modeling using TASSERVMT

Hongyi Zhou; Jeffrey Skolnick

doi:10.1002/prot.23183

. Author manuscript; available in PMC: 2013 Mar 14.

Published in final edited form as: Proteins. 2011 Nov 22;80(2):352–361. doi: 10.1002/prot.23183

Template-based protein structure modeling using TASSER^VMT

Hongyi Zhou ¹, Jeffrey Skolnick ^1,^*

PMCID: PMC3291807 NIHMSID: NIHMS325345 PMID: 22105797

Abstract

Template-based protein structure modeling is commonly used for protein structure prediction. Based on the observation that multiple template-based methods often perform better than single template-based methods, we further explore the use of a variable number of multiple templates for a given target in the latest variant of TASSER, TASSER^VMT. We first develop an algorithm that improves the target-template alignment for a given template. The improved alignment, called the SP³ alternative alignment, is generated by a parametric alignment method coupled with short TASSER refinement on models selected using knowledge-based scores. The refined top model is then structurally aligned to the template to produce the SP³ alternative alignment. Templates identified using SP³ threading are combined with the SP³ alternative and HHEARCH alignments to provide target alignments to each template. These template models are then grouped into sets containing a variable number of template/alignment combinations. For each set, we run short TASSER simulations to build full-length models. Then, the models from all sets of templates are pooled, and the top 20–50 models selected using FTCOM ranking method. These models are then subjected to a single longer TASSER refinement run for final prediction. We benchmarked our method by comparison with our previously developed approach, pro-sp3-TASSER, on a set with 874 Easy and 318 Hard targets. The average GDT-TS score improvements for the first model are 3.5% and 4.3% for Easy and Hard targets, respectively. When tested on the 112 CASP9 targets, our method improves the average GDT-TS scores as compared to pro-sp3-TASSER by 8.2% and 9.3% for the 80 Easy and 32 Hard targets, respectively. It also shows slightly better results than the top ranked CASP9 Zhang-Server, QUARK and HHpredA methods. The program is available for download at http://cssb.biology.gatech.edu/.

Keywords: template-based modeling, threading, alignment, SP³, TASSER

INTRODUCTION

Despite considerable efforts to develop accurate Template Free approaches, template-based protein structure modeling is still the only reliable protein structure modeling method¹. Template-based modeling involves (1) template identification, usually by threading; (2) construction of the target-template alignment; (3) model generation based on this alignment; and (4) refinement of the template models with the goal of generating closer structures to the target native state than that provided by the template. The accuracy of the final model depends on the alignment accuracy between the template and target as well as the quality of the subsequent refinement. An ideal threading method should identify the best templates, viz. those with the best structural similarity to the native structure. In practice, this is not always the case². Furthermore, the target-template alignment may not be optimal, especially for Hard targets³. Hard targets are those with poor quality threading alignments and good/poor quality structure alignments, respectively. Advances have been made in increasing the accuracy of template identification and threading alignment by going from single sequence alignments⁴ to profile-profile alignments^5–12, to machine learning^13,14 and metathreading^6,15,16. Recent studies show that using multiple templates yields better models than those obtained from a single best template approach^17–22.

Results from the latest Critical Assessment of Protein Structure Prediction (CASP9)²³ show that the top ranked automatic servers all used multiple template information in template-based modeling²³. The Zhang-Server²⁴, which performed well across all levels of target difficulty, employed a locally installed metaserver, LOMETS²⁵, that uses more than 8 individual threading methods for template identification/target alignment and TASSER¹⁷ for refinement. HHpredA^26,27, which performed well for Easy targets, but moderately well for Hard targets, used an updated method for hidden Markov profile generation and a neural network trained to filter distance restraints from multiple templates²³. The final models are built by MODELLER²⁸ using filtered distance restraints. RaptorX^29,30, on the other hand, performed well for Hard targets and slightly worse for Easy targets. RaptorX^29,30 and its variants RaptorX-MSA and RaptorX-Boost employed an outstanding alignment method that uses a boost tree to fit the alignment matching scores and a new method of generating multiple template-sequence alignments²⁰. Most methods used a fixed number of multiple templates for a given target to produce the intermediate or final models. For example, standard TASSER¹⁷ uses 20, 30, or 50 templates as input for Easy (those targets with good quality threading alignments), or Hard targets (those having templates with poor quality threading alignments but often having good structural alignments). Pro-sp3-TASSER²², that performed fairly well in CASP8³¹ and CASP9²³, uses a few different sets with a fixed number of templates for generating TASSER inputs.

Obvious lessons learned from the top performing approaches in recent CASPs are that the use of multiple templates is an effective way of improving template-based modeling methods and that further improvement can be achieved by improving target-template alignments and filtering or selecting those templates that are optimally compatible with the target. For example, the outstanding performance of Zhang-Server and the template-based part of QUARK could be attributed partly to the consensus information from many threading methods that serve as a filter to remove poor quality templates and bad alignments, the inclusion of template fragments by segment threading³², the predicted contacts by SVMSEQ³³, and finally TASSER refinement. The use of better alignments and a neural-network trained filter for distance restraints in HHpredA makes it stand out among other multiple template-based methods, especially for Easy targets.

In this work, we follow a similar scheme as in our previous pro-sp3-TASSER³¹ approach and use multiple sets of templates for a given target. In contrast to pro-sp3-TASSER that uses five different component threading scores, our new approach TASSER^VMT (TASSER with Variable number of Multiple Templates) uses SP^{3 34} threading for template identification. To build better models, we developed an algorithm to improve the target-template alignment for a given template. Alignments generated by HHSEARCH²⁶ are also included, since our tests show that they are slightly better than the default SP³ threading alignments for Easy targets and could provide complementary sampling of alignment space. Then, short TASSER simulations build a pool of structures derived from different combinations of templates and their associated target alignments. The rationale of using subsets of given multiple template set is to explore different combinations of templates to sample distance restraints or contacts to find the subset that are optimally compatible with the target. FTCOM serves as an effective filter equivalent to the neural-network trained distance restraint filter in HHpredA to select the top 20–50 models that are then subjected to longer TASSER refinement. Table 1 summarizes the major methodological differences between current TASSER^VMT and previous pro-sp3-TASSER²² methods. We benchmarked TASSER^VMT against pro-sp3-TASSER²² on a set of 874 Easy and 318 Hard targets. We also tested TASSER^VMT on the 112 CASP9 targets and compare the results to top performing CASP9 methods, Zhang-Server, QUARK, HHpredA, RaptorX, Seok-server, MULTICOM_CLUSTER and BAKER-ROSETTASERVER. We show that TASSER^VMT represents a promising approach to template-based modeling.

Table 1.

Summary of major differences between TASSER^VMT and pro-sp3-TASSER.

	TASSER^VMT	pro-sp3-TASSER
Template identification	SP³ threading	SP³ & four other threading scores derived from PROSPECTOR_3⁴⁷
Alignments	SP³ alternative & HHSEARCH for all targets	SP³ default for Easy SP³ default & SP³ alternative for Hard targets
Alternative alignment generation procedure	SP³ parametric FTCOM & DFIRE selection Short TASSER refinement	SP³ parametric TASSER-QA selection
Number of template sets	120 for Easy 80 for Hard targets	1 for Easy 11 for Hard targets
Model selection from structure pool for final refinement	FTCOM	TASSER-QA

Open in a new tab

METHOD

The overall flowchart of TASSER^VMT is shown in Figure 1(a). Figure 1(b) shows the SP³ alternative alignment generation procedure that is part of Figure 1(a). In what follows, we shall describe the approach in more detail.

SP³ alternative alignment generation

For a given template, our goal is to improve the target-template alignment over that provided by SP³ threading. Targets are classified as Easy if the SP^{3 34} threading Z-score of the top template is ≥6.0, as Medium if 4.5 < Z-score <6.0, and as Hard if Z-score ≤4.5. We refer them as SP³ Easy, Medium and Hard. In practice, here we lump the Medium and Hard targets in the same category as Hard for the prediction protocol. This classification is for prediction purposes and depending on predicted target difficulty allows for the use of different parameters in TASSER simulations and different template choices. (In contrast, in the Results section, Easy and Hard targets are defined for assessment purposes using the actual model quality of the predicted models.) In pro-sp3-TASSER²², we generate alternative alignments only for SP³ Hard targets using a parametric alignment generation method³⁵ and select the top alignment simply by TASSER-QA³⁶ without any refinement. Here, we also generate alternative alignments for Easy targets and employ short TASSER refinement.

Given a target sequence and a template structure, we utilize a parametric alignment method^35,37 to generate an ensemble of alignments. This is realized by a grid search on the five dimensional parameter space (w₀, w₁, w₂_ndary, w_struc, S_shift) of the SP³ threading score³⁴, where (w₀, w₁) are gap penalties, w₂_ndary is the secondary structure score weight, w_struc is weight of the structurally derived profile of the template, and S_shift is a shift parameter. Each parameter is sampled with 0.0, 0.5, 1.0, 1.5, and 2.0 times the original values except for the gap opening penalty, w₀, which is sampled with 0.5, 1.0, 1.5, 2.0, and 2.5 times the original value. Because a gap opening penalty of zero could lead to unrealistic, highly gapped alignments, this value of w₀ value is not sampled. Alignments are ranked according to their coverage (fraction of target residues aligned), with the top 500 distinct alignments kept. We then employ MODELLER²⁸ to build the full length models and add the side-chain, and apply FTCOM ranking³⁸ and the DFIRE statistical potential³⁹ to select the top 21 models from these 500 models. The reason to use FTCOM and DFIRE is that they are not consensus-based scores so that they could potentially select a few good models. We apply the two independent components of FTCOM³⁸, fragment comparison and template comparison, separately; each component selects 7 models; we then use DFIRE to select another 7 models. A short TASSER refinement run (limited to 5 hours) is conducted using the selected 21 models as input and the top model from the top cluster centroid is selected by SPICKER⁴⁰ clustering on the TASSER trajectories. We then use TM-align⁴¹ to generate the structural alignment to the template. The resulting template alignment comprises the new target-template alignment that we term the “SP³ alternative alignment” of the corresponding target-template pair. We also employed the HHSEARCH alignment as an independent alignment to the SP³ alternative alignment for each template because it is slightly better than the SP³ default alignment for Easy targets, and it provides alignments that could potentially be complementary to the SP³ alternative alignments for both Easy and Hard targets.

Template identification and construction of a variable number of multiple template sets

In this work, we use SP^{3 34} threading to identify templates. We use four different output scores to rank templates to explore different combinatorial space of templates: (1) the default SP³ rank score, i.e. the raw threading score minus the reverse threading score; (2) the raw threading score; (3) the raw threading score/alignment length; (4) the raw threading score/target length. For each target, four sets of ranked templates are collected from the top (10 for Easy, 20 for Hard targets) ranked templates corresponding to the above four scores. We denote them as (T₁,……, T₁₀)_i and (T₁,….., T₂₀)_i with i=1,2,3,4 denoting the templates from the four scores for Easy and Hard targets, respectively. T_k^HHM and T_k^sp3a denote the template alignment of the kth ranked template T_k in each i=1,2,3,4,_, using the HHSEARCH and the above generated SP³ alternative alignments, respectively.

To explore the combinatorial space of multiple templates, for Easy targets, the following sets of template alignments are used in subsequent model building: (1) sets of (T₁^HHM)_i, (T₁^HHM, T₂^HHM)_i, ……, (T₁^HHM, T₂^HHM, ……, T₁₀^HHM)_i; (2) sets of (T₁^sp3a)_i, (T₁^sp3a, T₂^sp3a)_i,……, (T₁^sp3a, T₂^sp3a, ……, T₁₀^sp3a)_i; (3) sets of (T₁^HHM, T₁^sp3a)_i, (T₁^HHM, T₁^sp3a, T₂^HHM, T₂^sp3a)_i, ……, (T₁^HHM, T₁^sp3a, T₂^HHM, T₂^sp3a, ……, T₁₀^HHM, T₁₀^sp3a)_i with i=1,2,3,4. For Hard targets, the template alignment sets are (T₁^HHM, T₁^sp3a)_i, (T₁^HHM, T₁^sp3a, T₂^HHM, T₂^sp3a)_i, ……, (T₁^HHM, T₁^sp3a, T₂^HHM, T₂^sp3a, ……, T₂₀^HHM, T₂₀^sp3a)_i with i=1,2,3,4. In these combinations, only templates within the same i (=1,2,3,4) are combined. The total template alignment sets for an Easy target is 120 and for a Hard target is 80.

It should be noted that the above sets of template alignment represent only a small portion of all possible combinations. For 10 ranked templates and a given alignment method, the 10 sets we used consist of less than 1% of all possible 1023 combinations. However, we have found that among all sets with k (≤10) templates, the one we used is the most significant one because it contains the best-ranked k templates of the 10 templates.

Building models by short TASSER simulations

For each set of multiple templates (120 for Easy and 80 for Hard targets), we run short TASSER¹⁷ simulations (limited to 10 hours to control the total modeling time) to build full length models. Each TASSER simulation contributes up to five models from the top five cluster centroids by SPICKER⁴⁰ clustering on the TASSER trajectories. The total number of models built for each Easy target is 600 and 400 for each Hard target. We found that for Hard targets, building more models from the templates sets (T₁^HHM)_i, (T₁^HHM, T₂^HHM)_i, ……, (T₁^HHM, T₂^HHM, ……, T₁₀^HHM)_i, (T₁^sp3a)_i, (T₁^sp3a, T₂^sp3a)_i, ……, (T₁^sp3a, T₂^sp3a, ……, T₁₀^sp3a)_i with i=1,2,3,4 does not give obvious improvement. Therefore, these sets of template alignments are not used for Hard targets but are used for Easy targets where an increase in the number of template sets did yield some improvement. The reason could be that for Easy targets, good models dominate in the additional ensemble of models; whereas for Hard targets, bad models dominate and lead to more false positives in the next step of model selection.

TASSER model refinement

After the pool of models is generated by short TASSER simulations, the top 20/50 models for each SP³ Easy/Hard target are selected by FTCOM³⁸, a model assessment method that combines fragment comparison and template comparison scores. The selected models are subsequently input into TASSER for refinement. After around 20 hours of simulation, TASSER trajectories were clustered with the SPICKER method⁴⁰ and top five cluster centroids were used for the final prediction. Possible clashes were removed and the main-chain, side-chain atoms built with Pulchra⁴². The longer TASSER refinement simulation time is limited to 20 hours based on the requirement of 72 hours of total modeling time for servers as required in CASP²³. However, if total modeling time is not an issue, these time restrictions can be relaxed and possibly better results might be obtained.

RESULTS

We tested our method on a large-scale benchmark set and compared the results with our previous method pro-sp3-TASSER ^22,31. For the CASP9 target set, we also compared our results to the top performing approaches of in CASP9²³.

Large-scale benchmark

A data set of 1192 targets was constructed from the SP³ threading library. These targets share a sequence identity less than 35% among themselves and to the threading template library used for template identification. All library structures were released before the targets. We compare TASSER^VMT with our previous developed method pro-sp3-TASSER using exactly the same template library. In our assessment, a target is classified as Easy if the predicted first model has a TM-score⁴³ to native > 0.5; otherwise, it is defined as a Hard target. The benchmark set has 874 Easy and 318 Hard targets, respectively. In accordance with convention, the first model’s GDT-TS score⁴⁴ is employed for model quality assessment. In Table 2, we compare the performance of TASSER^VMT and pro-sp3-TASSER as assessed by the cumulative GDT-TS scores of the first model as well as the foldability of Hard targets. Foldability is defined as the number of targets whose first model has a TM-score to native ≥ 0.4. Two structures having a TM-score ≥ 0.4 are considered to have significant structural similarity^43,45. On average, TASSER^VMT improves the GDT-TS scores by 3.5% and 4.3% for Easy and Hard targets, respectively. These improvements are statistically significant as shown by their respective p-values. For the 318 Hard targets, foldability improves from 61 to 85 targets. Figure 2 shows the scatter plot of the comparison of GDT-TS scores for the two approaches. While the majority (765) of targets are improved by TASSER^VMT, there are also targets (397) that get worse relative to pro-sp3-TASSER. Figures 3(a) and (b) show the cumulative GDT-TS score histograms for TASSER^VMT and pro-sp3-TASSER for Easy and Hard targets, respectively. TASSER^VMT improves over pro-sp3-TASSER models for all GDT-TS score cutoffs with the maximal improvements in GDT-TS score around 0.7 and 0.3 cutoffs for Easy and Hard targets, respectively.

Table 2.

Comparison of TASSER^VMT with pro-sp3-TASSER on the 1192 protein benchmark set.

	874 Easy targets^#		318 Hard targets
	Cumulative GDT-TS	p-value^*	Cumulative GDT-TS^§	p-value
TASSER^VMT	615.27	-	100.36 (85)	-
pro-sp3-TASSER	594.62	7.7×10⁻²⁰	96.22 (61)	1.9×10⁻⁴

Open in a new tab

A target is defined as Easy if the TM-score to native of the first model by at least one of the two methods is higher than 0.5; otherwise, it is classified as Hard.

Two-sided p-value of the Student-t test between TASSER^VMT and pro-sp3-TASSER. A p-value of ≤0.05 is considered statistically significant.

^§

Numbers in brackets are the foldabilities defined as the number of targets having a first model TM-score to native ≥0.4.

Scatter plot comparison of the first model’s GDT-TS scores by pro-sp3-TASSER and TASSER^VMT on the 1192 protein benchmark target set. The number of targets for which TASSER^VMT is better/worse than pro-sp3-TASSER is 765/397.

Histogram comparison of first model GDT-TS score by pro-sp3-TASSER and TASSER^VMT on the 1192 target set. (a) Easy set; (b) Hard set. The y-axis is the number of targets having first model GDT-TS score > x.

We next analyze the alignment accuracy of our alternative alignment generation method in comparison with HHSEARCH and the structural alignment method TM-align⁴¹ using native structures. We shall compare the model quality as measured by GDT-TS scores to native of models built from the alignments using MODELLER²⁸ as well as directly compare alignments with the structural alignment using TM-align⁴¹. In Table 3, we show the average model GDT-TS score changes relative to using SP³ alignments and the average percentage of exact, and ±1,± 2 shifted residue matches to the structural alignment as provided by TM-align for the top 10 SP³ templates. Apparently, both the SP³ alternative and HHSEARCH alignments are 1–3% better than SP³ alignments in terms of model GDT-TS scores to native for the Easy targets. However, for Hard targets, HHSEARCH is worse than SP³, whereas SP³ alternative alignments are still ~3% better than SP³. All methods are much worse than the best possible structural alignment generated by TM-align. For the exact match to the structural alignment given by the TM-align, the SP³ alternative alignment is around 2% (absolute value) or 3% (relative value) better than the SP³ alignment for the Easy targets. For the Hard targets in the benchmark set, the average exact match to the structural alignment is quite low because many target-template alignments have zero matches. The relative improvement of SP³ alternative alignment (13.09%) over SP³ alignment (12.87%) for the Hard targets is 1.7% which is less than the GDT-TS score improvement of 2.6%. However, if the match is allowed to shift within ±1 or ±2 residues, then the relative improvement of SP³ alternative alignment over that of SP³ for the Hard targets will be 3.0% or 3.5%, respectively. These improvements are consistent with the GDT-TS score result. Overall, the SP³ alternative alignment is consistently better than the SP³ alignment. A similar trend is also found for the CASP9 set described in the next subsection. Since the HHSEARCH provides an independent alignment for given target-template pair, we include it in our method to explore the alignment space more effectively.

Table 3.

Alignment accuracy on the top 10 SP³ templates for the 1192 benchmark set.

GDT-TS score to native^‡
	Benchmark set		CASP9 set
	874 Easy	318 Hard	80 Easy	32 Hard
	Change relative to SP³^*	Change relative to SP³	Change relative to SP³	Change relative to SP³
SP³ alternative alignment	2.9%	2.6%	2.6%	3.2%
HHSEARCH alignment	0.7%	−2.9%	1.4%	−2.8%
TM-align alignment	15.3%	40.7%	16.7%	38.8%
Comparison with TM-align
	Correct match^#	Correct match	Correct match	Correct match
SP³ alignment	58.95/68.09/72.17	12.87/19.04/23.16	58.58/67.21/71.02	19.80/25.84/29.73
SP³ alternative alignment	60.77/70.62/74.47	13.09/19.61/23.96	60.19/69.08/72.75	20.75/27.96/32.23
HHSEARCH alignment	59.70/68.67/72.56	12.01/17.97/21.88	60.04/68.57/72.50	18.61/24.63/28.84

Open in a new tab

^‡

GDT-TSscores are calculated using full length models built from alignments with MODELLER²⁸.

Defined as (model GDT-TS of this method – model GDT-TS of SP³)/(model GDT-TS of SP³).

Correct match is defined as that when the threading and structural alignments of the target residue to the template residue are identical, within ±1, or ±2 (third number) residues apart. The number of matches in a given target is normalized by alignment length of TM-align⁴¹. Presented numbers are the average per target per template and in percentages.

In Figure 4, we present some examples of TASSER^VMT improvements over pro-sp3-TASSER. Target 2qsv_A domain 1 is a 101 residue Hard target for threading method SP³. The predicted first model’s GDT-TS score by pro-sp3-TASSER is only 0.19, whereas for TASSER^VMT, it is 0.61. The best model in the structure pool of pro-sp3-TASSER generated by short TASSER runs has a GDT-TS score of 0.60 while that of TASSER^VMT it is 0.70 due to better alignments and more template combinations. The final prediction is dictated by the selected model quality for the TASSER refinement procedure. The GDT-TS score of the top model selected by FTCOM³⁸ in TASSER^VMT is 0.62, while that of the model selected by TASSER-QA³⁶ in pro-sp3-TASSER is 0.24. Therefore, the main reason for improvement is due to better model selection for this target. Another example is 3bci_A, a 165 residue Easy target for SP³, whose GDT-TS improves from 0.62 to 0.72. Since it is an Easy target, the model selection process alone does not make a big difference. The difference comes from the fact that the best model in the structure pool in pro-sp3-TASSER has a GDT-TS score of 0.63 while that from TASSER^VMT has a GDT-TS score of 0.73 due to better alignments employed. For comparison, TASSER models are also shown in Figure 4. Although on average, the pro-sp3-TASSER is better than TASSER²², for the target 2qsv_A, it is slightly worse.

Examples of TASSER^VMT improvement over pro-sp3-TASSER models. Numbers in parenthesis are GDT-TS scores to native. TASSER models are also shown for comparison.

CASP9 targets

To compare TASSER^VMT with other state-of-the-art template-based methods, we tested TASSER^VMT on all 112 CASP9 targets. A target is defined as Easy if the average TM-score⁴³ of the first models generated by the best 50% of all CASP9 servers is higher than 0.5; otherwise, it is defined as Hard. The list of Easy/Hard targets can be found at http://zhanglab.ccmb.med.umich.edu/casp9/. This definition is equivalent to that in the above benchmark test where two methods are involved: TASSER^VMT and pro-sp3-TASSER. In what follows, CASP servers with small variations will be represented by one server. For example, HHpredA represents all three HHpred servers, RaptorX represents all RaptorX servers, and MULTICOM_CLUSTER represents all MULTICOM⁴⁶ servers.

To have a fair comparison, all necessary inputs (profiles, templates, fragments, secondary structure predictions, etc.) were generated using information that was available at the time of CASP9. The results are compiled in Table 4. TASSER^VMT outperforms all other methods for the 80 Easy targets and the differences between TASSER^VMT and other methods are all statistically significant. TASSER^VMT is only slightly worse than Zhang-Server and QUARK for the 32 Hard targets, though the differences are not statistically significant. This could be due to the fact that the Zhang-Server and QUARK used template-free modeling for some Hard targets. Compared to our own method, pro-sp3-TASSER, TASSER^VMT improves the results by 8.2% and 9.3% for Easy and Hard targets, respectively, and all improvements are statistically significant. The percentages of improvement are larger than those for the large benchmark set but the corresponding p-values are less significant. The difference is mainly due to different sample size (112 vs. 1192). For a small size sample set, e.g. for the CASP9 set, a small number of outliers can dramatically influence the results.

Table 4.

Comparison of TASSER^VMT with top CASP9 servers on the 112 targets.

	80 Easy targets^#		32 Hard targets
	Cumulative GDT-TS	p-value^*	Cumulative GDT-TS	p-value
TASSER^VMT	54.28	-	10.57	-
Zhang-Server	53.50	0.04	10.91	0.20
QUARK	53.45	0.03	10.73	0.36
HHpredA	52.63	0.007	9.23	0.03
RaptorX	52.97	0.007	10.29	0.32
Seok-server	51.95	1.0×10⁻⁴	9.91	0.12
MULTICOM_CLUSTER	51.72	2.9×10⁻⁵	9.63	0.04
BAKER-ROSETTASERVER^§	-	-	9.79	0.11
chunk-TASSER	50.43	2.0×10⁻¹¹	9.96	0.05
pro-sp3-TASSER	50.16	1.2×10⁻¹³	9.67	0.01

Open in a new tab

A target is defined as Easy, if the averaged TM-score of the first models by the best 50% of all CASP9 servers is higher than 0.5; otherwise, it is classified as Hard. The list of targets can be found at http://zhanglab.ccmb.med.umich.edu/casp9/.

Two-sided p-value of the Student-t test between TASSER^VMT and the given method. A p-value of ≤0.05 is considered statistically significant.

^§

Because of missing targets, we did not put results for Easy targets here.

To analyze the effects of using a variable number of template sets, the SP³ alternative alignment and HHSEARCH alignment, for this relatively small set of CASP targets, we conducted several different modeling protocols: (A) Running TASSER refinements using a single set of top (10 for Easy, 20 for Hard targets, respectively) template models from SP³ without employing the Variable Multiple Template protocol; (B) (C) (D) (E) protocols employing the Variable Multiple Template protocol with different combinations of SP³ default, SP³ alternative and HHSEARCH alignments. In protocol (A), there is no intermediate stage of generating a structure pool. The results are compiled in Table 5. Obviously, all other protocols are significantly better than protocol (A) due to the usage of a Variable Multiple Template protocol. Thus, the Variable Multiple Template protocol contributes a lot to TASSER^VMT performance. By comparing TASSER^VMT without SP³ alternative alignments (D in Table 5, replacing SP³ alternative with SP³ default in TASSER^VMT) with the full TASSER^VMT protocol, we see the effect of the alternative alignments (in combination with the HHSEARCH alignment). For Easy targets, the SP³ alternative alignments contribute around 2% of the increase in GDT-TS score; whereas for Hard targets, this contribution is around 6%, which is higher than the average alignment accuracy increase given in Table 3. This could be because the model selection procedure by FTCOM filtered out some worse alignments, while keeping the better ones.

Table 5.

Comparison of TASSER^VMT with different protocols on the 112 targets.

Protocol	SP³ default alignment^#	SP³ alternative alignment^#	HHSEARCH alignment^#	Variable Multiple Templates^#	80 Easy targets	32 Hard targets
Protocol	SP³ default alignment^#	SP³ alternative alignment^#	HHSEARCH alignment^#	Variable Multiple Templates^#	Cumulative GDT-TS	Cumulative GDT-TS
A^*	+	−	−	−	49.34	9.27
B	+	−	−	+	51.59	9.95
C	−	+	−	+	52.06	10.17
D	+	−	+	+	53.12	9.96
E	+	+	−	+	53.48	10.45
TASSER^VMT	−	+	+	+	54.28	10.57

Open in a new tab

+/− indicates the component is included/excluded in the protocol.

In this protocol, a single set of top 10/20 SP³ templates were used for Easy/Hard targets.

The effect of SP³ alternative alignment without HHSEARCH can be obtained by comparing protocol (C) that uses only SP³ alternative alignments to (B) that uses only SP³ default alignments. Protocol (C) is better than (B) by 0.9%/2.2% for Easy/Hard targets. Protocol (E) is the protocol of TASSER^VMT without HHSEARCH but with the combination of SP³ default and SP³ alternative alignments. It is equivalent to replacing the HHSEARCH alignment in TASSER^VMT with the SP³ default alignment. Protocol (E) is better than (B) by 3.6%/5.0% and better than (C) by 2.7%/2.8% for Easy/Hard targets. Thus, there is also combinatorial effect on the alignments.

The direct effect of HHEARCH can be obtained by comparing protocol (E) with TASSER^VMT. Protocol (E) is slightly worse than TASSER^VMT by 1.5%/1.1% for Easy/Hard targets. Therefore, the HHEARCH alignment improves Easy targets more than Hard targets. HHSEARCH still improves Hard targets even though on average the HHSEARCH alignment is worse than SP³ default for Hard targets (see Table 3). This could be due to fact that HHSEARCH is independent of SP³ alternative alignments and provides some complementary alignments, whereas SP³ default alignments and SP³ alternative alignments share common alignment profiles.

We now examine how good is the SP³ parametric alignment for alternative alignment generation. We compare the actual model quality of models built by MODELLER from the SP³ default alignment and from SP³ parametric alignments for the top 10 SP³ templates of the 112 CASP9 targets in Table 6. For each target-template pair, the average, standard deviation, best GDT-TS score of the models and percentage of models having better GDT-TS scores than that of SP³ default alignment are obtained. The average model quality of the parametric alignments is slightly worse than that of the SP³ default alignment, which is understandable because the parameters of SP³ default alignment is optimized against structure alignments. The variance (standard deviation) of the parametric alignments is 11%/14% of the default alignment for Easy/Hard targets. Around 30% of the alignments are better than the default ones. The best alignment is on average 9%/18% better than the SP³ default alignment for Easy/Hard targets. Thus, SP³ parametric alignment can generate models much better than the SP³ default alignment. Even though the best models are still far from the models generated by TM-align (see Table 3, 16.7%/38.8% better than SP³ default), they are much better than the SP³ alternative alignment (see Table 3, 2.6%/3.2% better than SP³ default). If a better model selection method could be used for alternative alignment generation, then the improvement would be more significant.

Table 6.

Comparison of SP³ default and SP³ parametric alignments on the 112 targets^#.

	SP³ default	SP³ parametric
	GDT-TS	average GDT-TS	standard deviation of GDT-TS	best GDT-TS	percentage better than default
80 Easy targets	0.461	0.422	0.051	0.501	27.4%
32 Hard targets	0.236	0.210	0.032	0.279	31.2%

Open in a new tab

Numbers are the average per target per template for the top 10 SP³ templates. Models are built from the given alignment using MODELLER. The average, standard deviation, best and percentage of better than default method are defined within models of the same target-template pair with variable SP³ parameters.

DISCUSSION

We have improved our automated template-based modeling approach pro-sp3-TASSER method significantly by improving alignment accuracy, more sampling of multiple template combinations and a better model selection method, FTCOM. The resulting method, TASSER^VMT, performs better than the best servers in CASP9 for Easy targets. We note that in contrast to pro-sp3-TASSER and many other CASP servers, TASSER^VMT, does not separately consider any unaligned domains. However, based-on our analysis of the two main components of our method, alignment and model selection, there is still much room for further improvement. As shown in Table 3, current state-of-the-art alignment methods like HHSEARCH and the method developed in this work are far from optimal. Moreover, model selection could also be improved. For example, on average, the best model as measured by GDT-TS score in the structure pool generated by short TASSER simulations is 4.6% and 28% better than our predictions for Easy and Hard targets, respectively. A better model quality assessment prediction method could further improve these results, especially for Hard targets. In summary, for Easy/Hard targets, improving the template alignment to coincide with the structural alignment would result in a ~15%/40% improvement while improving model selection would provide another ~5%/30% improvement. We are currently investigating the possibility of improving alignment and model selection further towards their upper limits. In the current implementation of TASSER^VMT, we have explored only a relatively small number of multiple templates and generated alternative alignments by a simple parametric approach. It is possible that a more effective alternative alignment generation method that samples more near native alignments will result in further improvement. We are currently exploring a variety of approaches designed to achieve this goal.

Acknowledgments

This work is supported by NIH grants GM-48835 and GM-37408. The authors thank Dr. Bartosz Ilkowski for managing the cluster on which this work was conducted.

References

1.Zhang Y. Progress and challenges in protein structure prediction. Curr Opin Struct Biol. 2008;18(3):342–348. doi: 10.1016/j.sbi.2008.02.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
2.Venclovas C, Margelevicius M. Comparative modeling in CASP6 using consensus approach to template selection, sequence-structure alignment, and structure assessment. Proteins. 2005;(Suppl 7):99–105. doi: 10.1002/prot.20725. [DOI] [PubMed] [Google Scholar]
3.Xu J. Fold recognition by predicted alignment accuracy. IEEE/ACM Trans on comput Biol and Bioinfo. 2005;2(2):157–165. doi: 10.1109/TCBB.2005.24. [DOI] [PubMed] [Google Scholar]
4.Fischer D, Eisenberg D. Protein fold recognition using sequence-derived predictions. Protein Sci. 1992;5:947–955. doi: 10.1002/pro.5560050516. [DOI] [PMC free article] [PubMed] [Google Scholar]
5.Jones DT. GenTHREADER: an efficient and reliable protein fold recognition method for genomic sequences. J Mol Biol. 1999;287:797–815. doi: 10.1006/jmbi.1999.2583. [DOI] [PubMed] [Google Scholar]
6.Fischer D. In: Hybrid fold recognition: combining sequence derived properties with evolutionary information. Altman RB, Dunker AK, Hunter L, Lauderdale K, Klein TE, editors. Hawaii: World Scientific; 2000. pp. 119–130. [PubMed] [Google Scholar]
7.Jaroszewski L, Rychlewski L, Li W, Godzik A. Comparison of sequence profiles. Strategies for structural predictions using sequence information. Protein Sci. 2000;9:232–241. doi: 10.1110/ps.9.2.232. [DOI] [PMC free article] [PubMed] [Google Scholar]
8.Ginalski K, Pas J, Wyrwicz LS, von Grotthuss M, Bujnicki JM, Rychlewski L. ORFeus: Dectection of distant homology using sequence profiles and predicted secondary structure. Nucleic Acids Res. 2003;31:3804–3807. doi: 10.1093/nar/gkg504. [DOI] [PMC free article] [PubMed] [Google Scholar]
9.Yona G, Levitt M. Within the twilight zone: a sensitive profile-profile comparison tool based on information theory. J Mol Biol. 2002;315:1257–1275. doi: 10.1006/jmbi.2001.5293. [DOI] [PubMed] [Google Scholar]
10.Karplus K, Barrett C, Hughey R. Hidden markov models for detecting remote protein homologies. Bioinformatics. 1998;14:846–856. doi: 10.1093/bioinformatics/14.10.846. [DOI] [PubMed] [Google Scholar]
11.Wu S, Zhang Y. MUSTER: Improving protein sequence profile-profile alignments by using multiple sources of structure information. proteins. 2008;72:547–556. doi: 10.1002/prot.21945. [DOI] [PMC free article] [PubMed] [Google Scholar]
12.Zhou H, Zhou Y. SPARKS2 and SP3 servers in CASP6. Proteins(Supplement CASP issue) 2005;(suppl 7):152–156. doi: 10.1002/prot.20732. [DOI] [PubMed] [Google Scholar]
13.Lundsröm J, Rychlewski L, Bunnicki J, Elofsson A. Pcons: a neural-network-based consensus predictor that improves fold recognition. Protein Sci. 2001;10:2354–2362. doi: 10.1110/ps.08501. [DOI] [PMC free article] [PubMed] [Google Scholar]
14.Xu Y, Xu D, Olman V. A practical method for interpretation of threading scores:an application of neural networks. Statistica Sinica Special Issue on Bioinformatics. 2002;12:159–177. [Google Scholar]
15.Fischer D. 3D-SHOTGUN: a novel, cooperative, fold-recognition meta-predictor. Proteins. 2003;51:434–441. doi: 10.1002/prot.10357. [DOI] [PubMed] [Google Scholar]
16.Ginalski K, Elofsson A, Fischer D, Rychlewski L. 3D-jury: a simple approach to improve protein structure predictions. Bioinformatics. 2003;19:1015–1018. doi: 10.1093/bioinformatics/btg124. [DOI] [PubMed] [Google Scholar]
17.Zhang Y, Skolnick J. Automated structure prediction of weakly homologous proteins on genomic scale. Proc Natl Acad Sci (USA) 2004;101:7594–7599. doi: 10.1073/pnas.0305695101. [DOI] [PMC free article] [PubMed] [Google Scholar]
18.Zhang Y, Arakaki A, Skolnick J. TASSER: An automated method for the prediction of protein tertiary structures in CASP6. Proteins. 2005;(suppl 7):91–98. doi: 10.1002/prot.20724. [DOI] [PubMed] [Google Scholar]
19.Al-Lazikani B, Sheinerman FB, Honig B. Combining multiple structure and sequence alignments to improve sequence detection and alignment: Application to the SH2 domains of Janus kinases. Proc Natl Acad Sci USA. 2001;98:14796–14801. doi: 10.1073/pnas.011577898. [DOI] [PMC free article] [PubMed] [Google Scholar]
20.Peng J, Xu J. A multiple-template approach to protein threading. Proteins. 2011 doi: 10.1002/prot.23016. in press. [DOI] [PMC free article] [PubMed] [Google Scholar]
21.Cheng J. A multi-template combination algorithm for protein comparative modeling. BMC Struc Biol. 2008;8:18. doi: 10.1186/1472-6807-8-18. [DOI] [PMC free article] [PubMed] [Google Scholar]
22.Zhou H, Skolnick J. Protein structure prediction by pro-sp3-TASSER. Biophys J. 2009;96:2119–2127. doi: 10.1016/j.bpj.2008.12.3898. [DOI] [PMC free article] [PubMed] [Google Scholar]
23.Moult J, Fidelis K, Kryshtafovych A, Tramontano A. 9th Critical Assessment of Techniques for Protein Structure Prediction. 2010. [Google Scholar]
24.Zhang Y. Template-based modeling and free modeling by I-TASSER in CASP7. Proteins. 2007;69(Suppl 8):108–117. doi: 10.1002/prot.21702. [DOI] [PubMed] [Google Scholar]
25.Wu S, Zhang Y. LOMETS: A local meta-threading-server for protein structure prediction. Nucl Aci Res. 2007;35:3375–3382. doi: 10.1093/nar/gkm251. [DOI] [PMC free article] [PubMed] [Google Scholar]
26.Söding J. Protein homology detection by HMM-HMM comparison. Bioinformatics. 2005;21:951–960. doi: 10.1093/bioinformatics/bti125. [DOI] [PubMed] [Google Scholar]
27.Hildebrand A, Remmert M, Biegert A, Söding J. Fast and accurate automatic structure prediction with HHpred. Proteins. 2009;77(Suppl 9):128–132. doi: 10.1002/prot.22499. [DOI] [PubMed] [Google Scholar]
28.Sali AaBTL. Comparative protein modelling by satisfaction of spatial restraints. J Mol Biol. 1993;234:779–815. doi: 10.1006/jmbi.1993.1626. [DOI] [PubMed] [Google Scholar]
29.Peng J, Xu J. Boosting Protein Threading Accuracy. Lect Notes Comput Sci. 2009;31:5541. doi: 10.1007/978-3-642-02008-7_3. [DOI] [PMC free article] [PubMed] [Google Scholar]
30.Peng J, Xu J. Low-homology protein threading. Bioinformatics. 2010;26(12):i294–i300. doi: 10.1093/bioinformatics/btq192. [DOI] [PMC free article] [PubMed] [Google Scholar]
31.Zhou H, Pandit S, Skolnick J. Performance of the Pro-sp3-TASSER server in CASP8. Proteins. 2009;77:123–127. doi: 10.1002/prot.22501. [DOI] [PMC free article] [PubMed] [Google Scholar]
32.Wu S, Zhang Y. SEGMER:identifying protein sub-structural similarity by segmental threading. Structure. 2010;18:858–867. doi: 10.1016/j.str.2010.04.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
33.Wu S, Zhang Y. A comprehensive assessment of sequence-based and template-based methods for protein contact prediction. Bioinformatics. 2008;24:924–931. doi: 10.1093/bioinformatics/btn069. [DOI] [PMC free article] [PubMed] [Google Scholar]
34.Zhou H, Zhou Y. Fold recognition by combining sequence profiles derived from evolution and from depth-dependent structural alignment of fragments. Proteins. 2005;58:321–328. doi: 10.1002/prot.20308. [DOI] [PMC free article] [PubMed] [Google Scholar]
35.Chivian D, Baker D. Homology modeling using parametric alignment ensemble generation with consensus and energy-based model selection. Nucl Aci Res. 2006;34:e112. doi: 10.1093/nar/gkl480. [DOI] [PMC free article] [PubMed] [Google Scholar]
36.Zhou H, Skolnick J. Protein model quality assessment prediction by combining fragment comparisons and a consensus Cα contact potential. Proteins. 2007;71:1211–1218. doi: 10.1002/prot.21813. [DOI] [PMC free article] [PubMed] [Google Scholar]
37.Waterman MS, Eggert M, Lander E. Parametric sequence comparisons. Proc Natl Acad Sci (USA) 1992;89:6090–6093. doi: 10.1073/pnas.89.13.6090. [DOI] [PMC free article] [PubMed] [Google Scholar]
38.Zhou H, Skolnick J. Improving threading algorithms for remote homology modeling by combining fragment and template comparisons. Proteins. 2010;78(9):2041–2048. doi: 10.1002/prot.22717. [DOI] [PMC free article] [PubMed] [Google Scholar]
39.Zhou H, Zhou Y. Distance-scaled, finite ideal-gas reference state improves structure-derived potentials of mean force for structure selection and stability prediction. Protein Science. 2002;11:2714–2726. doi: 10.1110/ps.0217002. [DOI] [PMC free article] [PubMed] [Google Scholar]
40.Zhang Y, Skolnick J. SPICKER: a clustering approach to identify near-native protein fold. J Comput Chem. 2004;25:865–871. doi: 10.1002/jcc.20011. [DOI] [PubMed] [Google Scholar]
41.Zhang Y, Skolnick J. TM-align: a protein structure alignment algorithm based on the TM-score. Nucl Acids Res. 2005;33:2302–2309. doi: 10.1093/nar/gki524. [DOI] [PMC free article] [PubMed] [Google Scholar]
42.Rotkiewicz P, Skolnick J. Fast procedure for reconstruction of full-atom protein models from reduced representations. Journal of Computational Chemistry. 2008;29:1460–1465. doi: 10.1002/jcc.20906. [DOI] [PMC free article] [PubMed] [Google Scholar]
43.Zhang Y, Skolnick J. A scoring function for the automated assessment of protein structure template quality. Proteins. 2004;57:702–710. doi: 10.1002/prot.20264. [DOI] [PubMed] [Google Scholar]
44.Zemla A, Venclovas C, Moult J, Fidelis K. Processing and analysis of CASP3 protein structure predictions. Proteins. 1999;3:22–29. doi: 10.1002/(sici)1097-0134(1999)37:3+<22::aid-prot5>3.3.co;2-n. [DOI] [PubMed] [Google Scholar]
45.Zhang Y, Hubner I, Arakaki A, Shakhnovich E, Skolnick J. On the origin and highly likely completeness of single-domain protein structures. Proc Natl Acad Sci (USA) 2006;103:2605–2610. doi: 10.1073/pnas.0509379103. [DOI] [PMC free article] [PubMed] [Google Scholar]
46.Cheng J, Wang Z, Tegge A, Eickholt J. Prediction of global and local quality of CASP8 models by MULTICOM series. Proteins. 2009;77(Suppl 9):181–184. doi: 10.1002/prot.22487. [DOI] [PubMed] [Google Scholar]

[R1] 1.Zhang Y. Progress and challenges in protein structure prediction. Curr Opin Struct Biol. 2008;18(3):342–348. doi: 10.1016/j.sbi.2008.02.004. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R2] 2.Venclovas C, Margelevicius M. Comparative modeling in CASP6 using consensus approach to template selection, sequence-structure alignment, and structure assessment. Proteins. 2005;(Suppl 7):99–105. doi: 10.1002/prot.20725. [DOI] [PubMed] [Google Scholar]

[R3] 3.Xu J. Fold recognition by predicted alignment accuracy. IEEE/ACM Trans on comput Biol and Bioinfo. 2005;2(2):157–165. doi: 10.1109/TCBB.2005.24. [DOI] [PubMed] [Google Scholar]

[R4] 4.Fischer D, Eisenberg D. Protein fold recognition using sequence-derived predictions. Protein Sci. 1992;5:947–955. doi: 10.1002/pro.5560050516. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R5] 5.Jones DT. GenTHREADER: an efficient and reliable protein fold recognition method for genomic sequences. J Mol Biol. 1999;287:797–815. doi: 10.1006/jmbi.1999.2583. [DOI] [PubMed] [Google Scholar]

[R6] 6.Fischer D. In: Hybrid fold recognition: combining sequence derived properties with evolutionary information. Altman RB, Dunker AK, Hunter L, Lauderdale K, Klein TE, editors. Hawaii: World Scientific; 2000. pp. 119–130. [PubMed] [Google Scholar]

[R7] 7.Jaroszewski L, Rychlewski L, Li W, Godzik A. Comparison of sequence profiles. Strategies for structural predictions using sequence information. Protein Sci. 2000;9:232–241. doi: 10.1110/ps.9.2.232. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R8] 8.Ginalski K, Pas J, Wyrwicz LS, von Grotthuss M, Bujnicki JM, Rychlewski L. ORFeus: Dectection of distant homology using sequence profiles and predicted secondary structure. Nucleic Acids Res. 2003;31:3804–3807. doi: 10.1093/nar/gkg504. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R9] 9.Yona G, Levitt M. Within the twilight zone: a sensitive profile-profile comparison tool based on information theory. J Mol Biol. 2002;315:1257–1275. doi: 10.1006/jmbi.2001.5293. [DOI] [PubMed] [Google Scholar]

[R10] 10.Karplus K, Barrett C, Hughey R. Hidden markov models for detecting remote protein homologies. Bioinformatics. 1998;14:846–856. doi: 10.1093/bioinformatics/14.10.846. [DOI] [PubMed] [Google Scholar]

[R11] 11.Wu S, Zhang Y. MUSTER: Improving protein sequence profile-profile alignments by using multiple sources of structure information. proteins. 2008;72:547–556. doi: 10.1002/prot.21945. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R12] 12.Zhou H, Zhou Y. SPARKS2 and SP3 servers in CASP6. Proteins(Supplement CASP issue) 2005;(suppl 7):152–156. doi: 10.1002/prot.20732. [DOI] [PubMed] [Google Scholar]

[R13] 13.Lundsröm J, Rychlewski L, Bunnicki J, Elofsson A. Pcons: a neural-network-based consensus predictor that improves fold recognition. Protein Sci. 2001;10:2354–2362. doi: 10.1110/ps.08501. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R14] 14.Xu Y, Xu D, Olman V. A practical method for interpretation of threading scores:an application of neural networks. Statistica Sinica Special Issue on Bioinformatics. 2002;12:159–177. [Google Scholar]

[R15] 15.Fischer D. 3D-SHOTGUN: a novel, cooperative, fold-recognition meta-predictor. Proteins. 2003;51:434–441. doi: 10.1002/prot.10357. [DOI] [PubMed] [Google Scholar]

[R16] 16.Ginalski K, Elofsson A, Fischer D, Rychlewski L. 3D-jury: a simple approach to improve protein structure predictions. Bioinformatics. 2003;19:1015–1018. doi: 10.1093/bioinformatics/btg124. [DOI] [PubMed] [Google Scholar]

[R17] 17.Zhang Y, Skolnick J. Automated structure prediction of weakly homologous proteins on genomic scale. Proc Natl Acad Sci (USA) 2004;101:7594–7599. doi: 10.1073/pnas.0305695101. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R18] 18.Zhang Y, Arakaki A, Skolnick J. TASSER: An automated method for the prediction of protein tertiary structures in CASP6. Proteins. 2005;(suppl 7):91–98. doi: 10.1002/prot.20724. [DOI] [PubMed] [Google Scholar]

[R19] 19.Al-Lazikani B, Sheinerman FB, Honig B. Combining multiple structure and sequence alignments to improve sequence detection and alignment: Application to the SH2 domains of Janus kinases. Proc Natl Acad Sci USA. 2001;98:14796–14801. doi: 10.1073/pnas.011577898. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R20] 20.Peng J, Xu J. A multiple-template approach to protein threading. Proteins. 2011 doi: 10.1002/prot.23016. in press. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R21] 21.Cheng J. A multi-template combination algorithm for protein comparative modeling. BMC Struc Biol. 2008;8:18. doi: 10.1186/1472-6807-8-18. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R22] 22.Zhou H, Skolnick J. Protein structure prediction by pro-sp3-TASSER. Biophys J. 2009;96:2119–2127. doi: 10.1016/j.bpj.2008.12.3898. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R23] 23.Moult J, Fidelis K, Kryshtafovych A, Tramontano A. 9th Critical Assessment of Techniques for Protein Structure Prediction. 2010. [Google Scholar]

[R24] 24.Zhang Y. Template-based modeling and free modeling by I-TASSER in CASP7. Proteins. 2007;69(Suppl 8):108–117. doi: 10.1002/prot.21702. [DOI] [PubMed] [Google Scholar]

[R25] 25.Wu S, Zhang Y. LOMETS: A local meta-threading-server for protein structure prediction. Nucl Aci Res. 2007;35:3375–3382. doi: 10.1093/nar/gkm251. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R26] 26.Söding J. Protein homology detection by HMM-HMM comparison. Bioinformatics. 2005;21:951–960. doi: 10.1093/bioinformatics/bti125. [DOI] [PubMed] [Google Scholar]

[R27] 27.Hildebrand A, Remmert M, Biegert A, Söding J. Fast and accurate automatic structure prediction with HHpred. Proteins. 2009;77(Suppl 9):128–132. doi: 10.1002/prot.22499. [DOI] [PubMed] [Google Scholar]

[R28] 28.Sali AaBTL. Comparative protein modelling by satisfaction of spatial restraints. J Mol Biol. 1993;234:779–815. doi: 10.1006/jmbi.1993.1626. [DOI] [PubMed] [Google Scholar]

[R29] 29.Peng J, Xu J. Boosting Protein Threading Accuracy. Lect Notes Comput Sci. 2009;31:5541. doi: 10.1007/978-3-642-02008-7_3. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R30] 30.Peng J, Xu J. Low-homology protein threading. Bioinformatics. 2010;26(12):i294–i300. doi: 10.1093/bioinformatics/btq192. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R31] 31.Zhou H, Pandit S, Skolnick J. Performance of the Pro-sp3-TASSER server in CASP8. Proteins. 2009;77:123–127. doi: 10.1002/prot.22501. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R32] 32.Wu S, Zhang Y. SEGMER:identifying protein sub-structural similarity by segmental threading. Structure. 2010;18:858–867. doi: 10.1016/j.str.2010.04.007. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R33] 33.Wu S, Zhang Y. A comprehensive assessment of sequence-based and template-based methods for protein contact prediction. Bioinformatics. 2008;24:924–931. doi: 10.1093/bioinformatics/btn069. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R34] 34.Zhou H, Zhou Y. Fold recognition by combining sequence profiles derived from evolution and from depth-dependent structural alignment of fragments. Proteins. 2005;58:321–328. doi: 10.1002/prot.20308. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R35] 35.Chivian D, Baker D. Homology modeling using parametric alignment ensemble generation with consensus and energy-based model selection. Nucl Aci Res. 2006;34:e112. doi: 10.1093/nar/gkl480. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R36] 36.Zhou H, Skolnick J. Protein model quality assessment prediction by combining fragment comparisons and a consensus Cα contact potential. Proteins. 2007;71:1211–1218. doi: 10.1002/prot.21813. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R37] 37.Waterman MS, Eggert M, Lander E. Parametric sequence comparisons. Proc Natl Acad Sci (USA) 1992;89:6090–6093. doi: 10.1073/pnas.89.13.6090. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R38] 38.Zhou H, Skolnick J. Improving threading algorithms for remote homology modeling by combining fragment and template comparisons. Proteins. 2010;78(9):2041–2048. doi: 10.1002/prot.22717. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R39] 39.Zhou H, Zhou Y. Distance-scaled, finite ideal-gas reference state improves structure-derived potentials of mean force for structure selection and stability prediction. Protein Science. 2002;11:2714–2726. doi: 10.1110/ps.0217002. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R40] 40.Zhang Y, Skolnick J. SPICKER: a clustering approach to identify near-native protein fold. J Comput Chem. 2004;25:865–871. doi: 10.1002/jcc.20011. [DOI] [PubMed] [Google Scholar]

[R41] 41.Zhang Y, Skolnick J. TM-align: a protein structure alignment algorithm based on the TM-score. Nucl Acids Res. 2005;33:2302–2309. doi: 10.1093/nar/gki524. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R42] 42.Rotkiewicz P, Skolnick J. Fast procedure for reconstruction of full-atom protein models from reduced representations. Journal of Computational Chemistry. 2008;29:1460–1465. doi: 10.1002/jcc.20906. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R43] 43.Zhang Y, Skolnick J. A scoring function for the automated assessment of protein structure template quality. Proteins. 2004;57:702–710. doi: 10.1002/prot.20264. [DOI] [PubMed] [Google Scholar]

[R44] 44.Zemla A, Venclovas C, Moult J, Fidelis K. Processing and analysis of CASP3 protein structure predictions. Proteins. 1999;3:22–29. doi: 10.1002/(sici)1097-0134(1999)37:3+<22::aid-prot5>3.3.co;2-n. [DOI] [PubMed] [Google Scholar]

[R45] 45.Zhang Y, Hubner I, Arakaki A, Shakhnovich E, Skolnick J. On the origin and highly likely completeness of single-domain protein structures. Proc Natl Acad Sci (USA) 2006;103:2605–2610. doi: 10.1073/pnas.0509379103. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R46] 46.Cheng J, Wang Z, Tegge A, Eickholt J. Prediction of global and local quality of CASP8 models by MULTICOM series. Proteins. 2009;77(Suppl 9):181–184. doi: 10.1002/prot.22487. [DOI] [PubMed] [Google Scholar]

PERMALINK

Template-based protein structure modeling using TASSER^VMT

Hongyi Zhou

Jeffrey Skolnick

Abstract