Molecular Identification of Enterovirus by Analyzing a Partial VP1 Genomic Region with Different Methods

G Palacios; I Casas; A Tenorio; C Freire

doi:10.1128/JCM.40.1.182-192.2002

. 2002 Jan;40(1):182–192. doi: 10.1128/JCM.40.1.182-192.2002

Molecular Identification of Enterovirus by Analyzing a Partial VP1 Genomic Region with Different Methods

G Palacios ^1,^*, I Casas ², A Tenorio ², C Freire ¹

PMCID: PMC120085 PMID: 11773114

Abstract

VP1 is the most suitable region for use in the identification of enterovirus. Although VP1 sequencing methods may vary, it is necessary to agree on a common strategy of sequence analysis. Identification of a strain type may be achieved by three different approaches: pairwise sequence alignment, multiple-sequence alignment, and phylogenetic inference. Other methods are also available, but they are not simple enough to be performed at a virology laboratory. The performances of these methods were evaluated with nucleotide and protein sequences obtained from 32 original samples, 8 enterovirus isolates, and 64 GenBank sequences. Pairwise sequence alignment methods had very different results. The DNASTAR package identified only 28.8% of enterovirus strains, while the Genetics Computer Group package identified 50.0 or 72.1% of enterovirus strains when nucleotide or amino acid sequences were analyzed, respectively. Multiple-sequence alignment methods identified 94.2% (Clustal W program) or 92.3% (Pileup program) of the enterovirus strains, while the phylogenetic method increased this rate to 99.0%. Comparative evaluation of these analysis methods showed that the Clustal W program (version 1.81), a freely available multiple-sequence alignment program, presented one of the best performances when used with the correct criteria. Other commercial and expensive programs did not achieve the same performances, making them less suitable for molecular typing of enteroviruses. Finally, although phylogenetic inference is the most demanding method in terms of knowledge of the user, it remained the best option analyzed.

Enteroviruses (EVs) are a large genus belonging to the Picornaviridae family, and 64 immunologically distinct serotypes are known to cause infections in humans. They are the etiological agents responsible for several diseases (poliomyelitis, acute myocarditis, and aseptic meningitis) and play an important role in common chronic diseases, including dilated cardiomyopathy and insulin-dependent diabetes mellitus. Identification of EVs is essential for epidemiological surveillance, identification of polioviruses, the study of correlations between EV subtypes and diseases, identification of new EV types, and adequate treatment of EV infections in neonates and immunodeficient patients (for a complete review, see reference 15). Although the serotype identity can be determined by neutralization of infectivity with serotype-specific antisera, individual typing of all 64 serotypes by neutralization is clearly impractical. To overcome this problem, intersecting pools of individual antisera which allow the identification of a serotype were developed. Despite that, the method is still time-consuming, labor-intensive, and costly; and the supply of antisera is limited. Moreover, there are frequent problems related to untypeable EVs that have been associated with mixtures of EVs, the existence of certain EV serotypes that cannot be identified with intersecting pools, the formation of aggregates (9), and the existence of antigenic variants of recognized EVs (13). Finally, some unrecognized serotypes are also untypeable by this standard method.

Because of these problems, several methods were developed for the molecular characterization of the genus (1, 5–7, 17). The coupling of reverse transcription and amplification of the enteroviral RNA by PCR, followed by direct sequencing of the amplified products, is the general approach. Some of these methods analyzed the 5′ noncoding, the VP2, or the 3D region of the EVs genome; but the sequence did not always correlate with the corresponding serotype (1, 10, 18). The VP1 region was the most suitable target due to the high correlation between serotypes and sequences and the availability of a large database of EVs sequences (17). Two methods were developed for the typing of EV strains by partial sequencing of VP1 from cell cultures (17) or clinical samples (5). These two approaches were based on the sequencing of amplified products and comparison of the sequences with the VP1 sequences in a database of EVs reference strains by pairwise local alignment (17). However, several problems were found for only a handful of field strains, suggesting that this method of analysis could be inconsistent (5).

Although methods for the sequencing of VP1 may vary, there must be widespread agreement as to the method of analysis so that it can be universally applied. Several methods of analysis are quick and standardized: pairwise sequence alignment, multiple-sequence alignment, and phylogenetic reconstruction. Although there are several choices, users generally apply the available methods instead of first performing an evaluation of them. Nevertheless, the election of the analysis method should be an essential part of the design process, since each method of analysis is based on different assumptions.

Previously proposed sequence analysis methods were based on the similarity index of the pairwise local alignment as a measure of identity of the strain sequence (17). The Gap program (Genetics Computer Group [GCG] software package) uses the algorithm of Needleman and Wunsch (16). The same algorithm is implemented by other commercial packages (e.g., the MegAlign program of the DNASTAR package [3]).

Multiple-sequence alignment is an extension of pairwise sequence alignment but is considered a more thorough method of analysis than pairwise sequence alignment. Two programs, a freeware version of the Clustal W program (version 1.81) (23) and the Pileup program, another part of the GCG package, were evaluated.

Phylogenetic reconstruction of gene trees from sequence data is considered the “gold standard” method of molecular analysis. The best results are obtained when the proper method is chosen and the quality control measures are followed. However, there are several available methods based on distinct assumptions.

In the present study, we compared the efficiencies of analysis methods over a range of values for the various parameters. The methods are simple enough to be performed in virology reference laboratories. A completely characterized database of EV strain sequences from different origins was used for comparison.

MATERIALS AND METHODS

Clinical specimens.

Clinical specimens (8 EVs isolates from 3 stool samples and 5 pharyngeal swabs or nasopharyngeal aspirates and 32 original samples [28 cerebrospinal fluid and 4 stool samples]) were collected at both the Diagnostic Microbiology Service (Centro Nacional de Microbiología, Instituto de Salud Carlos III, Madrid, Spain) and the Neurovirosis Division (Administracion Nacional de Laboratorio e Institutos de Salud, “Dr. Carlos G. Malbrán,” Buenos Aires, Argentina). Sequences from the original samples (see Table 1) were directly obtained from clinical specimens.

TABLE 1.

Description of EVs studied^a

Clinical sample^b	GenBank accession no.	Neutrali-zation sero-types^c	Phylogenetic genotype	Boot-strap value	Clustal homology test index (%)^d	Genotype determined by clustal homology test^e	Pileup homology test index(%)^e	Genotype determined by pileup homol-ogy test^e	Highest score NW-GCG similarity test index (%)^f	Second highest score NW-GCG similarity test index (%)^g	Genotype determined by NW-GCG similarity test^e
TX92-1647	AF081634	CAV14	CAV14	100	87.7	CAV14	87.5	CAV14	87.2	63.5	CAV14
GA95-2095	AF081613	CAV16	CAV16	86	78.9	CAV16	78.9	CAV16	78.1	70.1	CAV16
PA94-5753	AF081628	CAV16	CAV16	86	78.9	CAV16	78.9	CAV16	78.1	69.3	CAV16
TX95-2147	AF081635	CAV16	CAV16	86	77.7	CAV16	77.7	CAV16	76.8	71.2	CAV16
PA89-9262	AF152288	CAV16	CAV16	86	78.9	CAV16	78.9	CAV16	78.1	70.2	CAV16
TAI84-5839	AF152293	CAV16	CAV16	86	79.1	CAV16	79.1	CAV16	78.3	70.2	CAV16
707F99	AF290902	CAV16	CAV16	86	81	CAV16	81.5	CAV16	80.8	64.9	CAV16
TN88-8321	AF152294	CAV17	CAV17	99	79.5	CAV17	79.6	CAV17	79.3	70.2	CAV17
MOR83-6282	AF152273	CAV20	CAV20	100	81.1	CAV20	81.1	CAV20	80.9	73.1	CAV20
GUT88-8020	AF152258	CAV21	CAV21	100	78.5	CAV21	78.5	CAV21	78.3	66.3	CAV21
GUT88-8438	AF152259	CAV21	CAV21	100	78.5	CAV21	78.5	CAV21	78.3	65.6	CAV21
MD86-7277	AF152265	CAV21	CAV21	100	93	CAV21	93	CAV21	92.9	67.2	CAV21
WA89-9165	AF152300	CAV21	CAV21	100	93.3	CAV21	93.3	CAV21	93.2	67.2	CAV21
DOR93-1657	AF081603	CAV24	CAV24	100	88	CAV24	88.2	CAV24	88.1	67.4	CAV24
2106NE98	AF252189	CAV4	CAV4	100	85.7	CAV4	85.7	CAV4	85.1	65	CAV4
1734O99	AF290899	CAV4	CAV4	100	86.6	CAV4	86.7	CAV4	86.1	66.2	CAV4
727F99	AF290903	CAV6	CAV6	100	84	CAV6	84.3	CAV6	83.6	67	CAV6
R15797	AF252183	CAV9	CAV9	100	85.1	CAV9	85.1	CAV9	84.1	70.2	CAV9
1712NE99	AF290898	CAV9	CAV9	100	82.3	CAV9	82.5	CAV9	81.3	68.6	CAV9
R7596	AF252169	CBV1	CBV1	94	79.9	CBV1	80.1	CBV1	79	71.7	CBV1
R6696	AF252170	CBV1	CBV1	94	79.7	CBV1	80.3	CBV1	78.7	70.9	CBV1
HON84-6016	AF152260	CBV2	CBV2	100	86.2	CBV2	86.2	CBV2	85.2	69.8	CBV2
MD84-5914	AF152263	CBV2	CBV2	100	87.4	CBV2	87.4	CBV2	86.5	68	CBV2
NH97-2342	AF081622	CBV3	CBV3	100	80.8	CBV3	80.8	CBV3	79.4	69.6	CBV3
BRA98-9169	AF152249	CBV3	CBV3	100	84.8	CBV3	85	CBV3	84	70.1	CBV3
BRA98-9171	AF152250	CBV3	CBV3	100	82.9	CBV3	82.9	CBV3	81.7	69	CBV3
BRA88-9172	AF152251	CBV3	CBV3	100	84.8	CBV3	85	CBV3	84	70.1	CBV3
BRA88-9173	AF152252	CBV3	CBV3	100	85.1	CBV3	85.3	CBV3	84.3	70.4	CBV3
PER89-9426	AF152291	CBV3	CBV3	100	81.8	CBV3	81.6	CBV3	80.4	70.4	CBV3
R8797	AF252182	CBV4	CBV4	100	86.2	CBV4	86.2	CBV4	85.2	68.3	CBV4
MEX88-8931	AF152270	CBV5	CBV5	100	87.2	CBV5	87.2	CBV5	86.3	70.3	CBV5
PA88-8885	AF152287	CBV5	CBV5	100	85.1	CBV5	85	CBV5	84	70.3	CBV5
R3597	AF252177	CBV5	CBV5	100	93.1	CBV5	93.1	CBV5	92.6	70.9	CBV5
R8697	AF252178	CBV5	CBV5	100	81.5	CBV5	61.5	CBV5	80.2	67.3	CBV5
R9497	AF252179	CBV5	CBV5	100	93.1	CBV5	79.3	CBV5	77.9	67.8	CBV5
1800NE99	AF290900	CBV5	CBV5	100	92.4	CBV5	92.4	CBV5	91.9	70.4	CBV5
CB6IS2	AF225470	CBV6	CBV6	96	80.2	CBV6	80.2	CBV6	78.6	72.6	CBV6
GA92-1616	AF081606	EV11	EV11	99	81.6	EV11	82.2	EV11	81.1	72.3	EV11
WA92-1516	AF081642	EV11	EV11	99	80.5	EV11	81.1	EV11	79.9	73.7	EV11
834NE99	AF290904	EV11	EV11	99	83.4	EV11	83.6	EV11	82.6	72.5	EV11
2257NE99	AF290908	EV11	EV11	99	85.7	EV11	83.8	EV11	84.9	72.1	EV11
ELS88-8236	AF152256	EV12	EV12	100	82.2	EV12	82.1	EV12	81	71.2	EV12
GV34	AF252184	EV12	EV12	100	80.7	EV12	80.9	EV12	79.5	71.8	EV12
M250	AF252185	EV12	EV12	100	77.9	EV12	77.9	EV12	76.4	71.4	EV12
VA86-6776	AF152299	EV13	ND (ENV69-EV13)	96	74.2	EV13	74.4	EV13	72.7	NT	EV13
47-98	AF290901	EV14	EV14	100	81.1	EV14	81	EV14	79.8	69.6	EV14
CT92-1465	AF081599	EV16	EV16	100	84.5	EV16	84.5	EV16	83.4	72.6	EV16
875NE99	AF290905	EV17	EV17	100	82.9	EV17	83.1	EV17	82	70.1	EV17
1068NE99	AF290907	EV17	EV17	100	84.1	EV17	84.3	EV17	83.2	71.3	EV17
CT96-2182	AF081601	EV18	EV18	100	78	EV18	78.2	EV18	76.8	70.2	EV18
MD88-8208	AF152269	EV18	EV18	100	79.1	EV18	79.3	EV18	78	70.2	EV18
OK89-9448	AF152282	EV18	EV18	100	78.9	EV18	79.1	EV18	77.8	70.2	EV18
OR85-6323	AF152284	EV18	EV18	100	80.3	EV18	80.2	EV18	78.9	70.2	EV18
SC87-7477	AF152292	EV18	EV18	100	82.1	EV18	82	EV18	80.9	70.2	EV18
R100	AF252187	EV18	EV18	100	76.4	EV18	76.9	EV18	75.4	68	EV18
270N97	AF252188	EV18	EV18	100	79.4	EV18	79.6	EV18	78.2	67.5	EV18
M256	AF290906	EV2	EV2	100	77.9	EV2	82.6	EV2	81.5	71.4	EV2
RI94-1959	AF081633	EV21	EV21	100	79.8	EV21	79.8	EV21	78.5	70.8	EV21
NC83-5515	AF152275	EV24	EV24	100	78	EV24	78.1	EV24	76.7	69.9	EV24
NC84-5530	AF152276	EV24	EV24	100	78.2	EV24	78.3	EV24	77	69.6	EV24
MD92-1649	AF081615	EV25	EV25	100	78.5	EV25	78.9	EV25	77.5	69	EV25
MN94-1828	AF081618	EV25	EV25	100	78.3	EV25	78.7	EV25	77.3	68.7	EV25
MO93-1808	AF081619	EV25	EV25	100	78.9	EV25	79.4	EV25	78	68.3	EV25
OR93-1817	AF081627	EV25	EV25	100	79.1	EV25	79.6	EV25	78.2	69	EV25
HON86-6843	AF152261	EV25	EV25	100	80.9	EV25	80.9	EV25	79.7	69.8	EV25
NC84-5531	AF152277	EV25	EV25	100	79.4	EV25	79.6	EV25	78.2	70.6	EV25
1498N98	AF252181	EV25	EV25	100	81.8	EV25	81.8	EV25	80.6	68.9	EV25
1573NE99	AF290909	EV25	EV25	100	81.8	EV25	81.8	EV25	80.3	69.1	EV25
T185	AF290896	EV29	EV29	100	79.1	EV29	70.4	EV6	81.1	70.1	EV29
MD88-8157	AF152266	EV3	EV3	100	77.1	EV3	77.5	EV3	76	69.1	EV3
MT87-7421	AF152274	EV3	EV3	100	76	EV3	78.3	EV3	77	69.1	EV3
1431N98	AF252171	EV30	EV30	100	87.8	EV30	88.5	EV30	87.1	69.9	EV30
1432N98	AF252172	EV30	EV30	100	87.8	EV30	88.5	EV30	87.7	69.9	EV30
HC56	AF252173	EV30	EV30	100	86.7	EV30	87.3	EV30	86	69.9	EV30
127N88	AF252174	EV30	EV30	100	88.7	EV30	88.9	EV30	88.2	70.1	EV30
PER98-2558	AF081632	EV33	EV33	100	80.6	EV33	80.6	EV33	79.2	70.4	EV33
PA88-8412	AF152286	EV4	EV4	100	83.3	EV4	83.3	EV4	82.1	75.6	EV4
T99	AF252180	EV4	EV4	100	82.1	EV4	82	EV4	80.8	73	EV4
CT96-2181	AF081602	EV5	EV5	100	87.4	EV5	87.3	EV5	86.5	72.5	EV5
NM95-2070	AF081625	EV6	EV6	100	80.1	EV6	80.5	EV6	79.2	69.7	EV6
1201C99	AF290897	EV6	EV6	100	81	EV6	80.4	EV6	80.2	71.7	EV6
1351N98	AF252186	EV7	EV7	100	82.3	EV7	82.2	EV7	81.1	69.8	EV7
AR95-2139	AF081596	EV9	EV9	100	83.3	EV9	83.3	EV9	82.2	69.2	EV9
NC92-1612	AF081620	EV9	EV9	100	84.9	EV9	84.8	EV9	83.8	69.9	EV9
WI95-2151	AF081645	EV9	EV9	100	84	EV9	83.9	EV9	78.2	69	EV9
SE74	AF252166	EV9	EV9	100	83.7	EV9	83.9	EV9	82.3	68.5	EV9
MP211	AF252167	EV9	EV9	100	83.3	EV9	83.5	EV9	82.4	68.8	EV9
T94	AF252168	EV9	EV9	100	82.6	EV9	82.8	EV9	81.7	68.5	EV9
T22	AF252175	EV9	EV9	100	86.4	EV9	86.4	EV9	85.5	69.4	EV9
R1192	AF252176	EV9	EV9	100	84.2	EV9	84.2	EV9	83.1	69.7	EV9
M8/72	D17604	ENV70	ENV70	100	98.7	ENV70	96.7	ENV70	98.6	68.9	ENV70
V1250/81	D17611	ENV70	ENV70	100	93.2	ENV70	93.4	ENV70	93.1	69.5	ENV70
AL88-8149	AF152248	ENV71	ENV71	100	99.1	ENV71	99	ENV71	99	70.7	ENV71
MD87-9256	AF152267	ENV71	ENV71	100	98.6	ENV71	98.6	ENV71	98.5	70.2	ENV71
NM90-9873	AF152278	ENV71	ENV71	100	83.6	ENV71	83.6	ENV71	83	66	ENV71
OK89-9243	AF152279	ENV71	ENV71	100	84.4	ENV71	84.2	ENV71	83.7	66.2	ENV71
OK89-9452	AF152283	ENV71	ENV71	100	83.4	ENV71	83.3	ENV71	82.7	65.9	ENV71
TX89-9166	AF152297	ENV71	ENV71	100	83.6	ENV71	83.6	ENV71	83	65.8	ENV71
T100	AF252190	ENV71	ENV71	100	84	ENV71	84	ENV71	83.4	66.7	ENV71
OK85-6388	AF152280	NT	UNT-GenB	100	72.3	CBV6	70.8	EV1	67.5	67.5	EV4
VA86-6765	AF152298	NT	UNT-GenB	100	73.5	ENV69	72.1	EV1	70.3	68.8	EV1
CT87-7122	AF152254	NT	UNT-GenB	100	74.1	CBV6	69.7	EV8	67.7	67.7	EV3
CT87-7123	AF152255	NT	UNT-GenB	100	74.1	CBV6	69.7	EV8	67.5	67.3	EV4
1827N98	AF252165	PV1	PV1	100	100	PV1	100	PV1	100	71.2	PV1

Open in a new tab

Abbreviations: CAV, coxsackie A virus; CBV, coxsackie B virus; PV, poliovirus; NA, isolate not available for neutralization test. EV was directly detected and sequenced from the clinical sample; ND, not identified (bootstrap value, <65%); UNT, no reference strain was grouped into same cluster; Gen B, strain located in Cluster B of EV whole genome classification. NT, untypeable enterovirus (19).

The field clinical samples added in this work are in boldface.

The neutralization serotypes of the 64 sequences obtained from GenBank are as in the corresponding publications (17, 19).

Highest value obtained after the application of equation 1 for the chosen alignment.

Results that differing from the neutralization serotype are in boldface.

Highest value obtained by the NW-GCG similarity test with GOP equal to 50 and GEP equal to 3. Values <75% are shaded in boldface since they do not meet the criterion for identification.

Values higher than 70% are in boldface since they do not achieve the criterion for identification. Values higher than 75% are shaded in boldface and italic.

Virus isolation and neutralization.

Isolation of EV in cell cultures was attempted for every clinical specimen. The specimens were inoculated in human embryonic fibroblast, buffalo green monkey kidney, human rhabdomyosarcoma, and human lung carcinoma (A549) cell lines. The isolates were typed by neutralization with a panel of antiserum pools (Lim-Benyesh-Melnick) and by evaluation of virus growth inhibition.

Extraction, amplification, and sequencing.

Nucleic acids from clinical samples and isolates were extracted as described previously (4). The dried pellet was resuspended in 15 μl of RNase-free sterile water (Sigma Chemical Co., St. Louis, Mo.) and used immediately. A reverse transcription-nested PCR was used to amplify the VP1 genome region, and a 656-bp product (5) which was located between nucleotide 2874 and nucleotide 3529 of echovirus type 9 strain Barty (GenBank accession no. X92886) was obtained.

Cycle sequencing reactions were performed with the Big Dye terminator kit (Perkin-Elmer Applied Biosystems) by using the same primers used for the nested PCR. Raw sequence data were first analyzed with CHROMAS software (version 1.3; C. McCarthy, 1996, Griffith University, Brisbane, Queensland, Australia), and the forward and reverse sequence data for each sample were aligned by using the MegAlign program (DNASTAR Inc. Software, Madison, Wis.) to obtain the final consensus sequence.

EV sequences used.

The sequences of 40 original EV strains were used. Another 64 EV sequences were obtained from the GenBank database, and their accession numbers are included in Table 1. VP1 reference strains from a previously described database (19) were included in every analysis method. The 2A region and the primer sequences were excised from the sequences before analysis.

Pairwise alignment analysis. (i) Martinez-Needleman and Wunsch algorithm.

The Martinez (11)-Needleman and Wunsch (16) algorithm (M-NW similarity test) is implemented by the MegAlign program of the DNASTAR commercial package. A similarity score between each pair of sequences was obtained manually after sequential pairwise alignment (M-NW similarity test) was performed. The quality of the alignment could not be measured, and the gap opening penalty (GOP) and the gap extension penalty (GEP) values were used as default values.

(ii) Needleman and Wunsch algorithm.

The Needleman and Wunsch (16) algorithm (NW similarity test) is implemented by the Gap program of the GCG package (NW-GCG similarity test), and it was performed with both nucleotide and amino acid sequences.

Given that strains of the same serotype rarely have gaps in their alignments, increasing the GOP and GEP parameters will not modify the similarity score for very related sequences but will decrease it for more diverging sequences. Therefore, to allow the separation of the peak of the similarity score for strains of the same serotype from the peak of the similarity score for strains of related serotypes, GOP options were set as default, double, and quadruple (50, 100, and 200, respectively) with GEP options of 3, 10, and 20, respectively. The randomization option was set on 100 to process the quality control of the alignment.

The quality of the alignment was recorded as the difference between the quality of the best alignment (V) and the mean of the Monte Carlo randomized alignments (m) divided by the standard deviation (SD) of the Monte Carlo simulation ([V − m]/SD), as established previously (2). Significance scores above 15 SDs indicated a nearly ideal alignment, while scores above 5 SDs suggested a good alignment and scores below 5 SDs were regarded with caution.

The conditions used to identify the sequences in both nucleotide schemes were the same as those described previously (19). Even though these conditions had been determined for the GCG program, the same criteria were used for the DNASTAR program since both programs are based on the same algorithm.

A sequence identity of ≥75% for any EV prototype strain indicated a homologous serotype, provided that the second-highest score was <70% (next closest serotype). A highest score between 70 and 75% or a second-highest score greater than 70% indicated a tentative identification, while a highest score lower than 70% indicated a nonmatching sequence.

Amino acid analysis conditions were set on the basis of a highest score of 90% and a second-highest score of 85%. These cutoff scores were inferred from a frequency distribution graphic of the pairwise sequence analysis scores when the intraserotype peak differentiated (25). The same kind of analysis had been used before for potyvirus classification (25).

Multiple-sequence alignment analysis.

Two widely used multiple-sequence alignment programs, the Clustal W program (version 1.81) and the Pileup program, were compared (23). The Pileup program (which is part of the GCG package) is a progressive pairwise alignment program which implements a simplified version of the Feng and Doolittle algorithm (8a). Sequence similarity is used to cluster sequences, and then a dendrogram is constructed by the unweighted pair group method with arithmetic means, which orders the subsequent pairwise alignments based on the method of Needleman and Wunsch (16).

The Clustal W program (the slow, accurate option) is also based on the Feng and Doolitle algorithm but is improved through dynamic assignment of penalties. The Clustal W program varies gap penalties in relation to sites along the sequence and the relative evolutionary distance among sequences (and subsets of sequences), and the dendrogram is constructed by the neighbor joining method. GOP and GEP costs can be specified separately for the pairwise sequence and multiple-sequence alignments, while the delay divergent sequences option delays the inclusion of sequences with less than the specified sequence identity until the most similar sequences are aligned.

The key parameters for both programs are the cost assigned to the opening of the gap (gap cost), the cost for extension of a gap (gap-length cost) (total gap cost = gap cost + [gap−length cost × gap length]), and the delay divergent sequence option, which is available only in the Clustal W program. These parameters were modified to evaluate the performances of the programs. The alignments of the deduced amino acid sequences of all strains were first compared. The methods were then tested by varying the GOP value between 5 and 100 and by using different GEP values (1, 5, and 10 for the Pileup program and 6.66, 20, 100, and 200 for the Clustal W program). A similar analysis was performed for the nucleotide sequences with the Pileup program and a GEP value of 0.3. The Clustal W program delay divergent option was also evaluated. This value was set at 80 since the limit value of the identity score for two strains of the same serotype was estimated to be 75%.

The quality of the alignments should also be assessed in order to evaluate and compare alignment programs. Two different scores were calculated to estimate the quality of the alignment. The sum-of-pairs score (SPS) increases with the number of sequences correctly aligned and is used to determine the abilities of the programs to align some or all sequences within an alignment. The column score (CS) is a binary score which tests the abilities of the programs to align all the sequences correctly. Both scores were previously developed and tested to evaluate multiple-sequence alignment programs (24), but no reference alignment is considered to normalize the corresponding results in the present study.

Finally, the most consistent alignment was chosen to calculate the homology index to identify the serotype. For multiple-sequence alignments, this score for each pair of sequences was calculated as follows:

A homology index higher than 75% assigned a strain to a serotype.

Phylogenetic approach.

The most consistent alignment was used as the input for the phylogenetic methods. Phylogenetic relationships were inferred with the DNAPARS program (as a character state method) and the DNADIST or PROTDIST program (as a distance method) and then with the NEIGHBOUR and KITSCH programs (PHYLIP, version 3.57) (8). The PROTPARS, DNAML, and FITCH programs were not evaluated due to the excessive analysis time needed for large numbers of sequences with a common personal computer system. The statistical significance of phylogenies was estimated by using the SEQBOOT program by bootstrap analysis with 100 pseudoreplicate data sets. Only groupings with bootstrap values higher than 65 were considered significant. The consensus tree obtained with CONSENSE was plotted by TREEVIEW (version 1.2; R. Page, 1995, Glasgow, United Kingdom).

Nucleotide sequence accession numbers.

The original sequences detected in the present study were submitted to the GenBank sequence database under accession numbers AF252165 to AF252190 and AF290896 to AF290909.

RESULTS

Table 1 describes the viruses analyzed and the values obtained for their sequences by each analysis method.

Pairwise sequence alignment methods. (i) Quality analysis.

The quality analysis showed that each program obtained different results. Since the DNASTAR package did not provide an option to perform the quality analysis, the results were compared with these obtained with the GCG package on a pairwise basis to test its quality. One of the most divergent results was observed when isolate R100, identified by seroneutralization as EV type 18 (EV18), was aligned with the reference strain, EV18 Metcalf. The overall length of the Gap alignment was 419 bp, while the MegAlign alignment had an overall length of 531 bp. No gaps were inserted by the Gap program, and the similarity score was 75.3%. On the other hand, the MegAlign program inserted 224 gaps, resulting in a similarity score of 46.3%. The same discordant insertion of gaps was found frequently in other pairs of alignments (data not shown).

The randomization option of the Gap program was used to evaluate the quality of the result by performing a Monte Carlo simulation. The more similar or related the sequences were, the higher the quality of the alignment was (data not shown).

The parameters selected for the analysis of nucleotide sequences were GOP equal to 50 and GEP equal to 3, considering the fact that the lowest-quality scores were 4.99, 2.74, and 1.09 with GOP equal to 50, 100, and 200, respectively (Table 2). By using the selected options, only one alignment resulted in a score lower than 5 (4.99 [1 of 8,216 comparisons; 0.0001%]), while 170 (2.1%) and 946 (11.5%) of the pairwise alignments had scores lower than 5 by using GOP values of 100 and 200, respectively.

TABLE 2.

Summary of results^a

Method and parameter values	Quality control			Identification							Method performance
	Quality (boot-strapping)		No. of samples with quality <5 (boot-strapping <65)	No. of strains with:				No. of strains affected by multiple-reference-strain factor (n = 28)	Failure to identify:		No. of strains correctly identified^b	% Correctly identified
	Maximum	Minimum	No. of samples with quality <5 (boot-strapping <65)	Highest scores <75% nt or <90% aa	Second-highest score >70% nt or >85% aa	Highest scores not correspond-ing to its serotype	Second-highest scores >75%		EV13 strain	New sero-type	No. of strains correctly identified^b	% Correctly identified
Pairwise sequence alignment
DNASTAR, nucleotides (default GOP and GEP values)	NA	NA	NA	24	50	4	4	13	Yes	Yes	30	28.85
GCG
Nucleotides
GOP = 50, GEP = 3	77.97	4.99	1	5	46	0	1	1	Yes	Yes	52	50.00
GOP = 100, GEP = 10	77.24	2.75	170	5	45	0	1	1	Yes	Yes	51	49.03
GOP = 200, GEP = 20	70.03	1.09	946	5	45	0	1	1	Yes	Yes	51	49.03
Proteins (GOP = 8, GEP = 2)	145	24.67	0	4	25	0	0	0	No	Yes	75	72.10
Multiple-sequence alignment
Pileup, nucleotides (GOP = 5, GEP = 1)	NA	NA	NA	6	NA	0	2	0	Yes	Yes	96	92.30
Clustal W, nucleotides (GOP = 100, GEP = 100)	NA	NA	NA	5	NA	0	1	0	Yes	Yes	98	94.20
Phylogenetic approach (PHYLIP package with nucleotides)	100	86	0	NA	NA	NA	NA	0	Yes	No	103	99.04

Open in a new tab

Abbreviations: NA, not applicable; nt, nucleotides; aa, amino acids.

Number of strains among 104 strains tested that fulfilled all established criteria.

Finally, selection of parameters for amino acid alignments was not decisive since all similarity scores had high-quality values (lowest score, 24.67).

(ii) Similarity test as method for EV identification.

The values obtained by the NW-GCG similarity test of nucleotide sequences from clinical samples are shown in Table 1. Only the EV13 strain and four other field strains (untypeable strains OK85-6388, VA86-6765, CT87-7122, and CT87-7123) resulted in similarities scores <75%, as reported previously (19).

However, the second-highest scores were higher than 70% in more than 60 cases under any condition, resulting in more than 45 different untypeable EV serotypes (Table 2). This was also true for isolate PA88-8412, whose second-highest score was also higher than 75% (the score for strain EV8 Bryson was 75.6%, which was independent of the parameter chosen).

On the other hand, deduced amino acid sequence analysis grouped all tested strains except the untypeable strains with their respective reference strains (including the EV13 field strain). Again, 25 different strains had in second-highest scores higher than 85% (data not shown).

The similarity values obtained by the M-NW similarity test were lower than the values by the NW-GCG similarity test (mean score by the NW-GCG similarity test of 81.5% versus mean score by the M-NW similarity test of 78.8%). A total of 19.2% of the strains, including the untypeable strains, could not be typed because of low similarity scores or because the score attributed the strain to a serotype different from the serotype established by the other methods (Table 3). In conclusion, the M-NW similarity test appears to be less suitable for EV identification.

TABLE 3.

Analysis of discordant M-NW similarity test values^a

Clinical sample	Neutralization serotype	Highest M-NW similarity index (%) with respec-tive reference strain^b	Highest M-NW similarity index (%)^c	Genotype determined by M-NW similarity index^d
GV34	EV12	71.8	71.8	EV12
M250	EV12	73.7	73.7	EV12
270N97	EV18	57.4	66.5	EV6 Burgess
R100	EV18	46.2	67.4	EV15CH96
MD92-1649	EV25	74.9	74.9	EV25
MN94-1828	EV25	74.1	74.1	EV25
PA94-5753	CAV16	52.3	65.9	ENV71 BrCr
GA95-2095	CAV16	51.8	66.7	ENV71 BrCr
TX95-2147	CAV16	74.5	74.5	CAV16
NC83-5515	EV24	75	75	EV24
TAI84-5839	CAV16	73.8	73.8	CAV16
OK85-6388	UNT	NA	66.4	EV7
VA86-6765	UNT	NA	67.6	EV24
VA86-6776	EV13	71.5	71.5	EV13
CT87-7122	UNT	NA	68.2	EV15 CH96
CT87-7123	UNT	NA	68.2	EV15 CH96
MT87-7421	EV3	74.1	74.1	EV3
GUT88-8020	CAV21	74	74	CAV21
MD88-8157	EV3	73	73	EV3
GUT88-8436	CAV21	74.3	74.3	CAV21
PA89-9262	CAV16	74.8	74.8	CAV16
OK89-9448	EV18	73.9	73.9	EV18
RI94-1959	EV21	74.4	74.4	EV21
CB6IS2	CBV6	73.6	73.6	CBV6

Open in a new tab

Abbreviations: UNT, “untypeable” EV. See footnote a of Table 1 for definitions of the other abbreviations.

Values lower than 75% are in boldface.

Results higher than those obtained for the corresponding reference strain are in boldface.

Serotype determined by the highest score obtained elsewhere (17). Results that differed from the neutralization serotype are in boldface.

(iii) Analysis of test results over multiple reference strain groups.

Samples corresponding to EV serotypes represented by several reference strains were compared by both similarity tests. The M-NW similarity test could not correctly correlate the field strain to another reference strain of the same serotype in 13 (46.4%) cases. For example, the sequences of EV30 strains 127N88, 1431N98, 1432N98, and HC56 resulted in scores of 72.3, 58.3, 58.3, and 74.1%, respectively, with EV30 reference strain PR17. Strains 127N88 and HC56 also had similarity scores of 74.1 and 74.3%, respectively, with reference strain EV30 Giles. On the other hand, only one field strain (strain 127N88) had a value lower than 75% (74.9%) when its sequence was compared with sequence of EV30 PR17 by the NW-GCG nucleotide similarity test with any combination of parameters. Moreover, the NW-GCG amino acid similarity test was not affected by the multiple reference strain factor since all the strains in this group correlated with their corresponding reference strains.

Multiple-sequence alignment methods. (i) Quality analysis. (a) Deduced amino acid sequences.

The Pileup program generated multiple-sequence alignments with very different lengths and qualities. The Clustal W program generated alignments with more conserved lengths and qualities for different parameters instead. The performances measured as the overall similarity (SPS/length) and the CSs for the different methods of analysis with different GEP and GOP parameters are shown in Fig. 1a and b.

FIG. 1. — Results from each test comparing the performances of the programs. The program tested the combinations of GEP and GOP parameters shown for each indicator with each alignment test. The program and the GEP parameters used are shown in the rows. Columns show the GOP parameter for each alignment. The box to the right of each comparison describes the symbols used for the scores. (a) Column score result for deduced amino acid alignments. (b) Percent overall similarity result for each amino acid alignment. (c) Column score result for nucleotide alignments. (d) Percent overall similarity result for each nucleotide alignment. (e) Variation in overall similarity of the alignment considering different delay divergent values with the Clustal W program. The GOP and GEP scores were set equal to 100. The default delay divergent value was increased by 5 units up to a value of 95.

The Clustal W program generated the most conserved and consistent alignment when a GOP value of 75 and GEP value of 6.66 were used. The resulting alignment had an overall similarity of 43.52% and a length of 151 amino acids.

(b) Nucleotide sequences.

For the Clustal W program, the alignment of the divergent sequences was delayed with the delay divergent option (which was varied between 30 and 95) until all the remaining sequences were aligned. The effect of this optional parameter in the alignment quality is presented in Fig. 1e, showing that the best alignments were obtained with values higher than 85. Thus, the conditions mentioned above were tested by varying only the values of GOP and GEP. Again, the Clustal W program generated multiple-sequence alignments with more conserved lengths and qualities than the Pileup program for the different parameters tested (Fig. 1c and d).

The Clustal W program generated the most conserved and consistent alignment with GOP equal to 100 and GEP equal to 100 and a delay divergent option of 90%. The alignment had an overall similarity of 54.78% and was 453 nucleotides long. Instead, GOP equal to 5 and GEP equal to 0.3 or 1 generated the best alignment with the Pileup program, with an overall similarity of 54.61% and a length of 452 nucleotides. Other combinations of parameter values generated very different results.

(ii) Homology index as method of EV identification.

The results obtained by use of the homology index are shown in Table 1. The homology test with the Clustal W program identified every serotype but an EV13 strain (homology indices with prototype strains EV13 Del Carmen and ENV69 Tolucal, 74.2 and 73.5%, respectively) and the untypeable strains with values higher than 75%. The PA88-8412 sequence presented the same problem encountered with the pairwise sequence analysis, showing a second-highest score higher than 75%.

The homology test with the Pileup program also identified every serotype but an EV13 strain, the untypeable strains, and field strain T185. Strain T185, typed as EV29, and was correctly identified by the other methods studied (Table 1), while the score obtained with the Pileup program with its respective reference strain was 30.6%. Again, the PA88-8412 sequence resulted in a second-highest score higher than 75% (the score for EV8 Bryson was 77.5%).

(iii) Analysis of test results over multiple reference strain groups.

Neither the homology index obtained with the Clustal W program nor the homology index obtained with the Pileup program was affected by the multiple reference strain group factor since all the field strains tested correlated with their corresponding reference strains.

Phylogenetic methods.

All the phylogenetic methods generated the same tree topology with different bootstrap values, and the one with higher bootstrap values was obtained with the KITSCH program (Fig. 2). Bootstrap values for each serotype cluster are shown in Table 1. Four separate groups supported by high bootstrap values clustered similarly to other previously published studies (genogroups A, B, C, and D) (15, 20–22). The method identified all serotypes except an EV13 isolate (VA86-6776). This isolate was correctly grouped with its reference strain with a not significant bootstrap value (42%), while it was grouped with ENV69 and EV13 reference strains with higher values. An untypeable EV serotype that could not be identified by other methods of analysis (19) was correctly grouped into genogroup B of the EV genus with a high bootstrap value.

FIG. 2. — Consensus phylogenetic tree constructed with the PHYLIP package for 104 EV sequences. Alignments were obtained with the Clustal W program. The statistical significance of the phylogenies constructed was estimated with the SEQBOOT program. The 100 pseudoreplicate data sets obtained were analyzed with the DNADIST program with the parameters of the Kimura 2 model of nucleotide substitution. The observed nucleotide distance matrix was then processed with the KITSCH program, and the tree generated was treated with the CONSENSE program. The tree was displayed with the TREEVIEW program. The numbers at the nodes represent the percentage of 100 bootstrap pseudoreplicates that contained the cluster distal to the node.

Finally, the remaining serotypes, including PA88-8412, clustered with very high bootstrap values.

DISCUSSION

Analysis of sequence information is becoming the major approach to the study and identification of human EVs. Three different levels of analysis of sequence information are discussed in this report: a pairwise sequence alignment of homologues, multiple-sequence alignment of homologues, and phylogenetic reconstruction of trees. The results showed that each analysis method has different EV identification results and that the accuracies of the results depend on the parameters chosen (Table 2).

The complexity of this genus makes the correct identification of EVs difficult, and problems associated with the seroneutralization method or analysis of untypeable EV strains appear frequently. Nevertheless, although identification of EVs through analysis of the VP1 genome region has been recognized as the best choice, it is also necessary to achieve consensus on a common system of sequence analysis.

The alignment of two sequences is critical for all sequence analysis methods. Differences in alignments and accuracies of methods have previously been reported with large sets of protein sequences with several programs (12). However, the biological validities of nucleotide sequence alignments have been studied less. This is because DNA sequences (which have 4 character states) are more difficult to align than amino acid sequences (which have 20 character states). It was also demonstrated for rRNA data and protein-encoding sequences that alignment accuracy depends on the alignment program and the parameters chosen (14). As was expected, the more that insertion or deletion events are incorporated, the greater the differences among sequences alignments are.

Like other steps in phylogenetic reconstruction, automated sequence alignment requires selection of the most appropriate parameters. Therefore, published studies should describe the parameter values used to create the alignments if they are to be repeatable. In addition, each method of analysis should provide an assessment of quality.

Pairwise nucleotide sequence alignment methods were very different. Although both programs use the same algorithm, the results obtained with the DNASTAR program were very poor.

Pairwise sequence alignment methods provide a simple way to assess the alignment quality by the Monte Carlo approach. The software developer indicates that the algorithm used in the MegAlign program is a slight modification (11) of the algorithm originally designed by Needleman and Wunsch (16) for proteins. This allows the user to align two nucleotide sequences, first by an approach described by Martinez (11), which identifies regions with perfect matches, and then by the method of Needleman and Wunsch (16), which optimizes the fit between perfect matches. The MegAlign program does not provide a quality assessment option, and the results obtained were different from those obtained with the Gap program. An example of this is for isolate R100, which was correctly identified by the Gap program, while the MegAlign program showed a very low similarity index (46.2%) in tests with the same reference strain. In addition, the M-NW similarity test score resulted in the worst performance, and only 32.7% of the strains fulfilled the criteria for identification. Therefore, the MegAlign program is useless for the identification of EVs by use of this genome region.

Instead, the GCG package provides an option that assesses the quality of the alignment obtained with the Gap program. The Monte Carlo simulation determined that the best pair of parameter values was GOP equal to 50 and GEP equal to 3. Other conditions resulted in scores for quality controls lower than the proposed 5 SDs for several pairs of alignments.

By use of these parameters, the highest scores were higher than 75% for 99 of 104 (94.6%) clinical strains. However, 46 clinical isolates could not be identified since they did not fulfill the second-highest-score criterion. Added to that isolate PA88-8412 had a score higher than 75% for two different serotypes, including the correct one (EV4). Scores higher than 70% slightly decreased as the GOP increased (data not shown), showing that these pairwise sequence alignments were relating divergent pairs of sequences, as suggested previously (17). Although the parameters had to be varied to increase the penalty to the similarity score for divergent sequences, the distances between similarity scores were not enough to increase the number of identifiable isolates.

In summary, this pairwise sequence alignment method did not allow the identification of 52 of 104 (50%) strains mainly due to problems related to the proposed criteria. It seems that the criterion of a second-highest score lower than 70% established for identification (19) is very stringent, since all the strains that did not fulfill this criterion had highest scores greater than 75% and the correct serotype.

An even more disappointing aspect was that the pattern of gaps in multiple-sequence alignments was often inconsistent with that in pairwise sequence alignments; thus, the optimal alignment between the two closest sequences was often altered in the presence of a third or a fourth sequence.

Analysis of deduced protein sequences with the Gap program improved the results. Unfortunately, the four untypeable isolates also could not be identified, but the EV13 field strain was correctly identified. The multiple-reference-strain effect was not observed, and the quality analysis had higher values. In this case, the second-highest-score criterion should be also discarded since 25 field strains would have been correctly identified if this criterion had not been considered. The analysis of deduced amino acid sequences could constitute a better approach than the analysis of nucleotide sequences.

The greater precision of the Clustal W program and the more consistent performance of the Clustal W program over that of the Pileup program is due to more sophisticated features such as neighbor joining that guide the alignments. Based on the same reasons considered for pairwise sequence alignment, two different methods were used to evaluate the accuracies of the alignments obtained.

Since length is one of the parameters that allows the user to test the quality of the alignment at first sight, the deduced amino acid sequences were evaluated by considering that if the overall length of a protein alignment is known, the nucleotide sequence alignment length could be calculated (overall amino acid length times 3). The results were different for the two methods. The alignments obtained with the Clustal W program were more consistent and conserved, and better methods to improve quality with the delay divergent sequence option were provided. Moreover, both methods of quality control, SPS and CS, were more consistent for the Clustal W program than for the Pileup program. The alignments of the nucleotide sequences obtained with the Clustal W program were also more reliable than those obtained with the Pileup program. Although the Clustal W program generated the exact 453-bp alignment, the Pileup program was not able to generate an alignment with the predicted overall length. Despite that, the most important features of the Clustal W program are its reproducibility and reliability, given that the resulting alignments were similar under all conditions selected.

The Clustal W program homology index identified all except six clinical strains; while the Pileup program homology index did not identify seven strains. The second-highest-score criterion should not be considered for these tests since their alignments are clearly more reliable than pairwise sequence alignments. In summary, the Clustal W program homology index was able to identify 98 of 104 (94.2%) field strains. Added to that, the Clustal W program is freely available on the Internet and can be performed with a wide variety of computer platforms or with a common personal computer in less than a half hour, while the GCG program is an expensive commercial package that must be run on a mainframe computer system and some Unix programming knowledge is needed to perform the process automatically. This should be considered when proposing a method easily transferable to all reference laboratories.

In any phylogenetic study, alignment of a less conserved region and regions with variable lengths is often problematic, even when sequence divergence is moderate. The alignment process is critical in phylogenetic analyses since inclusion of ambiguously aligned regions could result in erroneous patterns of branching, while removal of such regions could reduce the resolution. To avoid this reduction in the resolution, the terminal regions of the VP1 sequences were included, because it is known that the VP1 region is very divergent and generates a variable-length alignment.

Phylogenetic reconstruction allowed not only the correct clustering of all serotypes with a high bootstrap value (except for the EV13 field strain) but also the prediction and clustering of the new serotypes. This is shown for the cluster of untypeable isolates from clinical samples. Despite the location of the untypeable serotype near the EV20 JV1 reference strain, the cluster is clearly composed of strains of a new serotype since the observed nucleotide sequence distance among strains of the untypeable serotype is lower than 0.20 and the distance between strains of the untypeable serotype and EV20 strains is higher than 0.45 (data not shown).

Identification of the EV13 field strain is a special case since it was clustered with both EV13 and enterovirus type 69 (ENV69) reference strains with high bootstrap values. It was also the only strain that could not be correctly identified by all the methods tested. This could mean that the three strains are of the same serotype, but this hypothesis should be further evaluated with more sequence data from field strains of the EV13 or ENV69 serotype.

In addition, 16 serotypes of EV were not represented in the data. Field isolates of these serotypes are not available in GenBank or in the set of samples sequenced so far. The results should be evaluated again when fields samples of these serotypes are available.

The results of the present study indicate that phylogenetic inference is the best method. Multiple-sequence alignment programs or pairwise sequence alignment of deduced amino acid sequences appears to be the second choice, although adoption of universal criteria for selection of parameters and assessment of the quality of the alignment is needed before multiple-sequence alignment programs can be used. The parameters selected should be also reported in order to repeat and compare the experiments.

Finally, the present study shows that accurate identification of EVs can easily be achieved directly from clinical samples with freely available software.

Acknowledgments

We thank Jorge Gomez for continuous support during this work and for helpful discussions and critical reading of the manuscript.

The work was supported by institute funds from INEI-ANLIS “Dr. Carlos G. Malbrán” and by Fondo de Investigaciones Sanitarias grant FIS98/0229 from the Spanish Ministry of Health. I.C. is a postdoctoral fellow funded by the Instituto de Salud Carlos III, Becas de Perfeccionamiento.

REFERENCES

1.Arola, A., J. Santti, O. Ruuskanen, P. Halonen, and T. Hyypia. 1996. Identification of enteroviruses in clinical specimens by competitive PCR followed by genetic typing using sequence analysis. J. Clin. Microbiol. 34:313–318. [DOI] [PMC free article] [PubMed] [Google Scholar]
2.Barton, G. 1990. Protein multiple sequence alignment and flexible pattern matching. Methods Enzymol. 183:403–428. [DOI] [PubMed] [Google Scholar]
3.Burland, T. G. 2000. DNASTAR’s Lasergene sequence analysis software. Methods Mol. Biol. 132:71–91. [DOI] [PubMed] [Google Scholar]
4.Casas, I., P. E. Klapper, G. M. Cleator, J. E. Echevarría, A. Tenorio, and J. M. Echevarría. 1995. Two different PCR assays to detect enteroviral RNA in CSF samples from patients with acute aseptic meningitis. J. Med. Virol. 47:378–385. [DOI] [PubMed] [Google Scholar]
5.Casas, I. P., G. Trallero, D. Cisterna, M. Freire, and A. Tenorio. 2001. Molecular characterization of human enteroviruses in clinical samples: comparison between VP2, VP1, and RNA polymerase regions using RT nested PCR assays and direct sequencing of products. J. Med. Virol. 65:138–148. [PubMed] [Google Scholar]
6.Drebot, M. A., J. J. Campbell, and S. H. Lee. 1999. A genotypic characterization of enteroviral antigenic variants isolated in eastern Canada. Virus Res. 59:131–140. [DOI] [PubMed] [Google Scholar]
7.Drebot, M. A., C. Y. Nguan, J. J. Campbell, S. H. Lee, and K. R. Forward. 1994. Molecular epidemiology of enterovirus outbreaks in Canada during 1991–1992: identification of echovirus 30 and coxsackievirus B1 strains by amplicon sequencing. J. Med. Virol. 44:340–7. [DOI] [PubMed] [Google Scholar]
8.Felsenstein, J. 1993. PHYLIP: phylogeny inference package, version 3.57. University of Washington, Seattle.
8a.Feng, D. F., and R. F. Doolittle. 1996. Progressive alignment of amino acid sequences and construction of phylogenetic trees from them. Methods Enzymol. 266:368–382. [DOI] [PubMed] [Google Scholar]
9.Kapsenberg, J. G., A. Ras, and J. Korte. 1980. Improvement of enterovirus neutralization by treatment with sodium deoxycholate or chloroform. Intervirology 12:329–334. [DOI] [PubMed] [Google Scholar]
10.Kopecka, H., B. Brown, and M. Pallansch. 1995. Genotypic variation in coxsackievirus B5 isolates from three different outbreaks in the United States. Virus Res. 38:125–136. [DOI] [PubMed] [Google Scholar]
11.Martinez, H. 1983. An efficient method for finding repeats in molecular sequences. Nucleic Acids Res. 11:4629–4634. [DOI] [PMC free article] [PubMed] [Google Scholar]
12.McClure, M. A., T. K. Vasi, and W. M. Fitch. 1994. Comparative analysis of multiple protein-sequence alignment methods. Mol. Biol. Evol. 11:571–592. (Erratum, 11:811.) [DOI] [PubMed] [Google Scholar]
13.Melnick, J. L. 1996. Enteroviruses: polioviruses, coxsackieviruses, echoviruses, and newer enteroviruses, 3rd ed. Lippincott-Raven, Philadelphia, Pa.
14.Morrison, D. A., and J. T. Ellis. 1997. Effects of nucleotide sequence alignment on phylogeny estimation: a case study of 18S rDNAs of apicomplexa. Mol. Biol. Evol. 14:428–441. [DOI] [PubMed] [Google Scholar]
15.Muir, P., U. Kammerer, K. Korn, M. N. Mulders, T. Poyry, B. Weissbrich, R. Kandolf, G. M. Cleator, and A. M. van Loon for The European Union Concerted Action on Virus Meningitis and Encephalitis. 1998. Molecular typing of enteroviruses: current status and future requirements. Clin. Microbiol. Rev. 11:202–227. [DOI] [PMC free article] [PubMed] [Google Scholar]
16.Needleman, S. B., and C. D. Wunsch. 1970. A general method applicable to the search for similarities in the amino acid sequences of two proteins. J. Mol. Biol. 48:444–453. [DOI] [PubMed] [Google Scholar]
17.Oberste, M. S., K. Maher, D. R. Kilpatrick, M. R. Flemister, B. A. Brown, and M. A. Pallansch. 1999. Typing of human enteroviruses by partial sequencing of VP1. J. Clin. Microbiol. 37:1288–1293. [DOI] [PMC free article] [PubMed] [Google Scholar]
18.Oberste, M. S., K. Maher, and M. A. Pallansch. 1998. Molecular phylogeny of all human enterovirus serotypes based on comparison of sequences at the 5′ end of the region encoding VP2. Virus Res. 58:35–43. [DOI] [PubMed] [Google Scholar]
19.Oberste, M. S., K. Maher, M. R. Flemister, G. Marchetti, D. R. Kilpatrick, and M. A. Pallansch. 2000. Comparison of classic and molecular approaches for the identification of untypeable enterovirus. J. Clin. Microbiol. 38:1170–1174. [DOI] [PMC free article] [PubMed] [Google Scholar]
20.Poyry, T., T. Hyypia, C. Horsnell, L. Kinnunen, T. Hovi, and G. Stanway. 1994. Molecular analysis of coxsackievirus A16 reveals a new genetic group of enteroviruses. Virology 202:982–987. [DOI] [PubMed] [Google Scholar]
21.Pöyry, T., L. Kinnunen, T. Hyypiä, B. Brown, C. Horsnell, T. Hovi, and G. Stanway. 1996. Genetic and phylogenetic clustering of enteroviruses. J. Gen. Virol. 77:(Pt 8):1699–1717. [DOI] [PubMed] [Google Scholar]
22.Pulli, T., P. Koskimies, and T. Hyypiä. 1995. Molecular comparison of coxsackie A virus serotypes. Virology 212:30–38. [DOI] [PubMed] [Google Scholar]
23.Thompson, J. D., D. G. Higgins, and T. J. Gibson. 1994. CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, positions-specific gap penalties and weight matrix choice. Nucleic Acids Res. 22:4673–4680. [DOI] [PMC free article] [PubMed] [Google Scholar]
24.Thompson, J. D., F. Plewniak, and O. Poch. 1999. A comprehensive comparison of multiple sequence alignment programs. Nucleic Acids Res. 27:2682–2690. [DOI] [PMC free article] [PubMed] [Google Scholar]
25.Ward, C. W., N. M. McKern, M. J. Frenkel, and D. D. Shuckla. 1992. Sequence data as the major criterion for potyvirus classification. Arch Virol. 5(Suppl.):283–297. [DOI] [PubMed] [Google Scholar]

[r1] 1.Arola, A., J. Santti, O. Ruuskanen, P. Halonen, and T. Hyypia. 1996. Identification of enteroviruses in clinical specimens by competitive PCR followed by genetic typing using sequence analysis. J. Clin. Microbiol. 34:313–318. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r2] 2.Barton, G. 1990. Protein multiple sequence alignment and flexible pattern matching. Methods Enzymol. 183:403–428. [DOI] [PubMed] [Google Scholar]

[r3] 3.Burland, T. G. 2000. DNASTAR’s Lasergene sequence analysis software. Methods Mol. Biol. 132:71–91. [DOI] [PubMed] [Google Scholar]

[r4] 4.Casas, I., P. E. Klapper, G. M. Cleator, J. E. Echevarría, A. Tenorio, and J. M. Echevarría. 1995. Two different PCR assays to detect enteroviral RNA in CSF samples from patients with acute aseptic meningitis. J. Med. Virol. 47:378–385. [DOI] [PubMed] [Google Scholar]

[r5] 5.Casas, I. P., G. Trallero, D. Cisterna, M. Freire, and A. Tenorio. 2001. Molecular characterization of human enteroviruses in clinical samples: comparison between VP2, VP1, and RNA polymerase regions using RT nested PCR assays and direct sequencing of products. J. Med. Virol. 65:138–148. [PubMed] [Google Scholar]

[r6] 6.Drebot, M. A., J. J. Campbell, and S. H. Lee. 1999. A genotypic characterization of enteroviral antigenic variants isolated in eastern Canada. Virus Res. 59:131–140. [DOI] [PubMed] [Google Scholar]

[r7] 7.Drebot, M. A., C. Y. Nguan, J. J. Campbell, S. H. Lee, and K. R. Forward. 1994. Molecular epidemiology of enterovirus outbreaks in Canada during 1991–1992: identification of echovirus 30 and coxsackievirus B1 strains by amplicon sequencing. J. Med. Virol. 44:340–7. [DOI] [PubMed] [Google Scholar]

[r8] 8.Felsenstein, J. 1993. PHYLIP: phylogeny inference package, version 3.57. University of Washington, Seattle.

[r8a] 8a.Feng, D. F., and R. F. Doolittle. 1996. Progressive alignment of amino acid sequences and construction of phylogenetic trees from them. Methods Enzymol. 266:368–382. [DOI] [PubMed] [Google Scholar]

[r9] 9.Kapsenberg, J. G., A. Ras, and J. Korte. 1980. Improvement of enterovirus neutralization by treatment with sodium deoxycholate or chloroform. Intervirology 12:329–334. [DOI] [PubMed] [Google Scholar]

[r10] 10.Kopecka, H., B. Brown, and M. Pallansch. 1995. Genotypic variation in coxsackievirus B5 isolates from three different outbreaks in the United States. Virus Res. 38:125–136. [DOI] [PubMed] [Google Scholar]

[r11] 11.Martinez, H. 1983. An efficient method for finding repeats in molecular sequences. Nucleic Acids Res. 11:4629–4634. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r12] 12.McClure, M. A., T. K. Vasi, and W. M. Fitch. 1994. Comparative analysis of multiple protein-sequence alignment methods. Mol. Biol. Evol. 11:571–592. (Erratum, 11:811.) [DOI] [PubMed] [Google Scholar]

[r13] 13.Melnick, J. L. 1996. Enteroviruses: polioviruses, coxsackieviruses, echoviruses, and newer enteroviruses, 3rd ed. Lippincott-Raven, Philadelphia, Pa.

[r14] 14.Morrison, D. A., and J. T. Ellis. 1997. Effects of nucleotide sequence alignment on phylogeny estimation: a case study of 18S rDNAs of apicomplexa. Mol. Biol. Evol. 14:428–441. [DOI] [PubMed] [Google Scholar]

[r15] 15.Muir, P., U. Kammerer, K. Korn, M. N. Mulders, T. Poyry, B. Weissbrich, R. Kandolf, G. M. Cleator, and A. M. van Loon for The European Union Concerted Action on Virus Meningitis and Encephalitis. 1998. Molecular typing of enteroviruses: current status and future requirements. Clin. Microbiol. Rev. 11:202–227. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r16] 16.Needleman, S. B., and C. D. Wunsch. 1970. A general method applicable to the search for similarities in the amino acid sequences of two proteins. J. Mol. Biol. 48:444–453. [DOI] [PubMed] [Google Scholar]

[r17] 17.Oberste, M. S., K. Maher, D. R. Kilpatrick, M. R. Flemister, B. A. Brown, and M. A. Pallansch. 1999. Typing of human enteroviruses by partial sequencing of VP1. J. Clin. Microbiol. 37:1288–1293. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r18] 18.Oberste, M. S., K. Maher, and M. A. Pallansch. 1998. Molecular phylogeny of all human enterovirus serotypes based on comparison of sequences at the 5′ end of the region encoding VP2. Virus Res. 58:35–43. [DOI] [PubMed] [Google Scholar]

[r19] 19.Oberste, M. S., K. Maher, M. R. Flemister, G. Marchetti, D. R. Kilpatrick, and M. A. Pallansch. 2000. Comparison of classic and molecular approaches for the identification of untypeable enterovirus. J. Clin. Microbiol. 38:1170–1174. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r20] 20.Poyry, T., T. Hyypia, C. Horsnell, L. Kinnunen, T. Hovi, and G. Stanway. 1994. Molecular analysis of coxsackievirus A16 reveals a new genetic group of enteroviruses. Virology 202:982–987. [DOI] [PubMed] [Google Scholar]

[r21] 21.Pöyry, T., L. Kinnunen, T. Hyypiä, B. Brown, C. Horsnell, T. Hovi, and G. Stanway. 1996. Genetic and phylogenetic clustering of enteroviruses. J. Gen. Virol. 77:(Pt 8):1699–1717. [DOI] [PubMed] [Google Scholar]

[r22] 22.Pulli, T., P. Koskimies, and T. Hyypiä. 1995. Molecular comparison of coxsackie A virus serotypes. Virology 212:30–38. [DOI] [PubMed] [Google Scholar]

[r23] 23.Thompson, J. D., D. G. Higgins, and T. J. Gibson. 1994. CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, positions-specific gap penalties and weight matrix choice. Nucleic Acids Res. 22:4673–4680. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r24] 24.Thompson, J. D., F. Plewniak, and O. Poch. 1999. A comprehensive comparison of multiple sequence alignment programs. Nucleic Acids Res. 27:2682–2690. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r25] 25.Ward, C. W., N. M. McKern, M. J. Frenkel, and D. D. Shuckla. 1992. Sequence data as the major criterion for potyvirus classification. Arch Virol. 5(Suppl.):283–297. [DOI] [PubMed] [Google Scholar]

PERMALINK

Molecular Identification of Enterovirus by Analyzing a Partial VP1 Genomic Region with Different Methods

G Palacios

I Casas

A Tenorio

C Freire

Abstract