Figure 3.

Multiple sequence alignment of the p195/p210 regions of coronavirus replicase polyproteins. An initial draft of this alignment was generated using the Dialign2 program (35) and subsequently improved with the ClustalX program (34). The alignment was further checked and corrected using results of a Macaw-mediated (36) analysis that involved all coronaviruses except MHVJ, which was excluded due to its closeness to MHVA. Five domains were recognized in the alignment, and their positions were indicated with ><. The borders of the domains are tentative. The alignments of the PL1pro and PL2pro regions were based on results of our previous analysis (32). For two regions that are located between domains X and PL2pro, and PL2pro and Y, respectively, no consistent alignments have been produced. Therefore, only the sizes of these regions are indicated. The pp1a position of the rightmost residue in an alignment row is indicated at the right side. The shading of individual residues in the alignment was done according to a four-level conservation; black background and white letters, gray background andwhite letters, gray background and black letters, respectively, indicate residues that are conserved in 100, 80, and 60% of the sequences. Groups of conserved amino acids are as follows: IVLM; FYW; KRH; DNQE; ST; AG. According to the Macaw, four blocks, which are labeled with letters from A to D abovethe alignment and are discussed in the main text are statistically significant for the entire pp1a searching space: A, p = 1.1e−002; B, p = 2.1e−002; C, p = 4.2e−006; and D, p = 3.9e−015. Two hydrophobic regions predicted to be trans-membrane domains (40) are marked withdashed lines and denoted with TM1 and TM2, respectively. Other highlights are as follows: +, catalytic Cys and His residues of PLpros; #, postulated metal-chelating Cys and His residues of the PLpro zinc fingers; @, conserved Cys and His residues of domain Y; ‖, cleavage sites of PLpros. MHVA and MHVJ, MHV strains A59 and JHM. The National Center for Biotechnology Information sequence ID: IBV, 138147; HCoV, 464694; TGEV, 872319; MHVA, 453423 (nucleotide); MHVJ, 266958 (corrected according to Ref. 57); all sequences are for proteins unless otherwise specified.