Complete Genome Sequences of the SARS-CoV: the BJ Group (Isolates BJ01-BJ04)

Shengli Bi; E’de Qin; Zuyuan Xu; Wei Li; Jing Wang; Yongwu Hu; Yong Liu; Shumin Duan; Jianfei Hu; Yujun Han; Jing Xu; Yan Li; Yao Yi; Yongdong Zhou; Wei Lin; Jie Wen; Hong Xu; Ruan Li; Zizhang Zhang; Haiyan Sun; Jingui Zhu; Man Yu; Baochang Fan; Qingfa Wu; Wei Lin; Lin Tang; Bao’an Yang; Guoqing Li; Wenming Peng; Wenjie Li; Tao Jiang; Yajun Deng; Bohua Liu; Jianping Shi; Yongqiang Deng; Wei Wei; Hong Liu; Zongzhong Tong; Feng Zhang; Yu Zhang; Cui’e Wang; Yuquan Li; Jia Ye; Yonghua Gan; Jia Ji; Xiaoyu Li; Xiangjun Tian; Fushuang Lu; Gang Tan; Ruifu Yang; Bin Liu; Siqi Liu; Songgang Li; Jun Wang; Jian Wang; Wuchun Cao; Jun Yu; Xiaoping Dong; Huanming Yang

doi:10.1016/S1672-0229(03)01023-4

. 2016 Nov 28;1(3):180–192. doi: 10.1016/S1672-0229(03)01023-4

Complete Genome Sequences of the SARS-CoV: the BJ Group (Isolates BJ01-BJ04)

Shengli Bi ^1,^*, E’de Qin ^2,^*, Zuyuan Xu ^3,^4,^*, Wei Li ^3,^*, Jing Wang ^5,^3,^*, Yongwu Hu ^6,^3,^*, Yong Liu ^7,^*, Shumin Duan ¹, Jianfei Hu ^3,⁵, Yujun Han ³, Jing Xu ³, Yan Li ³, Yao Yi ¹, Yongdong Zhou ¹, Wei Lin ¹, Jie Wen ³, Hong Xu ¹, Ruan Li ¹, Zizhang Zhang ⁸, Haiyan Sun ³, Jingui Zhu ³, Man Yu ², Baochang Fan ², Qingfa Wu ³, Wei Lin ³, Lin Tang ³, Bao’an Yang ², Guoqing Li ³, Wenming Peng ², Wenjie Li ³, Tao Jiang ², Yajun Deng ³, Bohua Liu ², Jianping Shi ³, Yongqiang Deng ², Wei Wei ⁴, Hong Liu ², Zongzhong Tong ³, Feng Zhang ³, Yu Zhang ², Cui’e Wang ², Yuquan Li ², Jia Ye ^3,⁴, Yonghua Gan ², Jia Ji ³, Xiaoyu Li ², Xiangjun Tian ^3,⁴, Fushuang Lu ², Gang Tan ², Ruifu Yang ², Bin Liu ³, Siqi Liu ³, Songgang Li ^3,⁵, Jun Wang ³, Jian Wang ^3,⁴, Wuchun Cao ², Jun Yu ^3,^4,^#, Xiaoping Dong ^1,^#, Huanming Yang ^3,^4,^#

PMCID: PMC5172409 PMID: 15629030

Abstract

Beijing has been one of the epicenters attacked most severely by the SARS-CoV (severe acute respiratory syndrome-associated coronavirus) since the first patient was diagnosed in one of the city’s hospitals. We now report complete genome sequences of the BJ Group, including four isolates (Isolates BJ01, BJ02, BJ03, and BJ04) of the SARS-CoV. It is remarkable that all members of the BJ Group share a common haplotype, consisting of seven loci that differentiate the group from other isolates published to date. Among 42 substitutions uniquely identified from the BJ group, 32 are non-synonymous changes at the amino acid level. Rooted phylogenetic trees, proposed on the basis of haplotypes and other sequence variations of SARS-CoV isolates from Canada, USA, Singapore, and China, gave rise to different paradigms but positioned the BJ Group, together with the newly discovered GD01 (GD-Ins29) in the same clade, followed by the H-U Group (from Hong Kong to USA) and the H-T Group (from Hong Kong to Toronto), leaving the SP Group (Singapore) more distant. This result appears to suggest a possible transmission path from Guangdong to Beijing/Hong Kong, then to other countries and regions.

Key words: SARS, SARS-CoV, haplotype, substitution, phylogeny

China is the prime victim of the sudden outbreak of SARS (severe acute respiratory syndrome) due to a newly identified variant of coronavirus, the SARS-CoV (1). Beijing metropolitan area is one of the epicenters that have been severely attacked by the virus since its first patient was identified on March 1, 2003. Both the numbers of cases and of deaths of SARS in Beijing alone were accounted for 29.92% and 23.57%, respectively, of those in the entire world up to June 10, 2003 (Figure 1). We have been working on sequencing SARS-CoV genomes from clinical isolates since early April of this year and have released our data as soon as we acquired them. We now report our detailed comparative analyses on a group of four complete genome sequences (Isolates BJ01, BJ02, BJ03, and BJ04), named the BJ Group, by referencing the thirteen SARS-CoV genome sequences in the public databases 2., 3., 4..

Fig. 1 — Accumulated number of probable cases and deaths of SARS in Beijing, China, and the world. Data sources: http://www.moh.gov.cn/zhgl/yqfb/index.htm; http://www.who.int/csr/sars/country/en/.

SARS Patients, SARS-CoV Isolates and Genome Sequencing

SARS patients were clinically diagnosed in March 2003 according to World Health Organization (WHO) guidelines (http://www.who.int/csr/sars/guidelines/en/). SARS-CoV isolates were maintained in Vero-6 cell cultures that were inoculated from autopsies and biopsies of the deceased or recovered SARS patients (Table 1). Viral RNA was purified from virions prepared from the cultures and subjected for cDNA syntheses. A set of primers that cover the entire viral genome were used for the reverse transcription and PCR amplification, in a product size range of 400–800 bp. PCR-amplified fragments were then cloned into amplicon-libraries and two dozens or more clones were sequenced for each PCR-amplified fragments to ensure sequence quality and to avoid errors in the procedures of the RT-PCR and cloning. Only the consensus sequences with absolute majority votes were used for the genome assembly and gene annotation, although every read was assembled and some sequence variations were clearly visible. For comparative genomic analyses, 13 other full-length SARS-CoV genome sequences were downloaded from GenBank (Table 2; http://www.ncbi.nlm.nih.gov). The nucleotide positions of Isolate BJ01 were used as the reference for analyses.

Table 1.

Samples and Clinical Data for the BJ Group (Isolates BJ01-BJ04)

Isolate	GenBank accession number	Tissue	Sample	Clinical outcome
BJ01	AY278488	Lung	Autopsy	Deceased
BJ02	AY278487	Nose & throat	Swabs (mixed patients infected by BJ01)	Recovered
BJ03	AY278490	Liver & lymph nodes	Autopsy (same as BJ01)	Deceased
BJ04	AY279354	Lung	Autopsy	Deceased

Open in a new tab

Table 2.

Information Summary of the Complete Genome Sequences of the BJ Group and Other 13 Isolates of SARS-CoV^*

Isolate	Genome size (nt)	Accession number	Modification date
BJ01	29,725	AY278488.2	1-May-03
BJ02	29,745	AY278487	5-June-03
BJ03	29,740	AY278490	5-June-03
BJ04	29,732	AY279354	29-May-03
GD01	29,757	AY278489	29-May-03
ZJ01	29,715	AY297028.1	19-May-03
TW1	29,729	AY291451.1	14-May-03
CUHK-W1	29,736	AY278554.2	14-May-03
CUHK-Su10	29,736	AY282752.1	7-May-03
Urbani	29,727	AY278741.1	21-Apr-03
HKU-39849	29,742	AY278491.2	18-Apr-03
TOR2	29,751	NC_004718.3	22-May-03
SIN2500	29,711	AY283794.1	9-May-03
SIN2677	29,705	AY283795.1	9-May-03
SIN2679	29,711	AY283796.1	9-May-03
SIN2748	29,706	AY283797.1	9-May-03
SIN2774	29,711	AY283798.1	9-May-03

Open in a new tab

Data were retrieved from GenBank: http://www.ncbi.nlm.nih.gov.

Isolates BJ01 and BJ03 were derived from the autopsied lung tissue (BJ01) and liver/lymph nodes (BJ03) of the same patient who was the father of the first patient (the Index Case of Beijing) diagnosed on March 1, 2003 in Beijing. His own daughter infected him in the last week of February in Shanxi Province after she traveled to Guangdong Province during the period of February 18th to 23rd. He was 53 years old and had no detectable symptoms of hepatitis, AIDS, or heart diseases. He, as the second SARS patient hospitalized (March 5th) in Beijing, died two days after the diagnosis. His daughter, however, had recovered completely from the horrifying ordeal after medical treatments. The sequence differences between Isolates BJ01 and BJ03 are expected to reflect the sequence variations within the same patient after viral infection. Isolate BJ02 was inoculated from pooled samples of nose/throat swabs from seven patients who were all evidenced to be infected by the first patient when she (the Index Case of Beijing) was hospitalized in the first week of March at the same hospital. All of these patients were recovered after effective treatments. The sequence differences between BJ01 and BJ02 should be regarded as the variation from the first circle or round of the infection. Isolate BJ04 was inoculated from the autopsied lung tissue of a single deceased patient who had no direct contact with the Index Case of Beijing. The sequence differences between BJ04 and other BJ cases should yield variations during viral infections outside the first circle. As a whole, the group of isolates, or the BJ Group, represents the early rounds of disease transmissions in Beijing. Another relevant case concerns Isolate GD01 (previously named GZ01). It was isolated from the autopsied lung tissue of a single deceased patient who was a female of 54 years old. She was suspected being infected during her hospitalization by indirect contact with one of the “superspreaders”, who stayed in the same hospital as she did. She was one of the SARS cases with known transmission path connecting to the “Index Cases” identified in Guangdong Province. Its genome sequence serves as a root or an anchor point for the BJ Group. In addition, the genome sequence of GD01 harbors a 29-nt insertion, a rare genotype so far only found in the samples from Guangdong Province (5).

The sequences of the four complete genomes have a size range of 29,725-29,745 nucleotides (nt). The difference is primarily due to the extension of the 5′-end sequence, which has no significant changes in the overall genome structure and gene size. In the genomes of these isolates, the predicted number of ORFs (open reading frames) remains as 13, in contrast to the genome of Isolate GD01, where a 29-nt insertion was discovered. Sequence evidence from another SARS-CoV isolate collected in Guangdong has confirmed this significant finding (BGI unpublished data). Referring to the BJ01 genome annotation, we have previously reported that, in the GD01 genome, two additional BGI-PUPs (BGI-Postulated Uncharacterized Proteins), BGI-PUP5 (nt positions 27,763-27,882) and BGI-PUP6 (nt positions 27,848-28,102), were predicted in the largest (478 nt) intergenic region between the ORFs for BGI-PUP4 and the N protein (nucleoprotein); PUP5 was previously so named in BJ01 and now renamed as BGI-PUP7. Putative leader sequences were determined in these genomes, based on alignments of the leader sequence at the 5′-end of the genome (5′-UCUCUAAACGAACUUUAAAAUCUG) to the sequences upstream of each ORF. The organization and general features of viral proteins, including the R (replicase) protein and the structural proteins (the spike or S, envelope or E, membrane or M, and N proteins), among all the isolates are nearly identical except the 29-nt insertion in the GD isolates and the limited number of nucleotide substitutions.

A Novel Haplotype Found Unique to the BJ Group

A salient discovery in our study is that all members of the BJ Group share a common haplotype of 7 loci, C/t-T-G-C-A-C-G-C-T-C (letters in uppercases indicate the major allele at the locus and letters in lowercases specify the minor allele; Table 3); a new mutation occurred in BJ04 at Locus 9,385 is responsible for the minor allele t. Isolate BJ04, which was harbored by a patient who did not have any direct contacts with the other BJ cases, is clustered with most members of the other groups. If GD01 is included in the group, the haplotype remains as C/t-T/c-G-C-A-C-G-C-T-C; it introduces another minor allele c at Locus 9,835. Even if we exclude either BJ01 or BJ03, the two isolates from a single patient, the haplotype still stands out; it has differentiated the group from all other isolates (T/c-C-T/g-C/t-A/g-C/t-A-T/c-C-T/c) identified so far. This BJ Group-specific haplotype represents the first population of SARS-CoV responsible for the acute outbreak in the metropolitan area of Beijing.

Table 3.

The Group-Specific Haplotypes of SARS-CoV

Open in a new tab

The H-U Group (from Hong Kong to USA) has a different haplotype, T/c-C-T/g-T/c-G/a-C-A-T/c-C-T/c. The major allele T within the group at Locus 17,827 that is shared by CUHK-W1 and CUHK-Su10, together with the major allele G at Locus 19,045 shared by CUHK-W1 and Urbani (named after Dr. Carlo Urbani who was the first WHO officer to identify the outbreak of this new disease in an American businessman who had been admitted to a hospital in Hanoi), establishes the internal association within the H-U Group. The fact that the H-U Group overlaps with the BJ Group by Isolate CUHK-W1 (c-C-g-T-G-C-A-c-C-c) suggests a possible link or path of transmission between the BJ and H-U Groups. The H-U Group appears to be a mosaic genotype of the BJ and other groups. The result suggests the existence of an intermediate state in the transmission path connecting China to the outside. In addition to the haplotype, T-C-T-C-A-T-A-T-C-T, identical to the H-T Group (from Hong Kong to Toronto) as well as Isolates ZJ01 (Zhejiang) and TW1 (Taiwan), the SP Group (Singapore) possesses a monolocus marker T, as opposed to C for all others at Locus 19,065, thus establishing the internal phylogenetic relationship within the group and differentiating it from all others. GD01 shares an almost identical haplotype with the BJ Group at the defined loci, except for the minor allele c at Locus 9,835, which distances itself from the BJ Group but links it to all the other groups.

Sequence Variations in the BJ Group in Comparison with other Isolates

137 sequence variations (142 if counted independently by each ORF) were defined by comparing the sequences of the BJ Group with other isolates, including 3 (GD01, ZJ01, and TW1) identified in China, and 10 others elsewhere (Figure 2, Table 4). Out of the total, 42 were contributed by the BJ Group alone, amounting to nearly 30% of the grand total. These variations were confirmed among over a dozen high-quality sequence reads from re-sequencing of the RT-PCR products directly and clones from the corresponding amplicon-derived libraries. Although possible sequence variations acquired during the limited generations in the host body and viral culture as well as sequencing errors arisen from the RT-PCR and amplification/cloning could not be easily excluded, the likelihood that all BJ isolates mutate in a similar way is rather slim since the sequencing was done in a systematic way with overlapping segments and high quality. The quality of complete genome sequences from other contributors was assessed by manually checking the submitted sequence traces when publicly available. In the substitutions identified from the BJ group, 76% are categorized as non-synonymous, slightly higher than the average (70%, 100/142) calculated from all 17 isolates. As a benchmark, we have summarized a few commonly used parameters for viral evolution studies together with those of two other RNA viruses, the HIV-1 and influenza (Table 5). The SARS-CoV in many ways is quite different from these two viruses, except that their genomes are all RNA in nature. In particular, the HIV lives with the host for a long time so the constant escape from the host immune system is of essence for its survival until it overwhelms the system (6). The influenza virus that has an 85-year recorded history of infecting humans causes recurrent annual epidemics until a novel virus rises to stir up major worldwide pandemics (7). Our results did not lead to any strong conclusions due to insufficient data points compared to both HIV and influenza but it is quite necessary to compare the SARS-CoV data with those of the two prevalent viruses from time to time. Aside from mutation rate calculations, a popular test for selection on a particular protein is the ratio of Ka/Ks (8). Generally, most non-synonymous SNPs (single nucleotide polymorphisms) are believed to be deleterious and rapidly removed from the given population of viruses by selection, leaving Ka/Ks less than one. Conversely, Ka/Ks greater than one is a strong indicator of positive selection. Although we have seen a high ratio in the case of PUP2, the indication of such a number from an uncharacterized putative transcript remains to be elucidated. At the present time, limited by the amount of experimental data, the ratio of Ka/Ks in the SARS-CoV data is higher than that of most genes in HIV and influenza viruses, in which selection and adaptation are both playing significant roles in changing the rate of non-synonymous substitutions, especially for the structural proteins that are the targets of the host immune systems.

Fig. 2 — Distribution of the non-synonymous substitutions in the SARS-CoV genome from the known cases. The tall vertical bars represent the substitutions detected from the BJ Group and the low bars denote those from non-BJ Groups. The scale marks the nucleotide positions in reference to BJ01.

Table 4.

Summarized Substitutions Identified in the BJ Group and other 13 Isolates of SARS-CoV

ORF	Size(nt)	No.of S^*		Percentage of substitutions (%)		No. of N-Syn^*		Percentage of N-Syn (%)
ORF	Size(nt)	BJ Group	All	BJ Group	All	BJ Group	All	BJ Group	All
R	21,222	25	92	0.12	0.43	21	65	84	71
S	3,768	9	22	0.24	0.58	6	13	67	59
BGI-PUP1	825	4	9	0.48	1.09	2	6	50	67
BGI-PUP2	465	2	5	0.43	1.08	2	4	100	80
E	231	0	1	0	0.43	0	1		100
M	666	0	4	0	0.60	0	4		100
BGI-PUP3	192	0	2	0	1.04	0	2		100
BGI-PUP5	120	0	1	0	0.83	0	1		100
BGI-PUP6	255	0	1	0	0.39	0	1		100
N	1,269	1	4	0.08	0.32	1	3	100	75
Non-ORF		1	1

Total	29,725	42^†	142^†	0.13	0.46	32	100	76	70

Open in a new tab

S and N-Sys stand for synonymous and non-synonymous substitutions, respectively.

^†

A single substitution at the same position in a region overlapping with two ORFs was counted as 2. The total number is 137 when such a substitution event was counted only once so the total number of substitutions contributed by the BJ Group is reduced to 40.

Table 5.

Comparison of the Mutation Rates in SARS-CoV, Influenza Virus, and HIV^*

Virus	ORF	Size (nt)	No. of substitutions	Substitute rate (%)	No. of non-synonymous substitution	Non-synonymous substitute rate (%)	dN/dS	Ka	Ks	Ka/Ks
SARS-CoV	R	21,222	92	0.43	65	0.31	2.38	0.075	0.111	0.67
	S	3,768	22	0.58	13	0.35	2.00	0.108	0.188	0.57
	PUP1	825	9	1.09	6	0.73	1.77	0.171	0.340	0.50
	PUP2	465	5	1.08	4	0.86	4.88	0.217	0.152	1.43
	E	231	1	0.43	1	0.43		0.093	0.000
	M	666	4	0.60	4	0.60		0.126	0.000
	PUP3	192	2	1.04	2	1.04		0.492	0.000
	PUP4	369	0	0.00	0	0.00		0.000	0.000
	N	1,269	4	0.32	3	0.24	3.00	0.049	0.054	0.91
	PUP5	297	0	0.00	0	0.00		0.000	0.000
	Total	29,725	137(142)	0.46	98	0.33	2.52	0.085	0.119	0.72

Influenza Virus A	HA	1,701	698	41.03	323	18.99	0.62	64.5	382.8	0.17
	M2	294	98	33.33	60	20.41	0.83	178.0	815.9	0.22
	M1	759	243	32.02	79	10.41	0.24	79.9	1,112	0.07
	NA	1,413	418	29.58	181	12.81	0.67	12.3	69.4	0.18
	NP	1,497	573	38.28	186	12.42	0.31	107.0	1,231	0.09
	NS	366	128	34.97	56	15.30	0.34	90.6	1,087	0.08
	PA	2,151	689	32.03	181	8.41	0.20	31.4	596.4	0.05
	PB1	2,274	851	37.42	204	8.97	0.13	27.5	805.7	0.03
	PB2	2,280	805	35.31	203	8.90	0.18	36.2	719.3	0.05
	Total	13,638	4,784(4,803)	35.08	1,473	10.80	0.27	48.1	659.1	0.07

HIV-1	Gag	1,476	979	66.33	676	45.80	0.88	5,917	25,290	0.23
	Pol	3,000	1,894	63.13	1,232	41.07	0.61	3,974	24,510	0.16
	Vif	579	392	67.70	300	51.81	1.42	7,613	21,014	0.36
	Vpr	291	202	69.42	140	48.11	0.76	5,924	28,054	0.21
	Tat	306	216	70.59	172	56.21	2.64	11,101	16,108	0.69
	Rev	303	237	78.22	189	62.38	1.70	10,107	19,080	0.53
	Vpu	249	215	86.35	182	73.09	2.23	14,947	25,620	0.58
	Env	2,574	1,992	77.39	1,531	59.48	1.70	10,765	23,342	0.46
	Nef	624	473	75.80	370	59.29	1.12	8,632	29,346	0.29
	Total	9,680	6,536(7,141)	67.52	4,792	49.50	1.15	7,445	24,094	0.31

Open in a new tab

SARS-CoV data are from 17 isolates and referred to the notes of Table 2. Influenza Virus data are from 50 strains of HA segment, 100 strains of M segment, 24 strains of NA segment, 88 strains of NP segment, 89 strains of NS segment, 53 strains of PA segment, 58 strains of PB1 segment, and 58 strains of PB2 segment, downloaded from http://www.flu.lanl.gov. HIV data are from 405 strains, downloaded from http://hiv-web.lanl.gov/. Only those minor alleles that are present in at least two sequences were considered as real substitutions.

Instead, we have inspected these substitutions individually to the nucleotide positions in a context of a codon. It is not surprising to find that the three-nucleotide positions within a codon have a similar frequency to be mutated when insufficient evolutionary processes are yet to be experienced by the newly emerged virus. Among all the 139 substitutions (including 2 counted in 2 ORFs), 45 are at the first nucleotide position, 47 are at the second one, and 47 are at the third one. It implies that the selective pressure has yet to work on these mutations that were most likely generated in the initial viral population due to replication errors of the viral machinery. Many interesting non-synonymous substitutions were found among the group. A common A/C transversion (Locus 26,031, C in both BJ01 and BJ03, A in all other isolates, including two of the BJ Group, BJ02 and BJ04), leading to a Gln (CAA)-to-Pro (CCA) change, was noticed in the BJ Group specific variations. Since it resides in the PUP2 coding sequence, the functional significance of such a mutation is yet to be revealed experimentally in the future. A G/A transition (Locus 25,280, A in BJ03 and BJ02, G in all others) leading to Gly/Pro (from polar to non-polar) was also identified. These changes are not as drastic in terms of biochemical characteristics in the amino acid composition as the one found within the BJ01/BJ03 patient, indicating there might be possible selection on the latter cases. BJ04 has all the alleles at the loci mentioned above, the same as all the other isolates, assuring its relationship with isolates outside the BJ Group. GD01 shares the same alleles at three loci with either BJ01 or BJ02, which are different from all the others, suggesting that these mutations are early replication errors before the viral invasion into Beijing, providing a link between the BJ Group and the GD Group in Guangdong where the first major epidemics of SARS occurred.

The sequence variation between BJ01 and BJ03 was expected to reflect the replication error rates, emerging from both replication cycles and in different tissues inside a single host. The sequence variation between BJ01/03 and BJ02 would reflect the replication errors between the first and the second round of the infection among the hosts, as well as the selective pressure from the host or possible advantages taken by the viruses. Somewhat to our surprises, we have noticed a "square-root rule" of the non-synonymous substitutions among the member isolates in the BJ Group (Figure 3). There are approximately 15 to 16 non-synonymous substitutions in the first round of the transmission (15 between BJ01 and BJ03, 16 between BJ01 and BJ02, 16 between BJ01 and BJ04); it approximately equals to 4² or 2⁴. In the second round of the infection, there are 25 to 26 non-synonymous substitutions (25 between BJ02 and BJ04, 25 between BJ02 and BJ03, 25 between BJ03 and BJ04); it is close to 5², or to 2⁵ when both non-synonymous and synonymous substitutions are accounted. Such a "square-root rule" implies that the mutations occur freely without any constraints from patients to patients, perhaps due to lack of section pressure and adaptation during early transmissions or there is not enough time for them to become obvious, even though there might be a slight reduce in numbers in the second round of the infection.

Fig. 3 — Two-scale substitution rates of the SARS-CoV in the first and second round of the transmission of the BJ Group. The numbers above the lines that connect each isolate are nonsynonymous substitution counts believed as results of the first round and second round of the transmission. The numbers in the parentheses are synonymous substitution counts between the connected isolates.

We summarized the non-synonymous substitutions according to their subregions defined by a combination of structural and/or functional properties in the corresponding ORFs (Table 6). Data were classified according to the computationally predictable changes of physiochemical features and/or the secondary structure they would make in the corresponding subregions. For example, the predicted alterations by the substitutions of the M protein would lead to an increased pI (isoelectric point) of the N-terminal exterior region, decreased or increased hydrophobicity in the TM (transmembrane) domains, and decreased hydrophilicity in the C-terminal interior region, respectively. These predictable changes should suggest that the virus might benefit from these substitutions with remarkable changes that may be advantageous for the virus to defend the host immune system or drift to a new status ready for coming back. No single base insertion or deletion has been found so far in all the sequences published to this date; it states clearly the fidelity of the viral replication machinery.

Table 6.

Predicted Subregional Changes by the Non-Synonymous Substitutions in the 17 SARS-CoV Genomes

Open in a new tab

The BJ Group as a Subset of SARS-CoV Isolates Represents a Discrete Viral Transmission Path

Rooted phylogenetic trees (Fig. 4, Fig. 5), proposed on the basis of the haplotypes from each group, synonymous or non-synonymous substitutions, and the sum of all substitutions of the 17 genome sequences of SARS-CoV isolates from patients identified in Canada, USA, Singapore, and China (Beijing, Zhejiang, Guangdong, Hong Kong, and Taiwan), gave different paradigms but positioned the BJ Group, together with the newly discovered GD01 (GD-Ins29) in the same clade, followed by the H-U Group, then the H-T Group, leaving the SP Group (Singapore) more distant. This paradigm suggests a possible transmission path from Guangdong to Beijing/Hong Kong, then to other countries and regions. It appears consistent with the epidemiological data presently available, and would suggest a possible transmission path among Guangdong, Hong Kong, Beijing, and USA.

graphic file with name gr5a.jpg — Proposed rooted phylogenetic trees of the 17 isolates of the SARS-CoV based on all substitutions (A), haplotypes (B), non-synonymous (C), and synonymous (D) substitutions.

graphic file with name gr5b.jpg — Proposed rooted phylogenetic trees of the 17 isolates of the SARS-CoV based on all substitutions (A), haplotypes (B), non-synonymous (C), and synonymous (D) substitutions.

It is obvious that we are just in the early process of exploiting the information from the genomes of SARS-CoV isolates from patients of different countries and regions. The final picture of the infection route and the mutation spectra will be revealed in due time as long as we keep sequencing the many clinical isolates of the virus accurately and consistently. We have been doing so since the epidemic started, finding the unique insertion variant in the samples from Guangdong and now the haplotypes, and we will keep doing so until the next round of the infection if it does come in this fall.

Acknowledgements

We thank Ministry of Science and Technology of China, Chinese Academy of Sciences, and National Natural Science Foundation of China for financial support. We are indebted to collaborators and clinicians from Peking Union Medical College Hospital, National Center of Disease Control of China, and the Municipal Governments of Beijing and Hangzhou.

Contributor Information

Xiaoping Dong, Email: dongxp@public.fhnet.cn.net.

Huanming Yang, Email: yanghm@genomics.org.cn.

References

1.Lee N. A major outbreak of severe acute respiratory syndrome in Hong Kong. N. Engl. J. Med. 2003;348:1986–1994. doi: 10.1056/NEJMoa030685. [DOI] [PubMed] [Google Scholar]
2.Rota P.A. Characterization of a novel coronavirus associated with severe acute respiratory syndrome. Science. 2003;300:1394–1399. doi: 10.1126/science.1085952. [DOI] [PubMed] [Google Scholar]
3.Marra M.A. The genome sequence of the SARS-associated coronavirus. Science. 2003;300:1394–1399. doi: 10.1126/science.1085953. [DOI] [PubMed] [Google Scholar]
4.Qin E.D. A complete sequence and comparative analysis of a SARS-associated virus (Isolate BJ01) Chin. Sci. Bull. 2003;48:941–948. doi: 10.1007/BF03184203. [DOI] [PMC free article] [PubMed] [Google Scholar]
5.Qin E.D. A genome sequence of novel SARS-CoV isolates: the genotype, GD-Ins29, leads to a hypothesis of viral transmission in South China. Geno., Prot. & Bioinfo. 2003;1:101–107. doi: 10.1016/S1672-0229(03)01014-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
6.Williamson S. Adaptation in the env gene of HIV-1 and evolutionary theories of disease progression. Mol. Biol. Evol. 2003;20:1318–1325. doi: 10.1093/molbev/msg144. [DOI] [PubMed] [Google Scholar]
7.Hay A.J. The evolution of human influenza viruses. Philos. Trans. R. Soc. Lond. B. Biol. Sci. 2001;356:1861–1870. doi: 10.1098/rstb.2001.0999. [DOI] [PMC free article] [PubMed] [Google Scholar]
8.Graur D., Li W.H., editors. Fundamentals of molecular evolution. Sinauer Press; Sunderland, USA: 2000. [Google Scholar]

[bib1] 1.Lee N. A major outbreak of severe acute respiratory syndrome in Hong Kong. N. Engl. J. Med. 2003;348:1986–1994. doi: 10.1056/NEJMoa030685. [DOI] [PubMed] [Google Scholar]

[bib2] 2.Rota P.A. Characterization of a novel coronavirus associated with severe acute respiratory syndrome. Science. 2003;300:1394–1399. doi: 10.1126/science.1085952. [DOI] [PubMed] [Google Scholar]

[bib3] 3.Marra M.A. The genome sequence of the SARS-associated coronavirus. Science. 2003;300:1394–1399. doi: 10.1126/science.1085953. [DOI] [PubMed] [Google Scholar]

[bib4] 4.Qin E.D. A complete sequence and comparative analysis of a SARS-associated virus (Isolate BJ01) Chin. Sci. Bull. 2003;48:941–948. doi: 10.1007/BF03184203. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib5] 5.Qin E.D. A genome sequence of novel SARS-CoV isolates: the genotype, GD-Ins29, leads to a hypothesis of viral transmission in South China. Geno., Prot. & Bioinfo. 2003;1:101–107. doi: 10.1016/S1672-0229(03)01014-3. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib6] 6.Williamson S. Adaptation in the env gene of HIV-1 and evolutionary theories of disease progression. Mol. Biol. Evol. 2003;20:1318–1325. doi: 10.1093/molbev/msg144. [DOI] [PubMed] [Google Scholar]

[bib7] 7.Hay A.J. The evolution of human influenza viruses. Philos. Trans. R. Soc. Lond. B. Biol. Sci. 2001;356:1861–1870. doi: 10.1098/rstb.2001.0999. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib8] 8.Graur D., Li W.H., editors. Fundamentals of molecular evolution. Sinauer Press; Sunderland, USA: 2000. [Google Scholar]

PERMALINK

Complete Genome Sequences of the SARS-CoV: the BJ Group (Isolates BJ01-BJ04)

Shengli Bi

E’de Qin

Zuyuan Xu

Wei Li

Jing Wang

Yongwu Hu

Yong Liu

Shumin Duan

Jianfei Hu

Yujun Han

Jing Xu

Yan Li

Yao Yi

Yongdong Zhou

Wei Lin

Jie Wen

Hong Xu

Ruan Li

Zizhang Zhang

Haiyan Sun

Jingui Zhu

Man Yu

Baochang Fan

Qingfa Wu

Wei Lin

Lin Tang

Bao’an Yang

Guoqing Li

Wenming Peng

Wenjie Li

Tao Jiang

Yajun Deng

Bohua Liu

Jianping Shi

Yongqiang Deng

Wei Wei

Hong Liu

Zongzhong Tong

Feng Zhang

Yu Zhang

Cui’e Wang

Yuquan Li

Jia Ye

Yonghua Gan

Jia Ji

Xiaoyu Li

Xiangjun Tian

Fushuang Lu

Gang Tan

Ruifu Yang

Bin Liu

Siqi Liu

Songgang Li

Jun Wang

Jian Wang

Wuchun Cao

Jun Yu

Xiaoping Dong

Huanming Yang

Abstract

Fig. 1.

SARS Patients, SARS-CoV Isolates and Genome Sequencing

Table 1.

Table 2.

A Novel Haplotype Found Unique to the BJ Group

Table 3.

Sequence Variations in the BJ Group in Comparison with other Isolates

Fig. 2.

Table 4.

Table 5.

Fig. 3.

Table 6.

The BJ Group as a Subset of SARS-CoV Isolates Represents a Discrete Viral Transmission Path

Fig. 4.

Fig. 5.

Acknowledgements

Contributor Information

References