Abstract
The genome of Lelystad virus (LV), the causative agent of porcine epidemic abortion and respiratory syndrome (previously known as mystery swine disease), was shown to be a polyadenylated RNA molecule. The nucleotide sequence of the LV genome was determined from a set of overlapping cDNA clones. A consecutive sequence of 15,088 nucleotides was obtained. Eight open reading frames (ORFs) that might encode virus-specific proteins were identified. ORF1a and ORF1b are predicted to encode the vital RNA polymerase because the amino acid sequence contains sequence elements that are conserved in RNA polymerases of the torovirus Berne virus (BEV), equine arteritis virus (EAV), lactate dehydrogenase-elevating virus (LDV), the coronaviruses, and other positive-strand RNA viruses. A heptanucleotide slippery sequence (UUUAAAC) and a putative pseudoknot structure, which are both required for efficient ribosomal frameshifting during translation of the RNA polymerase ORF 1b of BEV, EAV, and the coronaviruses, were identified in the overlapping region of ORF1a and ORF1b of LV. ORFs 2 to 6 probably encode viral membrane-associated proteins, whereas ORF7 is predicted to encode the nucleocapsid protein. Comparison of the amino acid sequences of the ORFs identified in the genome of LV, LDV, and EAV indicated that LV and LDV are more closely related than LV and EAV. A 3′ nested set of six subgenomic RNAs was detected in LV-infected cells. These subgenomic RNAs contain a common leader sequence that is derived from the 5′ end of the genomic RNA and that is joined to the 3′ terminal body sequence. Our results indicate that LV is closely related evolutionarily to LDV and EAV, both members of a recently proposed family of positive-strand RNA viruses, the Arteriviridae.