Skip to main content
Proceedings of the National Academy of Sciences of the United States of America logoLink to Proceedings of the National Academy of Sciences of the United States of America
. 2021 Mar 22;118(13):e2020969118. doi: 10.1073/pnas.2020969118

Rescue of codon-pair deoptimized respiratory syncytial virus by the emergence of genomes with very large internal deletions that complemented replication

Cyril Le Nouën a,1, Thomas McCarty a, Lijuan Yang a, Michael Brown b, Eckard Wimmer c,1, Peter L Collins a, Ursula J Buchholz a
PMCID: PMC8020775  PMID: 33753491

Significance

Synonymous recoding of viral genomes by codon-pair deoptimization (CPD) provides live-attenuated vaccine candidates expected to be genetically stable. However, their actual genetic stability is largely unknown. Under selective pressure, a highly attenuated human respiratory syncytial virus bearing CPD G and F ORFs (called Min B) generated abundant subpopulations of genomes containing large internal deletions (LD genomes). These LD genomes contained the CPD F gene in the first or second promoter-proximal position. This provided increased expression of the F protein, which rescued Min B replication. Thus, defective genomes were specifically selected to complement and rescue the replication of a CPD RNA virus. This study describes a previously unknown mechanism of adaptation of negative-strand RNA viruses.

Keywords: negative-strand RNA virus, respiratory syncytial virus, codon-pair deoptimization, genetic stability, defective genomes

Abstract

Recoding viral genomes by introducing numerous synonymous but suboptimal codon pairs—called codon-pair deoptimization (CPD)—provides new types of live-attenuated vaccine candidates. The large number of nucleotide changes resulting from CPD should provide genetic stability to the attenuating phenotype, but this has not been rigorously tested. Human respiratory syncytial virus in which the G and F surface glycoprotein ORFs were CPD (called Min B) was temperature-sensitive and highly restricted in vitro. When subjected to selective pressure by serial passage at increasing temperatures, Min B substantially regained expression of F and replication fitness. Whole-genome deep sequencing showed many point mutations scattered across the genome, including one combination of six linked point mutations. However, their reintroduction into Min B provided minimal rescue. Further analysis revealed viral genomes bearing very large internal deletions (LD genomes) that accumulated after only a few passages. The deletions relocated the CPD F gene to the first or second promoter-proximal gene position. LD genomes amplified de novo in Min B–infected cells were encapsidated, expressed high levels of F, and complemented Min B replication in trans. This study provides insight on a variation of the adaptability of a debilitated negative-strand RNA virus, namely the generation of defective minihelper viruses to overcome its restriction. This is in contrast to the common “defective interfering particles” that interfere with the replication of the virus from which they originated. To our knowledge, defective genomes that promote rather than inhibit replication have not been reported before in RNA viruses.


Codon-pair deoptimization (CPD) involves large-scale recoding of ORFs, such as in a virus, to increase the frequency of underrepresented codon pairs without changing codon usage or amino acid sequences. Underrepresented codon pairs are thought to function suboptimally. CPD leads to poor ORF expression and attenuation, providing a new path to vaccine candidates (1). CPD ORFs may contain up to thousands of nucleotide mutations. This should confer stability against substantial deattenuation because, in principle, any single-site reversion should yield only a small amount of deattenuation (2, 3). However, before advancing CPD vaccine candidates to clinical studies, it is important to study in depth the stability of the attenuation, and to identify possible mutations and mechanisms involved in deattenuation. A number of previous genetic stability studies of deoptimized viruses suggested that deattenuation was minimal (1, 410). However, an important limitation was that the deoptimized viruses generally were not subjected to strong selective pressure that would favor the outgrowth of viruses with deattenuating mutations.

Human respiratory syncytial virus (RSV) is a negative-strand RNA virus of the Pneumoviridae family of the Mononegavirales order (11). It is the most important viral agent of severe pediatric respiratory tract disease worldwide and also a serious threat to older adults. Despite active ongoing research, RSV lacks a licensed vaccine or antiviral drug suitable for routine use. The RSV genome consists of a single-stranded negative-sense RNA of 15.2 kb and has 10 genes in the order 3′-NS1-NS2-N-P-M-SH-G-F-M2-L-5′ (Fig. 1A) preceded by a leader region and followed by a trailer region. Each gene encodes a single protein of the same name except for M2, which encodes two proteins (M2-1 and M2-2) from overlapping ORFs. The genes begin and end with short conserved gene-start and gene-end transcription signals, respectively, and are transcribed as individual mRNAs by sequential transcription. Transcription initiates at a single promoter in the leader region, and there is a polar gradient of decreasing gene transcription due to polymerase fall-off at the various gene junctions.

Fig. 1.

Fig. 1.

Phenotypic instability of Min B during a temperature stress test. (A) Gene map of Min B showing ORFs that are WT (gray) or CPD (black). The shut-off temperature (TSH) is the lowest restrictive temperature at which the reduction compared to 32 °C is 100-fold or greater than that observed for WT RSV at the two temperatures (27). (B and C) Temperature stress test. Seven replicate T-25 flasks of Vero cells were infected with Min B virus at an MOI 0.1 pfu per cell: (B) Two of these were subjected to parallel passage at 32 °C for a total of 18 passages, and (C) the remaining five were subjected to parallel passage at temperatures that began at 32 °C, increased 1 °C every second passage, and ended at 40 °C at passage P18 (for details, see SI Appendix). The x axis shows passage number, and the y axis shows the virus titer of the initial inoculum and passage harvests. (D) Representative images (200× magnification) of Vero cell monolayers infected at an MOI of 0.01 with the original Min B virus (Left) or with lineage #5, P16 (Right). The monolayers were incubated at 37 °C for 6 d and visualized by light microscopy. Examples of syncytia are indicated by red arrows. Images for other lineages are shown in SI Appendix, Fig. S1A. (E) Plaque characterization of lineage #5 P16 (red), the original Min B virus (green), and WT RSV (black). Vero cells in 24-well plates were inoculated with 20 to 30 pfu per well, incubated for 12 d under 0.8% methylcellulose at 32 °C, fixed, and immunostained for RSV F expression with a mixture of three mAbs to RSV F (28). Plaque size (area in mm2), F protein MFI, and the total F protein expression were determined using a Celigo imager (SI Appendix) for an average of 756 plaques per virus. These three parameters were compared between virus preparations for statistical significance using the ANOVA test (****P ≤ 0.0001). Plaque characterization for other lineages is shown in SI Appendix, Fig. S1B.

We previously recoded various RSV ORFs by CPD (12). We generated four RSVs with up to nine CPD ORFs. Replication of each of the four CPD RSV viruses was strikingly temperature sensitive, a phenotype also observed in other CPD viruses (13). These viruses exhibited a range of increasing attenuation in vitro and in vivo (12), providing new live-attenuated RSV vaccine candidates. A CPD RSV in which only G and F were recoded (called Min B) was the most restricted in replication, suggesting that efficient expression of G or F is critical and limiting for RSV replication (12).

The temperature-sensitive phenotype provides a means to apply selective pressure to CPD RSVs by a so-called temperature stress test, in which the virus of interest is subjected to serial passage in vitro at incrementally increasing physiologic temperatures (14). This creates conditions that favor the outgrowth and detection of viruses bearing mutations that reduce temperature sensitivity and likely are deattenuating, providing evaluation of genetic stability. Since a temperature gradient occurs along the human respiratory tract, and since respiratory virus infections progress from the upper to the lower respiratory tract, temperature stress is a relevant model to test the genetic stability of live-attenuated respiratory virus vaccine candidates.

In the present study, we applied a temperature stress test to RSV Min B to investigate genetic stability and mutations that might occur in response to selective pressure. We found that the selective pressure induced the rapid and abundant generation of RSV genomes bearing large internal deletions (LD genomes) that contained the CPD F gene at the first or second promoter-proximal gene position. Expression of the F gene from the LD genomes complemented and rescued the replication of the debilitated CPD RSV. The RSV LD genomes described here represent a fundamentally different type of defective genome that promoted rather than inhibited replication of Min B.

Results

Temperature Stress Test of RSV Min B.

RSV Min B (12) contains CPD G and F ORFs bearing 197 and 422 silent mutations, respectively, for a total of 619 (Fig. 1A). Min B has a TSH (shut-off temperature; the lowest restrictive temperature at which the reduction compared to 32 °C is 100-fold or greater than that observed for WT RSV at the two temperatures) of 38 °C, but even at 32 °C it has a small-plaque phenotype and is strongly restricted in replication (12).

We subjected Min B to a temperature stress test (14). Five replicate flasks of Vero cells were infected at a multiplicity of infection (MOI) of 0.1 plaque-forming units (pfu) per cell and serially passaged, in independent parallel lineages #1 to #5, at progressively increasing temperature (starting at 32 °C, increasing 1 °C every second passage [P], to a final of 40 °C, for a total of 18 passages, representing 5 mo of continuous culture) (Fig. 1C). Two additional replicate flasks were infected and passaged in parallel at the more-permissive temperature of 32 °C as controls (Ct#1 and Ct#2) (Fig. 1B).

In the two control lineages (Ct#1 and Ct#2), viral titers increased 300-fold during P1-P3 and thereafter mostly remained between 105.5 to 106.5 pfu/mL (Fig. 1B). At those titers, each passage involved a MOI of ∼0.5 to 1.0 pfu per cell. In the five replicates incubated at increasing temperatures, titers increased during the early passages at 32 °C and 33 °C, reaching 106.3 pfu/mL at P4 (Fig. 1C). Titers then decreased by ∼30-fold in all lineages from P5 (34 °C) to P7 (35 °C), and then increased in all lineages from P8 (35 °C) to P10 (36 °C) to ∼106 pfu/mL. From P10 to P16 (39 °C), the titers varied among the lineages, ranging between 104 and 106 pfu/mL. From P16 to P18 (39 °C to 40 °C), viral titers decreased, showing that each of the lineages still maintained some temperature sensitivity.

To evaluate effects on cytopathology, monolayers of Vero cells were infected with an MOI of 0.01 with unpassaged Min B or with P16 of each lineage and incubated at 37 °C for 6 d. Results for Min B and lineage #5 are shown in Fig. 1D, and for the other lineages in SI Appendix, Fig. S1A. Min B did not form evident syncytia, whereas P16 of each lineage, including the controls passaged at 32 °C, formed extensive syncytia. This suggested that, during passage, each lineage of virus had regained substantial expression of functional RSV F protein.

Additional Vero cells were infected with 20 to 30 pfu per well of unpassaged Min B or P16 of each lineage, incubated for 12 d under methylcellulose, and analyzed for plaque size and level of F expression. Results for Min B and lineage #5 are shown in Fig. 1E, and for the other lineages in SI Appendix, Fig. S1B. Plaque sizes for all of the lineages, including those passaged at 32 °C, were increased compared to that of unpassaged Min B, although none of the lineages reached the size of WT RSV (Fig. 1E and SI Appendix, Fig. S1B). Quantification of RSV F immunostaining for plaques from P16 showed that, compared to the original Min B, all of the lineages exhibited substantially increased F median fluorescent intensity (MFI), as well as increased total amount of F protein per plaque, although less than WT RSV (Fig. 1E and SI Appendix, Fig. S1B). Thus, serial passage of Min B, either at 32 °C or at increasing temperature, yielded progeny with increased replication and expression of RSV F protein.

Serial Passage of Min B Promoted the Emergence of Sets of Mutations Unique to Each Lineage.

Whole-genome Ion Torrent deep sequencing was performed for RNA from P18 for each of the lineages (#1 to #5 and Ct#1 and Ct#2) passaged in the stress test in Fig. 1 B and C. This identified a total of 35 prominent point mutations (defined as being present in ≥50% of the sequencing reads) (Table 1) among the seven lineages. Thirty-one of these were unique to a particular lineage, and only two were the same in two lineages (for details see SI Appendix, Table S1). There was considerable variability in the number and locations of mutations between the lineages (Table 1). One of the control lineages (Ct#2) had no prominent mutations, and the other (Ct#1) had five. The other lineages had between two and nine prominent mutations. Thirty-two mutations were distributed among seven ORFs (NS1, P, M, G, F, M2, and L); the remaining three mutations were in nontranslated regions. Only nine mutations (26%) were in the G and F ORFs, the only ORFs that had been subjected to CPD. Of the 32 mutations identified in ORFs, 26 (74%) were nonsynonymous mutations, suggesting a bias toward amino acid change.

Table 1.

Numbers and locations of point mutations present in >50% of reads from full-genome deep-sequencing analysis of Min B RSV RNA from P18 of lineages #1 to #5 and Ct#1 and Ct#2

Lineage
Genome region 1 2 3 4 5 Ct#1 Ct#2
3′ leader (1)
NS1 1
P 1 1 1+ (1) 1+ (1)
M 2 1 1 1
M/SH IG 1
G* 2
F 1+ (1) 2 1 2
M2-1 1+ (1)
M2-1/M2-2 1
M2-2 2 1+ (2)
L GS 1
L 1 2
No. of mutations in the lineage 2 9 6 6 7 5 0

The passage scheme is shown in Fig. 1 B and C. Mutations are nonsynonymous except for those in parentheses, which are synonymous. The details of sequence positions and sequence changes are shown in SI Appendix, Table S1.

*

G and F were the only ORFs in Min B that had been subjected to CPD.

The two mutations in the F ORF of lineage 2 and Ct#1 are the only mutations shown that are identical between lineages.

Using a lower cutoff of ≥10% of the sequencing reads from the same experiment, an additional 47 subpromiment (<50% of reads) point mutations were identified in P18 of the 7 lineages (SI Appendix, Table S1). All were unique except that a subprominent synonymous mutation in the leader region of lineage #1 was the same as a prominent mutation in lineage #2, and a subprominent nonsynonymous mutation in the F ORF of lineage #5 was the same as a prominent mutation in lineage #2 and Ct#1. Of the 45 unique subprominent mutations, 30 (67%) were nonsynonymous (SI Appendix, Table S1). Taken together, the 35 prominent and 47 subprominent mutations involved every ORF except that of the N gene. Only 18 (22%) occurred in CPD G and F (SI Appendix, Table S1). Of these, seven (39%) involved a codon that had been modified during CPD with four resulting in an amino acid change. Four involved a nucleotide that had been modified during CPD and two restored the WT nucleotide assignment. The two prominent F mutations shared between lineages #2 and Ct#1 did not involve codons or nucleotides modified by CPD.

To investigate the dynamics of the accumulation of mutations over the passages, we performed Ion Torrent whole-genome deep sequencing of each of the seven lineages at P4, P7, P10, P14, and P16, in addition to that described above for P18. Mutations present at a frequency of ≥5% are shown in SI Appendix, Table S2. Overall, most mutations emerged by P7, P10, and P14. Some of these mutations, including both prominent and subprominent, soon disappeared, while others remained and often increased in frequency by P18.

Accumulation of Mutations in the M2-1/M2-2 Overlap Region in Lineage #5.

We examined lineage #5 in greater detail because it exhibited a set of six mutations (SI Appendix, Table S1, shaded gray) localized in or near the region of the M2 gene where the upstream ORF encoding the M2-1 protein overlaps with the downstream ORF encoding the M2-2 protein (SI Appendix, Fig. S2A). The M2-1 protein is an essential elongation/antitermination factor needed for sequential transcription (11). The M2-2 protein is a nonessential regulatory factor whose deletion results in attenuation, a global increase in viral mRNA and protein expression, and decreased RNA replication (11, 15).

Three of these changes were nonsynonymous, changing amino acid assignments near the N terminus of M2-2 but not in M2-1 (t8176c, I6T; t8182c, I8T; and t8196c, Y13T). These six M2-overlap mutations appeared en bloc in P10 and increased through P18; all were present throughout these passages at ≥50% except for 8176, which increased to 40% by P18 (SI Appendix, Tables S1 and S2). The five mutations present at ≥50% were designated M2[5m], and these five mutations plus the less abundant 8176 mutation were designated M2[6m].

A number of other mutations in seven other genes were present during passage of lineage #5, but these were mostly of low abundance and in a number of cases disappeared during passage (SI Appendix, Table S2). Only two other prominent mutations were observed, both of which appeared only in P18 (t7673c, F23L in M2-1; and c3519a, S86R in M).

Thus, the most prominent and consistent mutations in lineage #5 were the six M2[6m] mutations. Because they were located in the overlap between M2-1 and M2-2 and included three amino acid substitutions in M2-2, these mutations might affect the expression and functions of M2-1 and M2-2, which in turn might affect global viral RNA and protein synthesis.

Effects of Introducing M2-Overlap Mutations into WT RSV and Min B.

We introduced M2-overlap mutations from lineage #5 into WT RSV and Min B as follows (focusing in particular on the nonsynonymous mutations t8182c and t8196c): 1) t8182c, 2) t8196c, 3) t8182c and t8196c, and 4) all five mutations (M2[5m]). The recovered WT-M2 and Min-B-M2 mutant viruses were sequenced completely, confirming the correct sequences.

The four WT-M2 mutants were analyzed in Vero cells in parallel with LID/∆M2-2, a vaccine candidate with the same viral backbone and deletion of most of the M2-2 ORF (16) (SI Appendix, Fig. S2). Briefly, during multicycle replication over 6 d at 32 °C or 37 °C, the WT-M2 mutants replicated with efficiencies similar to that of WT RSV and LID/∆M2-2 (SI Appendix, Fig. S2B), but induced more rapid and extensive syncytium formation. Increased syncytium formation due to increased protein expression is characteristic of deletion of the M2-2 ORF (15). In the same experiment, evaluation of the expression of RSV F and G proteins during the 6 d by flow cytometry (SI Appendix, Fig. S2 C and D) showed that expression by the various viruses increased in the order: WT RSV < t8182c = LID/∆M2-2 < t8196c < t8182c-t8196c < M2[5m], which also was the order of increased syncytium formation. Thus, introducing the M2[5m] mutations into WT RSV yielded a phenotype similar to that of a ∆M2-2 mutant except that the increase in protein expression was greater.

The four Min B-M2 viruses similarly were evaluated for multicycle replication in Vero cells at 32 °C and 37 °C (SI Appendix, Fig. S3A). At 32 °C, all four Min B-M2 mutants replicated slightly better than Min B, reaching about 10-fold higher titers by day 7. At 37 °C, the M2 mutations had little effect except that mutation t8182c conferred a modest increase in replication to Min B. Thus, the M2 mutations partly rescued Min B replication at 32 °C, with a single mutation having a marginal effect at 37 °C.

The Min B-M2[5m] mutant was compared to virus from P10 and P14 of lineage #5 (SI Appendix, Fig. S3 BD). Each of the three virus preparations were inoculated on Vero cells at MOIs of 0.4 and 0.15 pfu per cell, which correspond to the inocula used for P10 and P14, respectively, of the temperature stress test in Fig. 1 B and C. Syncytium formation was evident with P10 and P14 lineage #5, but no syncytia were detected with Min B-M2[5m]. Measurement of virus production on day 5 postinfection (SI Appendix, Fig. S3C) showed that P10 and P14 of lineage #5 replicated to 20- and 50-fold higher titers, respectively, than Min B-M2[5m]. This experiment was repeated using a uniform MOI of 0.01 pfu per cell for Min B-M2[5m], P10, and P14, to ensure that the differences in viral yield were not due to differences in MOI. In this experiment, P10 and P14 replicated to 214- and 590-fold higher titers than Min B-M2[5m], respectively (SI Appendix, Fig. S3D).

Next, we characterized the plaques induced by P10 and P14 of lineage #5 and Min B-M2[5m] at 32 °C (SI Appendix, Fig. S3E). Compared to Min B, Min B-M2[5m] plaques were slightly larger in size, but the expression of F protein was not significantly increased. The plaque sizes and the total F expression per plaque induced by P10 and P14 of lineage #5 were significantly higher than for Min B and Min B-M2[5m].

As noted above, the t8176c mutation (causing a I6T substitution in M2-2) was less abundant than the M2[5m] mutations but appeared with them en bloc during the stress test. We constructed and recovered a version of Min B containing the t8176c mutation combined with the M2[5m] mutations (Min B-M2[6m] virus), and found that this did not further increase expression of RSV F protein (see, for example, Fig. 3 A, B, D, and E).

Fig. 3.

Fig. 3.

The LD genomes efficiently expressed RSV F protein. Expression of RSV F protein from cDNA-encoded versions of LD RNA #1, LD RNA #1-M2[6m], and LD RNA #2 in a transient reverse genetic system. Plasmids encoding positive-sense copies of the 10,144-nt LD RNA #1 (Fig. 2E) with and without the six M2[6m] mutations, and the 4,884-nt LD RNA #2 (Fig. 2F), under the control of the RNA T7 promoter were transfected into BSR T7/5 cells that constitutively express the T7 RNA polymerase together with four RSV support plasmids expressing the N, P, M2-1, and L ORFs (except where omission is indicated). Cells were incubated at (AD) 32 °C or (E) 37 °C. Protein expression was evaluated by Western blot and expressed as fluorescence intensity (FI). (A) Time course of expression by the indicated plasmid-encoded genomes. (B) Representative Western blot of F protein expression at 48 hpt by the indicated plasmid-encoded genomes. (C) Lack of F protein expression by LD RNA #2 in absence of the L support plasmid. (D) Quantification of F protein expression at 48 hpt at 32 °C from seven independent experiments (with the exception of WT-M2[6m] and WT, for which n = 6). (E) Quantification of F protein expression at 48 hpt at 37 °C from five independent experiments. In D and E, box plots show the median (horizontal line), flanked by the second and third quartile. The outer bars show the range of values. Data were evaluated for statistical significance using the nonparametric Kruskal–Wallis test with Dunn’s multiple comparisons posttest (*P ≤ 0.05, **P ≤ 0.01).

Thus, the introduction of the M2[6m] mutations into Min B, representing six of the eight prominent mutations in P18 of lineage #5, did not significantly rescue the expression of RSV F and viral replication, and did not recreate a virus with a phenotype similar to that of P10 or P14 of lineage #5 (note that the higher passages in the stress test could not be used in every experiment due to limitations on material).

Analysis of Lineage #5 by Long-Read Single-Molecule Real-Time Deep Sequencing of Long RT-PCR Amplicons.

We investigated whether other combinations of mutations in lineage #5 (SI Appendix, Tables S1 and S2) might be linked in subpopulations of genomes, which might in aggregate rescue Min B replication and expression of F protein. First, we generated, from P10, P14, and P18 of lineage #5, two RT-PCR amplicons of approximately 8 kb (amplicon A, nucleotide positions 251 to 8,356) and 7.4 kb (amplicon B, nucleotide positions 7,788 to 15,223) that overlapped by 568 nt and represented the entire genome except for the first 250 nt (SI Appendix, Fig. S4A). The M2-1/M2-2 overlap region (nucleotide positions ∼8,147 to ∼8,207) (SI Appendix, Fig. S2A), in which the M2[6m] mutations were located, was present in the overlap region of both amplicons. We then performed long-read single-molecule real-time (SMRT) deep sequencing to obtain complete single-molecule reads of the 8-kb and 7.4-kb amplicons (SI Appendix, Fig. S4).

SMRT sequencing of amplicon A showed that the original Min B genome diminished in abundance with increasing passage and was replaced by subpopulations bearing different combinations of point mutations. Each passage contained four to five major subpopulations that increased or diminished or gained further mutations during further passage (SI Appendix, Fig. S4B). At P18, three populations accounted for about 70% of the reads; the most abundant of these accounted for ∼40% of the reads and contained five nonsynonymous mutations: NS2[Y88H], P[K27E], M[A45V], F[A74T], and F[H159Y] (SI Appendix, Fig. S4B, asterisk) that collectively were called the [#5m-P18] mutations. The effect of these mutations on Min B is detailed later. Thus, the RSV genomes in lineage #5 indeed contained several subpopulations with different combinations of point mutations, although none of the subpopulations became prominent (i.e., ≥50% of reads) or involved prominent mutations. Unexpectedly, however, none of the long-range reads of amplicon A (SI Appendix, Fig. S4B) from any passage level contained any of the six M2[6m] mutations.

In contrast, SMRT sequencing of amplicon B from P18 detected the set of M2[6m] mutations in 80% of the long-range reads (SI Appendix, Fig. S4C). The M2[6m] mutations coemerged by P10 and remained through P18, consistent with the Ion Torrent sequencing in SI Appendix, Table S2. A few further subpopulations emerged containing the M2[6m] mutations linked to one or several other mutations, but these were minor (SI Appendix, Fig. S4C). Thus, while the M2[6m] mutations involve sequence positions represented in both amplicon A and B, these mutations were detected only in sequence analysis of amplicon B.

Identification of a Min B–Derived Genome Containing an LD.

The presence of the M2[6m] mutations in amplicon B but not amplicon A of lineage #5 RNA suggested the possibility that these amplicons might represent different subpopulations of Min B–derived genomes. Specifically, the M2[6m] mutations might be present in a subpopulation of genomes in which the region containing the binding site for the forward primer used to amplify amplicon A (beginning at genome nucleotide position 251) had been deleted. If so, the [251–8534] primer pair used for amplicon A would be unable to amplify this subpopulation, but could amplify any additional subpopulations in which the primer binding site was present.

We probed for a possible deletion in the left-hand half of the viral genome by RT-PCR of viral RNA from P17 of lineage #5 using a series of nine different forward primers complementary to the RSV genome beginning at nucleotide position 251 from the left-hand end of the genome and moving progressively rightward along the genome to nucleotide 6,592, together with reverse primers specific for nucleotide position 8,356, 9,578, or 10,317 (Fig. 2A). The various PCR amplicons of the expected sizes were generated, and these were sequenced to detect the presence or absence of the M2[6m] mutations (Fig. 2A, indicated in the right-hand column). None of the PCR amplicons generated with forward primers that bound at nucleotide 251, 1,120, 2,850, 4,676, or 5,132 contained the M2[6m] mutations, whereas PCR amplicons generated with a forward primer binding at nucleotide position 5,379, 5,560, 5,727, 5,919, or 6,592 did contain the M2[6m] mutations (Fig. 2A). The choice of reverse primer made no difference. This indicated that the M2[6m] mutations were present on a subpopulation of genomes containing a deletion whose right-hand end was located between nucleotides 5,132 and 5,379 and included deletion of positions 251, 1,120, 2,850, and 4,676.

Fig. 2.

Fig. 2.

Identification and structures of three LD Min B–derived genomes identified in lineage #5. (A) Mapping of a deletion in the left-hand side of Min B genomes from P17. The Min B genome is shown at the top (CPD genes in black), and amplicons A and B (from the long-read SMRT deep sequencing) (SI Appendix, Fig. S4) are shown for reference. RNA was analyzed by RT-PCR using primer pairs identified in the left-hand column by their nucleotide positions. The PCR products were completely sequenced by Sanger sequencing, and the presence or absence of the set of six M2[6m] mutations is indicated on the right. The dashed lines show the deduced location of the right-hand end of the deletion between nucleotides 5,132 and 5,379 (primers complementary to sites within the deletion are shown in red). (B–D) Images of ethidium bromide-stained 0.8% agarose gels showing RT-PCR products generated from RNA purified from the unpassaged Min B inoculum or from the indicated passage levels of lineage #5 in the temperature stress test, amplified using the indicated primers identified by their nucleotide positions. The left-hand lane of each gel contained molecular length markers (labeled by kilobase). (B) Analysis with primer pair [1–9578] flanking the deletion shows the emergence of an LD genome, indicated by the transition from the larger to the smaller PCR product (marked “E”). (C) Analysis with primer pair [251–8356], of which primer 251 is located within the deletion, fails to amplify a smaller product and generates a larger amplicon representing a nondeleted subpopulation (asterisk). (D) Analysis with primer pair [1–15223] representing the two ends of the genome reveals several amplicons representing several putative LD genomes (“E,” “F,” and “G” indicate amplicons representing LD genomes in E, F, and G, below. (EG) Structures of three LD genomes identified in lineage #5. In each panel, full-length Min B genome (Upper structure) is annotated to show the locations of deletions and point mutations (in particular, the six M2[6m] mutations), and the LD genome and its nucleotide length are shown underneath. LD RNA#1-M2[6m] (E) was sequenced from amplicons marked “E” in B and D, respectively. LD RNA#2 (F) was sequenced from amplicons from P9 RNA corresponding to amplicon “F” in D. LD RNA#3 (G) was sequenced from amplicons from P17 RNA corresponding to amplicon “G” in D.

To map the left-hand end of the proposed deletion, we performed RT-PCR of RNA from the unpassaged Min B inoculum, as well as P4, P7, P11, P14, and P18 of lineage #5 using a forward primer specific to nucleotides 1 to 25 (the left-hand end of the genome), and a reverse primer complementary to nucleotide 9,578 (primer pair [1–9578]) (Fig. 2B). Agarose gel electrophoresis of the RT-PCR products revealed that the inoculum and P4 yielded a single amplicon of approximately 10 kb, which would be the predicted product from a genome that is intact across this region (Fig. 2B). At P7, a second major amplicon of ∼4 kb appeared; from P7 to P18, the 10-kb amplicon decreased sharply in abundance while the 4-kb fragment increased in abundance. This change in size suggested that a large deletion (LD) genome with a 5-kb deletion had emerged at around P4 and subsequently accumulated to become predominant. An amplicon representing this LD genome was not detected with a control primer pair [251–8356], since the binding site of primer 251 was within the deletion; this primer pair yielded primarily a single large amplicon of the appropriate size to represent genome with no deletions across this region (Fig. 2C).

The short PCR amplicon in Fig. 2B was gel-purified and sequenced completely by Sanger sequencing. This revealed a 4,967-nt deletion from nucleotide 213 (codon 39 in the NS1 ORF) to nucleotide 5,291 (codon 201 in the CPD G ORF) inclusive, and confirmed that this deletion-containing genome also contained the M2[6m] mutations (Fig. 2E). Combining these sequences with those obtained from the SMRT long-read deep sequencing of amplicon B, we assembled a complete 10,144-nt sequence of this LD viral RNA, called LD RNA#1-M2[6m] (Fig. 2E).

We then sought to confirm the presence of the complete 10,144-nt LD RNA in lineage #5 by RT-PCR using primers specific for the first and last nucleotide of the RSV genome (1–15111), which would amplify complete LD genomes. Analysis of RNA from P4, P7, P9, P10, P14, P17, and P18 confirmed the presence of a 10-kb amplicon (band E in Fig. 2D) in the RT-PCR products from each passage after P4, as well as other bands. The complete sequence was confirmed from the gel-purified 10-kb amplicon from P9. Partial sequencing of gel-purified 10-kb amplicon from the other passages provided further confirmation. The only differences noted were that, from P10 to P18, two “T” to “A” substitutions appeared at nucleotide positions 115 and 145 that resulted in the introduction of a stop codon at NS1 codons 6 and 16, respectively. The RT-PCR products obtained with the [1–15223] primer pair contained additional amplicons (e.g., bands F and G in Fig. 2D) suggestive of additional LD genomes, which are investigated below.

Identification of Additional Min B–Derived LD Genomes.

We then investigated whether lineage #5 might contain LD genomes with a deletion in the 5′ (right-hand) half of the genome. We performed a series of RT-PCR reactions with RNA from P17 and a series of forward and reverse primers spanning the genome from nucleotides 5,919 to 15,223 (SI Appendix, Fig. S5). This provided evidence of genomes containing a deletion with a left-hand end between nucleotide 6,592 and nucleotide 7,788 and a right-hand end between nucleotide 11,159 and nucleotide 12,756 (SI Appendix, Fig. S5).

We also gel-purified additional amplicons from gels such as shown in Fig. 2D obtained with the [1–15223] primer pair, specifically 4.9-kb and 5.8-kb bands from P9 and P17 (bands F and G in Fig. 2D, respectively) and determined their complete sequences by Sanger sequencing. This identified two additional presumptive LD genomes that we called LD RNA#2 and LD RNA#3 (Fig. 2 F and G).

LD RNA#2 (Fig. 2F) was 4,884 nt in length and contained two deletions: One was a 4,736-nt deletion from nucleotide 7,596 (2 nt upstream of the M2 GS signal) to nucleotide 12,331 (codon 1,278 in the L ORF), which is consistent with the deletion mapping described immediately above. LD RNA#2 also contained a second deletion of 5,492 nt from nucleotide 56 (the 13th nucleotide in the upstream untranslated region of the NS1 gene) through 5,659 (the 12th nucleotide in the upstream untranslated region of the F gene) (Fig. 2F). This deletion placed the complete CPD F ORF downstream of the NS1 GS signal and thus in the first gene position.

LD RNA#3 (Fig. 2G) was 5,846 nt in length and also contained two deletions. One was the same 4,736-nt deletion from nucleotide 7,596 to nucleotide 12,331 as in LD RNA#2. The second deletion was 4,531 nt in length and was from nucleotide 361 (codon 88 in the NS1 ORF) to nucleotide 5,003 (codon 105 in the CPD G ORF) (Fig. 2G). This created a short NS1/G chimeric gene as the first gene followed by the intact CPD F gene as the second gene. In addition, we identified the insertion of an additional “A” nucleotide in an “A” homopolymer at nucleotides 173 to 178 (NS1 codons 25 to 27). This nucleotide insertion resulted in the introduction of a stop codon at codon 32 of NS1.

LD Genomes Were Detected in Each Lineage.

To investigate the presence of LD genomes in the other lineages, viral RNA from P10 of each lineage was subjected to RT-PCR using primer pairs [1–9578] and [6592–13915] (SI Appendix, Fig. S6 A and B, respectively). Each primer pair yielded one or more amplicons indicative of one or more LD genomes from each lineage.

Primer pair [1–9578] yielded amplicons of ∼4 to 5 kb, depending on the lineage (SI Appendix, Fig. S6A), which were much smaller than the full-length product of 9.5 kb generated from unpassaged Min B (Fig. 2B). Amplicons were gel-purified from each lineage and sequenced completely by Sanger sequencing. Lineages #1, #4, and Ct#2 yielded a major amplicon of 4.5 kb containing a deletion from nucleotides 138 to 5,196 (total of 4,947 nt) (SI Appendix, Fig. S6C). Lineage #2 yielded several nonabundant amplicons of 4 to 5 kb: a 4-kb band contained a deletion from nucleotide 56 to nucleotide 5,659 (total of 5,492 nt) (SI Appendix, Fig. S6D). This deletion was identical to that detected in LD RNA#2 from lineage #5 (Fig. 2F). This LD genome also contained an insert of 7 nt (TGAATGA) located 5 nt upstream of the M2 GS signal. Lineage #3 yielded a 4.5-kb amplicon with a deletion from nucleotides 213 to 5,291 (total of 4,967 nt) (SI Appendix, Fig. S6E) that was identical to that identified in LD RNA#1-M2[6m] from lineage #5 (Fig. 2E), except that it did not contain the M2[6m] mutations. Lineage #5 yielded a 4.5-kb amplicon that was previously shown in Fig. 2E. Lineage Ct#1 yielded a 4.8-kb product with a deletion from nucleotides 495 to 5,289 (total of 4,683 nt) (SI Appendix, Fig. S6F). Three of these four deletions (SI Appendix, Fig. S6 C, E, and F) fused the ORFs of the NS1 and G genes, so that the first gene in each genome was a short chimeric NS1/G gene that was followed by the complete CPD F gene. The remaining deletion moved the complete CPD F gene into the first gene position (SI Appendix, Fig. S6D).

Primer pair [6592–13915] yielded primarily a 2.8-kb amplicon instead of the expected 7.3-kb amplicon representing an intact genome, except in the case of lineage #4 in which the larger product was predominant and the smaller product was less abundant (SI Appendix, Fig. S6B). Sanger sequencing of the gel-purified smaller amplicon from each lineage showed that it contained a 4,736-nt deletion from nucleotide 7,596 to nucleotide 12,331 that was identical to that previously identified in LD RNA#2 (Fig. 2F) and LD RNA#3 (Fig. 2G) of lineage #5.

The LD Min B Genomes Are Functional and Efficiently Express the F Protein.

Each of LD genomes described above had sustained a deletion that moved the intact CPD F ORF into the first or second gene position, which would be predicted to increase the expression of F protein. To investigate this, we used commercial DNA synthesis to produce cDNAs encoding positive-sense copies of three LD RNAs: 1) LD RNA#1-M2[6m] (the largest LD RNA, 10,144 nt in size) (Fig. 2E); (2) a version of LD RNA#1-M2[6m] without the six M2 mutations (called LD RNA#1); and 3) LD RNA#2 (4,884 nt) (Fig. 2F). Each cDNA was cloned into a plasmid backbone under the control of the promoter for T7 RNA polymerase, and was followed by a self-cleaving hepatitis delta ribozyme. T7 transcription would yield the positive-sense antigenome, as with constructs used to recover all of the recombinant RSVs in this study.

The ability of these LD genomes to replicate and express F protein was tested in a transient reverse genetic system (Fig. 3). BSR T7/5 cells, which constitutively express the T7 RNA polymerase, were transfected with plasmids encoding the indicated LD RNA genomes or control full-length genomes, together with four T7 support plasmids encoding the N, P, M2-1, and L proteins necessary to reconstitute RSV genome replication and gene transcription.

Western blot analysis of F expression at 32 °C revealed substantial levels of expression from the LD genomes that peaked at 48 h posttransfection (hpt) (Fig. 3A), whereas there was little or no detectable expression from full-length WT, Min B, Min B-M2[6m], and WT-M2[6m] genomes (Fig. 3 A and B). Omission of the RSV L support plasmid ablated the expression of F protein (Fig. 3C), confirming that the RSV F protein was mostly expressed from mRNA generated by the RSV polymerase from the support plasmids rather than expressed from the positive-sense RNA generated from the antigenome plasmid by T7 polymerase. Data from six to seven independent experiments at 32 °C (Fig. 3D) showed that LD RNA#2 was the most efficient in expression of RSV F, and was approximately twofold more efficient than LD RNA#1-M2[6m]. LD RNA#1-M2[6m] in turn was three-to fourfold more efficient than LD RNA#1 that lacked the M2[6m] mutations but otherwise was identical, suggesting that these M2 mutations indeed increased LD genome replication and F expression. Comparable results were obtained when the transfected cells were incubated at 37 °C (Fig. 3E). Thus, the LD genomes indeed expressed increased levels of RSV F protein from the CPD F ORF.

The LD RNAs Complemented Min B Replication.

We investigated whether the LD RNAs increased replication and F protein expression by Min B in trans. To do so, we used reverse genetics to recreate mixed viral populations containing the most abundant full-length genome and various LD genomes detected during the passages of lineage #5. The most abundant full-length genome detected in P18 by SMRT sequencing (∼40% of reads) (SI Appendix, Fig. S4) contained the five nonsynonymous mutations NS2[Y88H], P[K27E], M[A45V], F[A74T], and F[H159Y] (SI Appendix, Fig. S4B, asterisk). We introduced these five mutations into the Min B antigenome cDNA, yielding Min B-[#5m-P18].

Next, we recreated (see SI Appendix, Supplemental Materials and Methods for details) six virus populations (Fig. 4A) containing this Min B-[#5m-P18] virus on its own or in combinations with one or two LD genomes from the three used in the minigenome experiments in the previous section, namely: 1) LD RNA#1-M2[6m], which was a major species in lineage #5; 2) LD RNA#1; and 3) LD RNA#2, which also was a major species in lineage #5. The presence of the appropriate genomes in these recreated virus populations was confirmed by RT-PCR (SI Appendix, Fig. S7). Then, we evaluated the replication in Vero cells of these recreated virus populations in comparison to several passages of lineage #5 (P4, P9, P12, and P15), as well as Min B and WT RSV, at both 37 °C (Fig. 4B) and 32 °C (SI Appendix, Fig. S8) in a multicycle replication experiment.

Fig. 4.

Fig. 4.

The LD Min B genomes complemented Min B replication in trans. Evaluation of RSV F expression and replication of recreated virus populations containing combinations of Min B–derived genomes identified in lineage #5. (A) Six virus populations were recreated by reverse genetics to contain Min B-[#5m-P18] (i.e., Min B bearing the five [#5m-P18] mutations) alone (population “a,” labeled at the left) or combined with one or two LD RNAs, as shown. (B) Multicycle replication of the recreated virus populations compared to P4, P9, P12, and P15 of lineage #5. Min B and WT RSV were included as controls. Vero cells in six-well plates were infected in duplicate at an MOI of 0.01 pfu per cell with the indicated viruses and incubated at 37 °C (B) or 32 °C (SI Appendix, Fig. S8). From days 0 to 7, viruses were collected by scraping infected cells into media followed by vortexing for 30 s, and clarification of the supernatant by centrifugation. Virus inocula and the daily aliquots were snap frozen and stored at −80 °C. Virus titers were determined by plaque assay at 32 °C. Note that Min B replication was assessed only from one well and every other day due to the low titer of Min B virus stock. The virus populations are listed in the legend in the top-to-bottom order in which they appear in the graph. (C) Representative photomicrographs (100× magnification) obtained from the experiment described in B at day 3 postinfection showing the presence of syncytia in the wells infected with lineage #5, P15 (Upper, Right) and recreated Min B[#5m-P18] + LD RNA #2 (Lower, Right), but not with Min B (Upper, Left) or Min B[#5m-P18] without an LD RNA (Lower, Left). Representative syncytia are indicated by red arrows. Images for the other infections are shown in SI Appendix, Fig. S9A. (D) Plaque characterization. Vero cells in 24-well plates were infected with 20 to 30 pfu per well of the indicated viruses, incubated under methycellulose for 12 d at 32 °C, immunostained with RSV F mAbs, and analyzed as described in Fig. 1E. Plaque size, F protein MFI, and total F expression were measured for 1,061 plaques on average per virus. The three parameters were compared between virus preparations for statistical significance using the ANOVA test (*P ≤ 0.05, **P ≤ 0.01, ***P ≤ 0.001, ****P ≤ 0.0001). Data for the other viruses are shown in SI Appendix, Fig. S9B.

At 37 °C, Min B (green triangle in Fig. 4B) exhibited little or no replication, as expected. The titers of lineage #5 P4, P9, P12, and P15 (red, violet, yellow, and blue squares, respectively, in Fig. 4B) increased with increased passage number, from 104 to 106.6 pfu/mL at day 4 postinfection. For comparison, WT RSV (black circle in Fig. 4B) replicated to 107.1 pfu/mL, as typically observed.

Min B-[#5m-P18] alone (white triangle in Fig. 4B) replicated substantially more efficiently than Min B and also somewhat more than lineage #5 P4 (red square in Fig. 4B), and its titer reached 104.6 pfu/mL at day 4. This suggested that the five introduced mutations improved virus growth modestly. This effect was greater at 37 °C (Fig. 4B) than 32 °C (SI Appendix, Fig. S8), suggestive of a partial reversal of the temperature-sensitive phenotype. However, the highest titer for Min B-[#5m-P18] was 100-fold below that of the P15 virus.

Min B-[#5m-P18] + LD RNA#1 (blue circle in Fig. 4B), and Min B-[#5m-P18] + LD RNA#1-M2[6m], (orange circle in Fig. 4B) exhibited a 10-fold increase of virus replication (105.3 and 105.4 pfu/mL, respectively) compared to Min B-[#5m-P18] alone (white triangle in Fig. 4B), and was similar to lineage #5 P9 (violet square in Fig. 4B). Thus, these larger LD RNA made a further contribution to improving Min B replication. However, the presence of the M2[6m] mutations in the LD RNA made no difference.

Min B-[#5m-P18] + LD RNA#2 (red circle in Fig. 4B) (106.2 pfu/mL), and Min B-[#5m-P18] + LD RNA#1 + LD RNA#2 (green circle in Fig. 4B) (106.3 pfu/mL), and Min B-[#5m-P18] + LD RNA#1-M2[6m] + LD RNA#2 (brown circle, 106.2 pfu/mL) were indistinguishable in replication and similar to Lineage #5 P12 and P15 (yellow and blue squares in Fig. 4B) (106.0 and 106.6 pfu/mL, respectively). Thus, the presence of the shortest LD genome (LD RNA#2), in combination with the presence of the five mutations [#5m-P18] in Min B, restored Min B replication substantially. The further addition of LD RNA#1-M2[6 m] or LD RNA#1 to LD RNA#2 did not make a further contribution, nor did the presence of the M2[6m] mutations in the LD genome. Comparable results were obtained when the experiment was performed at 32 °C, although the differences were smaller (SI Appendix, Fig. S8).

We also evaluated syncytium formation in the same multicycle replication experiment (Fig. 4C and SI Appendix, Fig. S9A). Min B and Min B-[#5m-P18] did not induce any cytopathic effect when inspected on day 3 (Fig. 4 C, Left, Upper and Lower, respectively), but Min B-[#5m-P18] + LD RNA#2 restored syncytium formation to the level of lineage #5 P15 (Fig. 4 C, Right, Lower and Upper, respectively). Complementation of Min B-[#5m-P18] with LD RNA#1 or LD RNA#1-M2[6m] restored a lower level of syncytium formation (SI Appendix, Fig. S9A).

We analyzed plaque phenotypes (Fig. 4D and SI Appendix, Fig. S9B). As already noted, the plaque size exhibited by lineage #5 P15 was increased compared to Min B (Fig. 4 D, Top, Left). The median level of F expression and total F expression of those plaques were also significantly increased (Fig. 4 D, Top, Center and Right, respectively). Min B-[#5m-P18] alone exhibited a modest nonsignificant increase of the plaque size and F expression compared to Min B (Fig. 4 D, Center). Complementation of Min B-[#5m-P18] with the small LD RNA#2 was sufficient to induce a plaque size and level of F expression that was comparable to lineage #5 P15 (Fig. 4D, compare Top and Bottom). Complementation of Min B-[#5m-P18] with the larger LD RNA#1 or LD RNA#1-M2[6m] induced a modest but significant increase in plaque size and total level of F expression (SI Appendix, Fig. S9B). Thus, the smallest of the LD genomes, which effectively expressed RSV F from the first promoter proximal position, had the strongest compensatory effect on Min B.

The LD Min B Genomes Were Also Detected from a Newly Rescued Min B-[#5m-P18] Virus.

We also investigated the ability of newly rescued Min B-[#5m-P18] to generate LD RNAs. The P2 stock of Min B-[#5m-P18] virus prepared in the previous section was passaged once more on Vero cells at 32 °C using an MOI of 1 pfu per cell, and the viral RNA was isolated from the harvested virus and subjected to RT-PCR (SI Appendix, Fig. S10). Gel-band purification and Sanger sequencing revealed that Min B-[#5m-P18] generated an LD RNA, named LD RNA#4, that was 4,795-nt long and contained two large deletions of 5,230 nt and 5,055 nt. This LD RNA contained a truncated NS1 gene fused to the four last nucleotides of the G end sequence, followed by the intact CPD F gene, followed by a truncated L gene. The NS1 sequence also contained a dinucleotide “AA” insertion at nucleotide 177 resulting in the introduction of a stop codon, and the remnant of the L gene contained three small deletions of 1, 28, and 4 nt (SI Appendix, Fig. S10). This showed that LD genomes with the CPD F ORF were generated very quickly from newly transfected Min B-[#5m-P18], that relocation of the F gene to the left-hand end of the genome was common, that the presence of the [#5m-P18] mutations did not remove the selective pressure to generate these LD genomes, and that additional mutations were selected that would reduce expression from an upstream ORF.

Discussion

CPD is being used to generate new live attenuated vaccine candidates for a number of viruses, including RSV. In the present study, we took advantage of the temperature sensitivity of RSV Min B, bearing CPD G and F ORFs, to evaluate its genetic stability in a temperature stress test. We previously showed that Min B was moderately temperature sensitive and was strongly restricted for replication in vitro even at the typically permissive temperature of 32 °C (12). The replication efficiency of Min B increased during passage, similar to our previous studies of Min L (14). Min B from the different parallel lineages, including the two control lineages, showed compensatory increases in RSV F expression and syncytium formation associated with the increased replication. Full-genome deep sequencing of the lineages at different passage levels identified numerous point mutations that, surprisingly, were mostly nonsynonymous, were mostly outside the CPD G and F ORFs, and mostly differed between lineages. When prominent mutations were introduced into Min B, their effect on virus replication and F expression was minimal. Thus, these point mutations clearly did not account for the increase in RSV F expression and viral replication during serial passage.

Unexpectedly, long-range PCR amplification and long-read deep sequencing revealed genomes containing very large internal deletions of 4 to 5 kb or more (LD genomes) that appeared early and amplified robustly during serial passage. Each lineage contained one or more LD genomes as predominant species. Coexpression of several synthetic LD genomes with Min B-[#5m-P18] virus confirmed that they complemented and rescued F protein expression and replication to levels comparable to later passages of lineage #5. Thus, trans-complementation by LD genomes appeared to be the major contributor to rescuing Min B.

Sequence analysis of a number of these LD genomes showed that, in each case, they contained the intact CPD F gene as the first gene, or as the second gene following a truncated chimeric gene comprised of NS1 and G gene sequence. When a promoter-proximal NS1/G gene occurred, it frequently acquired point mutations creating stop codons. This suggests that the LD genomes were under continuous selective pressure for increased expression of the CPD F gene and deletion or reduced expression of any upstream ORF. The proximity of the CPD F ORF to the promoter, together with the deletion/silencing of any upstream ORF and the efficient replication of these LD genomes, resulted in efficient expression of the RSV F protein that complemented and rescued Min B replication in trans.

We did not identify any LD RNAs with the intact CPD G ORF relocated to a promoter proximal position. This likely is because RSV G is not required for efficient RSV replication in Vero cells (17). However, the G protein usually is required for efficient replication in cells other than Vero, and in particular in cells expressing the CX3CR1 receptor to which the G protein binds. In those cells, CPD G likely also would contribute to Min B attenuation, which likely could be complemented by LD genomes in which the adjoining G and F genes were relocated en bloc to be promoter-proximal.

We previously showed that Min B produced virus particles in similar abundance to that of WT RSV but with infectivity reduced by ∼13,900-fold (12). We hypothesize that, when an LD genome coinfects with Min B (or is generated de novo), it is efficiently amplified due to its shorter size, and provides increased expression of F for incorporation into Min B particles, increasing their infectivity. Because of this increased infectivity, these genomes would efficiently spread through the culture and become predominant. RSV defective-interfering (DI) genomes are packaged in particles that are “much smaller and less dense” than those containing predominantly full-length genomes (18), and presumably the same is true for LD genomes. Full-size RSV virions are pleomorphic and the majority contain RNA equivalent to more than one, and up to nine, full-length genomes (19). This raises the possibility that LD genomes also might be packaged into full-size virions together with full-size genomes, particularly given their much greater relative abundance. Consistent with this, preparations of Sendai virus were shown to contain small particles bearing apparent DI genomes and larger particles containing up to six genomes and including apparent DI genomes (20). The ability of a single particle to deliver both full-length and LD genomes to individual cells would increase the efficiency of complementation. The increased expression of F also increased syncytium formation, which likely increases the number of cells containing both Min B and LD genomes.

Although all of the LD genomes that became predominant contained the CPD F gene in the first or second gene position, we hypothesize that these were selectively amplified from a very diverse background of LD genomes with deletions throughout the genome. This presumably is a general feature of RSV replication. From this diverse background population, those LD genomes that could complement Min B became predominant as described above. The rapid emergence of LD genomes suggests that their generation and selection was very robust. All lineages contained one particular 4.7-kb deletion involving M2 and part of L (SI Appendix, Fig. S6B), suggesting that LD genomes containing this deletion may have been present in the inoculum at a low frequency, and thus emerged quickly after the virus was made by reverse genetics.

The large internal deletions in the LD genomes presumably occurred by polymerase jumping, as with DI genomes of nonsegmented negative-strand viruses in general. Specifically, polymerase complexes that initiate the synthesis of antigenomes from the 3′ end of genomes, or the synthesis of progeny genomes from the 3′ end of antigenomes, can disassociate from the template and reinitiate synthesis at a different location on the same template or a different template, resulting in the loss of any intervening sequence (21). This process of intra- or intermolecular recombination results in the generation of genomes that bear one or more internal deletions, but retain conserved genome termini that allow for amplification and incorporation into nucleocapsids and virus particles.

The RSV LD genomes in the present study quickly became predominant while the full-length genomes concurrently became reduced. As with DI genomes in general, interference with the replication of full-length genomes likely involves competition for viral proteins involved in viral transcription and RNA replication. DI genomes also can increase the activation of host innate immunity and thereby reduce infection (2224). In the case of the RSV LD genomes, complementation of Min B infectivity by the expression of F protein evidently outweighed inhibition of Min B replication. However, the level of virus replication of Min B did not reach that of WT RSV, which probably is due to interference with RNA replication and the less-efficient nature of transcomplementation.

If the generation of a broad background of internal-deletion genomes is a fundamental property of RSV RNA replication, it presumably also occurs in vivo. If an LD genome has a selective advantage, such as by complementing Min B, it presumably would amplify. This would depend upon full-length and LD genomes being delivered to the same cell. If they are present in separate particles, they would have to coinfect. The efficiency of coinfection would depend on local viral titers (the MOI during the serial passages was ∼1.0 pfu per cell or less) and would be less likely in semipermissive experimental animals than in humans. Delivery of LD and full-length genomes to the same cell would be facilitated if they are packaged in the same particle.

For most of the LD genomes that we sequenced, we found no obvious evidence of homologous recombination: that is, where recombination was directed by base-pairing between two molecules. However, we identified two LD Min B–derived genomes (e.g., LD RNA#2 in Fig. 2F and lineage #2 in SI Appendix, Fig. S6D) in which the CPD F ORF was perfectly positioned at the first promoter-proximal position, precisely replacing the NS1 gene. Since the NS1 and F GS signal sequences are identical, it might be that this specific deletion was the product of an intra- or intermolecular base-pairing between these two homologous GS sequences. In a previous study, we found suggestive evidence of both homologous and nonhomologous recombination by RSV (25).

The overlapping M2-1 and M2-2 ORFs were frequent targets for point mutations and deletions during the temperature stress test. The six M2[6m] mutations in lineage #5, and 21 other mutations in the M2-2 and L genes in lineage #1, all involved T to C substitutions (SI Appendix, Table S2), suggesting that cellular deaminases might have introduced these mutations. As noted, deletion of M2-2 results in a global increase in mRNA and protein expression and decreased genomic RNA replication (15). In the LD RNA#2 from lineage #5, the six M2[6m] mutations in the M2-1/M2-2 overlapping region (which include three nonsynonymous mutations in M2-2) had a substantial effect on restoring RSV F expression, presumably through reducing the expression and function of M2-2, and thereby increasing expression of the remaining RSV genes, including F. Thus, one effect of the M2 mutations was inactivation of M2-2 to achieve increased gene expression including the F gene. A second effect may have been to counteract up-regulated expression of M2-2 resulting from being relocated closer to the promoter by gene deletions. Specifically, the repositioning of the CPD F gene to the first or second promoter-proximal position resulted in the M2 gene—which immediately follows F in the RSV gene order—being repositioned to the second or third position. Moving the M2-2 ORF to a more promoter-proximal location was previously shown to inhibit RSV replication due to increased expression of M2-2 protein (26). Therefore, deletion of the M2 gene or the introduction of mutations that inhibited M2-2 expression and functions from the LD genomes was likely the result of the selective pressure to reverse overexpression of M2-2.

In conclusion, under selective pressure during cell culture passage, LD genomes were selected that rescued rather than inhibited the replication of a single-stranded negative-strand RNA virus attenuated by CPD. Such a mechanism of compensation was previously unknown for RNA viruses and suggests that the accumulation of LD genomes has to be carefully investigated during the generation and evaluation of live-attenuated vaccine candidates.

Materials and Methods

Details of materials and methods regarding the viruses and virus populations created by reverse genetics, the temperature stress of Min B in cell culture, and the virus characterization in cell culture or by immunoplaque assay can be found in SI Appendix. In addition, methods detailing the Ion Torrent whole-genome deep sequencing, the site-directed mutagenesis of virus encoding cDNAs, the generation of long PCR amplicons, the PacBio deep sequencing of long-read single-molecule, and the evaluation of the viral protein expression by flow cytometry and Western blot can also be found in SI Appendix.

Supplementary Material

Supplementary File

Acknowledgments

We thank Isabelle Zhou for her outstanding technical assistance. This research was supported by the Intramural Research Program of the National Institute of Allergy and Infectious Diseases, NIH (C.L.N., T.M, L.Y., P.L.C. and U.J.B.).

Footnotes

Competing interest statement: C.L.N., E.W., P.L.C., and U.J.B. are coinventors on a patent application for the development of respiratory syncytial virus vaccines by codon-pair deoptimization.

This article contains supporting information online at https://www.pnas.org/lookup/suppl/doi:10.1073/pnas.2020969118/-/DCSupplemental.

Data Availability

All study data are included in the article and SI Appendix.

References

  • 1.Coleman J. R., et al., Virus attenuation by genome-scale changes in codon pair bias. Science 320, 1784–1787 (2008). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Martinez M. A., Jordan-Paiz A., Franco S., Nevot M., Synonymous virus genome recoding as a tool to impact viral fitness. Trends Microbiol. 24, 134–147 (2015). [DOI] [PubMed] [Google Scholar]
  • 3.Le Nouën C., Collins P. L., Buchholz U. J., Attenuation of human respiratory viruses by synonymous genome recoding. Front. Immunol. 10, 1250 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Mueller S., Papamichail D., Coleman J. R., Skiena S., Wimmer E., Reduction of the rate of poliovirus protein synthesis through large-scale codon deoptimization causes attenuation of viral virulence by lowering specific infectivity. J. Virol. 80, 9687–9696 (2006). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Bull J. J., Molineux I. J., Wilke C. O., Slow fitness recovery in a codon-modified viral genome. Mol. Biol. Evol. 29, 2997–3004 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Nougairede A., et al., Random codon re-encoding induces stable reduction of replicative fitness of Chikungunya virus in primate and mosquito cells. PLoS Pathog. 9, e1003172 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Vabret N., et al., Large-scale nucleotide optimization of simian immunodeficiency virus reduces its capacity to stimulate type I interferon in vitro. J. Virol. 88, 4161–4172 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Meng J., Lee S., Hotard A. L., Moore M. L., Refining the balance of attenuation and immunogenicity of respiratory syncytial virus by targeted codon deoptimization of virulence genes. mBio 5, e01704-14 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Ni Y. Y., et al., Computer-aided codon-pairs deoptimization of the major envelope GP5 gene attenuates porcine reproductive and respiratory syndrome virus. Virology 450-451, 132–139 (2014). [DOI] [PubMed] [Google Scholar]
  • 10.Cheng B. Y., Ortiz-Riaño E., Nogales A., de la Torre J. C., Martínez-Sobrido L., Development of live-attenuated arenavirus vaccines based on codon deoptimization. J. Virol. 89, 3523–3533 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Collins P. L., Karron R. A., “Respiratory Syncytial virus and Metapneumovirus” in Fields Virology, Knipe D. M., Howley P. M., Eds. (Lippincott, Williams and Wilkins, Philadelphia, PA, 2013), vol. 1, pp. 1086–1123. [Google Scholar]
  • 12.Le Nouën C., et al., Attenuation of human respiratory syncytial virus by genome-scale codon-pair deoptimization. Proc. Natl. Acad. Sci. U.S.A. 111, 13169–13174 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Stauft C. B., et al., Extensive recoding of dengue virus type 2 specifically reduces replication in primate cells without gain-of-function in Aedes aegypti mosquitoes. PLoS One 13, e0198303 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Le Nouën C., et al., Genetic stability of genome-scale deoptimized RNA virus vaccine candidates under selective pressure. Proc. Natl. Acad. Sci. U.S.A. 114, E386–E395 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Bermingham A., Collins P. L., The M2-2 protein of human respiratory syncytial virus is a regulatory factor involved in the balance between RNA replication and transcription. Proc. Natl. Acad. Sci. U.S.A. 96, 11259–11264 (1999). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.McFarland E. J.et al.; International Maternal Pediatric Adolescent AIDS Clinical Trials (IMPAACT) 2000 Study Team , Live-attenuated respiratory syncytial virus vaccine candidate with deletion of RNA synthesis regulatory protein M2-2 is highly immunogenic in children. J. Infect. Dis. 217, 1347–1355 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Teng M. N., Whitehead S. S., Collins P. L., Contribution of the respiratory syncytial virus G glycoprotein and its secreted and membrane-bound forms to virus replication in vitro and in vivo. Virology 289, 283–296 (2001). [DOI] [PubMed] [Google Scholar]
  • 18.Sun Y., López C. B., Preparation of respiratory syncytial virus with high or low content of defective viral particles and their purification from viral stocks. Bio Protoc. 6, e1820 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Kiss G., et al., Structural analysis of respiratory syncytial virus reveals the position of M2-1 between the matrix protein and the ribonucleoprotein complex. J. Virol. 88, 7602–7617 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Loney C., Mottet-Osman G., Roux L., Bhella D., Paramyxovirus ultrastructure and genome packaging: Cryo-electron tomography of Sendai virus. J. Virol. 83, 8191–8197 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Simon-Loriere E., Holmes E. C., Why do RNA viruses recombine? Nat. Rev. Microbiol. 9, 617–626 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Sun Y., et al., Immunostimulatory defective viral genomes from respiratory syncytial virus promote a strong innate antiviral response during infection in mice and humans. PLoS Pathog. 11, e1005122 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Tapia K., et al., Defective viral genomes arising in vivo provide critical danger signals for the triggering of lung antiviral immunity. PLoS Pathog. 9, e1003703 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Vasilijevic J., et al., Reduced accumulation of defective viral genomes contributes to severe outcome in influenza virus infected patients. PLoS Pathog. 13, e1006650 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Spann K. M., Collins P. L., Teng M. N., Genetic recombination during coinfection of two mutants of human respiratory syncytial virus. J. Virol. 77, 11201–11211 (2003). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Cheng X., Park H., Zhou H., Jin H., Overexpression of the M2-2 protein of respiratory syncytial virus inhibits viral replication. J. Virol. 79, 13943–13952 (2005). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.McAuliffe J. M., et al., Codon substitution mutations at two positions in the L polymerase protein of human parainfluenza virus type 1 yield viruses with a spectrum of attenuation in vivo and increased phenotypic stability in vitro. J. Virol. 78, 2029–2036 (2004). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Murphy B. R., Sotnikov A. V., Lawrence L. A., Banks S. M., Prince G. A., Enhanced pulmonary histopathology is observed in cotton rats immunized with formalin-inactivated respiratory syncytial virus (RSV) or purified F glycoprotein and challenged with RSV 3-6 months after immunization. Vaccine 8, 497–502 (1990). [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary File

Data Availability Statement

All study data are included in the article and SI Appendix.


Articles from Proceedings of the National Academy of Sciences of the United States of America are provided here courtesy of National Academy of Sciences

RESOURCES