Abstract
SARS-CoV-2 Spike protein is critical for virus infection via engagement of ACE2, and amino acid variation in Spike is increasingly appreciated. Given both vaccines and therapeutics are designed around Wuhan-1 Spike, this raises the theoretical possibility of virus escape, particularly in immunocompromised individuals where prolonged viral replication occurs. Here we report chronic SARS-CoV-2 with reduced sensitivity to neutralising antibodies in an immune suppressed individual treated with convalescent plasma, generating whole genome ultradeep sequences by both short and long read technologies over 23 time points spanning 101 days. Although little change was observed in the overall viral population structure following two courses of remdesivir over the first 57 days, N501Y in Spike was transiently detected at day 55 and V157L in RdRp emerged. However, following convalescent plasma we observed large, dynamic virus population shifts, with the emergence of a dominant viral strain bearing D796H in S2 and ΔH69/ΔV70 in the S1 N-terminal domain NTD of the Spike protein. As passively transferred serum antibodies diminished, viruses with the escape genotype diminished in frequency, before returning during a final, unsuccessful course of convalescent plasma. In vitro, the Spike escape double mutant bearing ΔH69/ΔV70 and D796H conferred decreased sensitivity to convalescent plasma, whilst maintaining infectivity similar to wild type. D796H appeared to be the main contributor to decreased susceptibility, but incurred an infectivity defect. The ΔH69/ΔV70 single mutant had two-fold higher infectivity compared to wild type and appeared to compensate for the reduced infectivity of D796H. Consistent with the observed mutations being outside the RBD, monoclonal antibodies targeting the RBD were not impacted by either or both mutations, but a non RBD binding monoclonal antibody was less potent against ΔH69/ΔV70 and the double mutant. These data reveal strong selection on SARS-CoV-2 during convalescent plasma therapy associated with emergence of viral variants with reduced susceptibility to neutralising antibodies.
Keywords: SARS-CoV-2, COVID-19, antibody escape, Convalescent plasma, neutralising antibodies, mutation, evasion, resistance, immune suppression
Introduction
SARS-CoV-2 is an RNA betacoronavirus, with closely related viruses identified in pangolins and bats1,2. RNA viruses have inherently higher rates of mutation than DNA viruses such as Herpesviridae3. However, amongst RNA viruses coronaviruses have a relatively modest mutation rate at around 23 nucleotide substitutions per year4, likely due to proof reading capability of coronavirus RNA dependent RNA polymerase5. The capacity for successful adaptation is exemplified by the Spike D614G mutation, that arose in China and rapidly spread worldwide6, now accounting for more than 90% of infections. The mutation appears to increase infectivity and transmissibility in animal models7. Although the SARS-CoV-2 Spike protein is critical for virus infection via engagement of ACE2, substantial Spike amino acid variation has been observed in circulating viruses8. Mutations in the receptor binding domain (RBD) of Spike are of particular concern because the RBD is an important target of neutralising antibodies and therapeutic monoclonal antibodies.
Deletions in the N-terminal domain (NTD) of Spike S1 have also been increasingly recognised, both within hosts9 and across individuals10. The evolutionary basis for the emergence of deletions is unclear at present but could be related to escape from immunity or to enhanced fitness/transmission. The most notable deletion in terms of frequency is ΔH69/ΔV70. This double deletion has been detected in multiple unrelated lineages, including the recent ‘Cluster 5’ mink related strain in the North Jutland region of Denmark (https://files.ssi.dk/Mink-cluster-5-short-report_AFO2). There it was associated with the RBD mutation Y453F in almost 200 individuals. Another European cluster in GISAID includes ΔH69/ΔV70 along with the RBD mutation N439K.
Although ΔH69/ΔV70 has been detected multiple times, within-host emergence remains undocumented and the reasons for its selection are unknown. Here we document real time SARS-CoV-2 emergence of ΔH69/ΔV70 in combination with the S2 mutation D796H following convalescent plasma therapy in an immunocompromised human host, demonstrating selection and reduced phenotypic susceptiblility of selected mutations.
Results
Clinical case history of SARS-CoV-2 infection in setting of immune-compromised host
Clinical data are available from the corresponding author.
Given the history of B cell depletion therapy and hypogammaglobulinemia we measured serum SARS-CoV-2 specific antibodies over the course of the admission. Total serum antibodies to SARS-CoV-2 were tested at days 44 and 50 by S protein immunoassay (Siemens). Results were negative. Three units (200mL each) of convalescent plasma (CP) from three independent donors were obtained through the NHS Blood and Transplant Clearance Registry and administered on compassionate named patient basis. These had been assayed for SARS-CoV-2 IgG antibody titres (Supplementary figure 4). Patient serum was subsequently positive for SARS-CoV-2 specific antibodies by S protein immunoassay (Siemens) in the hospital diagnostic laboratory on days 68, 90 and 101.
Virus genomic comparative analysis of 23 sequential respiratory samples over 101 days
The majority of samples were respiratory samples from nose and throat or endotracheal aspirates during the period of intubation. Ct values ranged from 16–34 and all 23 respiratory samples were successfully sequenced by standard single molecule sequencing approach as per the ARTIC protocol implemented by COG-UK; of these 20 additionally underwent short-read deep sequencing using the Illumina platform (Supplementary Table 3). There was generally good agreement between the methods (Supplementary Figure 5, Supplementary table 4). Additionally, Single genome amplification and sequencing of Spike using extracted RNA from respiratory samples was used as an independent method to detect mutations observed (Supplementary Table 5). Finally, we detected no evidence of recombination, based on two independent methods.
Maximum likelihood analysis of patient-derived whole genome consensus sequences demonstrated clustering with other local sequences from the same region (Figure 2A). The infecting strain was assigned to lineage 20B bearing the D614G Spike variant. Environmental sampling showed evidence of virus on surfaces such as telephone and call bell. Sequencing of these surface viruses showed clustering with those derived from the respiratory tract (Figure 2B). All samples were consistent with having arisen from a single viral population. In our phylogenetic analysis, we included sequential sequences from three other local patients identified with persistent viral RNA shedding over a period of 4 weeks or more as well as two long term immunosuppressed SARS-CoV-2 ‘shedders’ recently reported9,11, (Figure 2B and supplementary table 2). While the sequences from the three local patients as well as from Avanzato et al11 showed little divergence with no amino acid changes in Spike over time, the case patient showed significant diversification. The Choi9 et al report showed similar degree of diversification as the case patient. Further investigation of the sequence data suggested the existence of an underlying structure to the viral population in our patient, with samples collected at days 93 and 95 being rooted within, but significantly divergent from the original population (Figure 2B and 3). The relationship of the divergent samples to those at earlier time points argues against superinfection.
SARS-CoV-2 antibodies in serum, and convalescent plasma associated changes in viral diversity
We measured longitudinal serum and convalescent plasma SARS-CoV-2 IgG to SARS-CoV-2 trimeric Spike (S), Spike receptor binding domain (RBD) and Nucleocapsid protein (N) using a Luminex based assay. Levels of antibodies were very low at day 39, consistent with the standard lab based assay result above. Following CP1 at day 63 and CP2 at day 65, antibody levels were significantly increased against all three protein targets. As expected antibody titres fell over time before increasing after the third unit of CP on day 93.
All samples tested positive by RT-PCR and there was no sustained change in Ct values throughout the 101 days following the first two courses of remdesivir (days 41 and 54), or the first two units of convalescent plasma (days 63 and 65). Consensus sequences from short read deep sequence Illumina data revealed dynamic population changes after day 65, as shown by a highlighter plot (Figure 3). In addition, we were also able to follow the dynamics of virus populations down to low frequencies during the entire period (Figure 4). Following remdesivir at day 41 the low frequency variant analysis allowed us to observe transient amino acid changes in populations at below 50% abundance in Orf 1b, 3a and Spike, with aT39I mutation in ORF7a reaching 77% on day 45 (Figure 4). At day 66 we noted I513T in NSP2 and V157L in RdRp had emerged from undetectable at day 54 to 100% frequency (Figure 4 orange line), with the polymerase being the more plausible candidate for driving this sweep. Notably, spike variant N501Y, which can increase the ACE2 receptor affinity12, and which is present in the new UK B1.1.7 lineage13, was observed on day 55 at 33% frequency, but was eliminated by the sweep of the NSP2/RdRp variant.
In contrast to the early period of infection, between days 66 and 82, following the first two administrations of convalescent sera, a dramatic shift in the virus population was observed, with a variant bearing D796H in S2 and ΔH69/ΔV70 in the S1 N-terminal domain (NTD) becoming the dominant population at day 82. This was identified in a nose and throat swab sample with high viral load as indicated by Ct of 23 (Figure 5). The deletion was not detected at any point prior to the day 82 sample, even as minority variants by short read deep sequencing.
On Days 86 and 89, viruses obtained from upper respiratory tract samples were characterised by the Spike mutations Y200H and T240I, with the deletion/mutation pair observed on day 82 having fallen to frequencies of 10% or less (Figure 4 and 5). The Spike mutations Y200H and T240I were accompanied at high frequency by two other non-synonymous variants with similar allele frequencies, coding for I513T in NSP2, V157L in RdRp and N177S in NSP15 (Figure 4). Of these, the former was previously observed at 100% frequency in the sample on day 66 (Figure 3, orange line), arguing that this new lineage emerged out of a previously-existing population.
Sequencing of a nose and throat swab sample at day 93 identified viruses characterised by Spike mutations P330S at the edge of the RBD and W64G in S1 NTD at close to 100% abundance, with D796H along with ΔH69/ΔV70 at <1% abundance and the variants Y200H and T240I at frequencies of <2%. Viruses with the P330S variant were detected in two independent samples from different sampling sites, arguing against the possibility of contamination. The divergence of these samples from the remainder of the population (Figure 4, 5B and Supplementary Figure 6) suggests the possibility that they represent the emergence of a previously unobserved subpopulation. Following the third course of remdesivir (day 93) and third CP (day 95), we observed a re-emergence of the D796H + ΔH69/ΔV70 viral population. The inferred linkage of D796H and ΔH69/ΔV70 was maintained as evidenced by the highly similar frequencies of the two variants.
Patterns in the variant frequencies suggest competition between virus populations carrying different mutations, viruses with the D796H/ ΔH69/ΔV70 deletion/mutation pair rising to high frequency during CP therapy, then being outcompeted by another population in the absence of therapy. Specifically, these data are consistent with a lineage of viruses with the NSP2 I513T and RdRp V157L variant, dominant on day 66, being outcompeted during therapy by the mutation/deletion variant. With the lapse in therapy, the original strain, having acquired NSP15, N1773S and the Spike mutations, regained dominance, followed by the emergence of a separate population with the W64G and P330S mutations.
In a final attempt to reduce the viral load, a third course of remdesivir (day 93) and third CP (day 95) were administered. We observed a re-emergence of the D796H + ΔH69/ΔV70 viral population. The inferred linkage of D796H and ΔH69/ΔV70 was maintained as evidenced by the highly similar frequencies of the two variants, suggesting that the third unit of CP led to the re-emergence of this population under renewed positive selection. In further support of our proposed idea of competition, noted above, frequencies of these two variants appeared to mirror changes in the NSP2 I513T mutation (Figure 4), suggesting these as markers of opposing clades in the viral population. Ct values remained low throughout this period with hyperinflammation, eventually leading to multi-organ failure and death at day 102. The repeated increase in frequency of the viral population with CP therapy strongly supports the hypothesis that the deletion/mutation combination conferred selective advantage.
Spike mutants emerging post convalescent plasma impair neutralisating antibodies
Using lentiviral pseudotyping we generated wild type, ΔH69/ΔV70 + D796H and single mutant Spike proteins in enveloped virions in order to measure neutralisation activity of CP against these viruses (Figure 6). This system has been shown to give generally similar results to replication competent virus14,15. Spike protein from each mutant was detected in pelleted virions (Figure 6A). We also probed with an HIV-1 p24 antibody to monitor levels of lentiviral particle production. We then measured infectivity of the pseudoviruses, correcting for virus input, and found that ΔH69/ΔV70 appeared to have two-fold higher infectivity over a single round of infection compared to wild type (Figure 6B, supplementary figure 7). By contrast, the D796H single mutant had significantly lower infectivity as compared to wild type and the double mutant had similar infectivity to wild type (Figure 6B, supplementary figure 7).
We found that D796H alone and the D796H + ΔH69/ΔV70 double mutant were less sensitive to neutralisation by convalescent plasma samples (Figure 6C–E). By contrast the ΔH69/ΔV70 single mutant did not impact neutralisation. In addition, patient derived serum from days 64 and 66 (one day either side of CP2 infusion) similarly showed lower potency against the D796H + ΔH69/ΔV70 mutants (Figure 6F, G).
A panel of nineteen monoclonal antibodies (mAbs) isolated from three donors was previously identified to neutralize SARS-CoV-2. To establish if the mutations incurring in vivo (D796H and ΔH69/ΔV70) resulted in a global change in neutralization sensitivity we tested neutralising mAbs targeting the seven major epitope clusters previously described (excluding non-neutralising clusters II, V and small [n =<2] neutralising clusters IV, X). The seven RBD-specific mAbs (Supplementary table 6) exhibited no major change in neutralisation potency and non-RBD specific COVA1–21 showing 3–5 fold reduction in potency against ΔH69/ΔV70+D796H and ΔH69/ΔV70, but not D796H alone15 (Figure 7). We observed no differences in neutralisation between single/double mutants and wild type, suggesting that the mechanism of escape was likely outside these epitopes in the RBD. These data confirm the specificity of the findings from convalescent plasma and suggest that mutations observed are related to antibodies targeting regions outside the RBD.
In order to understand the mechanisms that might confer resistance to antibodies we examined a published Spike structure and annotated it with our residues of interest (Figure 8). This analysis showed that ΔH69/ΔV70 is in a disordered, glycosylated loop at the very tip of the NTD, and therefore could alter binding of antibodies. ΔH69/V70 is close to the binding site of the polyclonal antibodies derived from COV57 plasma16,17. D796H is in an exposed loop in S2 (Figure 8) and appears to be in a region frequently targeted by antibodies18, despite mutations at position 796 being rare (Supplementary table 7).
Discussion
Here we have documented a repeated evolutionary response by SARS-CoV-2 in the presence of antibody therapy during the course of a persistent infection in an immunocompromised host. The observation of potential selection for specific variants coinciding with the presence of antibodies from convalescent plasma is supported by the experimental finding of two-fold reduced susceptibility of these viruses to plasma. Further, we were able to document real-time emergence of a variant ΔH69/ΔV70 in the NTD of Spike that has been increasing in frequency in Europe and is present in the new UK variant B1.1.719. In this case the emergence of the variant was not the primary reason for treatment failure. However, given that both vaccines and therapeutics are aimed at Spike, our study raises the possibility of virus evasion, particularly in immune suppressed individuals where prolonged viral replication occurs.
Our observations represent a very rare insight, possible due to poor T cell responses and a lack of antibodies in the individual, and an intensive sampling course. Persistent viral replication and the failure of antiviral therapy allowed us to define the viral response to convalescent plasma. Our findings follow those of Choi et al9, who reported persistent infection in an immune suppressed individual; they noted significant virus evolution, including NTD deletions and RBD mutations in the absence of SARS-CoV-2 specific antibody therapy. A second paper reported asymptomatic long term shedding with four sequences over 105 days11, demonstrated similarly dramatic shifts in genetic composition of the viral population without phenotypic impact. A common finding with our study was the very low neutralisation activity in serum post transfusion of CP with waning over time as expected. Apart from the difference in the outcome of infection (severe, fatal disease versus asymptomatic disease and clearance), critically important differences in our study include: 1. The intensity of sampling and use of both long and short read sequencing to verify variant calls, thereby providing a unique scientific resource for longitudinal population genetic analysis. 2. The close alignment between the genetic composition of the viral population and CP administration, with an experimentally verified variant with reduced susceptibility emerging, falling to low frequency, and then rising again under CP selection. 3. Real time detection of emergence of a variant, ΔH69/ΔV70, that is increasing in frequency in Europe, and present in the new UK multiply mutated variant B1.1.7.
An interesting observation is that in the two cases of chronic infection highlighted, the Avanzato case where CP was used for asymptomatic shedding11 exhibited lower diversification of virus as compared to the fatal Choi et al case where CP was not used9. There are clearly a number of factors that could account for these differences, though this highlights the fact that use of CP does not necessarily lead to rapid adaptation. Intriguingly, deletions spanning amino acids 139–145 emerged in both these cases in contrast to the ΔH69/ΔV70 observed in the present study. Deletion of amino acid 144 and ΔH69/ΔV70 is observed in the UK lineage B1.1.719, and is therefore of concern.
We have noted in our analysis the potential influence of compartmentalised viral replication upon the sequences recovered in upper respiratory tract samples. Both population genetic and small animal studies have shown a lack of reassortment between influenza viruses within a single host during an infection, suggesting that acute respiratory viral infection may be characterised by spatially distinct viral populations20,21. In the analysis of data, it is important to distinguish genetic changes which occur in the primary viral population from apparent changes that arise from the stochastic observation of spatially distinct subpopulations in the host. While the samples we observe on days 93 and 95 of infection are genetically distinct from the others, the remaining samples are consistent with arising from a consistent viral population. We note that Choi et al reported the detection in post-mortem tissue of viral RNA not only in lung tissue, but also in the spleen, liver, and heart9. Mixing of virus from different compartments, for example via blood, or movement of secretions from lower to upper respiratory tract, could lead to fluctuations in viral populations at particular sampling sites. Experiments in animal models with sampling of different replication sites could allow a better understanding of SARS-CoV-2 population genetics and enable prediction of escape variants following antibody based therapies.
This is a single case report and therefore limited conclusions can be drawn about generalisability.
In addition to documenting the emergence of SARS-CoV-2 Spike ΔH69/ΔV70 in vivo, we show that this mutation increases infectivity of the Spike protein in a pseudotyping assay. The deletion was observed contemporaneously with the rare S2 mutation D796H after two separate courses of CP, with other viral populations emerging. D796H, but not ΔH69/ΔV70, conferred reduction in susceptibility to polyclonal antibodies in the units of CP administered, though we cannot speculate as to their individual impacts on sera from other individuals. Importantly, neither of these mutations alone or in combination affected susceptibility of virus to a set of RBD-targeting monoclonal antibodies. The reduced sensitivity of ΔH69/ΔV70 to a non RBD binding antibody may hint at a role in antigen recognition, for example to polyclonal sera from COV-57.
The effects of CP on virus evolution seen here are unlikely to apply in immune competent hosts where viral diversity is likely to be lower due to better immune control. Our data highlight that infection control measures may need to be tailored to the needs of immunocompromised patients and also caution in interpretation of CDC guidelines that recommend 20 days as the upper limit of infection prevention precautions in immune compromised patients who are afebrile22. Due to the difficulty with culturing clinical isolates, use of surrogates are warranted23. However, where detection of ongoing viral evolution is possible, this serves as a clear proxy for the existence of infectious virus. In our case we detected environmental contamination whilst in a single occupancy room and the patient was moved to a negative-pressure high air-change infectious disease isolation room.
Clinical efficacy of convalescent plasma in severe COVID-19 has not been demonstrated24, and its use in different stages of infection and disease remains experimental; as such, we suggest that it should be reserved for use within clinical trials, with rigorous monitoring of clinical and virological parameters. The data from this single case report might warrant caution in use of convalescent plasma in patients with immune suppression of both T cell and B cell arms; in such cases, the antibodies administered have little support from cytotoxic T cells, thereby reducing chances of clearance and theoretically raising the potential for escape mutations. Whilst we await further data, where clinical trial enrolment is not possible, convalescent plasma administered for clinical need in immune suppression should ideally only be considered as part of observational studies, undertaken preferably in single occupancy rooms with enhanced infection control precautions, including SARS-CoV-2 environmental sampling and real-time sequencing. Understanding of viral dynamics and characterisation of viral evolution in response to different selection pressures in the immunocompromised host is necessary not only for improved patient management but also for public health benefit.
Ethics
The study was approved by the East of England - Cambridge Central Research Ethics Committee (17/EE/0025). Written informed consent was obtained from both the patient and family. Additional controls with COVID-19 were enrolled to the NIHR BioResource Centre Cambridge under ethics review board (17/EE/0025).
Methods
Clinical Sample Collection and Next generation sequencing
Serial samples were collected from the patient periodically from the lower respiratory tract (sputum or endotracheal aspirate), upper respiratory tract (throat and nasal swab), and from stool. Nucleic acid extraction was done from 500μl of sample with a dilution of MS2 bacteriophage to act as an internal control, using the easyMAG platform (Biomerieux, Marcy-’Étoile) according to the manufacturers’ instructions. All samples were tested for presence of SARS-CoV-2 with a validated one-step RT q-PCR assay developed in conjunction with the Public Health England Clinical Microbiology25. Amplification reaction were all performed on a Rotorgene™ PCR instrument. Samples which generated a CT of ≤36 were considered to be positive.
Sera from recovered patients in the COVIDx study26 were used for testing of neutralisation activity by SARS-CoV-2 mutants.
SARS-CoV-2 serology by multiplex particle-based flow cytometry (Luminex):
Recombinant SARS-CoV-2 N, S and RBD were covalently coupled to distinct carboxylated bead sets (Luminex; Netherlands) to form a 3-plex and analyzed as previously described (Xiong et al. 2020). Specific binding was reported as mean fluorescence intensities (MFI).
Whole blood T cell and innate stimulation assay
Whole blood was diluted 1:5 in RPMI into 96-well F plates (Corning) and activated by single stimulation with phytohemagglutinin (PHA; 10 μg/ml; Sigma-Aldrich), or LPS (1 μg/ml, List Biochemicals) or by co-stimulating with anti-CD3 (MEM57, Abcam, 200 ng/ml,) and IL-2 (Immunotools, 1430U/ml,). Supernatants were taken after 24 hours. Levels (pg/ml) are shown for IFNg, IL17, IL2, TNFa, IL6, IL1b and IL10. Cytokines were measured by multiplexed particle based Flow cytometry on a Luminex analyzer (Bio-Plex, Bio-Rad, UK) using an R&D Systems custom kit (R&D Systems, UK).
For viral genomic sequencing, total RNA was extracted from samples as described. Samples were sequenced using MinION flow cells version 9.4.1 (Oxford Nanopore Technologies) following the ARTICnetwork V3 protocol (https://dx.doi.org/10.17504/protocols.io.bbmuik6w) and BAM files assembled using the ARTICnetwork assembly pipeline (https://artic.network/ncov-2019/ncov2019-bioinformatics-sop.html). A representative set of 10 sequences were selected and also sequenced using the Illumina MiSeq platform. Amplicons were diluted to 2 ng/μl and 25 μl (50 ng) were used as input for each library preparation reaction. The library preparation used KAPA Hyper Prep kit (Roche) according to manufacturer’s instructions. Briefly, amplicons were end-repaired and had A-overhang added; these were then ligated with 15mM of NEXTflex DNA Barcodes (Bio Scientific, Texas, USA). Post-ligation products were cleaned using AMPure beads and eluted in 25 μl. Then, 20 μl were used for library amplification by 5 cycles of PCR. For the negative controls, 1ng was used for ligation-based library preparation. All libraries were assayed using TapeStation (Agilent Technologies, California, USA) to assess fragment size and quantified by QPCR. All libraries were then pooled in equimolar accordingly. Libraries were loaded at 15nM and spiked in 5% PhiX (Illumina, California, USA) and sequenced on one MiSeq 500 cycle using a Miseq Nano v2 with 2× 250 paired-end sequencing. A minimum of ten reads were required for a variant call.
Bioinformatics Processes
For long-read sequencing, genomes were assembled with reference-based assembly and a curated bioinformatics pipeline with 20x minimum coverage across the whole-genome27. For short-read sequencing, FASTQs were downloaded, poor-quality reads were identified and removed, and both Illumina and PHiX adapters were removed using TrimGalore v0.6.628. Trimmed paired-end reads were mapped to the National Center for Biotechnology Information SARS-CoV-2 reference sequence MN908947.3 using MiniMap2–2.17 with arguments -ax and sr29. BAM files were then sorted and indexed with samtools v1.11 and PCR optical duplicates removed using Picard (http://broadinstitute.github.io/picard). A consensus sequences of nucleic acids with a minimum whole-genome coverage of at least 20× were generated with BCFtools using a 0% majority threshold.
Single Genome Amplification and sequencing
Viral RNA extracts were reverse transcribed from each sample to sufficiently capture the diversity of the viral population without introducing resampling bias. SuperScript IV (Thermofisher Scientific) and the gene specific primers were used for reverse transcription. Template RNA was degraded with RNAse H (Thermofisher Scientific). All primers used were ‘in-house’ primers designed using the multiple sequence alignment of the patient’s consensus NGS sequences. Partial Spike (amino acids 21–800) was amplified as 1 continuous length of DNA (Spike ~ 1.8 kb) by nested PCR. Terminally diluted cDNA was PCR- amplified using Platinum® Taq DNA Polymerase High Fidelity (Invitrogen, Carlsbad, CA) so that 30% of reactions were positive30. By Poisson statistics, sequences were deemed ≥80% likely to be derived from HIV-1 single genomes. We obtained between 20–60 single genomes at each sample time point to achieve 90% confidence of detecting variants present at ≥8% of the viral population in vivo31,32. Partial spike amplicons obtained from terminal dilution PCR amplification were Sanger sequenced to form a contiguous sequence using another set of 8 in-house primers. Sanger sequencing was provided by Genewiz UK and manual sequence editing was performed using DNA Dynamo software (Blue Tractor Software Ltd, UK).
Phylogenetic Analysis
All available full-genome SARS-CoV-2 sequences were downloaded from the GISAID database (http://gisaid.org/)33 on 16th December. Duplicate and low-quality sequences (>5% N regions) were removed, leaving a dataset of 212,297 sequences with a length of >29,000bp. All sequences were sorted by name and only sequences sequenced with United Kingdom / England identifiers were retained. From this dataset, sequecnes were de-duplicated and where background sequences were required in figures, randomly subsampled using seqtk (https://github.com/lh3/seqtk). All sequences were aligned to the SARS-CoV-2 reference strain MN908947.3, using MAFFT v7.475 with automatic flavour selection34. Major SARS-CoV-2 clade memberships were assigned to all sequences using both the Nextclade server v0.9 (https://clades.nextstrain.org/) and Phylogenetic Assignment Of Named Global Outbreak Lineages (pangolin)35.
Maximum likelihood phylogenetic trees were produced using the above curated dataset using IQ-TREE v2.1.236. Evolutionary model selection for trees were inferred using ModelFinder37 and trees were estimated using the GTR+F+I model with 1000 ultrafast bootstrap replicates38. All trees were visualised with Figtree v.1.4.4 (http://tree.bio.ed.ac.uk/software/figtree/), rooted on the SARS-CoV-2 reference sequence and nodes arranged in descending order. Nodes with bootstraps values of <50 were collapsed using an in-house script.
In-depth allele frequency variant calling
The SAMFIRE package39 was used to call allele frequency trajectories from BAM file data. Reads were included in this analysis if they had a median PHRED score of at least 30, trimming the ends of reads to achieve this if necessary. Nucleotides were then filtered to have a PHRED score of at least 30; reads with fewer than 30 such reads were discarded. Distances between sequences, accounting for low-frequency variant information, was also conducted using SAMFIRE. The sequence distance metric, described in an earlier paper40, combines allele frequencies across the whole genome. Where L is the length of the genome, we define q(t) as a 4 × L element vector describing the frequencies of each of the nucleotides A, C, G, and T at each locus in the viral genome sampled at time t. For any given locus i in the genome we calculate the change in allele frequencies between the times t1 and t2 via a generalisation of the Hamming distance
where the vertical lines indicate the absolute value of the difference. These statistics were then combined across the genome to generate the pairwise sequence distance metric
The Mathematica software package was to conduct a regression analysis of pairwise sequence distances against time, leading to an estimate of a mean rate of within-host sequence evolution. In contrast to the phylogenetic analysis, this approach assumed the samples collected on days 93 and 95 to arise via stochastic emission from a spatially separated subpopulation within the host, leading to a lower inferred rate of viral evolution for the bulk of the viral population.
Western blot analysis.
Forty-eight hours after transfection of cells with plasmid preparations, the culture supernatant was harvested and passed through a 0.45-μm-pore-size filter to remove cellular debris. The filtrate was centrifuged at 15,000 rpm for 120 min to pellet virions. The pelleted virions were lysed in Laemmli reducing buffer (1 M Tris-HCl [pH 6.8], SDS, 100% glycerol, β-mercaptoethanol, and bromophenol blue). Pelleted virions were subjected to electrophoresis on SDS–4 to 12% bis-Tris protein gels (Thermo Fisher Scientific) under reducing conditions. This was followed by electroblotting onto polyvinylidene difluoride (PVDF) membranes. The SARSCOV-2 Spike proteins were visualized by a ChemiDoc® MP imaging system (Biorad) using anti-Spike S2 (Invitrogen) and anti-p24 Gag antibodies.
Recombination Detection
All sequences were tested for potential recombination, as this would impact on evolutionary estimates. Potential recombination events were explored with nine algorithms (RDP, MaxChi, SisScan, GeneConv, Bootscan, PhylPro, Chimera, LARD and 3SEQ), implemented in RDP5 with default settings41. To corroborate any findings, ClonalFrameML v1.1242 was also used to infer recombination breakpoints. Neither programs indicated evidence of recombination in our data.
Structural Viewing
The Pymol Molecular Graphics System v2.4.0 (https://github.com/schrodinger/pymol-open-source/releases) was used to map the location of the four spike mutations of interested onto a SARS-CoV-2 spike structure visualised by Wrobel et al (PDB: 6ZGE)43.
Testing of convalescent plasma for antibody titres
The Anti-SARS-CoV-2 ELISA (IgG) assay used to test CP for antibody titres was Euroimmun Medizinische Labordiagnostika AG. This indirect ELISA based assay uses a recombinant structural spike 1 (S1) protein of SARS-CoV-2 expressed in the human cell line HEK 29 for the detection of SARS-CoV2 IgG.
Generation of Spike mutants
Amino acid substitutions were introduced into the D614G pCDNA_SARS-CoV-2_Spike plasmid as previously described44 using the QuikChange Lightening Site-Directed Mutagenesis kit, following the manufacturer’s instructions (Agilent Technologies, Inc., Santa Clara, CA).
Pseudotype virus preparation
Viral vectors were prepared by transfection of 293T cells by using Fugene HD transfection reagent (Promega). 293T cells were transfected with a mixture of 11ul of Fugene HD, 1μg of pCDNAΔ19Spike-HA, 1ug of p8.91 HIV-1 gag-pol expression vector45,46, and 1.5μg of pCSFLW (expressing the firefly luciferase reporter gene with the HIV-1 packaging signal). Viral supernatant was collected at 48 and 72h after transfection, filtered through 0.45um filter and stored at −80°C. The 50% tissue culture infectious dose (TCID50) of SARS-CoV-2 pseudovirus was determined using Steady-Glo Luciferase assay system (Promega).
Standardisation of virus input by SYBR Green-based product-enhanced PCR assay (SG-PERT)
The reverse transcriptase activity of virus preparations was determined by qPCR using a SYBR Green-based product-enhanced PCR assay (SG-PERT) as previously described47.
Briefly, 10-fold dilutions of virus supernatant were lysed in a 1:1 ratio in a 2x lysis solution (made up of 40% glycerol v/v 0.25% Trition X-100 v/v 100mM KCl, RNase inhibitor 0.8 U/ml, TrisHCL 100mM, buffered to pH7.4) for 10 minutes at room temperature.
12μl of each sample lysate was added to thirteen 13μl of a SYBR Green master mix (containing 0.5μM of MS2-RNA Fwd and Rev primers, 3.5pmol/ml of MS2-RNA, and 0.125U/μl of Ribolock RNAse inhibitor and cycled in a QuantStudio. Relative amounts of reverse transcriptase activity were determined as the rate of transcription of bacteriophage MS2 RNA, with absolute RT activity calculated by comparing the relative amounts of RT to an RT standard of known activity.
Serum/plasma pseudotype neutralization assay
Spike pseudotype assays have been shown to have similar characteristics as neutralisation testing using fully infectious wild type SARS-CoV-214.Virus neutralisation assays were performed on 293T cell transiently transfected with ACE2 and TMPRSS2 using SARS-CoV-2 Spike pseudotyped virus expressing luciferase48. Pseudotyped virus was incubated with serial dilution of heat inactivated human serum samples or convalescent plasma in duplicate for 1h at 37°C. Virus and cell only controls were also included. Then, freshly trypsinized 293T ACE2/TMPRSS2 expressing cells were added to each well. Following 48h incubation in a 5% CO2 environment at 37°C, the luminescence was measured using Steady-Glo Luciferase assay system (Promega).
mAb pseudotype neutralisation assay
Virus neutralisation assays were performed on HeLa cells stably expressing ACE2 and using SARS-CoV-2 Spike pseudotyped virus expressing luciferase as previously described49. Pseudotyped virus was incubated with serial dilution of purified mAbs15 in duplicate for 1h at 37°C. Then, freshly trypsinized HeLa ACE2- expressing cells were added to each well. Following 48h incubation in a 5% CO2 environment at 37°C, the luminescence was measured using Bright-Glo Luciferase assay system (Promega) and neutralization calculated relative to virus only controls. IC50 values were calculated in GraphPad Prism.
Data Availability
Long-read sequencing data that support the findings of this study have been deposited in the NCBI SRA database with the accession codes SAMN16976824 - SAMN16976846 under BioProject PRJNA682013 (https://www.ncbi.nlm.nih.gov/bioproject/PRJNA682013). Short reads and data used to construct figures were deposited at https://github.com/Steven-Kemp/sequence_files.
Supplementary Material
Acknowledgements
We are immensely grateful to the patient and his family. We would also like to thank the staff at CUH and the NIHR Cambridge Clinical Research Facility. We would like to thank Dr Ruthiran Kugathasan and Professor Wendy Barclay for helpful discussions and Dr Martin Curran, Dr William Hamilton, and Dr. Dominic Sparkes. We would like to thank Prof Andres Floto and Prof Ferdia Gallagher. We thank Dr James Voss for the kind gift of HeLa cells stably expressing ACE2. COG-UK is supported by funding from the Medical Research Council (MRC) part of UK Research & Innovation (UKRI), the National Institute of Health Research (NIHR) and Genome Research Limited, operating as the Wellcome Sanger Institute. RKG is supported by a Wellcome Trust Senior Fellowship in Clinical Science (WT108082AIA). LEM is supported by a Medical Research Council Career Development Award (MR/R008698/1). SAK is supported by the Bill and Melinda Gates Foundation via PANGEA grant: OPP1175094. DAC is supported by a Wellcome Trust Clinical PhD Research Fellowship. CJRI acknowledges MRC funding (ref: MC_UU_00002/11). This research was supported by the National Institute for Health Research (NIHR) Cambridge Biomedical Research Centre, the Cambridge Clinical Trials Unit (CCTU) and by the UCL Coronavirus Response Fund and made possible through generous donations from UCL’s supporters, alumni, and friends (LEM). JAGB is supported by the Medical Research Council (MC_UP_1201/16). IG is a Wellcome Senior Fellow and supported by the Wellcome Trust (207498/Z/17/Z). DDP is supported by NIH GM083127.
References
- 1.Zhang T., Wu Q. & Zhang Z. Probable pangolin origin of SARS-CoV-2 associated with the COVID-19 outbreak. Current Biology (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Xiao K. et al. Isolation of SARS-CoV-2-related coronavirus from Malayan pangolins. Nature 583, 286–289, doi: 10.1038/s41586-020-2313-x (2020). [DOI] [PubMed] [Google Scholar]
- 3.Sanjuán R. & Domingo-Calap P. Mechanisms of viral mutation. Cell Mol Life Sci 73, 4433–4448, doi: 10.1007/s00018-016-2299-6 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Hadfield J. et al. Nextstrain: real-time tracking of pathogen evolution. Bioinformatics 34, 4121–4123 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Robson F. et al. Coronavirus RNA proofreading: molecular basis and therapeutic targeting. Molecular cell (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Korber B. et al. Tracking Changes in SARS-CoV-2 Spike: Evidence that D614G Increases Infectivity of the COVID-19 Virus. Cell, doi: 10.1016/j.cell.2020.06.043 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Yurkovetskiy L. et al. Structural and Functional Analysis of the D614G SARS-CoV-2 Spike Protein Variant. Cell 183, 739–751 e738, doi: 10.1016/j.cell.2020.09.032 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Korber B. et al. Tracking changes in SARS-CoV-2 Spike: evidence that D614G increases infectivity of the COVID-19 virus. Cell 182, 812–827. e819 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Choi B. et al. Persistence and Evolution of SARS-CoV-2 in an Immunocompromised Host. N Engl J Med, doi: 10.1056/NEJMc2031364 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.McCarthy K. R. et al. Natural deletions in the SARS-CoV-2 spike glycoprotein drive antibody escape. bioRxiv, 2020.2011.2019.389916, doi: 10.1101/2020.11.19.389916 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Avanzato V. A. et al. Case Study: Prolonged infectious SARS-CoV-2 shedding from an asymptomatic immunocompromised cancer patient. Cell (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Starr T. N. et al. Deep Mutational Scanning of SARS-CoV-2 Receptor Binding Domain Reveals Constraints on Folding and ACE2 Binding. Cell 182, 1295–1310.e1220, doi: 10.1016/j.cell.2020.08.012 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Rambaut A., L. N.,Pybus O, Barclay W, Carabelli A. C., Connor T., Peacock T., Robertson D. L., Volz E., on behalf of COVID-19 Genomics Consortium UK (CoG-UK). Preliminary genomic characterisation of an emergent SARS-CoV-2 lineage in the UK defined by a novel set of spike mutations, <https://virological.org/t/preliminary-genomic-characterisation-of-an-emergent-sars-cov-2-lineage-in-the-uk-defined-by-a-novel-set-of-spike-mutations/563> (2020).
- 14.Schmidt F. et al. Measuring SARS-CoV-2 neutralizing antibody activity using pseudotyped and chimeric viruses. 2020.2006.2008.140871, doi: 10.1101/2020.06.08.140871%JbioRxiv (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Brouwer P. J. M. et al. Potent neutralizing antibodies from COVID-19 patients define multiple targets of vulnerability. Science 369, 643–650, doi: 10.1126/science.abc5902 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Robbiani D. F. et al. Convergent antibody responses to SARS-CoV-2 in convalescent individuals. Nature 584, 437–442, doi: 10.1038/s41586-020-2456-9 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Barnes C. O. et al. Structures of Human Antibodies Bound to SARS-CoV-2 Spike Reveal Common Epitopes and Recurrent Features of Antibodies. Cell 182, 828–842 e816, doi: 10.1016/j.cell.2020.06.025 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Shrock E. et al. Viral epitope profiling of COVID-19 patients reveals cross-reactivity and correlates of severity. Science, doi: 10.1126/science.abd4250 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Kemp S. et al. Recurrent emergence and transmission of a SARS-CoV-2 Spike deletion ΔH69/V70. bioRxiv, 2020.2012.2014.422555, doi: 10.1101/2020.12.14.422555 (2020). [DOI] [Google Scholar]
- 20.Sobel Leonard A. et al. The effective rate of influenza reassortment is limited during human infection. PLoS pathogens 13, e1006203, doi: 10.1371/journal.ppat.1006203 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Richard M., Herfst S., Tao H., Jacobs N. T. & Lowen A. C. Influenza A Virus Reassortment Is Limited by Anatomical Compartmentalization following Coinfection via Distinct Routes. Journal of virology 92, doi: 10.1128/JVI.02063-17 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.CDC. Discontinuation of Transmission-Based Precautions and Disposition of Patients with COVID-19 in Healthcare Settings (Interim Guidance), <https://www.cdc.gov/coronavirus/2019-ncov/hcp/disposition-hospitalized-patients.html> (2020).
- 23.Boshier F. A. T. et al. Remdesivir induced viral RNA and subgenomic RNA suppression, and evolution of viral variants in SARS-CoV-2 infected patients. medRxiv, 2020.2011.2018.20230599, doi: 10.1101/2020.11.18.20230599 (2020). [DOI] [Google Scholar]
- 24.Simonovich V. A. et al. A Randomized Trial of Convalescent Plasma in Covid-19 Severe Pneumonia. N Engl J Med, doi: 10.1056/NEJMoa2031304 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Meredith L. W. et al. Rapid implementation of SARS-CoV-2 sequencing to investigate cases of health-care associated COVID-19: a prospective genomic surveillance study. The Lancet Infectious Diseases 20, 1263–1272, doi: 10.1016/S1473-3099(20)30562-4 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Collier D. A. et al. Point of Care Nucleic Acid Testing for SARS-CoV-2 in Hospitalized Patients: A Clinical Validation Trial and Implementation Study. Cell Rep Med, 100062, doi: 10.1016/j.xcrm.2020.100062 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Loman N., Rowe W. & Rambaut A. (v1, 2020).
- 28.Martin M. Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet. journal 17, 10–12 (2011). [Google Scholar]
- 29.Li H. Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics (Oxford, England) 34, 3094–3100, doi: 10.1093/bioinformatics/bty191 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Jordan M. R. et al. Comparison of standard PCR/cloning to single genome sequencing for analysis of HIV-1 populations. J Virol Methods 168, 114–120, doi: 10.1016/j.jviromet.2010.04.030 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Palmer S. et al. Multiple, linked human immunodeficiency virus type 1 drug resistance mutations in treatment-experienced patients are missed by standard genotype analysis. Journal of clinical microbiology 43, 406–413, doi: 10.1128/JCM.43.1.406-413.2005 (2005). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Keele B. F. et al. Identification and characterization of transmitted and early founder virus envelopes in primary HIV-1 infection. Proceedings of the National Academy of Sciences of the United States of America 105, 7552–7557 (2008). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Shu Y. & McCauley J. GISAID: Global initiative on sharing all influenza data - from vision to reality. Euro surveillance: bulletin Europeen sur les maladies transmissibles = European communicable disease bulletin 22, 30494, doi: 10.2807/1560-7917.ES.2017.22.13.30494 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Katoh K. & Standley D. M. MAFFT Multiple Sequence Alignment Software Version 7: Improvements in Performance and Usability. Molecular Biology and Evolution 30, 772–780, doi: 10.1093/molbev/mst010 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Rambaut A. et al. A dynamic nomenclature proposal for SARS-CoV-2 lineages to assist genomic epidemiology. Nature Microbiology 5, 1403–1407, doi: 10.1038/s41564-020-0770-5 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Minh B. Q. et al. IQ-TREE 2: New Models and Efficient Methods for Phylogenetic Inference in the Genomic Era. Molecular Biology and Evolution 37, 1530–1534, doi: 10.1093/molbev/msaa015 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Kalyaanamoorthy S., Minh B. Q., Wong T. K. F., von Haeseler A. & Jermiin L. S. ModelFinder: fast model selection for accurate phylogenetic estimates. Nature Methods 14, 587–589, doi: 10.1038/nmeth.4285 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Minh B. Q., Nguyen M. A. T. & von Haeseler A. Ultrafast Approximation for Phylogenetic Bootstrap. Molecular Biology and Evolution 30, 1188–1195, doi: 10.1093/molbev/mst024 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Illingworth C. J. SAMFIRE: multi-locus variant calling for time-resolved sequence data. Bioinformatics 32, 2208–2209, doi: 10.1093/bioinformatics/btw205 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Lumby C. K., Zhao L., Breuer J. & Illingworth C. J. A large effective population size for established within-host influenza virus infection. Elife 9, doi: 10.7554/eLife.56915 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Martin D. P., Murrell B., Golden M., Khoosal A. & Muhire B. RDP4: Detection and analysis of recombination patterns in virus genomes. Virus evolution 1 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Didelot X. & Wilson D. J. ClonalFrameML: efficient inference of recombination in whole bacterial genomes. PLoS Comput Biol 11, e1004041 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Wrobel A. G. et al. SARS-CoV-2 and bat RaTG13 spike glycoprotein structures inform on virus evolution and furin-cleavage effects. Nature Structural & Molecular Biology 27, 763–767, doi: 10.1038/s41594-020-0468-7 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Gregson J. et al. HIV-1 viral load is elevated in individuals with reverse transcriptase mutation M184V/I during virological failure of first line antiretroviral therapy and is associated with compensatory mutation L74I. The Journal of infectious diseases, doi: 10.1093/infdis/jiz631 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Naldini L., Blomer U., Gage F. H., Trono D. & Verma I. M. Efficient transfer, integration, and sustained long-term expression of the transgene in adult rat brains injected with a lentiviral vector. Proceedings of the National Academy of Sciences of the United States of America 93, 11382–11388 (1996). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Gupta R. K. et al. Full-length HIV-1 Gag determines protease inhibitor susceptibility within in vitro assays. Aids 24, 1651–1655, doi: 10.1097/QAD.0b013e3283398216 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Vermeire J. et al. Quantification of reverse transcriptase activity by real-time PCR as a fast and accurate method for titration of HIV, lenti- and retroviral vectors. PloS one 7, e50859–e50859, doi: 10.1371/journal.pone.0050859 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Mlcochova P. et al. Combined point of care nucleic acid and antibody testing for SARS-CoV-2 following emergence of D614G Spike Variant. Cell Rep Med, 100099, doi: 10.1016/j.xcrm.2020.100099 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Seow J. et al. Longitudinal observation and decline of neutralizing antibody responses in the three months following SARS-CoV-2 infection in humans. Nat Microbiol 5, 1598–1607, doi: 10.1038/s41564-020-00813-8 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
Long-read sequencing data that support the findings of this study have been deposited in the NCBI SRA database with the accession codes SAMN16976824 - SAMN16976846 under BioProject PRJNA682013 (https://www.ncbi.nlm.nih.gov/bioproject/PRJNA682013). Short reads and data used to construct figures were deposited at https://github.com/Steven-Kemp/sequence_files.