Abstract
Whiteflies from the Bemisia tabaci species complex have the ability to transmit a large number of plant viruses and are some of the most detrimental pests in agriculture. Although whiteflies are known to transmit both DNA and RNA viruses, most of the diversity has been recorded for the former, specifically for the Begomovirus genus. This study investigated the total diversity of DNA and RNA viruses found in whiteflies collected from a single site in Florida to evaluate if there are additional, previously undetected viral types within the B. tabaci vector. Metagenomic analysis of viral DNA extracted from the whiteflies only resulted in the detection of begomoviruses. In contrast, whiteflies contained sequences similar to RNA viruses from divergent groups, with a diversity that extends beyond currently described viruses. The metagenomic analysis of whiteflies also led to the first report of a whitefly-transmitted RNA virus similar to Cowpea mild mottle virus (CpMMV Florida) (genus Carlavirus) in North America. Further investigation resulted in the detection of CpMMV Florida in native and cultivated plants growing near the original field site of whitefly collection and determination of its experimental host range. Analysis of complete CpMMV Florida genomes recovered from whiteflies and plants suggests that the current classification criteria for carlaviruses need to be reevaluated. Overall, metagenomic analysis supports that DNA plant viruses carried by B. tabaci are dominated by begomoviruses, whereas significantly less is known about RNA viruses present in this damaging insect vector.
Introduction
The majority of vectored plant viruses are transmitted by hemipteran insects, whose piercing-sucking mouthparts allow efficient transmission [1]. Whiteflies (Aleyrodidae), in particular the Bemisia tabaci species complex, are among the most detrimental insect vectors causing considerable economic losses to multiple agricultural industries [2], [3]. Whiteflies damage crops directly through feeding, which can weaken plants and elicit undesirable plant responses [4], and through depositing excreta that favors sooty mold production. In addition, whiteflies indirectly damage crops by transmitting pathogenic viruses [2], [3], [5]. Viruses are responsible for almost half of the emerging diseases affecting plants and whitefly-transmitted viruses are some of the most devastating agents affecting cash crops [6].
Among the large diversity of viral types known to infect plants, only DNA viruses belonging to the genus Begomovirus (Geminiviridae) and a small diversity of RNA viruses have been associated with whiteflies. Whiteflies are known to transmit more than 280 begomovirus species [5], [7] and, to our knowledge, no other DNA viral groups have been detected in whiteflies. The emergence of begomoviruses as important pathogens is closely associated with the increased prevalence of highly polyphagous whitefly species [5]. Whiteflies feed on a large number of cultivated and native plant species and thus may provide the opportunity to transmit viruses among a variety of hosts, including wild and cultivated vegetation [3], [8], [9]. The ability of whiteflies to transmit begomoviruses into diverse hosts, as well as the high potential for co-infection and recombination opportunities, may have contributed to the emergence of the genus Begomovirus as the group of plant viruses with the largest number of recognized species [5], [7]. In contrast to the species-rich, genus-specific association of whiteflies with begomoviruses, a low species diversity of RNA viruses is known to be vectored by whiteflies. There are four genera of RNA viruses (each with fewer than 15 species) known to be transmitted by whiteflies, namely: Crinivirus (Closteoviridae; 12 species), Carlavirus (Betaflexiviridae; 1 species), Ipomovirus (Potyviridae; 4 species), and Torradovirus (Secoviridae; 4 species) [5].
It is possible that the diversity of whitefly-transmitted viruses reported to date does not accurately represent the total complement of viruses in this insect vector. Viral types known to be carried by B. tabaci may instead reflect methodological limitations that are only capable of detecting close relatives of known vector-transmitted viruses (e.g., PCR with degenerate primers designed based on known sequences). In addition, agricultural surveillance efforts typically focus on viral species that negatively affect economically important crops. Therefore it is likely that viruses present in native vegetation that do not show any impact on agricultural crops in the area will be overlooked. Since viruses found in native vegetation may emerge as serious pathogens for crops and asymptomatic hosts may facilitate virus spread by serving as reservoirs [10]–[12], there is a critical need to investigate the total community of DNA and RNA viruses associated with whiteflies in a given area. This endeavor can be accomplished by applying the vector-enabled metagenomics approach (VEM; where viruses are purified and sequenced directly from insect vectors) using whiteflies. The main advantage of VEM is that it allows the detection of viruses carried by insect vectors without a priori knowledge of the plant pathogens present in a given area [13]. Moreover, the VEM approach does not depend on the collection of foliar tissue exhibiting virus-like infection symptoms to detect viruses present in an area, circumventing limitations associated with sampling individual plants and enabling the identification of asymptomatic infections. VEM using a small sequencing effort has been successfully implemented to identify whitefly-transmitted begomoviruses infecting both commercial crops and native vegetation [13].
In an effort to shed light on the total diversity of viruses carried by whiteflies, this study incorporated high-throughput sequencing into the VEM approach to detect DNA and RNA viruses in B. tabaci specimens collected from an experimental field site in Florida. Begomoviruses were the only DNA plant viruses detected, whereas known and novel RNA viruses from different families were found in whiteflies from this single field site. Furthermore, sequencing efforts resulted in the detection and first report of a whitefly-transmitted carlavirus most similar to Cowpea mild mottle virus (CpMMV) in North America. Although the CpMMV Florida isolate was originally detected in whiteflies, it was subsequently identified in wild and cultivated plants from the same area and its host range was experimentally determined. Analysis of the CpMMV Florida genome suggests that the current classification criteria for carlaviruses need to be reevaluated.
Materials and Methods
Whiteflies: Sample collection, processing, and metagenomic sequencing
The B. tabaci specimens used for viral metagenomics were collected in an experimental field in Citra, Florida (29°24′N 82°06′W) in August 2007 as previously described [13]. Briefly, adult whiteflies were collected from soybean and volunteer watermelon plants using a battery-operated vacuum. The whiteflies were manually inspected using a Nikon model C-DSD115 stereoscope and debris and other insects were removed before storing at −80°C. A subset of the whiteflies were used in a pilot study investigating DNA viruses [13], while the remainder were processed for the present study as described below.
Virus particles were partially purified from the whiteflies before nucleic acid extraction and sequencing. For this purpose, approximately 250 whiteflies were homogenized in SM Buffer (50 mM Tris·HCl, 10 mM MgSO4, 0.1 M NaCl, pH 7.5) using a bead-beater (BioSpec) with 1.0 mm glass beads (Research Products International) for 1 min. Cells were removed from homogenates by centrifuging at 10,000 xg for 10 min and filtering the supernatant through a 0.22 µm Sterivex filter (Millipore). Virus particles present in the filtrate were treated with 0.2 volumes of chloroform, followed by DNase I (2.5 U/µl) and RNase A (0.25 U/µl) treatment at 37°C for 3 hrs to remove non-encapsidated nucleic acids.
DNA and RNA were simultaneously extracted from purified virus particles using the All Prep DNA/RNA Mini Kit (Qiagen) following manufacturer's instructions and sequenced individually. For the RNA fraction, the ‘on-column DNase digestion’ step was used to minimize DNA carryover. The extracted DNA fraction was amplified using the GenomiPhi V2 DNA Amplification Kit (GE Healthcare) followed by further amplification and fragmentation using the GenomePlex Whole Genome Amplification (WGA) Kit (Sigma-Aldrich). The RNA fraction was amplified using the TransPlex Whole Transcriptome Amplification (WTA) Kit (Sigma-Aldrich). WGA- and WTA-amplified nucleic acids were used for next-generation sequencing using a single lane of a Genome Analyzer IIx System (Illumina) by multiplexing.
Metagenomic data analysis
WGA and WTA adapter sequences as well as multiplexing barcodes were removed from the DNA and RNA sequence libraries, respectively, using the TagCleaner server (http://edwards.sdsu.edu/cgi-bin/tagcleaner/tc.cgi) [14]. Trimmed sequences from both DNA and RNA libraries are publicly available on the Metavir website (http://metavir-meb.univ-bpclermont.fr/) under the project ‘Whiteflies_Citra_2007’. Sequences (1.4 million from the DNA library and 2.1 million from the RNA library) were then assembled with a minimum identity of 95% over 25 bp using the Geneious software package (Biomatters). Contigs over 80 bp in length were compared against the GenBank non-redundant database using either BLASTn (DNA library) or BLASTx (RNA library) with an e-value cut-off of E<0.001 in June 2011 [15]. BLAST results were summarized and inspected using the Metagenome Analyzer (MEGAN4) software [16] to identify viral sequences. The top viral match for each contig was accepted only if the score for the top virus hit was at least 10% higher than the next best hit; otherwise, the contig was annotated as “unassigned”. In most cases where the BLAST scores were within 10% of each other, the viral matches belonged to the same genus and thus the genus was identified.
CpMMV Florida isolate genome completion
The majority of contigs from the RNA library with significant matches to viral sequences were similar to the carlavirus CpMMV. To sequence the full genome of this virus, contigs with similarities to CpMMV were organized based on the genomic position sharing similarity with a CpMMV reference genome from Africa (NC014730). Primer pairs were designed to bridge the gaps between contigs and primer pairs that spanned the entire genome were used to complete the genome (Table S1). cDNA for PCR reactions was produced from RNA extracted from purified virus particles using a SuperScript III First-Strand Synthesis System kit (Invitrogen). All PCR reactions contained 1 µl cDNA, 1 U Apex Red Taq Polymerase (Genesee), 1× NH4 buffer, 1.5 mM MgCl2, and 0.5 µM of each primer. Amplification was performed with an initial denaturation at 94°C for 5 min followed by 35 cycles of 94°C for 45 sec, 50°C for 45 sec (incrementally decreasing the temperature by 0.1°C each cycle), 72°C for 1.5 min, followed by a final extension at 72°C for 8 min. The 5′ end of the genome was completed using gene-specific primers with the 5′ RACE System Kit (Invitrogen) according to manufacturer's instructions (Table S1). All PCR products were cloned using the TOPO TA system (Invitrogen) and Sanger sequenced. PCR product sequences were assembled using Sequencher 4.7 (Gene Codes) and the complete genome was annotated using SeqBuilder (DNASTAR). Each region of the genome had at least 4× sequence coverage.
Survey and isolation of CpMMV from wild and cultivated vegetation
To investigate the presence of the carlavirus CpMMV in the vegetation, 90 plants were surveyed in the summer of 2011 from an area around the field site in Citra, FL where whiteflies were originally collected. Leaves exhibiting a variety of viral infection symptoms (e.g., mottling, leaf curling) were collected from plants belonging to the Fabaceae family which are known hosts of CpMMV, namely peanuts (Arachis hypogaea L.; n = 71), hairy indigo (Indigofera hirsuta L.; n = 14), and dixie ticktrefoil (Desmodium tortuosum (Sw.) DC.; n = 5). All plant tissues were tested for the presence of CpMMV using a CpMMV-specific ELISA Reagent Set (Neogen Europe Ltd) in accordance with manufacturer's protocols. Samples were considered positive when their absorbance values were greater than the mean of the negative controls plus three standard deviations. Positive samples were verified through a degenerate carlavirus RT-PCR assay targeting part of the capsid protein (CP) and 3′end poly-A tail of these RNA genomes [17]. Briefly, RNA was extracted from plant tissues using TRI Reagent following manufacturer's protocols (Ambion Inc.). Reverse transcription was performed using ImProm-II™ Reverse Transcriptase (Promega) with the oligo-d(T21) primer according to manufacturer's protocols. The cDNA was used for PCR with the Carla-CP (5′GGBYTNGGBGTNCCNACNGA3′) and oligo-dT (21) primers under the following conditions: 0.5 ul cDNA, Taq DNA Polymerase (New England Biolabs), 1× standard Taq (Mg-free) buffer, 3.0 mM MgCl2, and 1 µM spermidine. Amplification was performed with an initial denaturation at 94°C for 5 min, followed by 35 cycles of 94°C for 1 min, 50°C for 1 min, and 72°C for 1 min ending with a final extension at 72°C for 5 min.
One D. tortuosum plant sample that tested positive for CpMMV by ELISA was used to establish a culture and obtain inoculum for transmission and host range determination experiments. The sample of D. tortuosum was mechanically inoculated to Chenopodium quinoa L., which is an established local lesion host for CpMMV [18], using a 1∶5 dilution of tissue to phosphate buffer (100 mM K2HPO4, 100 mM Na2HPO4, 10 mM Na2SO3, pH 7.4). Eight days later, chlorotic local lesions were observed on inoculated leaves of C. quinoa. Four of these lesions were removed and individually mechanically inoculated to primary leaves of common bean (Phaseolus vulgaris L. ‘Topcrop’) which is a known systemic host of some CpMMV isolates [19]–[22]. Three bean plants exhibited virus-like symptoms from this inoculation and tested positive for CpMMV by ELISA and RT-PCR.
Whitefly transmission of isolated CpMMV
Viral isolates from each of the three infected bean plants were transmitted to new bean plants using B. tabaci (Mediterranean/Asia Minor/Africa clade, formerly known as B. tabaci Biotype B). For this purpose, infected beans were placed in separate cages in different growth rooms for acquisition and transmission. Transmissions were performed at different times throughout the day to prevent contamination through whitefly carryover between rooms. Non-viruliferous whiteflies were placed on each infected bean and given an acquisition access period of 20 min. Whiteflies were then transferred to three healthy beans and given an inoculation access period of 4 hrs. Transmission was terminated using insecticidal soap (20 ml/L Safer Soap®) and Imidacloprid (0.2% active ingredient formulation, applied as a 30 ml per plant drench). The presence of CpMMV was confirmed in all three whitefly-inoculated beans by RT-PCR. The CpMMV genome was sequenced from each of these bean plants through PCR using the same primers used to sequence the CpMMV genome from whiteflies (Table S1).
Experimental CpMMV Florida host range
A variety of hosts were selected for experimental infectivity assays based on previously reported hosts for isolates of CpMMV (Table 1) [20]. Bean leaf tissue infected with an isolate of CpMMV from D. tortuosum was collected 19 days post inoculation, frozen and used as the inoculum source for all inoculations. Three to four experimental host species were tested at a time. Five to twenty plants of each species were mechanically inoculated at the first true leaf stage. At the same time, three to five plants of each test species were mock-inoculated to serve as negative controls and three to five common bean plants were inoculated to serve as positive controls for the quality of the inoculum. Plants were visually assessed daily and systemic symptoms were recorded at 14 days post inoculation. Plants were then sampled and tested for the presence of CpMMV by ELISA. Inconclusive results based on ELISA were further tested by RT-PCR.
Table 1. Responses observed in a range of selected host plants mechanically inoculated with the Cowpea mild mottle virus Florida isolate.
Family | Species | Cultivar | No. Infected/No.Inoculated1 | Local Symptoms/Systemic Symptoms2 |
Amaranthaceae | Gomphrena globosa L. | Strawberry Fields | 0/10 | NS/NS |
Chenopodeaceae | Chenopodium giganteum D. Don | n/a | 0/5 | NS/NS |
Chenopodeaceae | Chenopodium quinoa Willd. | n/a | 8/10 | CLL/NS |
Cucurbitaceae | Cucumis sativus L. | Straight Eight | 0/10 | NS/NS |
Fabaceae | Arachis hypogaea L. | GA Green | 10/10 | NS/NS |
Fabaceae | Glycine max (L.) Merr. | Round-up Ready | 10/10 | NS/VC, LD, Mo |
Fabaceae | Lens culinaris Medik. | Green Lentil | 7/10 | NS/NS |
Fabaceae | Phaseolus coccineus L. | Scarlet Runner | 10/10 | NS/NS |
Fabaceae | Phaseolus lunatus L. | Fordhook No. 242 | 10/10 | NS/NS |
Fabaceae | Phaseolus vulgaris L. | Topcrop | 9/10 | NS/Mo, R |
Fabaceae | Pisum sativum L. | Lincoln | 8/10 | NS/NS |
Fabaceae | Vigna radiata (L.) R. Wilczek | Mung bean | 10/10 | NS/NS |
Fabaceae | Vigna unguiculata (L.) Walp. | CA Blackeye No. 5 | 10/10 | RV, RLL/LD, mMo |
Solanaceae | Capsicum annuum L. | California Wonder | 0/10 | NS/NS |
Solanaceae | Datura stramonium L. | n/a | 0/10 | NS/NS |
Solanaceae | Nicotiana glutinosa L. | n/a | 0/10 | NS/NS |
Solanaceae | Solanum lycopersicum L. | FL 7316 | 0/20 | NS/NS |
Solanaceae | Solanum lycopersicum L. | Sweetheart | 0/20 | NS/NS |
1 Number of infected plants/number of plants inoculated. The number of infected plants was determined by ELISA.
2 Abbreviations used: NS - no symptoms, Mo – mottle, R – rugose, LD – leaf deformation, mMo – mild mottle, VC – veinal chlorosis, RV – red vein, RLL – red local lesions, CLL – chlorotic local lesions.
Infected species as determined by visible symptoms and/or ELISA are highlighted in boldface.
CpMMV Florida genome, pairwise comparisons, and phylogenetic analysis
The CpMMV Florida genomes sequenced from whiteflies and bean plants as well as their predicted protein sequences were compared against known members of the Carlavirus genus. Predicted protein sequences were compared against the Pfam database [23] to identify conserved motifs. For all pairwise comparisons, alignments were performed using the MUSCLE algorithm [24] implemented in MEGA5 [25]. Pairwise distances were calculated in MEGA5 using p-distance and pairwise deletion of gaps. For phylogenetic analysis of the capsid protein, alignments were optimized using the PRALINE server [26] with default settings. A maximum likelihood tree was constructed using the PhyML online server [27] with the (LG+I+G+F) model chosen as the best-fit substitution model according to ProtTest [28]. The approximate likelihood ratio test (aLRT) was used to assess branch support [29].
Results
Viruses identified in whiteflies
VEM revealed a diversity of DNA and RNA plant viruses present in whiteflies collected from a single site in Citra, Florida. Viral sequences in the DNA library displayed high levels of similarity to previously described viruses, enabling their identification through BLASTn searches, while RNA sequences had to be identified through BLASTx due to limited similarities to known sequences. Viral contigs (n = 259) in the DNA library were dominated by begomoviruses (Geminiviridae; 97.3%), the majority of which shared >88% nucleotide identity with their top match in the database (Table S2). Note that short reads hinder any definitive classification of begomovirus species or strains; therefore, Table S2 only provides an overview of potential begomovirus types detected in whiteflies. In addition to begomoviruses, five contigs were most similar to novel begomovirus-associated satellites, Whitefly VEM Satellites, discovered in a nearby field [13]. Only two contigs were not related to plant viruses, including a single-stranded DNA bacteriophage and a human virus.
Although fewer viral contigs were recovered from the RNA library (n = 64) compared to the DNA library, the RNA sequences encompassed a broader viral diversity at the family level. The viral sequences identified in the RNA library had similarities to viruses from at least five different families (Betaflexiviridae, Closteroviridae, Bunyaviridae, Bromoviridae, Virgaviridae), three of which (Bunyaviridae, Bromoviridae, Virgaviridae) have not been detected in whiteflies previously (Table 2). Most of the identified RNA viruses were similar to plant viruses and contigs similar to the carlavirus CpMMV dominated the viral sequences. Viral sequences similar to plant viruses known to be transmitted by whiteflies, including criniviruses and CpMMV, had high amino acid identities (up to 100%) with their top match in the database. In contrast to the DNA library, many of the RNA viral contigs (33%) were highly divergent from known species since they shared less than 45% amino acid identity with their top match in the database. Several contigs had low identities to double-stranded RNA viruses, Circulifer tenellus virus 1 and Spissistilus festinus virus 1, recently discovered in plant-feeding hemipteran pests; however, it remains unknown whether these viruses replicate in insect cells or those of associated microorganisms [30]. Only three contigs had similarities to viruses that infect hosts other than plants or insects, including diatoms (Rhizosolenia setigera RNA virus) and humans (Uukuniemi virus and Armero virus). However, due to low amino acid identities, it is possible that these sequences represent novel plant or whitefly viruses.
Table 2. Plant or insect RNA viruses identified in whiteflies and amino acid (aa) identity ranges.
Virus match* | Hits | Contig length (nt) | AA identity range (%) | Genus | Family | Significance† | Ref. |
Cowpea mild mottle virus | 27 | 86–355 | 50–100 | Carlavirus | Betaflexiviridae | N | |
Unassigned carlavirus | 9 | 84–284 | 64–100 | Carlavirus | Betaflexiviridae | N | |
Lettuce chlorosis virus | 6 | 83–239 | 93–100 | Crinivirus | Closteroviridae | N | |
Circulifer tenellus virus 1 | 4 | 286–933 | 30–35 | Unclassified | Unclassified | U | |
Rice grassy stunt virus | 2 | 391–579 | 23 | Tenuivirus | Unclassified | U | |
Spissistilus festinus virus 1 | 2 | 189–404 | 43 | Unclassified | Unclassified | U | |
Unassigned phlebovirus | 2 | 685–931 | 24–43 | Phlebovirus | Bunyaviridae | U | |
Unassigned plant RNA virus | 1 | 620 | 38–40 | - | - | U | |
Armero virus** | 1 | 233 | 35 | Tentative Phlebovirus | Bunyaviridae | U | |
Unassigned bromovirus | 1 | 299 | 37 | Bromovirus | Bromoviridae | U | |
Unassigned Ilarvirus subgroup 1 | 1 | 317 | 43 | Ilarvirus | Bromoviridae | U | |
Rhizosolenia setigera RNA virus** | 1 | 328 | 39 | Bacillariornavirus | Unclassified | U | |
Rice stripe virus | 1 | 734 | 32 | Tenuivirus | Unassigned | U | |
Tobacco mild green mosaic virus | 1 | 749 | 35 | Tobamovirus | Virgaviridae | U | |
Tomato chlorosis virus | 1 | 89 | 100 | Crinivirus | Closteroviridae | K | [40] |
Unassigned cilevirus | 2 | 261–1159 | 22–38 | Cilevirus | Unclassified | U | |
Unassigned tobamovirus | 1 | 384 | 33 | Tobamovirus | Virgaviridae | U | |
Uukuniemi virus** | 1 | 313 | 28 | Phlebovirus | Bunyaviridae | U |
Unassigned virus groups refer to contigs that had significant matches to more than one virus with BLASTn scores within 10% of each other. Multiple matches make it impossible to assign these partial sequences. Viruses found in hosts other than plants or insects are identified (**). Viruses highlighted in boldface are known to be whitefly – transmitted.
K = known virus, and known to occur in Florida at this site, N = known virus, but not previously reported from Florida, U = most likely a new, uncharacterized virus.
Isolation and experimental host range determination of CpMMV Florida
Since CpMMV-like sequences were abundant in whiteflies and this virus had never been reported in North America, a survey of 90 symptomatic plants from three different species using ELISA with a CpMMV antibody was conducted in the same crop field four years after the whitefly collection. Thirty-eight percent of A. hypogea (n = 71), 36% of I. hirsuta (n = 14), and 100% of D. tortuosum (n = 5) plants tested positive for CpMMV. Note that infection symptoms observed in the field may not have been caused by CpMMV. Since all of the samples of D. tortuosum tested positive for CpMMV, an infected seedling from this species was used to establish a culture and obtain virus inoculum. This plant was used to mechanically infect C. quinoa and local lesions were subsequently inoculated into common bean plants (P. vulgaris). To confirm infection by CpMMV, the common bean plants were tested by ELISA and a degenerate carlavirus RT-PCR assay. Once the presence of CpMMV was confirmed, the CpMMV Florida isolate was successfully transmitted to three common bean plants using whiteflies. All three plants exposed to CpMMV-bearing whiteflies exhibited mild mottling symptoms and were verified as infected with CpMMV based on ELISA and RT-PCR.
To determine host range, a CpMMV isolate from a common bean plant infected through whitefly transmission was used to inoculate 18 species of experimental hosts belonging to five different families (Amaranthaceae, Chenopodeaceae, Cucurbitaceae, Fabaceae, Solanaceae). Ten of the 18 species tested were successfully infected by the CpMMV Florida isolate, all of which belonged to the Chenopodeaceae and Fabaceae families (Table 1). Six of the ten infected species did not show any visible symptoms of infection and had the same appearance as the negative controls; only four species exhibited local or systemic symptoms of infection. C. quinoa showed local chlorotic lesions on inoculated leaves, Vigna unguiculata exhibited both local and systemic symptoms, whereas Glycine max and Pisum sativum only displayed systemic symptoms (Table 1).
CpMMV Florida genome
PCR primers (Table S1) were used to obtain and sequence the entire CpMMV Florida genome from the field-collected whiteflies and each of the three bean plants experimentally infected with CpMMV Florida using whiteflies. The CpMMV genomes sequenced from bean plants, CpMMV Florida [Beans_2011] (Accession no. KC774020), are 100% identical to each other and share 99% genome-wide nucleotide identity with the genome retrieved from whiteflies collected four years earlier, CpMMV Florida [Whiteflies_2007] (Accession no. KC774019). The CpMMV Florida genomes exhibit organizations identical to members of the Carlavirus genus, including six open reading frames (ORFs) encoding the following proteins from 5′ to 3′: replication polyprotein, movement proteins [i.e., triple gene block (TGB)], capsid protein (CP) and nucleic acid binding (NB) protein (Fig. 1). Among the carlaviruses, the CpMMV Florida genomes are most closely related to the only other complete CpMMV genome sequence that was available at the time the analysis was performed, an isolate from Ghana (NC014730) [31], with which they share 67.5% genome-wide pairwise identity. Although the average amino acid pairwise identity for the complete protein complement of these two viral genomes is ∼62%, the CP exhibits 95% identity (Fig. 1). Phylogenetic analysis of the CP of different carlavirus species also supports identification of the Florida carlavirus as an isolate of CpMMV (Fig. 2). Based on pairwise distances among available CpMMV CP and NB sequences (Tables S3 and S4), the CpMMV Florida isolate may be more closely related to isolates from South America (Brazil) and the Caribbean (Puerto Rico) than to the Ghana isolate, since it shares 98–99% amino acid identity with these isolates.
Searches in the Pfam database using predicted amino acid sequences for each of the six ORFs present in the CpMMV Florida genomes revealed significant matches (e-value≪0.001) to conserved motifs observed in carlaviruses. The replication polyprotein contains four different domains characterized by viral methyltransferase [32], carlavirus endopeptidase (family C23 peptidase) [33], superfamily one helicase [34], and supergroup three RNA-dependent RNA polymerase [core motif: TGX3TX3NTX22GDD, where ‘X’ represents any amino acid residue] [35] motifs. Downstream from the replication polyprotein, there is a triple gene block involved in cell-to-cell movement with characteristic motifs of filamentous viruses, specifically the ‘potex-like’ class [36]. The first block contains NTPase/helicase sequence domains belonging to superfamily one helicases. The second and third genes contain the signature sequences GDX6GGXYXDG and CX5GX8C, respectively. Similar to several carlavirus species, the third gene in the TGB of the CpMMV Florida isolate lacks a standard start codon. The capsid protein exhibits both carlavirus- and potexvirus-specific domains and the carlavirus capsid signature sequence ‘GLGVPTE’ [17]. The putative NB protein encoded at the 3′ end of the CpMMV Florida genome, whose presence distinguishes virus species in the Carlavirus genus from other members of the Betaflexiviridae with similar genome organization (i.e., foveaviruses) [37], exhibits four characteristic cysteine residues in the pattern CX2CX12CX4C [38].
Discussion
Diversity of viruses identified in whiteflies
The VEM approach has been introduced as a strategy to survey viruses carried by insect vectors in a given region without a priori knowledge of the viral types present [13], [39]. Here VEM was used to detect both DNA and RNA plant viruses found in whiteflies collected from an experimental field station in Citra, FL. Strikingly, all of the DNA plant viruses identified with this deep Solexa sequencing effort were limited to the well-established whitefly-transmitted genus Begomovirus with high nucleotide identities (>88%) to known viral species. Although the short reads (maximum fragment size of 71 nt) hindered our ability to conclusively identify these begomoviruses to the species level even after assembly, results indicated that DNA viruses present in whiteflies from this site are dominated by members of a single genus. On the other hand, a diversity of RNA viral sequences from various families was detected and the VEM approach ultimately led to the discovery of the whitefly-transmitted carlavirus CpMMV in Florida, USA.
Since the VEM approach does not rely on sequence-specific primers/probes, this method should have recovered any type of DNA virus present in the whiteflies whose virions are resistant to chloroform and nuclease treatment. However, the use of a 0.22 µm filter during virus particle purification may have excluded larger non-plant DNA viruses which are commonly associated with insects (e.g., baculoviruses). The fact that only begomoviruses were identified suggests that this group indeed dominates the whitefly-transmitted DNA plant viruses, and perhaps exclusively occupies this niche. Future studies investigating plant DNA viruses in whiteflies from different locations using sequence-independent methods are needed to confirm whether or not whiteflies have evolved an exclusive relationship with begomoviruses.
In contrast to DNA viruses, the RNA library indicated that whiteflies from a single site can carry RNA viruses from disparate families. Currently, whiteflies are known to transmit four different groups of RNA viruses, including filamentous viruses from the Ipomovirus (Potyviridae), Crinivirus (Closteoviridae) and Carlavirus (Betaflexiviridae) genera, and icosahedral viruses from the Torradovirus genus (Secoviridae) [5]. Two of these groups were identified in the RNA metagenomic library, with high amino acid identities to known viruses including two criniviruses (Lettuce chlorosis virus (LCV) and Tomato chlorosis virus (TCV)) and a carlavirus (CpMMV). TCV has previously been reported from tomato in Florida [40]. However, this is the first evidence documenting the presence of CpMMV in the United States and LCV in the eastern United States. The remaining virus-like sequences identified in the RNA metagenomic library have low amino acid identities (<45%) with their top matches in the database. These novel viral sequences are most similar to groups that are not known to be carried by whiteflies and encompass divergent species, including viruses classified in three different families, as well as unclassified viruses (Table 2). It remains to be determined if these RNA viral sequences indeed represent novel whitefly-transmitted plant viruses, viruses infecting the whiteflies themselves, or simply transient viruses picked up by the whiteflies through feeding. Nevertheless, the detection of novel RNA viral sequences with weak similarities to known plant pathogens suggests that there are RNA plant viruses that have not yet been described.
Discovery of the carlavirus CpMMV in Florida
The VEM approach led to the detection of the first whitefly-transmitted carlavirus (CpMMV Florida) in North America. This virus has been reported in a wide range of geographical areas including Africa, India, Asia, the Middle East, and South America, where it can infect and negatively impact important food crops [19]–[22], [41]–[43]. The host range of the CpMMV Florida isolate includes members of the Chenopodeaceae and Fabaceae families that have been previously reported as either natural or experimental hosts for CpMMV isolates from different regions. Many of the susceptible hosts did not show any visible signs of infection, which is similar to other CpMMV isolates [18], [21], [22], [44]. Asymptomatic infection by CpMMV may contribute to the high prevalence and transmission of this virus in some crop fields [18]. Although CpMMV Florida infects hosts that have been previously reported for CpMMV isolates from other regions, there are differences in the host ranges of isolates from different locations and crops. For example, CpMMV isolates from Israel and Ghana are able to infect representative members of the Solanaceae family [18], [45], whereas CpMMV Florida and isolates from Brazil, Thailand, and Southern Iran did not infect any members of this family [22], [41], [44]. Despite these differences, CpMMV isolates cannot be distinguished by electron microscopy or serologically, and no stringent comparative tests have been performed to determine if host range differences are sufficient to distinguish strains, pathotypes, or species [20].
A recent report published during the review process of this manuscript described six novel CpMMV isolates from Brazil that are closely related to CpMMV Florida (see below) [46]. Based on experimental hosts tested in both studies, most of the Brazilian isolates shared a similar host range with CpMMV Florida, mainly infecting members of the Fabaceae. However, two of the isolates (CpMMV∶BR∶BA∶02 and CpMMV∶BR∶GO∶01∶1) were able to infect a member of the Solanaceae (Nicotiana glutinosa) which CpMMV Florida fail to infect. CpMMV∶BR∶BA∶02 and CpMMV∶BR∶GO∶01∶1 share 98% and 93% genome-wide pairwise identity with CpMMV Florida, respectively. Therefore CpMMV isolates may exhibit different host ranges despite high nucleotide identities. Further research examining both the full genomes and experimental host range of all available CpMMV isolates will provide insight into which genetic differences explain differences in host range.
The CpMMV genome recovered from whiteflies collected in 2007 is 99% identical to the genome isolated from vegetation in the same field four years later. The genome organization of the CpMMV Florida isolate is similar to other carlaviruses. Full genome comparisons between CpMMV isolates from Florida and Ghana, which were the only genomes available at the time when the analyses were performed, clearly show that the CP shares a much greater degree of identity (95%) than non-structural proteins in the genomes (∼62%) (Fig. 1). This conservation of structural proteins but high variability in non-structural genes has been noted by other authors investigating CpMMV partial sequences [21], [31] and a recent report investigating six new CpMMV genomes from Brazil which share 93–99% genome-wide pairwise identity with the Florida isolate [46]. Despite the higher genetic distance among non-structural genes, CpMMV Florida exhibits all the core and functional domains that have been identified for these proteins. According to the ICTV classification criteria for members of Betaflexiviridae (formerly known as Flexiviridae), distinct species share <72% nucleotide or <80% amino acid identity between the entire CP or replication genes [47]. Due to the difference between the CP and non-structural identities, these criteria present a problem for properly classifying CpMMV isolates. Based on the replication gene, CpMMV Florida represents a novel species, whereas CP identities suggest it does not.
Unfortunately, most of the available CpMMV sequences only encompass the 3′ end of the genomes, containing the CP and/or NB since available carlavirus-specific degenerate PCR assays target this region [17], [48]. Most studies have based their classifications on ELISA, microscopy, and/or degenerate PCR targeting the coat protein and, thus, many viruses previously identified as CpMMV may actually represent different strains or even species. CP identities among available sequences only range from 88–99% whereas NB identities range from 56–99% (Tables S3 and S4). Furthermore, full genome comparisons between the Florida and Ghana isolates suggests that NB identities reflect identities for non-structural proteins (Fig. 1). Therefore our analysis suggests that the NB may be more representative of overall genomic similarities than the CP for classification purposes at the strain level. Based on this observation, it was expected that the CpMMV Florida isolate would be closely related to isolates from Brazil and Puerto Rico. This was confirmed with the recently released CpMMV genome sequences from Brazil which seem to belong to the same viral strain as CpMMV Florida. Due to the scarcity of genomic data regarding currently classified CpMMV isolates and strong CP similarities, we have named the Florida isolate as CpMMV; however, this classification may need to be revised as more genomic and infectivity data become available. It was also concluded that the Brazilian CpMMV isolates represented a new strain belonging to the same viral species as CpMMV based on their close phylogenetic relationship with the CpMMV isolate from Ghana. Interestingly, recombination analyses of CpMMV genomes from Brazil and Ghana suggested that low pairwise identities in the RdRP compared to the rest of the genome may be partly due to recombination events in this ORF [46]. Due to the occurrence of recombination events it may be necessary to use full genomic sequences for classification of carlavirus strains.
The biological significance of the highly conserved CP in CpMMV isolates is largely unknown; however, it may result from selective pressure of transmission driven by the whitefly vector. CpMMV isolates have been reported to be transmitted both non-persistently and semi-persistently as there are no latent periods for virus transmission in whiteflies but retention times vary from minutes to hours [20]. Regardless, even non-persistent transmission in other vector-virus systems can depend on specific interactions between vector and virus [49]. Therefore the diversity of CpMMV populations may be constrained by the need to retain specific interactions between its CP and the whitefly vector [8].
Concluding remarks
The results of this study demonstrate that current understanding of RNA viruses found in B. tabaci whiteflies is not nearly as complete as that of DNA viruses, which appear to be restricted to the genus Begomovirus. Our findings indicate that the range of RNA viruses found in whiteflies may not be limited to the four groups that have been described since viral sequences with low amino acid identities likely represent novel groups. In addition to expanding current knowledge regarding viruses that can be captured by whiteflies, the VEM approach allowed us to expand the geographical range of CpMMV by documenting its presence in North America. Genomic comparisons among CpMMV genomes suggest that the classification criteria for carlaviruses need to be reevaluated, especially when considering variants that cannot be serologically distinguished. Future studies need to establish criteria to classify CpMMV variants and pathotypes by comparing genomic features, symptoms, infectivity and host range.
Supporting Information
Acknowledgments
We thank the staff at the NGS Core at the Scripps Research Institute for providing DNA sequencing support.
Funding Statement
This study was funded through the National Science Foundation Biodiversity Inventories program (DEB-1025915) and the USDA – Tropical-Subtropical Agriculture Research (T-STAR) program. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
References
- 1. Hogenhout SA, Ammar ED, Whitfield AE, Redinbaugh MG (2008) Insect vector interactions with persistently transmitted viruses. Annual Review of Phytopathology 46: 327–359. [DOI] [PubMed] [Google Scholar]
- 2. De Barro PJ, Liu SS, Boykin LM, Dinsdale AB (2011) Bemisia tabaci: a statement of species status. Annual Review of Entomology 56: 1–19. [DOI] [PubMed] [Google Scholar]
- 3. Oliveira MRV, Henneberry TJ, Anderson P (2001) History, current status, and collaborative research projects for Bemisia tabaci. Crop Protection 20: 709–723. [Google Scholar]
- 4. Zarate SI, Kempema LA, Walling LL (2007) Silverleaf whitefly induces salicylic acid defenses and suppresses effectual jasmonic acid defenses. Plant Physiology 143: 866–875. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6. Anderson PK, Cunningham AA, Patel NG, Morales FJ, Epstein PR, et al. (2004) Emerging infectious diseases of plants: pathogen pollution, climate change and agrotechnology drivers. Trends in Ecology & Evolution 19: 535–544. [DOI] [PubMed] [Google Scholar]
- 7.King AMQ, Lefkowitz E, Adams MJ, Carstens EB, editors (2012) Virus Taxonomy: Ninth Report of the International Committee on Taxonomy of Viruses San Diego: Academic Press. [Google Scholar]
- 8. Power AG (2000) Insect transmission of plant viruses: a constraint on virus variability. Current Opinion in Plant Biology 3: 336–340. [DOI] [PubMed] [Google Scholar]
- 9. Harrison BD, Robinson DJ (1999) Natural genomic and antigenic variation in whitefly-transmitted geminiviruses (Begomoviruses). Annual Review of Phytopathology 37: 369–398. [DOI] [PubMed] [Google Scholar]
- 10. Jones RAC (2009) Plant virus emergence and evolution: Origins, new encounter scenarios, factors driving emergence, effects of changing world conditions, and prospects for control. Virus Research 141: 113–130. [DOI] [PubMed] [Google Scholar]
- 11. Polston JE, Cohen L, Sherwood TA, Ben-Joseph R, Lapidot M (2006) Capsicum species: Symptomless hosts and reservoirs of Tomato yellow leaf curl virus. Phytopathology 96: 447–452. [DOI] [PubMed] [Google Scholar]
- 12. Seal S, van den Bosch F, Jeger M (2006) Factors influencing begomovirus evolution and their increasing global significance: Implications for sustainable control. Critical Reviews in Plant Sciences 25: 23–46. [Google Scholar]
- 13. Ng TFF, Duffy S, Polston JE, Bixby E, Vallad GE, et al. (2011) Exploring the diversity of plant DNA viruses and their satellites using vector-enabled metagenomics on whiteflies. PLoS ONE 6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14. Schmieder R, Lim Y, Rohwer F, Edwards R (2010) TagCleaner: Identification and removal of tag sequences from genomic and metagenomic datasets. BMC Bioinformatics 11: 341. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15. Altschul SF, Madden TL, Schaffer AA, Zhang JH, Zhang Z, et al. (1997) Gapped BLAST and PSI-BLAST: A new generation of protein database search programs. Nucleic Acids Research 25: 3389–3402. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16. Huson DH, Mitra S, Ruscheweyh HJ, Weber N, Schuster SC (2011) Integrative analysis of environmental sequences using MEGAN4. Genome Research 21: 1552–1560. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17. Gaspar JO, Belintani P, Almeida AMR, Kitajima EW (2008) A degenerate primer allows amplification of part of the 3′-terminus of three distinct carlavirus species. Journal of Virological Methods 148: 283–285. [DOI] [PubMed] [Google Scholar]
- 18. Brunt AA, Kenten RH (1973) Cowpea mild mottle, a newly recognized virus infecting cowpeas (Vigna-Unguiculata) in Ghana. Annals of Applied Biology 74: 67–74. [DOI] [PubMed] [Google Scholar]
- 19. Brito M, Fernández-Rodríguez T, Garrido M, Mejías A, Romano M, et al. (2012) First report of Cowpea mild mottle carlavirus on yardlong bean (Vigna unguiculata subsp. sesquipedalis) in Venezuela. Viruses 4: 3804–3811. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20. Jeyanandarajah P, Brunt AA (1993) The natural occurrence, transmission, properties and possible affinities of Cowpea mild mottle virus. Journal of Phytopathology-Phytopathologische Zeitschrift 137: 148–156. [Google Scholar]
- 21. Naidu RA, Gowda S, Satyanarayana T, Boyko V, Reddy AS, et al. (1998) Evidence that whitefly-transmitted cowpea mild mottle virus belongs to the genus Carlavirus. Archives of Virology 143: 769–780. [DOI] [PubMed] [Google Scholar]
- 22. Tavasoli M, Shahraeen N, Ghorbani SH (2009) Serological and RT-PCR detection of Cowpea mild mottle carlavirus infecting soybean. Journal of General and Molecular Virology 1: 007–011. [Google Scholar]
- 23. Punta M, Coggill PC, Eberhardt RY, Mistry J, Tate J, et al. (2012) The Pfam protein families database. Nucleic Acids Research 40: D290–D301. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24. Edgar RC (2004) MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Research 32: 1792–1797. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25. Tamura K, Peterson D, Peterson N, Stecher G, Nei M, et al. (2011) MEGA5: Molecular evolutionary genetics analysis using maximum likelihood, evolutionary distance, and maximum parsimony methods. Molecular Biology and Evolution 28: 2731–2739. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26. Simossis VA, Heringa J (2005) PRALINE: a multiple sequence alignment toolbox that integrates homology-extended and secondary structure information. Nucleic Acids Research 33: W289–W294. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27. Guindon S, Dufayard J-F, Lefort V, Anisimova M, Hordijk W, et al. (2010) New algorithms and methods to estimate maximum-likelihood phylogenies: Assessing the performance of PhyML 3.0. Systematic Biology 59: 307–321. [DOI] [PubMed] [Google Scholar]
- 28. Abascal F, Zardoya R, Posada D (2005) ProtTest: Selection of best-fit models of protein evolution. Bioinformatics 21: 2104–2105. [DOI] [PubMed] [Google Scholar]
- 29. Anisimova M, Gascuel O (2006) Approximate likelihood-ratio test for branches: A fast, accurate, and powerful alternative. Systematic Biology 55: 539–552. [DOI] [PubMed] [Google Scholar]
- 30. Spear A, Sisterson MS, Yokomi R, Stenger DC (2010) Plant-feeding insects harbor double-stranded RNA viruses encoding a novel proline-alanine rich protein and a polymerase distantly related to that of fungal viruses. Virology 404: 304–311. [DOI] [PubMed] [Google Scholar]
- 31. Menzel W, Winter S, Vetten HJ (2010) Complete nucleotide sequence of the type isolate of Cowpea mild mottle virus from Ghana. Archives of Virology 155: 2069–2073. [DOI] [PubMed] [Google Scholar]
- 32. Rozanov MN, Koonin EV, Gorbalenya AE (1992) Conservation of the putative methyltransferase domain: a hallmark of the ‘Sindbis-like’ supergroup of positive-strand RNA viruses. Journal of General Virology 73: 2129–2134. [DOI] [PubMed] [Google Scholar]
- 33. Lawrence DM, Rozanov MN, Hillman BI (1995) Autocatalytic processing of the 223-kDa protein of blueberry scorch carlavirus by a papain-like proteinase. Virology 207: 127–135. [DOI] [PubMed] [Google Scholar]
- 34. Gorbalenya AE, Koonin EV (1989) Viral proteins containing the purine NTP-binding sequence pattern. Nucleic Acids Research 17: 8413–8438. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35. Koonin EV (1991) The phylogeny of RNA-dependent RNA polymerases of positive-strand RNA viruses. Journal of General Virology 72: 2197–2206. [DOI] [PubMed] [Google Scholar]
- 36. Morozov SY, Solovyev AG (2003) Triple gene block: modular design of a multifunctional machine for plant virus movement. Journal of General Virology 84: 1351–1366. [DOI] [PubMed] [Google Scholar]
- 37. Martelli GP, Adams MJ, Kreuze JF, Dolja VV (2007) Family Flexiviridae: a case study in virion and genome plasticity. Annual Review of Phytopathology 45: 73–100. [DOI] [PubMed] [Google Scholar]
- 38. Foster GD (1992) The structure and expression of the genome of carlaviruses. Research in Virology 143: 103–112. [DOI] [PubMed] [Google Scholar]
- 39. Ng TFF, Willner DL, Lim YW, Schmieder R, Chau B, et al. (2011) Broad surveys of DNA viral diversity obtained through viral metagenomics of mosquitoes. PLoS ONE 6: e20579. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40. Wisler GC, Li RH, Liu HY, Lowry DS, Duffus JE (1998) Tomato chlorosis virus: a new whitefly-transmitted, Phloem-limited, bipartite closterovirus of tomato. Phytopathology 88: 402–409. [DOI] [PubMed] [Google Scholar]
- 41. Almeida AMR, Piuga FF, Marin SRR, Kitajima EW, Gaspar JO, et al. (2005) Detection and partial characterization of a carlavirus causing stem necrosis of soybean in Brazil. Fitopatologia Brasileira 30: 191–194. [Google Scholar]
- 42. Chang C-A, Chien L-Y, Tsai C-F, Cheng YH, Lin Y-Y (2013) First report of Cowpea mild mottle virus in cowpea and French bean in Taiwan. Plant Disease 97: 1001. [DOI] [PubMed] [Google Scholar]
- 43. Pardina PER, Arneodo JD, Truol GA, Herrera PS, Laguna IG (2004) First record of Cowpea mild mottle virus in bean crops in Argentina. Australasian Plant Pathology 33: 129–130. [Google Scholar]
- 44. Iwaki M, Thongmeearkom P, Prommin M, Honda Y, Hibi T (1982) Whitefly transmission and some properties of Cowpea mild mottle virus on soybean in Thailand. Plant Disease 66: 365–368. [Google Scholar]
- 45. Antignus Y, Cohen S (1987) Purification and some properties of a new strain of Cowpea mild mottle virus in Israel. Annals of Applied Biology 110: 563–569. [Google Scholar]
- 46. Zanardo L, Silva F, Bicalho A, Urquiza G, Lima A, et al. (2013) Molecular and biological characterization of Cowpea mild mottle virus isolates infecting soybean in Brazil and evidence of recombination. Plant Pathology DOI: 10.1111/ppa.12092 [Google Scholar]
- 47. Adams MJ, Antoniw JF, Bar-Joseph M, Brunt AA, Candresse T, et al. (2004) The new plant virus family Flexiviridae and assessment of molecular criteria for species demarcation. Archives of Virology 149: 1045–1060. [DOI] [PubMed] [Google Scholar]
- 48. Badge J, Brunt A, Carson R, Dagless E, Karamagioli M, et al. (1996) A carlavirus-specific PCR primer and partial nucleotide sequence provides further evidence for the recognition of cowpea mild mottle virus as a whitefly-transmitted carlavirus. European Journal of Plant Pathology 102: 305–310. [Google Scholar]
- 49. Gray SM, Banerjee N (1999) Mechanisms of arthropod transmission of plant and animal viruses. Microbiology and Molecular Biology Reviews 63: 128–148. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.