Skip to main content
F1000Research logoLink to F1000Research
. 2017 Oct 13;6:1835. [Version 1] doi: 10.12688/f1000research.12858.1

Updated mtCOI reference dataset for the Bemisia tabaci species complex

Laura M Boykin 1,a, Anders Savill 1, Paul De Barro 2
PMCID: PMC5680533  PMID: 29167738

Abstract

Members of the whitefly Bemisia tabaci species complex cause millions of dollars of damage globally and are considered one of the world’s most invasive species. They are capable of causing extensive damage to major vegetable, grain legume and fiber crops. All member of the species complex are morphologically identical therefore, data from the partial mitochondrial cytochrome oxidase subunit I (mtCOI) gene sequence has been used to identify the various species. The current reference dataset that is widely used is found on the CSIRO data portal. However, the reference set stored on the CSIRO data does not include newly added sequences (2013-2017), therefore an updated reference dataset is needed.  All mtCOI data for the Bemisia tabaci species complex were downloaded on 22 May 2017 from GenBank and after quality checking, a dataset of 1,071 unique sequences and 696 base pairs was generated (https://doi.org/10.6084/m9.figshare.5437420.v1).

Keywords: species identification, whitefly, insect vector, mitochondrial cytochrome oxidase, DNA barcoding

Introduction

Members of the Bemisia tabaci (whiteflies) species complex are among the world’s most devastating insect pests and cause billions of dollars (US) of damage each year, leaving farmers in the developing world food insecure ( De Barro et al., 2011). As a species complex with at least 34 members, identification is based on the use of the 657 bp portion of the 3’ end of the mitochondrial COI (mtCOI) ( Boykin et al., 2012, Boykin et al., 2013). In order to identify members of the complex correctly, a curated reference dataset is a useful resource. In 2012, a reference mtCOI dataset was made available on the CSIRO data portal ( De Barro & Boykin, 2012). Errors in the dataset were subsequently identified and so the dataset was updated on 15 May 2017 ( http://doi.org/10.4225/08/591a4018dfca8) ( De Barro & Boykin, 2017), but did not include new additions from GenBank (post 2012). Therefore, the dataset described herein represents the most up-to-date reference resource for members of the complex.

Methods

The CSIRO dataset ( http://doi.org/10.4225/08/591a4018dfca8), updated 15 May 2017 was used as the starting point. The existing records were updated to include host plant data. New records post-2012 were then downloaded on 22 May 2017 directly from GenBank. All downloaded data was treated as follows:

1) Data was classified with BLAST using the new CSIRO reference data set

2) Sequences that caused gaps in the alignment were removed

2) Sequences that had stop codons present were removed

3) Clustal Omega ( Sievers & Higgins, 2014) was used for preliminary alignment and fine tuning of the alignment was carried out with MAFFT ( Katoh & Standley, 2013).

4) Duplicate sequences were then removed using BBMAP Dedupe ( Bushnell, 2017).

In addition, all MEAM2 sequences were removed as they have now been confirmed to be pseudogenes ( Tay et al., 2017).

Data availability

The data referenced by this article are under copyright with the following copyright statement: Copyright: © 2017 Boykin LM et al.

Figshare: Dataset 1. mtCOI reference data for species ID of Bemisia tabaci. DOI: 10.6084/m9.figshare.5437420 ( Boykin et al., 2017)

Funding Statement

The author(s) declared that no grants were involved in supporting this work.

[version 1; referees: 2 approved]

References

  1. Boykin LM, Armstrong KF, Kubatko L, et al. : Species delimitation and global biosecurity. Evol Bioinform Online. 2012;8:1–37. 10.4137/EBO.S8532 [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Boykin LM, Bell CD, Evans G, et al. : Is agriculture driving the diversification of the Bemisia tabaci species complex (Hemiptera: Sternorrhyncha: Aleyrodidae)?: Dating, diversification and biogeographic evidence revealed. BMC Evol Biol. 2013;13:228. 10.1186/1471-2148-13-228 [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Boykin L, Savill A, De Barro P: mtCOI reference data for species ID of Bemisia tabaci. figshare. 2017. Data Source [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Bushnell B: BBmap.2017. Reference Source [Google Scholar]
  5. De Barro P, Boykin LM: Global Bemisia dataset release version 31 Dec 2012. CSIRO.2012. 10.4225/08/50EB54B6F1042 [DOI] [Google Scholar]
  6. De Barro P, Boykin LM: Global Bemisia dataset release version 15 May 2017. CSIRO.2017. 10.4225/08/591a4018dfca8 [DOI] [Google Scholar]
  7. De Barro PJ, Liu SS, Boykin LM, et al. : Bemisia tabaci: a statement of species status. Annu Rev Entomol. 2011;56:1–19. 10.1146/annurev-ento-112408-085504 [DOI] [PubMed] [Google Scholar]
  8. Katoh K, Standley DM: MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol Biol Evol. 2013;30(4):772–780. 10.1093/molbev/mst010 [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Sievers F, Higgins DG: Clustal Omega, accurate alignment of very large numbers of sequences. Methods Mol Biol. 2014;1079:105–116. 10.1007/978-1-62703-646-7_6 [DOI] [PubMed] [Google Scholar]
  10. Tay WT, Elfekih S, Court LN, et al. : The trouble with MEAM2: Implications of pseudogenes on species delimitation in the globally invasive Bemisia tabaci (Hemiptera: Aleyrodidae) cryptic species complex. Genome Biol Evol. 2017. 10.1093/gbe/evx173 [DOI] [PMC free article] [PubMed] [Google Scholar]
F1000Res. 2017 Nov 8. doi: 10.5256/f1000research.13935.r26987

Referee response for version 1

Sharad Saurabh 1, Manisha Mishra 2

Whitefly ( Bemisia tabaci) is becoming a global hazard for crop and ornamental plants. Identification of correct species is always better for the implication of best control strategy. In this regard, the effort made by Boykin et al for speedy and accurate identification of  B. tabaci species complex is very significant. Additionally, the dataset developed with enriched quality is also very useful for whitefly biologist working in the area of evolution, mitochondrial genomics and crop management.

All the bioinformatics tools used to generate this refined dataset are ideal to make such analysis.

We have read this submission. We believe that we have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.

F1000Res. 2017 Oct 23. doi: 10.5256/f1000research.13935.r26988

Referee response for version 1

Renate Krause Sakate 1

The updated mtCOI reference dataset for the Bemisia tabaci species complex will add a valuable contribution to researches for a fast and accurate identification of members from the B. tabaci species complex based on the partial mitochondrial COI gene.

The high quality data is easily accessible for download and gathers whiteflies collected globally. The availability of a reliable and updated reference dataset is an essential tool that will aid the scientific community to identify and classify correctly this  pest, the first step in the crop management against whiteflies.

I have read this submission. I believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Data Availability Statement

    The data referenced by this article are under copyright with the following copyright statement: Copyright: © 2017 Boykin LM et al.

    Figshare: Dataset 1. mtCOI reference data for species ID of Bemisia tabaci. DOI: 10.6084/m9.figshare.5437420 ( Boykin et al., 2017)


    Articles from F1000Research are provided here courtesy of F1000 Research Ltd

    RESOURCES