Skip to main content
Data in Brief logoLink to Data in Brief
. 2024 Apr 17;54:110441. doi: 10.1016/j.dib.2024.110441

Characterization of complete mitogenome data of two flies (Diptera) as orchid pollinators from China

Jinrui He a,b, Xiong Zhang c, Xiaoyan Liu d, Qingqing Li a,f, Yinling Luo e,, Yan Luo b,
PMCID: PMC11067367  PMID: 38708295

Abstract

The Diptera insects have important ecological functions. Many plants rely on Diptera insects for pollination, and they play an important role in Co-evolution with plants. We described the detailed characteristics across the complete mitogenome sequences of Desmometopa sabroskyi Brake, 2003 (Diptera: Milichiidae) and an unidentified species of Gampsocera (Diptera: Chloropidae), which are pollinators of orchid species. Sequences were assembled and annotated using the reference genomes of Phyllomyza sp. (OP612805) and Elachiptera insignis (OP612812) available in Genbank. The complete mitogenomes of D. sabroskyi and Gampsocera sp. are 15,841 bp and 16,036 bp in length, respectively. Both mitogenomes include 37 genes consisting of 13 protein-coding genes (PCGs), 22 transfer RNA genes (tRNAs), two ribosomal RNA genes (rRNAs), and one noncoding region (NCR). The mitogenome data would better contribute to species identification, taxonomy, phylogenetics, and evolutionary analysis of Diptera insects. 

Keywords: Mitochondrial DNA, Milichiidae, Desmometopa sabroskyi, Chloropidae, Gampsocera sp.


Specifications Table

Subject Biological Sciences: Biodiversity; Entomology and Insect Science
Specific subject area Diptera, Chloropidae, Milichiidae, Mitogenomics
Type of data Table: gene annotations, base composition.
Figure: two flies, mitogenomic circular map, AT-rich region sequence, phylogenetic tree
Fasta: mitogenome data
Fastq: DNA sequence reads
Data format: Raw and analyzed
Data collection DNA extraction and sequencing: The total DNA was extracted using DNeasy Kit. The Illumina paired-end DNA library was constructed and sequenced by the Illumina NovaSeq 6000 with 150 paired-end mode (PE150).
Mitogenome assembly and annotation: The raw data were filtered using the Trimmomaticv0.30, and the final high-quality data was obtained and uploaded to supercomputing, assembled using the Getorganelle, and the results were examined using the Bandage. Both protein-coding genes (PCGs) and rRNA genes were predicted by MITOS tools, and the tRNA genes were identified through tRNAscan-SE webserver. A mitogenome map was drawn using the OGDRAW web server.
Phylogenetic analyses: IQ-tree and MrBayes programs were used to construct the phylogenetic tree using Maximum-Likelihood (ML) and Bayesian (BI) methods.
Data source location The samples for adult flies were collected in 2023 from Xishuangbanna, Yunnan, China. Desmometopa sabroskyi was collected at 100°33′ 5.41′′ E, 22°5′ 7.56′′ N, and Gampsocera sp. was collected at 101°38′E, 21°8′N. Specimens were deposited at Center for Integrative Conservation, Xishuangbanna Tropical Botanical Garden, Chinese Academy of Sciences (contact: Yan Luo, luoyan@xtbg.org.cn) with voucher number HB20230225 and SM20230621.
Data accessibility Repository name: NCBI BioProject
Data identification number: PRJNA1045338 and PRJNA1070834
Direct URL to data: https://www.ncbi.nlm.nih.gov/bioproject/1045338 and https://www.ncbi.nlm.nih.gov/bioproject/1070834
Repository name: NCBI BioSample
Data identification number: SAMN38441133 and SAMN39663932
Direct URL to data: https://www.ncbi.nlm.nih.gov/biosample/38441133 and https://www.ncbi.nlm.nih.gov/biosample/39663932
Repository name: NCBI SRA
Data identification number: SRR27065045 and SRR27833488
Direct URL to data: https://www.ncbi.nlm.nih.gov/sra/SRX22754512 and https://www.ncbi.nlm.nih.gov/sra/SRX23496836
Repository name: NCBI Genbank
Data identification number: OR854638 and PP232099
Direct URL to data: https://www.ncbi.nlm.nih.gov/nuccore/OR854638 and https://www.ncbi.nlm.nih.gov/nuccore/PP232099

1. Value of Data

  • The Diptera is one of the most diverse insect groups in the world [1]. As an important group of pollinators, they play a significant role in the diversification of flowering plants [2]. The taxonomic research on Diptera insects is very challenging, and morphological characteristics are inadequate for resolving species delimitation and phylogenetic relationships [3,4]. However, there is a lack of genetic information about Diptera insects, which makes it difficult to understand their diversity, as well as hinders the study of flowering plant diversification and pollinator-plant interactions.

  • The data represent the first complete mitogenome for Desmometopa sabroskyi (Milichiidae) and Gampsocera sp. (Chloropidae), respectively, which served as effective pollinators of Bulbophyllum orchids in Yunnan, southwestern China.

  • The complete mitochondrial references for the Diptera species can provide accurate identification of the pollinator at the molecular level, clarify its taxonomic status, and increase knowledge of the diversity of pollinators for orchid species.

  • The complete mitogenome might serve as a valuable resource for phylogenetic and evolutionary analysis for Milichiidae and Chloropidae of the Diptera.

2. Background

As an important group of pollinators, the Diptera play a significant role in the diversification of flowering plants [5]. It is estimated that over 70 families of Diptera are pollinators of at least 555 flowering plant species [6,7]. In the course of observation of pollinators of Bulbophyllum orchids, we found two flies pollinating Bulbophyllum nigripetalum (Fig. 1A) and B. andersonii (Fig. 1B), respectively. By critical examinations of morphological characters and evaluation of literature, the fly pollinating B. nigripetalum was identified as Desmometopa sabroskyi Brake, 2003 (Diptera: Milichiidae) (Fig. 1C), and the fly pollinating B. andersonii was an undescribed species belonging to the genus Gampsocera (Diptera: Chloropidae) (Fig. 1D).

Fig. 1.

Fig. 1

Desmometopa sabroskyi pollinates Bulbophyllum nigripetalum and Gampsocera sp. pollinates Bulbophyllum andersonii. A: Desmometopa sabroskyi (white arrow) pollinating a Bulbophyllum nigripetalum flower; B: Gampsocera sp. (white arrow) pollinating a Bulbophyllum andersonii flower; C: Desmometopa sabroskyi carrying pollinia of Bulbophyllum nigripetalum: D: Gampsocera sp. carrying pollinia of Bulbophyllum andersonii.

3. Description of Data

The complete mitogenomes of D. sabroskyi and Gampsocera sp. are 15,841 bp and 16,036 bp in length, respectively (Table 1). The mitogenome nucleotide composition of D. sabroskyi is 40.9 % of A, 36.7 % of T, 9 % of G, 13.3 % of C, and AT content was 77.6 % (Table 1). The length of tRNAs ranges from 60 to 72 bp. 16S rRNA is 1301 bp in length and 12S rRNA is 786 bp in length. The CR is 999 bp in length and has rich AT content (90.7 %). The sequence shows weakly positive AT-skew (0.0545) and negative GC-skew (−0.1936). The mitogenome nucleotide composition of Gampsocera sp. is 40.9 % of A, 38.3 % of T, 8.6 % of G, 12.2 % of C, AT content was 79.2 % (Table 1). The length of tRNAs ranges from 62 to 72 bp. 16S rRNA is 1,302 bp in length and 12S rRNA is 790 bp in length. The CR is 1,148 bp in length and has rich AT content (83.7 %). The sequence shows weakly positive AT-skew (0.0333) and negative GC-skew (−0.1684).

Table 1.

Base composition and skewness of mitogenomes of Desmometopa sabroskyi and Gampsocera sp.

Species Gene size (bp) A% G% T% C% A + T% AT skew GC skew
Desmometopa sabroskyi Mitogenome 15,841 40.9 9 36.7 13.3 77.6 0.0545 −0.1936
PCGs 11,202 32.3 12.5 43.3 11.8 75.6 −0.1460 0.0289
tRNAs 1453 39 12.9 38 10.2 77 0.0126 0.1164
rRNAs 2087 37.8 12 44.2 6 82 −0.0789 0.3297
CR 999 48.6 3.6 42.1 5.6 90.7 0.0716 −0.2173

Gampsocera sp. Mitogenome 16,036 40.9 8.6 38.3 12.2 79.2 0.0333 −0.1684
PCGs 11,227 33.8 11.8 43.7 10.7 77.5 −0.1282 0.0479
tRNAs 1478 39 12.7 38.5 9.8 77.5 0.0069 0.1265
rRNAs 2092 39.8 11.4 43.1 5.7 82.9 −0.0392 0.3351
CR 1148 49.1 3.8 40.4 6.6 89.5 0.0972 −0.2666

Both mitogenomes include 37 genes consisting of 13 protein-coding genes (PCGs), 22 transfer RNA genes (tRNAs), two ribosomal RNA genes (rRNAs), and a control region (CR) (Table 2, Fig. 2). In total, 16 intergenic spacer regions with an average size of 7 bp and a total length of 115 bp were identified in D. sabroskyi and 17 intergenic spacer regions with an average size of 7 bp and a total length of 122 bp were identified Gampsocera sp. (Table 2).

Table 2.

Mitogenome organization of Desmometopa sabroskyi and Gampsocera sp.

Strand Gene Desmometopa sabroskyi
Gampsocera sp.
Position Size Int Start Stop Anti Position Size Int Start Stop Anti
H tRNA-Ile 1–65 65 −3 GAT 1–66 66 6 GAT
L tRNA-Gln 63–131 69 10 TTG 73–141 69 −1 TTG
H tRNA-Met 142–210 69 0 CAT 141–209 69 0 CAT
H ND2 211–1233 1023 −2 ATT TAA 210–1247 346 7 ATT TAG
H tRNA-Trp 1232–1299 68 −8 TCA 1255–1323 69 −8 TCA
L tRNA-Cys 1292–1354 63 0 GCA 1316–1380 65 3 GCA
L tRNA-Tyr 1355–1419 65 1 GTA 1384–1448 65 −2 GTA
H COX1 1421–2956 1536 −5 CGA TAA 1447–2980 1534 −1 TCG T–
H tRNA-Leu 2952–3017 66 0 TAA 2980–3047 68 −1 TAG
H COX2 3018–3705 679 0 ATG T– 3047–3734 688 −1 ATG T–
H tRNA-Lys 3706–3776 71 −1 CTT 3734–3805 72 1 CTT
H tRNA-Asp 3776–3843 68 0 GTC 3807–3872 66 0 GTC
H ATP8 3844–4005 162 −7 ATC TAA 3873–4034 162 −7 ATT TAA
H ATP6 3999–4676 678 7 ATG TAA 4028–4705 678 −1 ATG TAA
H COX3 4684–5472 789 12 ATG TAA 4705–5496 792 20 ATG TAA
H tRNA-Gly 5485–5550 66 0 TCC 5517–5582 66 0 TCC
H ND3 5551–5904 354 7 ATT TAA 5583–5936 354 4 ATT TAA
H tRNA-Ala 5912–5976 65 −1 TGC 5941–6007 67 4 TGC
H tRNA-Arg 5976–6039 64 7 CGG 6012–6073 62 7 TCG
H tRNA-Asn 6047–6110 64 0 GTT 6081–6146 66 0 GTT
H tRNA-Ser 6111–6176 66 10 GCT 6147–6213 67 0 TGA
H tRNA-Glu 6187–6251 65 18 TTC 6214–6279 66 17 TTC
L tRNA-Phe 6270–6335 66 0 GAA 6297–6365 69 −2 GAA
L ND5 6336–8070 1735 0 ATT T– 6364–8099 1736 0 ATC TA-
L tRNA-His 8071–8134 64 −1 GTG 8100–8165 66 0 GTG
L ND4 8134–9474 1341 −7 ATG TAA 8166–9504 1339 −1 ATG T–
L ND4L 9468–9764 297 2 ATG TAA 9504–9794 291 2 ATG TAA
H tRNA-Thr 9767–9831 65 0 TGT 9797–9862 66 0 TGT
L tRNA-Pro 9832–9896 65 2 TGG 9863–9930 68 2 TGG
H ND6 9899–10,423 525 3 ATT TAA 9933–10,457 525 3 ATT TAA
H CYTB 10,427–11,563 1137 1 ATG TAA 10,461–11,597 1137 −1 ATG TAA
H tRNA-Ser 11,565–11,631 67 6 TGA 11,597–11,664 68 16 GCT
L ND1 11,638–12,595 958 10 TTG T– 11,681–12,628 948 1 TTG TAA
L tRNA-Leu 12,606–12,665 60 10 TAG 12,630–12,696 67 5 TAG
L 16S rRNA 12,676–13,976 1301 9 12,702–14,003 1302 24
L tRNA-Val 13,986–14,057 72 −1 TAC 14,028–14,098 71 0 TAC
L 12S rRNA 14,057–14,842 786 0 14,099–14,888 790 0
H Control region 14,843–15,841 999 0 14,889–16,036 1148 0

Fig. 2.

Fig. 2

Circular map of the assembled Gampsocera sp. and Desmometopa sabroskyi mitogenome, consisting of 13 protein-coding genes, 22 transfer RNA, and two ribosomal RNA genes. Different colors indicate different gene families and the outer and inner rings represent heavy and light chains, respectively. The darker and lighter gray area in the inner circle represent the GC and AT contents.

Phylogenetic analysis showed that the taxa of Milichiidae and Chloropidae formed a well-supported clade, D. sabroskyi formed an independent lineage with strong support, and Gampsocera sp. is closely related to Cadrema minor within Chloropidae with strong support. (Fig. 3). The mitogenome data indicate Milichiidae to be paraphyletic, as suggested by previous researchers [8].

Fig. 3.

Fig. 3

Molecular phylogeny based on the concatenated nucleotide sequences of protein-coding genes (PCGs) of 19 species in Ephydroide or Carnoidea and 2 outgroups. Phylogenetic tree conducted using Maximum Likelihood (ML) and Bayesian Inference (BI) methods and the numbers above branches represent bootstrap percentage (BP) of ML /posterior probability (PP) of BI. Genbank accession numbers are located after the species' scientific name. The position of Gampsocera sp. and Desmometopa sabroskyi are marked with a solid square.

4. Experimental Design, Materials And Methods

4.1. Samples collection

The samples for adult flies were collected in June 2023 from two sites of Xishuangbanna, Yunnan, China (100°33′ 5.41′′ E, 22°5′ 7.56′′ N; 101°38′E, 21°8′N). Specimens were deposited at Center for Integrative Conservation, Xishuangbanna Tropical Botanical Garden, Chinese Academy of Sciences (contact: Yan Luo, luoyan@xtbg.org.cn) with voucher HB20230225 and SM20230621.

4.2. DNA extraction and sequencing

The total DNA was extracted using DNeasy Kit (Qiagen, Germany) according to the manufacturer's instructions. The Illumina paired-end DNA libraries were constructed using the standard library building procedure of Illumina's Truseq DNA PCR-Free Prep kit (San Diego, CA, USA) following the manufacturer's instructions. The DNA libraries were subjected to next-generation sequencing using the high-throughput sequencing platform Illumina NovaSeq6000 of Shanghai Personal Biotechnology Company (Shanghai, China).

4.3. Mitogenome assembly and annotation

The sequencing runs produced 28,787,148 reads (total bases: 4.3 G, size: 2.2 GB) in D. sabroskyi with the values of Q20 (97.24 %) and Q30 (92.51 %) over 90 % and 25,863,920 reads (total bases: 3.9 G, size: 1.7 GB) in Gampsocera sp. with the values of Q20 (97.09 %) and Q30 (94.75 %) over 90 %, respectively.

The raw data was filtered using the Trimmomaticv0.30 [9], and the final high-quality data of 28,722,578 reads (total bases: 4.2 G, size: 1.9 GB) in D. sabroskyi and 25,602,128 reads (total bases: 3.7 G, size: 1.5 GB) in Gampsocera sp., respectively, were obtained and uploaded to supercomputing, assembled using the Getorganelle pipeline [10], and the results were examined using the Bandage. Both protein-coding genes (PCGs) and rRNA genes were predicted by MITOS tools (http://mitos2.bioinf.uni-leipzig.de/index.py) [11], and the tRNA genes was identified through tRNAscan-SE webserver [12]. A mitogenome map was drawn using the OGDRAW web server (https://chlorobox.mpimp-golm.mpg.de/OGDraw.html).

4.4. Phylogenetic analysis

To validate the position of these flies, the PCGs sequences of mitogenome from 23 Diptera species were used in phylogenetic analysis, including 11 species in Carnoidea, 10 species in Ephydroide, and Chironomus tepperi Skuse, 1889 and Dixella aestivalis (Meigen, 1818) as outgroups, were clustered together to construct phylogenetic tree using maximum-likelihood (ML) and Bayesian (BI) analysis. The ML tree with 1000 bootstrap replicates was performed using IQ-tree [13]. The BI tree was generated using MrBayes 3.2.4 [14] with Markov chain Monte Carlo (MCMC) algorithm running for 2 × 106 generations, sampled once every 100 generations. The first 25 % of all generations were excluded, and the remaining samples were used to generate the majority consensus trees and estimate the posterior probabilities.

Limitations

Not applicable.

Ethics Statement

This study did not involve humans or animals. No approval or permission was necessary in this study for the sample collection.

CRediT authorship contribution statement

Jinrui He: Investigation, Data curation, Writing – original draft. Xiong Zhang: Investigation, Data curation, Writing – original draft. Xiaoyan Liu: Methodology, Writing – review & editing. Qingqing Li: Data curation, Writing – review & editing. Yinling Luo: Conceptualization, Methodology, Supervision, Writing – review & editing. Yan Luo: Conceptualization, Methodology, Supervision, Writing – review & editing.

Acknowledgments

Acknowledgments

We are grateful to Meng-Kai Li and Peng-Yue Ma from Xishuangbanna Botanical Garden, Chinese Academy of Sciences for their help in fieldwork and sampling.

This research was supported by the National Natural Science Foundation of China under Grant (number 32270225 and 32360306) and the 14th Five-Year Plan of Xishuangbanna Tropical Botanical Garden, Chinese Academy of Sciences.

Declaration of Competing Interest

No potential conflict of interest was reported by the authors.

Contributor Information

Yinling Luo, Email: luoyinling@peu.edu.cn.

Yan Luo, Email: luoyan@xtbg.org.cn.

Data Availability

References

  • 1.G.W. Courtney, T. Pape, J.H. Skevington, B.J. Sinclair, Biodiversity of diptera: science and society, 2017, doi:10.1002/9781118945568.ch9.
  • 2.Raguso R.A. Don't forget the flies: dipteran diversity and its consequences for floral ecology and evolution. Appl. Entomol. Zool. 2020;55:1–7. doi: 10.1007/s13355-020-00668-9. [DOI] [Google Scholar]
  • 3.Kanmiya K. A systematic study of the Japanese Chloropidae (Diptera) Mem. Entomol. Soc. Washington. 1983;11:1–377. [Google Scholar]
  • 4.Riccardi P.R., Amorim D.D.S. Phylogenetic relationships and classification of the Chloropinae of the world (Diptera: chloropidae) Zool. J. Linn. Soc. 2020;190:889–941. doi: 10.1093/zoolinnean/zlaa007. [DOI] [Google Scholar]
  • 5.Ssymank A., Kearns C.A., Pape T., Thompson F.C. Pollinating flies (Diptera): a major contribution to plant diversity and agricultural production. Biodiversity. 2008;9:86–89. doi: 10.1080/14888386.2008.9712892. [DOI] [Google Scholar]
  • 6.Larson B.M.H., Kevan P.G., Inouye D.W. Flies and flowers: I. The taxonomic diversity of anthophiles and pollinators. Can. Entomol. 2001;133:439–465. doi: 10.4039/Ent133439-4. [DOI] [Google Scholar]
  • 7.Christensen D.E. In: Orchid Biology: Reviews and Perspectives VI. Arditti J., editor. John Wiley & Sons press; New York: 1994. Fly pollination in the Orchidaceae; pp. 415–454. editor. [Google Scholar]
  • 8.Song N., Xi Y.Q., Yin X.M. Phylogenetic relationships of Brachycera (Insecta: diptera) inferred from mitochondrial genome sequences. Zool. J. Linn. Soc. 2022;196:720–739. doi: 10.1093/zoolinnean/zlab125. [DOI] [Google Scholar]
  • 9.Bolger A.M., Lohse M., Usadel B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics. 2014;30:2114–2120. doi: 10.1093/bioinformatics/btu170. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Jin J.J., Yu W.B., Yang J.B., Song Y., dePamphilis C.W., Yi T.S., Li D.Z. GetOrganelle, a fast and versatile toolkit for accurate de novo assembly of organelle genomes. Genome Biol. 2020;21:241. doi: 10.1186/s13059-020-02154-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Bernt M., Donath A., Juhling F., Externbrink F., Florentz C., Fritzsch G., Putz J., Middendorf M., Stadler P.F. MITOS: improved de novo metazoan mitochondrial genome annotation. Mol. Phylogenet. Evol. 2013;69:313–319. doi: 10.1016/j.ympev.2012.08.023. [DOI] [PubMed] [Google Scholar]
  • 12.Lowe T.M., Chan P.P. tRNAscan-SE On-line: integrating search and context for analysis of transfer RNA genes. Nucleic Acids Res. 2016;44:W54–W57. doi: 10.1093/nar/gkw413. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Trifinopoulos J., Nguyen L., Haeseler A., Minh B. W-IQ-TREE: a fast online phylogenetic tool for maximum likelihood analysis. Nucleic Acids Res. 2016;44:W232–W235. doi: 10.1093/nar/gkw256. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Ronquist F., Teslenko M., van der Mark P., et al. MrBayes 3.2, Efficient Bayesian phylogenetic inference and model choice across a large model space. Syst. Biol. 2012;61:539–542. doi: 10.1093/sysbio/sys029. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement


Articles from Data in Brief are provided here courtesy of Elsevier

RESOURCES