Abstract
Bracon brevicornis is an ectoparasitoid of a wide range of larval-stage Lepidopterans, including several pests of important crops, such as the corn borer, Ostrinia nubilalis. It is also one of the earliest documented cases of complementary sex determination in Hymenoptera. Here, we present the linked-read-based genome of B. brevicornis, complete with an ab initio-derived annotation and protein comparisons with fellow braconids, Fopius arisanus and Diachasma alloeum. We demonstrate the potential of linked-read assemblies in exploring regions of heterozygosity and search for structural and homology-derived evidence of the complementary sex determiner gene (csd).
Keywords: 10X Genomics, Braconinae, complementary sex determination, microsynteny, feminizer, parasitoid, parasitic wasp
1. Introduction
Bracon brevicornis (Wesmael) is a gregarious ectoparasitoid of various Lepidoptera larvae, including many important pests, and is considered a cosmopolitan species [1,2]. In the past, B. brevicornis has been classified under the genus Habrobracon [3], Microbracon [4], or classified as one species with Habrobracon/Bracon hebetor [5]; however, recent research shows that B. brevicornis and B. hebetor are genetically two distinct species [6]. In the field, B. brevicornis has shown potential as a biological control agent against important pest species in stored corn stalks, such as Ostrinia nubilalis and Sesamia cretica [7], or against the coconut moth, Opisinia arenosella [2]. In the laboratory, B. brevicornis attacks a wide range of larval host such as Ephesthia kuehniella, Galleria spp., and Spodoptera spp. [8].
Work on B. brevicornis has included both laboratory and semi-field set-ups to determine both its efficacy as a biological control agent as well as its suitability as a study system. There are several studies on the biology of B. brevicornis, including on population growth potential [9], their host range [8], interspecific competition [2], clutch size and fitness [10], mate choice [11,12], diet [1], and efficacy [7].
Within a phylogenetic perspective, B. brevicornis falls within the subfamily Braconinae, the largest of the cyclostome-forming braconid wasps [13]. The presence of a cyclostome (round mouthpart) is a defining feature within braconid wasps, as it represents an unresolved evolutionary and systematic question: is the cyclostome a derived trait within certain branches, or an ancestral trait that has been lost in others [13]? Within the Braconinae, there have been multiple switches from ectoparasitism to endoparasitism and vice versa, and this combination of cyclostome and endoparasitism has been described as a “controversial topic” by braconid researchers and taxonomists [13]. These systematic issues are far from being resolved, and more genomic data would be useful for future phylogenetic analyses [13]. Yet, a representative genome for the Braconinae is currently lacking. As previously stated, B. brevicornis is an ectoparasitoid, and its position within a family that contains both types of parasitism lifestyles holds promise for further phylogenetic comparisons.
As a member of the order Hymenoptera, B. brevicornis has a haplodiploid sex determination system where males develop from unfertilized eggs and females develop from fertilized eggs [14,15,16]. From a genetic perspective, B. brevicornis belongs to a genus where sex determination and diploid male production have been widely studied (B. hebetor [17], B. brevicornis [3], B. serinopae [18], reviewed in [19]; B. spec. near hebetor [20]; B. variator [21]). Indeed, the first description of the complementary sex determination (CSD) mechanism was provided for B. hebetor (= B. juglandis [22], reviewed in [23]), and recent work on B. brevicornis and polyploidy studies include diploid male fitness as well as ploidy-dependent mate choice behavior [11].
While straightforward to detect phenotypically through the formation of diploid males following inbreeding [14,19], the molecular mechanism underlying CSD has thus far only been resolved to a small level of detail in the honeybee Apis mellifera (L.) (Hymenoptera: Apidae), with the identification of the complementary sex determiner (csd) gene [24,25]. Heterozygosity at this gene leads to female development, while hemi- and homozygous individuals develop into haploid and diploid males, respectively. Therefore, inbreeding often leads to diploid male production in species with a CSD mechanism as it increases homozygosity. Csd is a duplication of feminizer (fem), a transformer (tra) ortholog [26] that is conserved across many insect orders as part of the sex determination cascade [27]. When heterozygous, csd initiates the female-specific splicing of fem, which then autoregulates its own female-specific splicing, ultimately resulting in female development [26,28]. Within the Hymenoptera and specifically the bees and ants, more duplications of tra/fem have been identified in species that are presumed to have CSD [27], but these tra/fem duplications have not been analyzed for potential heterozygosity. There has been some debate on the origin of tra/fem duplications. One theory states that the duplication of fem to produce csd occurred before the divergence of ants and bees, and that the paralogs evolved in concerted evolution [29,30], while the other theory states that all fem duplications are independent events [31,32]. This shows that additional hymenopteran genomes are necessary to understand the evolutionary history of tra/fem duplications and identify the genes underlying CSD. However, an assembled genome is usually haploid as areas of heterozygosity are collapsed in the final stages of assembly, diminishing the potential to find the gene(s) underlying CSD by its most notable feature: heterozygosity. Yet recent advances in sequencing and analysis gave us the ability to view heterozygous regions—known as “phases” in diploid assemblies—within a genome, which allow us to investigate the genes potentially involved in CSD.
Here, we report on the whole-genome sequencing of a pool of females from an isolated B. brevicornis strain using 10X Genomics technology that relies on linked-read sequencing (10X Genomics Inc., Pleasanton, CA, USA). Due to their long history of genetic isolation during laboratory rearing, individuals in this strain are assumed to have a high level of homozygosity, whereas a csd locus would retain its heterozygosity in the females. The 10X Genomics technology allows for generating phased data in which allelic variants can be identified after assembly. High-molecular weight DNA is partitioned into small droplets containing a unique barcode and adapter in such a way that only a few DNA molecules are present within each droplet.
Within each droplet (Gel Bead-in-Emulsion, “GEM”) the DNA is used as template for random priming and elongation resulting in many short DNA fragments sharing the same unique barcode. These DNA fragments are converted into a library that can be sequenced on an Illumina sequence platform. In the assembly step, the reads originating from the same fragment are organized by barcode and put together into synthetic linked-read fragments. Importantly, it is nearly impossible that two fragments with opposing allelic variants are together in the same droplet [33], though smaller genomes are a challenge on their own. This technique therefore allowed us to identify potential csd candidates in the female-derived B. brevicornis genome after sequencing by studying the phased data containing the different haplotypes. Moreover, as B. brevicornis is a potential biological control agent of several pests, the availability of a full genome may provide effective ways to study and improve this species to grow it into an established biological control agent for Lepidopteran pests.
2. Materials and Methods
2.1. Species Description and General Rearing
Individuals of B. brevicornis were taken from the laboratory colony L06. The colony was initiated in 2006 from naturally parasitized O. nubilalis larvae collected in maize fields near Leipzig, Germany. Species identification was first carried out by Matthias Schöller and Cornelis van Achterberg based on morphological characteristics [34]. Since collection, parasitoids have been reared on late instar larvae of the Mediterranean flour moth, E. kuehniella [11]. The species identity of strain L06 was recently revalidated based on molecular data and is entirely separate from its congeneric B. hebetor [6].
2.2. DNA Extraction
Immediately following emergence, 100 to 120 female wasps were flash frozen in liquid nitrogen and ground with a mortar and pestle. Genomic DNA was extracted using a protocol modified from Chang, Puryear, and Cairney [35]. Modifications include adding 300 μL β-mercaptoethanol (BME) to extraction buffer just before use. Instead of 10M LiCl, 0.7 volume isopropanol (100%) was added to the initial supernatant, after which it was divided into 1.5 mL Eppendorf tubes as 1 mL aliquots for subsequent extractions. The initial centrifugation step occurred at a slower rate and for a longer period of time to adjust for machine availability. Final pellets were dissolved in 50 μL autoclaved MQ H2O and recombined at the end of the extraction process (1.0 mL). DNA concentration was measured with an Invitrogen Qubit 2.0 fluorometer using the dsDNA HS Assay Kit (Thermo Fisher Scientific, Waltham, MA, USA) with final assessments for DNA quality, amount, and fragment size confirmed via BioAnalyzer 2100 (Agilent, Santa Clara, CA, USA).
2.3. 10XGenomics Library Preparation and Sequencing
As the genome of B. brevicornis is relatively small for the scale of the 10X platform, there is a higher risk of overlapping fragments within single GEMs. In order to reduce this risk, genomic DNA of a larger and previously analyzed genome (Tomato, Solanum lycopersicon (L.) (Solanaceae), commercial variety Heinz 1607) [29] was used as “carrier DNA”. DNA extraction of S. lycopersicon followed the protocol of Hosmani et al. [36]. The DNA of both B. brevicornis and S. lycopersicon was pooled in a 1:4 molar ratio.
One nanogram of this pooled DNA was used for 10X Genomics linked-read library preparation following the Chromium Genome Reagent Kits Version 1 User Guide (CG-00022) (10XGenomics, Pleasanton, CA, USA). Barcoded linked-read DNA fragments were recovered for final Illumina library construction (Illumina, San Diego, CA, USA). The library was used for 2 × 150 bp paired-end sequencing on one lane of an Illumina HiSeq 2500 at the business unit Bioscience of Wageningen University and Research (Wageningen, The Netherlands). Sequencing data were then used for basecalling and subsequent demultiplexing using Longranger (v2.2.2) (10X Genomics) (command—mkfastq), yielding 212,910,509 paired-end reads with a read length of 150 bp.
2.4. Assembly
To filter sequence data from tomato carrier DNA, 23 bp (16 bp GEM + 7 bp spacer) was removed from forward reads and all reads were subsequently mapped to an in-house high quality reference assembly of the Heinz genome using BWA-MEM v0.7.17 [37]. Using samtools v1.9 [38], all unaligned read pairs (-F = 12) were extracted and labelled non-Heinz. The assembly of the non-Heinz labelled read set was performed with Supernova assembler v2.1.0 (10X Genomics), using default settings including commands for both pseudohap (--style = pseudohap) and pseudohap2 (--style = pseudohap2) outputs [33]. These commands determine the output from Supernova, the first being the final scaffold output (pseudohap), while the second is the so-called “parallel pseudohaplotype” (pseudohap2) scaffolds that represent areas of divergence or phases [33]. Phasing is flattened in the pseudohap output by selecting the region with higher mapping coverage, whereas in the pseudohap2 output is differentiated by “.1” and “.2” at the end of each scaffold name to denote phasing, though not all scaffolds are phased at this point due to lack of divergence during assembly.
To verify whether there were no Heinz leftovers in the assembly, minimap2 v2.17-r941 [39] was used to align the assembly against the Heinz reference assembly. Further examination on presence of possible non-B. brevicornis scaffolds, i.e., bacterial scaffolds from sample microbiome, was performed with BlobTools (v1.0) [40], relying on megaBLAST against the NCBI NT-NR database [41] (2018-11-19) (max_target_seqs = 1, max_hsps = 1, evalue = 1 × 10−25) for taxonomical classification and BWA-MEM mapping of reads against scaffolds for coverage statistics. Reads mapping only against “Arthropoda” classified scaffolds were then extracted and used for a final k-mer analysis using jellyfish v2.1.1 (-C m = 21 –s = 2,000,000,000) [42] and GenomeScope [43] to infer heterozygosity.
Assembly completeness was determined using BUSCO (v3.0.2) with the insect_odb9 ortholog set and the fly training parameter [44] while assembly statistics were determined using QUAST [45]. The aforementioned pseudohap2 scaffolds were used in csd analysis, while the pseudohap scaffolds are now the assembly used for annotation.
2.5. Ab Initio Gene Finding and Protein Comparison
The coding sequences of two additional braconids (members of the subfamily Opiinae, and similar to the Braconinae belonging to the cyclostome subgroup [13,46]) were used for gene prediction and protein comparisons: Fopius arisanus (Sonan) (Hymenoptera: Braconidae) and Diachasma alloeum (Muesebeck) (Hymenoptera: Braconidae). Both sets of coding sequences were retrieved from the NCBI Assembly Database, version ASM8063v1 for F. arisanus and version Dall2.0 for D. alloeum [41,47,48].
For gene prediction, Augustus (v2.5.5) was first used to predict genes from the B. brevicornis assembly [49]. Using BLAST, coding sequences of F. arisanus were set as a query to the genome of B. brevicornis using default parameters (except minIdentity = 50) [50]. The result was converted into a hints file that was used to predict the genes of B. brevicornis using Nasonia vitripennis (Walker) (Hymenoptera: Pteromalidae) as the species parameter in Augustus (--species = nasonia –extrinsiccCfgFile = extrinsic.E.cfg).
After prediction, the protein sequences were retrieved and compared to both F. arisanus and D. alloeum (version Dall2.0) using Proteinortho (v6.0, -p = blastp, -e = 0.001) [51]. From the orthology grouping generated by Proteinortho, gene names could be allocated to the predicted genes. Lengths of both of these B. brevicornis genes and the orthologs of F. arisanus and D. alloeum were retrieved using samtools for comparison [38]. The mean relative length of genes was retrieved by dividing the lengths of all aorthologous amino acid sequences between the two relevant species and then calculating the mean over these. Errors within the annotation related to genome submission and validation were corrected with manual annotation of exons (three cases) and removal of two predicted genes that were more than 50% ambiguous nucleotides.
2.6. In Silico Identification of Feminizer as a Putative csd Locus
The pseudohap2 files were deduplicated using the dedupe tool within BBTools (sourceforge.net/projects/bbmap/) (ac = f) to remove all parallel pseudohaplotypes that were complete duplicates as these scaffolds were not heterozygous. The remainder of the set contained both scaffolds that previously had a duplicate, as well as solitary scaffolds that did not have a partner scaffold. These unique scaffolds were removed using the “filter by name” tool in BBTools, leaving 258 scaffolds, or 129 pairs of pseudohap2 scaffolds. Pairs were pairwise aligned in CLC Genomics Workbench 12 (Qiagen, Hilden, Germany) using default settings (gap open cost = 10, gap extension cost = 1, end gap cost = free, alignment = very accurate).
A local tBLASTn search against the entire B. brevicornis assembly was performed using the Apis mellifera Feminizer protein (NP_001128300) as query in Geneious Prime 2019.1.3 (http://www.geneious.com, [52]). The protein of gene “g7607” (locus tag = BBRV_LOCUS33129) was used in an NCBI BLASTp against the nr database with default settings [41,50]. Next, a region stretching from ~10 Kbp upstream and downstream of the first and last tBLASTn hit in scaffold 12, respectively, was annotated using a hidden Markov model (HMM) gene finder plus similar protein-based gene prediction (FGENESH+, Softberry, http://www.softberry.com/) with Nasonia vitripennis tra (NP_001128299) and N. vitripennis for the specific gene-finding parameters [53]. Only this combination of settings resulted in a full-length annotation from TSS to poly-A with seven exons. The resulting protein prediction was used in a BLASTp search with default settings against the nr database. To annotate the potential fem duplication, a stretch of ~10 Kbp directly upstream of the annotated putative fem was again annotated using FGENESH+ (Softberry) with Nasonia vitripennis tra (NP_001128299) and N. vitripennis for the specific gene-finding parameters [53]. The predicted annotation contained five exons but lacked the last coding segment with stop codon. A protein alignment was made in Geneious Prime 2019.1.3 with A. mellifera csd (ABU68670) and fem (NP_001128300); N. vitripennis tra (XP_001604794) and B. brevicornis putative fem and B. brevicornis putative fem duplicate (fem1), using MAFFT v7.450 with the following settings: Algorithm = auto, Scoring matrix = BLOSUM62, Gap open penalty = 1.53, Offset value = 0.123 [54,55].
2.7. Microsynteny Analysis
A microsynteny analysis was achieved by comparing the arrangement of a set of homologous genes directly upstream and downstream of tra or fem in A. mellifera and N. vitripennis using a combination of the online tool SimpleSynteny [56] and tBLASTn searches using default settings in Geneious Prime. The scaffolds containing fem (A. mellifera, scaffold CM000059.5, 13.2 Mbp in length), tra (N. vitripennis, scaffold NW_001820638.3, 3.7 Mbp in length) or the putative fem (B. brevicornis, scaffold 12, 4.5 Mbp in length) were extracted from their respective genomes (Apis: GCA_000002195.1_Amel_4.5_genomic, Nasonia: nvi_ref_Nvit_2.1, Bracon: B. brevicornis assembly from this study) and searched with protein sequence from the following genes: tra (GeneID: 00121203), LOC100121225, LOC100678616, LOC100680007 originating from N. vitripennis; fem (GeneID:724970), csd (GeneID:406074), LOC408733, LOC551408, LOC724886 originating from A. mellifera. The advanced settings for SimpleSynteny were as follows: BLAST E-value Threshold = 0.01, BLAST Alignment type = Gapped, Minimum Query Coverage Cutoff = 1%, Circular Genome Mode = Off. If the gene was not found within the extracted scaffold, it was searched for in the full genome assembly. For the image settings, Gene Display Mode = Project Full-Length Gene. This generated image was used together with results from the tBLASTn searches as template to draw the final figure. The final figure that we present in the Results and Discussion section depicts ~0.9 Mbp of genomic region for all three species.
2.8. Data Availability
Raw sequence data for B. brevicornis after removal of carrier DNA and contamination, as well as the assembly, can be found in the EMBL-EBI European Nucleotide Archive (ENA) under BioProject PRJEB35412. Both the assembly file (.fasta) (https://doi.org/10.6084/m9.figshare.12674189.v2) and the complete annotation file (.gff) (https://doi.org/10.6084/m9.figshare.12073911.v2) are available in a separate repository. Contaminated pseudohap scaffolds are available for download alongside the two pseudohap2 FASTA files, more details are provided at https://doi.org/10.17026/dans-xn6-pjm8.
3. Results
3.1. Genome Assembly
A total of 172 ng of B. brevicornis DNA was extracted, which was then reduced to 1 ng/μL for library preparation. Sequencing of the Heinz diluted library resulted in a total yield of 54 Gbp of data (corrected for 10X Genomics 23 bp segment of forward reads). Mapping against the Heinz genome assembly showed a mapping percentage of 84.8%. There was a total of 30,278,915 unmapped pairs, comprising ~8.39 Gbp of data. This corresponds to the 4:1 ratio between Heinz and B. brevicornis DNA in the library. Further scaffold decontamination with BlobTools resulted in a separation of the assembly into B. brevicornis scaffolds and microbiome scaffolds. The final assembled genome is ~123 Mbp in size, comprised of 353 scaffolds. This is similar to the projected physical genome size of 133 Mbp ([57], flow cytometry). The contig N50 of the assembly is 6,121,327 bp, while k-mer analysis of the B. brevicornis-only read set showed an expected haploid genome length of ~115 Mbp (105 Mbp unique, 10 Mbp repeat) and a heterozygosity of ~0.54%. Peak coverage was 27x. The disparity in projected physical genome size and assembled genome size may be due to the decontamination process potentially removing extremely repetitive regions, as ~10.9 Mbp of the removed scaffolds returned either no hits or were classified as undefined at the taxa level. These scaffolds were clearly separated in the blobplot due to their high coverage. The potential effects are negligible, as the BUSCO analysis indicates a completeness of 98.7% (single orthologs 97.0%, duplicate orthologs 1.7%).
3.2. Ab Initio Gene Finding and Protein Comparison
In total, 12,686 genes were predicted, with an average coding sequence length of 529.86 amino acids. The number of genes correspond well to those found in F. arisanus (11,775) and D. alloeum (13,273), the two closest relatives of B. brevicornis for which public data are available. Proteinortho analysis resulted in 7660 three-way orthology groups (7830 B. brevicornis genes), while 362 orthology groups contained proteins of B. brevicornis and F. arisanus (382 B. brevicornis genes), and 451 groups contained B. brevicornis and D. alloeum genes (479 B. brevicornis genes). A large number of orthology groups (2492) had no B. brevicornis genes, while 3995 predicted genes remain ungrouped.
Compared to F. arisanus, the mean relative length of predicted B. brevicornis genes was 1.016, while the mean relative length for the two- and three-way orthology groups was 0.996. Similar results were obtained for comparisons to D. alloeum, where mean relative length for B. brevicornis genes was 1.011, and 0.988 for the two- and three-way orthology groups. Furthermore, the pairwise lengths of all these proteins are fairly similar (Figure 1).
3.3. Identification of a Putative Feminizer Ortholog and Duplication Event
After deduplicating the similar parallel pseudohaplotype files, 6706 scaffolds in total, the remainder of the set contained 3420 scaffolds, of which 3286 scaffolds were solitary and did not have a counterpart pseudohap2 for comparison. Some had a previous duplicate removed in the deduplication, while others never had a partner scaffold in the first place. These unique scaffolds were removed, leaving 258 scaffolds, or 129 pairs of pseudohap2 scaffolds. These putatively heterozygous scaffolds were good candidates to search for potential csd loci as these are presumed to be heterozygous in females.
So far, a csd gene has been sequenced only in species of bees of the genus Apis, and it is highly polymorphic, even within subspecies [58]. It is located adjacent to the more conserved feminizer (fem) [26], and we therefore started with localizing feminizer in the genome. As feminizer (or its ortholog transformer, tra) was not identified in the ab-initio annotation, we used a local tBLASTn search to find fem in the assembly. Four hits with E-value from 5.86 × 10−4 to 8.59 × 10−8 were found in scaffold 12. Searching the annotation using part of the tBLASTn result shows that it is annotated as “g7607” (locus tag = BBRV_LOCUS33129) which gave a first hit with protein O-glucosyltransferase 2 (Diachasma alloeum) after a BLASTp search, and no fem or tra hits were found. A closer inspection showed that “g7607” is annotated as fusion protein with the N-terminal part resembling fem and the C-terminal part putatively encoding O-glucosyltransferase 2. Next, we used FGENESH+ to re-annotate the genomic region, resulting in a full-length putative B. brevicornis feminizer (Bbfem) ortholog containing seven exons (Figure 2). We found that the two fem/tra signature domains in Hymenoptera, the Hymenoptera domain [59] and CAM domain (putative autoregulatory domain) [60], are present in the putative fem ortholog, but are also duplicated upstream of putative Bbfem. A second manual re-annotation step showed that a partial fem-duplicate is encoded directly upstream of putative Bbfem containing five exons (Figure 2), which we denote here as Bbfem1 as suggested by Koch et al. [31]. The entire region encoding both Bbfem and Bbfem1 has 64.5% pairwise identity (Figure 2). The coding region of Bbfem has 83.2%, and Bbfem1 has 79.3% pairwise identity.
A protein alignment showed that the full-length putative Bbfem, as well as Bbfem1, are highly similar to each other and both contain all known fem/tra domains (Figure 3).
Bbfem1 lacks a notably long first Arginine/Serine (RS)-rich region which is present only in Bbfem (124–153aa), but it otherwise appears to encode for a full-length protein. The csd-specific hypervariable domain (Figure 3, purple text; [24]) is not present in Bbfem nor in Bbfem1. Therefore, the gene name has been updated as “g7607 putative Bbfem-Bbfem1 csd” in the official annotation.
3.4. Synteny Analysis of Putative Fem Encoding Region
We compared the orthologous gene arrangement of a number of genes up- and downstream of N. vitripennis tra and A. mellifera fem and csd, with the genomic organization of the Bbfem region (Figure 4).
N. vitripennis LOC100680007 is present in the tra/fem containing scaffolds of all three genomes, while A. mellifera LOC408733 has both translocated closer to Nasonia tra and to a different scaffold in B. brevicornis. N. vitripennis LOC100121225 and LOC100678616 are encoded in opposing directions in both A. mellifera and N. vitripennis but are both downstream of tra in N. vitripennis and upstream of fem and csd in A. mellifera. There is no match for both genes in B. brevicornis. A. mellifera LOC724886 and LOC551408 are encoded in opposing directions with the same orientation in both N. vitripennis and A. mellifera but are reversed in B. brevicornis and downstream of Bbfem and Bbfem1 while they are upstream of csd and fem in A. mellifera. In N. vitripennis, both genes are not located in the tra containing scaffold but in another scaffold indicating that this region has undergone chromosomal rearrangements.
4. Discussion
Here, we present the genome of the braconid wasp Bracon brevicornis, a parasitoid wasp that not only has biological control applications, but also offers potential as a study system for future analyses into braconid phylogenetics and gene evolution. With no previous genomes available for the subfamily Braconinae, the most specious of the braconid wasps, the resources and investigations presented here fill this gap. Our linked-read library, assisted by carrier DNA of S. lycopersicon, has resulted in a highly contiguous, very complete assembly, comprised of just 353 scaffolds and 12,686 genes. This gene count is similar to related species, and in further protein length comparisons, the proteins are highly similar. This indicates that the predicted genes are highly complete, a necessary feature for any future phylogenetic comparisons between species or families.
We utilized the 10X Genomics linked-read approach to obtain pseudohaploid information that would allow us to search for potential csd loci in silico. As a substantial number of scaffolds were putatively heterozygous, we used the notion that in A. mellifera, csd is located adjacent to fem [26] to limit our search for csd candidates. We manually annotated a putative B. brevicornis fem and a partial Bbfem duplicate that is highly similar, and both genes encode all known tra/fem protein domains (Figure S1) [59]. Both genes are in a small region that is highly heterozygous, especially when compared to the remainder of the scaffold, which would suggest true heterozygosity and not assembly error, but also when compared to the level of heterozygosity in the other pseudohap2 scaffolds. While the 10X Genomics platform has been discontinued for de novo genome assembly, recent approaches may offer a similar solution (see [61]).
Our synteny analysis showed low synteny between B. brevicornis, A. mellifera, and N. vitripennis with the translocation of LOC408733 (A. mellifera) and the absence of LOC100121225 and LOC100678616 (N. vitripennis) in the B. brevicornis genome region. It is known that genomic regions encoding sex determination genes are dynamic in nature, showing both duplications and translocations [62]. Tra/fem duplications have been shown in CSD systems before, most notably in A. mellifera where a fem gene duplication event resulted in it becoming a csd locus [26,28], but also in ants and bumblebees, where many independent duplications of fem with varying orientations are observed [30]. In addition, in non-CSD systems, tra duplications have been observed [27,63,64]. Although there is some debate on whether fem paralogs originated due to a single duplication event and functions as csd [29,30], or evolved multiple times independently and may have other functions [31,32], we suggest that the Bbfem paralog, Bbfem1, is a promising csd gene candidate in B. brevicornis. However, in-depth analyses are required to verify this, including in-depth sequencing of individual females, gene expression analyses, and further phylogenetic studies with other CSD Hymenopterans, all of which are now possible with an assembled, annotated, and published genome.
Acknowledgments
We would like to acknowledge Jetske de Boer for information on B. brevicornis genome size, Martin Hasselmann for discussion on honeybee csd, and Elzemiek Geuverink for discussions on tra/fem duplicates in Hymenoptera. We acknowledge the helpful comments of three anonymous reviewers.
Author Contributions
Conceptualization, K.B.F., B.A.P., A.T., B.J.Z. and E.C.V.; methodology, K.B.F., A.C., J.v.d.H., R.N., F.F.M.B., E.S., A.T. and E.C.V.; project administration, B.A.P.; supervision, B.A.P., A.T., B.J.Z. and E.C.V.; visualization, J.v.d.H. and E.C.V.; writing—original draft, K.B.F., A.C., J.v.d.H., R.N. and E.C.V.; writing—review and editing, K.B.F., B.A.P., J.v.d.H., R.N., E.S., A.T., B.J.Z. and E.C.V. All authors have read and agreed to the published version of the manuscript.
Funding
This research was funded by the European Union’s Horizon 2020 research and innovation program under the Marie Skłodowska-Curie grant agreement no. 641456.
Conflicts of Interest
The authors declare no conflict of interest.
Footnotes
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.
References
- 1.Temerak S.A. Longevity of Bracon brevicornis [Hym: Braconidae] adults as influenced by nourishment on artificial and natural foods. Entomophaga. 1983;28:145–150. doi: 10.1007/BF02372138. [DOI] [Google Scholar]
- 2.Venkatesan T., Jalali S.K., Srinivasamurthy K. Competitive interactions between Goniozus nephantidis and Bracon brevicornis, parasitoids of the coconut pest Opisina arenosella. Int. J. Pest. Manag. 2009;55:257–263. doi: 10.1080/09670870902914155. [DOI] [Google Scholar]
- 3.Speicher B.R., Speicher K.G. The Occurrence of Diploid Males in Habrobracon brevicornis. Am. Nat. 1940;74:379–382. doi: 10.1086/280904. [DOI] [Google Scholar]
- 4.Narayanan E.S., Angalet G.W., Subba Rao B.R., D’Souza G.I. Effect of refrigeration of the pupæ of microbracon brevicornis wesm. on the pigmentation of the adult. Nature. 1954;173:503–504. doi: 10.1038/173503b0. [DOI] [Google Scholar]
- 5.Puttarudriah M., Basavanna G.P.C. A study on the identity of bracon hebetor say and bracon brevicornis wesmael (hymenoptera: Braconidae) Bull. Entomol. Res. 1956;47:183–191. doi: 10.1017/S0007485300046617. [DOI] [Google Scholar]
- 6.Kittel R.N., Maeto K. Revalidation of Habrobracon brevicornis stat. rest. (Hymenoptera: Braconidae) Based on the CO1, 16S, and 28S Gene Fragments. J. Econ. Entomol. 2019;112:906–911. doi: 10.1093/jee/toy368. [DOI] [PubMed] [Google Scholar]
- 7.Kares E.A., El-Sappagh I.A., Ebaid G.H., Sabra I.M. Efficacy of releasing Bracon brevicornis wesm. (Hymenoptera: Braconidae) for controlling hibernated Ostrinia nubilalis (Hübner) and Sesamia cretica led. larvae in stored corn stalks. Egypt. J. Biol. Pest Control. 2010;20:155–159. [Google Scholar]
- 8.Temerak S.A. Host preferences of the parasitoid Bracon brevicornis Wesmael (Hym., Braconidae) and host sensitivity to its venom. Z. Angew. Entomol. 1983;96:37–41. doi: 10.1111/j.1439-0418.1983.tb03638.x. [DOI] [Google Scholar]
- 9.Srinivasan T. Chandrikamohan Population growth potential of Bracon brevicornis Wesmael (Braconidae: Hymenoptera): A life table analysis. Acta Phytopathol. Entomol. Hung. 2017;52:123–130. doi: 10.1556/038.52.2017.010. [DOI] [Google Scholar]
- 10.Villacañas de Castro C., Thiel A. Resource-Dependent Clutch Size Decisions and Size-Fitness Relationships in a Gregarious Ectoparasitoid Wasp, Bracon brevicornis. J. Insect Behav. 2017;30:454–469. doi: 10.1007/s10905-017-9632-2. [DOI] [Google Scholar]
- 11.Thiel A., Weeda A.C. Haploid, diploid, and triploid—Discrimination ability against polyploid mating partner in the Parasitic Wasp, Bracon brevicornis (Hymenoptera: Braconidae) J. Insect Sci. 2014;14:1–7. doi: 10.1093/jisesa/ieu153. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Thiel A., Weeda A.C., de Boer J.G., Hoffmeister T.S. Genetic incompatibility drives mate choice in a parasitic wasp. Front. Zool. 2013;10:43. doi: 10.1186/1742-9994-10-43. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Chen X., van Achterberg C. Systematics, Phylogeny, and Evolution of Braconid Wasps: 30 Years of Progress. Annu. Rev. Entomol. 2019;64:335–358. doi: 10.1146/annurev-ento-011118-111856. [DOI] [PubMed] [Google Scholar]
- 14.Cook J.M., Crozier R.H. Sex determination and population biology in the hymenoptera. Trends Ecol. Evol. 1995;10:281–286. doi: 10.1016/0169-5347(95)90011-X. [DOI] [PubMed] [Google Scholar]
- 15.Heimpel G.E., de Boer J.G. Sex Determination in the Hymenoptera. Annu. Rev. Entomol. 2008;53:209–230. doi: 10.1146/annurev.ento.53.103106.093441. [DOI] [PubMed] [Google Scholar]
- 16.Cook J.M. Sex determination in the Hymenoptera: A review of models and evidence. Heredity. 1993;71:421–435. doi: 10.1038/hdy.1993.157. [DOI] [Google Scholar]
- 17.Whiting P.W., Whiting A.E. Diploid males from fertilized eggs in hymenoptera. Science (80-) 1925;62:437. doi: 10.1126/science.62.1611.437. [DOI] [PubMed] [Google Scholar]
- 18.Clark A.M., Bertrand H.A., Smith R.E. Life Span Differences between Haploid and Diploid Males of Habrobracon serinopae after Exposure as Adults to X Rays. Am. Nat. 1963;97:203–208. doi: 10.1086/282271. [DOI] [Google Scholar]
- 19.Van Wilgenburg E., Driessen G., Beukeboom L.W. Single locus complementary sex determination in Hymenoptera: An “unintelligent” design? Front. Zool. 2006;3:1. doi: 10.1186/1742-9994-3-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Holloway A.K., Heimpel G.E., Strand M.R., Antolin M.F. Survival of diploid males in Bracon sp. near hebetor (Hymenoptera: Braconidae) Ann. Entomol. Soc. Am. 1999;92:110–116. doi: 10.1093/aesa/92.1.110. [DOI] [Google Scholar]
- 21.Thiel A. (University of Bremen, FB02, Institute of Ecology, Population and Evolutionary Ecology Group, Bremen, Germany) Personal communication. 2019.
- 22.Whiting P.W. Multiple alleles in sex determination of Habrobracon. J. Morphol. 1940;66:323–355. doi: 10.1002/jmor.1050660208. [DOI] [Google Scholar]
- 23.Antolin M.F., Ode P.J., Heimpel G.E., O’Hara R.B., Strand M.R. Population structure, mating system, and sex-determining allele diversity of the parasitoid wasp Habrobracon hebetor. Heredity. 2003;91:373–381. doi: 10.1038/sj.hdy.6800337. [DOI] [PubMed] [Google Scholar]
- 24.Beye M., Hasselmann M., Fondrk M.K., Page R.E., Omholt S.W. The gene csd is the primary signal for sexual development in the honeybee and encodes an SR-type protein. Cell. 2003;114:419–429. doi: 10.1016/S0092-8674(03)00606-8. [DOI] [PubMed] [Google Scholar]
- 25.Hasselmann M., Beye M. Signatures of selection among sex-determining alleles of the honey bee. Proc. Natl. Acad. Sci. USA. 2004;101:4888–4893. doi: 10.1073/pnas.0307147101. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Hasselmann M., Gempe T., Schiøtt M., Nunes-Silva C.G., Otte M., Beye M. Evidence for the evolutionary nascence of a novel sex determination pathway in honeybees. Nature. 2008;454:519–522. doi: 10.1038/nature07052. [DOI] [PubMed] [Google Scholar]
- 27.Geuverink E., Beukeboom L.W. Phylogenetic Distribution and Evolutionary Dynamics of the Sex Determination Genes doublesex and transformer in Insects. Sex. Dev. 2014;8:38–49. doi: 10.1159/000357056. [DOI] [PubMed] [Google Scholar]
- 28.Gempe T., Hasselmann M., Schiøtt M., Hause G., Otte M., Beye M. Sex determination in honeybees: Two separate mechanisms induce and maintain the female pathway. PLoS Biol. 2009;7:e1000222. doi: 10.1371/journal.pbio.1000222. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Schmieder S., Colinet D., Poirié M. Tracing back the nascence of a new sex-determination pathway to the ancestor of bees and ants. Nat. Commun. 2012;3:895. doi: 10.1038/ncomms1898. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Privman E., Wurm Y., Keller L. Duplication and concerted evolution in a master sex determiner under balancing selection. Proc. R. Soc. B Biol. Sci. 2013;280:20122968. doi: 10.1098/rspb.2012.2968. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Koch V., Nissen I., Schmitt B.D., Beye M. Independent evolutionary origin of femparalogous genes and complementary sex determination in hymenopteran insects. PLoS ONE. 2014;9:e91883. doi: 10.1371/journal.pone.0091883. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Biewer M., Schlesinger F., Hasselmann M. The evolutionary dynamics of major regulators for sexual development among Hymenoptera species. Front. Genet. 2015;6:124. doi: 10.3389/fgene.2015.00124. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Weisenfeld N.I., Kumar V., Shah P., Church D.M., Jaffe D.B. Direct determination of diploid genome sequences. Genome Res. 2017;27:757–767. doi: 10.1101/gr.214874.116. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Wührer B. ((AMW Nützlinge GmbH, Pfungstadt, Germany)). Personal communication. 2019.
- 35.Chang S., Puryear J., Cairney J. A simple and efficient method for isolating RNA from pine trees. Plant Mol. Biol. Rep. 1993;11:113–116. doi: 10.1007/BF02670468. [DOI] [Google Scholar]
- 36.Hosmani P.S., Flores-Gonzalez M., van de Geest H., Maumus F., Bakker L.V., Schijlen E., van Haarst J., Cordewener J., Sanchez-Perez G., Peters S., et al. An improved de novo assembly and annotation of the tomato reference genome using single-molecule sequencing, Hi-C proximity ligation and optical maps. bioRxiv. 2019:767764. doi: 10.1101/767764. [DOI] [Google Scholar]
- 37.Li H. Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. arXiv. 20131303.3997 [Google Scholar]
- 38.Li H., Handsaker B., Wysoker A., Fennell T., Ruan J., Homer N., Marth G., Abecasis G., Durbin R. The Sequence Alignment/Map format and SAMtools. Bioinformatics. 2009;25:2078–2079. doi: 10.1093/bioinformatics/btp352. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Li H. Minimap2: Pairwise alignment for nucleotide sequences. Bioinformatics. 2018;34:3094–3100. doi: 10.1093/bioinformatics/bty191. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Laetsch D.R., Blaxter M.L. BlobTools: Interrogation of genome assemblies [version 1; peer review: 2 approved with reservations] F1000Research. 2017;6:1287. doi: 10.12688/f1000research.12232.1. [DOI] [Google Scholar]
- 41.Acland A., Agarwala R., Barrett T., Beck J., Benson D.A., Bollin C., Bolton E., Bryant S.H., Canese K., Church D.M., et al. Database resources of the National Center for Biotechnology Information. Nucleic Acids Res. 2014;42:8–13. doi: 10.1093/nar/gkt1146. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Marçais G., Kingsford C. A fast, lock-free approach for efficient parallel counting of occurrences of k-mers. Bioinformatics. 2011;27:764–770. doi: 10.1093/bioinformatics/btr011. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Vurture G.W., Sedlazeck F.J., Nattestad M., Schatz M.C., Gurtowski J., Underwood C.J., Vurture G.W., Fang H. GenomeScope: Fast reference-free genome profiling from short reads. Bioinformatics. 2017;33:2202–2204. doi: 10.1093/bioinformatics/btx153. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Simão F.A., Waterhouse R.M., Ioannidis P., Kriventseva E.V., Zdobnov E.M. BUSCO: Assessing genome assembly and annotation completeness with single-copy orthologs. Bioinformatics. 2015;31:3210–3212. doi: 10.1093/bioinformatics/btv351. [DOI] [PubMed] [Google Scholar]
- 45.Gurevich A., Saveliev V., Vyahhi N., Tesler G. QUAST: Quality assessment tool for genome assemblies. Bioinformatics. 2013;29:1072–1075. doi: 10.1093/bioinformatics/btt086. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Li X.Y., van Achterberg C., Tan J.C. Revision of the subfamily Opiinae (Hymenoptera, Braconidae) from Hunan (China), including thirty-six new species and two new genera. Zookeys. 2013;268:1–186. doi: 10.3897/zookeys.326.5911. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Geib S.M., Liang G.H., Murphy T.D., Sim S.B. Whole Genome Sequencing of the Braconid Parasitoid Wasp Fopius arisanus, an Important Biocontrol Agent of Pest Tepritid Fruit Flies. G3 Genes Genomes Genet. 2017;7:2407–2411. doi: 10.1534/g3.117.040741. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Tvedte E.S., Walden K.K.O., McElroy K.E., Werren J.H., Forbes A.A., Hood G.R., Logsdon J.M., Feder J.L., Robertson H.M. Genome of the parasitoid wasp diachasma alloeum, an emerging model for ecological speciation and transitions to asexual reproduction. Genome Biol. Evol. 2019;11:2767–2773. doi: 10.1093/gbe/evz205. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Stanke M., Morgenstern B. AUGUSTUS: A web server for gene prediction in eukaryotes that allows user-defined constraints. Nucleic Acids Res. 2005;33:W465–W467. doi: 10.1093/nar/gki458. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Camacho C., Coulouris G., Avagyan V., Ma N., Papadopoulos J., Bealer K., Madden T.L. BLAST+: Architecture and applications. BMC Bioinform. 2009;10:421. doi: 10.1186/1471-2105-10-421. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Lechner M., Findeiß S., Steiner L., Marz M., Stadler P.F., Prohaska S.J. Proteinortho: Detection of (Co-)orthologs in large-scale analysis. BMC Bioinform. 2011;12:124. doi: 10.1186/1471-2105-12-124. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Kearse M., Moir R., Wilson A., Stones-Havas S., Cheung M., Sturrock S., Buxton S., Cooper A., Markowitz S., Duran C., et al. Geneious Basic: An integrated and extendable desktop software platform for the organization and analysis of sequence data. Bioinformatics. 2012;28:1647–1649. doi: 10.1093/bioinformatics/bts199. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Solovyev V. Statistical Approaches in Eukaryotic Gene Prediction. In: Balding D., Cannings C., Bishop M., editors. Handbook of Statistical Genetics. Wiley-Interscience; Hoboken, NJ, USA: 2007. pp. 97–159. [Google Scholar]
- 54.Katoh K. MAFFT: A novel method for rapid multiple sequence alignment based on fast Fourier transform. Nucleic Acids Res. 2002;30:3059–3066. doi: 10.1093/nar/gkf436. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Katoh K., Standley D.M. MAFFT multiple sequence alignment software version 7: Improvements in performance and usability. Mol. Biol. Evol. 2013;30:772–780. doi: 10.1093/molbev/mst010. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Veltri D., Wight M.M., Crouch J.A. SimpleSynteny: A web-based tool for visualization of microsynteny across multiple species. Nucleic Acids Res. 2016;44:W41–W45. doi: 10.1093/nar/gkw330. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.De Boer J.G. ((Department of Terrestrial Ecology, Netherlands Institute of Ecology, Wageningen, The Netherlands)). Personal communication. 2019.
- 58.Wang Z., Liu Z., Wu X., Yan W., Zeng Z. Polymorphism analysis of csd gene in six Apis mellifera subspecies. Mol. Biol. Rep. 2012;39:3067–3071. doi: 10.1007/s11033-011-1069-7. [DOI] [PubMed] [Google Scholar]
- 59.Verhulst E.C., van de Zande L., Beukeboom L.W. Insect sex determination: It all evolves around transformer. Curr. Opin. Genet. Dev. 2010;20:376–383. doi: 10.1016/j.gde.2010.05.001. [DOI] [PubMed] [Google Scholar]
- 60.Hediger M., Henggeler C., Meier N., Perez R., Saccone G., Bopp D. Molecular Characterization of the Key Switch F Provides a Basis for Understanding the Rapid Divergence of the Sex-Determining Pathway in the Housefly. Genetics. 2010;184:155–170. doi: 10.1534/genetics.109.109249. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61.Meier J.I., Salazar P.A., Kučka M., Davies R.W., Dréau A., Aldás I., Power O.B., Nadeau N.J., Bridle J.R., Rolian C., et al. Haplotype tagging reveals parallel formation of hybrid races in two butterfly species. bioRxiv. 2020 doi: 10.1101/2020.05.25.113688. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62.Dechaud C., Volff J.N., Schartl M., Naville M. Sex and the TEs: Transposable elements in sexual development and function in animals. Mob. DNA. 2019;10:42. doi: 10.1186/s13100-019-0185-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 63.Geuverink E., Kraaijeveld K., van Leussen M., Chen F., Pijpe J., Linskens M.H.K., Beukeboom L.W., van de Zande L. Evidence for involvement of a transformer paralogue in sex determination of the wasp Leptopilina clavipes. Insect Mol. Biol. 2018;27:780–795. doi: 10.1111/imb.12522. [DOI] [PubMed] [Google Scholar]
- 64.Jia L.-Y., Xiao J.-H., Xiong T.-L., Niu L.-M., Huang D.-W. The transformer genes in the fig wasp C eratosolen solmsi provide new evidence for duplications independent of complementary sex determination. Insect Mol. Biol. 2016;25:191–201. doi: 10.1111/imb.12210. [DOI] [PubMed] [Google Scholar]