Skip to main content
Nucleic Acids Research logoLink to Nucleic Acids Research
. 2023 Nov 6;52(D1):D1024–D1032. doi: 10.1093/nar/gkad956

SilkMeta: a comprehensive platform for sharing and exploiting pan-genomic and multi-omic silkworm data

Kunpeng Lu 1,2,4, Yifei Pan 3,4, Jianghong Shen 4,4, Lin Yang 5, Chengyu Zhan 6, Shubo Liang 7, Shuaishuai Tai 8, Linrong Wan 9, Tian Li 10, Tingcai Cheng 11, Bi Ma 12, Guoqing Pan 13, Ningjia He 14, Cheng Lu 15, Eric Westhof 16,17, Zhonghuai Xiang 18, Min-Jin Han 19,20,, Xiaoling Tong 21,22,, Fangyin Dai 23,24,
PMCID: PMC10767832  PMID: 37941143

Abstract

The silkworm Bombyx mori is a domesticated insect that serves as an animal model for research and agriculture. The silkworm super-pan-genome dataset, which we published last year, is a unique resource for the study of global genomic diversity and phenotype-genotype association. Here we present SilkMeta (http://silkmeta.org.cn), a comprehensive database covering the available silkworm pan-genome and multi-omics data. The database contains 1082 short-read genomes, 546 long-read assembled genomes, 1168 transcriptomes, 294 phenotype characterizations (phenome), tens of millions of variations (variome), 7253 long non-coding RNAs (lncRNAs), 18 717 full length transcripts and a set of population statistics. We have compiled publications on functional genomics research and genetic stock deciphering (mutant map). A range of bioinformatics tools is also provided for data visualization and retrieval. The large batch of omics data and tools were integrated in twelve functional modules that provide useful strategies and data for comparative and functional genomics research. The interactive bioinformatics platform SilkMeta will benefit not only the silkworm but also the insect biology communities.

Graphical Abstract

Graphical Abstract.

Graphical Abstract

Introduction

The silkworm Bombyx mori is a domesticated insect deriving from the wild silkworm, Bombyx mandarina. As a supplier of silk and a model for biological research, B. mori plays multiple roles in scientific research and agriculture. Like Drosophila, the silkworm is a favored representative for several insect taxa, notably Lepidoptera. Research on the silkworm over more than a century has produced remarkable results, such as the first discovery of Mendel's genetic principles in animal (1), the discovery, isolation and studies of insect hormones (e.g. juvenile hormone) (2), and the discovery and use of heterosis (3,4). Over the past twenty years, silkworm genomic information has greatly facilitated the progress on insect adaptation (5–7), domestication (7–9), genetic engineering (10,11) and silkworm breeding.

The silkworm reference genome was firstly published in 2004 and updated twice in 2008 and 2019 (12–14). Several databases (SilkDB, KAIKObase, Silkbase, SGID) centered on the single genome have subsequently been developed to facilitate the acquisition of genomic information (15–18). Recently, our group completed the 1K Silkworm Genome Project (1KSGP) and built a silkworm super-pan-genome, which contains 1082 short-read genomes, 545 assembled long-read genomes, over 50 million variations (SNPs, InDels, SVs), etc. (7). To our knowledge, the silkworm pan-genome comprises the largest quantity of assembled long-read genomes in animals and plants to date, representing almost the entire genomic diversities of the silkworm. The publication of the silkworm pan-genome dataset provides a valuable resource for the research community. However, the exploration and use of the pan-genome dataset requires computing and bioinformatics skills, creating a gap between the resources and researchers. It is therefore essential to have an interactive, user-friendly platform for convenient investigation of genomes and genomic variations.

In addition to 1KSGP, a large amount of multi-omics data (transcriptomes, single-cell transcriptomes, etc.) has been published and can be searched in public databases such as NCBI (https://www.ncbi.nlm.nih.gov/) and China National Center for Bioinformation (CNCB, https://www.cncb.ac.cn/). Here, we present the SilkMeta database, which integrates pan-genomic data, multi-omic data and bioinformatics tools to create a bridge for revealing associations between genes (or their variations) and phenotypes, with the aim of accelerating research on silkworm and insect biology.

Data resources

The majority of SilkMeta data comes from our pan-genomic project, including a reference pan-genome, 294 germplasm phenotypic features (variome), 1082 short-read sequencing data (DNBSEQ, BGI), 545 long-read sequencing data (PromethION, Oxford Nanopore), 126 transcriptomes, 545 de novo assembled genomes (100 of these genomes are annotated), 4 301 2261 SNPs, 9 344 375 InDels (<50 bp), 3 432 266 SVs, 1 640 256 protein-coding genes, 19 411 intra-species orthogroups, 7216 inter-species orthogroups and population genetics data. Most of the 294 phenotypic features (e.g. cocoon colour, egg colour, larvae pigmentation, moltinism, voltinism) are qualitative traits, which are fixed in one or more silkworm strains. These features are recorded in a table of the ‘sample information’ module (the ‘background/phenotype’ column) using gene symbols. Data volume of the short- and long-read sequencing reads are 31.52 Tb and 24.06 Tb with an average coverage depth of 65 × and 97×, respectively. Average completeness and contig N50 size of the 545 genomes are 98% (BUSCO evaluation value) and 7.6 Mb. More detailed knowledge about the silkworm pan-genome dataset was presented in our previous publication (7).

In addition, we collected 1042 transcriptomes, 92 Chromatin Immunoprecipitation Sequencing (ChIP-seq) data, 14 single-cell transcriptomes, 7253 long non-coding RNAs (lncRNAs), 18717 full length transcripts and the Dazao reference genome from NCBI, the CNCB or previous publications (14,19–21). We integrated the piRNA, small RNA (including circRNA and miRNA and piRNA), epigenomics (including ChIP-seq and methylome), transcription factor (TF), repeats (containing transposable elements) and Hi-C data from Silkbase, SGID and SilkDB3.0 into SilkMeta. We collected the results of silkworm functional genomics research (protein-coding genes and relevant publications) and integrated them into the ‘mutant map’ and ‘functional research’ modules. Links of Hi-C, eFP (electronic fluorescent pictograph), co-expression, 3D (three-dimensional structure of protein) and synteny plot in SilkDB 3.0 is presented on gene information page of SilkMeta. All the data incorporated into SilkMeta are listed in Table 1.

Table 1.

Data summary of SilkMeta

Type of data Numbers Type of data Numbers
Reference pan-genome 1 Long non-coding RNAs 7253
Phenotype description 294 traits Full length transcripts 18 717
Short-read sequencing data 1082 strains Transposable elements 432 055
Long-read sequencing data 545 strains SNPs 43 012 261
Transcriptome data 1168 (samples), 122 (projects) InDels 9 344 375
Genomes 546 (including the Dazao genome) SVs 3 432 266
Annotated genomes 101 (including the Dazao genome) Single-cell transcriptomes 14
Protein-coding Genes 1640256 mutant map 251 loci (mutants) in genetic map with 68 deciphered ones
Orthogroups 19411 (intra-species), 7216 (inter-species) Functional research library 320 articles involving 395 protein coding genes
piRNA samples 80 Transcription factor 704
small RNA samples 29 ChIP-seq samples 92
circRNA samples 4 Methylome samples 3
miRNA samples 10 Hi-C samples 18

The pipelines for processing raw sequencing data, genome assembly, genome annotation, variant calling, variant annotation and transcriptome analysis were descripted in a previous publication (7). New data for these elements were processed as described in (7). Annotations of protein-coding genes based on Gene Ontology (GO) (22), Kyoto Encyclopedia of Genes and Genomes (KEGG) (23), Pfam (24), InterPro (IPR) (25) and the Non-Redundant Protein Sequence Database (NR) from NCBI were performed using the BLAST program (v. 2.9.0) (26). Intra-species orthogroups were clustered in 101 silkworm genomes and inter-species orthogroups were clustered in silkworm (B. mori), Drosophila (Drosophila melanogaster), mouse (Mus musculus) and human (Homo sapiens) using orthofinder software (v. 2.3.7) (27). Population genetics statistics such as Tajima's D, π (nucleotide diversity), FST (population divergence) and the reduction of diversity (ROD, 1-πdescendantancestor) were calculated following the previous formula (28–30) with a sliding window of 5000 bp and a step size of 500 bp. For ChIP-seq data, we used bowtie2 (v2.5.1) (31) to map sequencing reads onto the reference genome (14) and filtered PCR duplications with samtools (v1.17) (32). The bamCoverage parameter in deeptools (v3.5.1) (33) was used to transform bam files to bigwig files.

SilkMeta implementation

SilkMeta is a web platform using Vue (v. 2.0) JavaScript and JDK java framework, running on a nginx web server (v. 1.13.7). The database and operating system are MySQL (v. 8.0) and CentOS (v. 7.9). JBrowse2 (34) was used to visualize genomes, variations and gene structures. Primer3 (35) was implemented in SilkMeta for primer design. The BLAST tool (v. 2.14) (26) was installed for genome, gene, protein, lncRNA, transposable elements (TEs), transcription factors (TFs) and full length transcript sequence alignments.

Visualizations and analysis modules

We have developed SilkMeta to accommodate 12 modules for convenient visualization and exploration of the huge omics dataset: ‘sample information’, ‘population structure’, ‘gene search’, ‘variation search’, ‘variations viewer’, ‘genome browser’, ‘expression’, ‘selective sweep’, ‘functional research’, ‘mutant map’, ‘tools’ and ‘download’. These modules are interactive interfaces that enable phenome, genome, transcriptome and variome data to be deposited and analyzed for comparative and functional genomics research (Figure 1). We describe below the functionalities available and the workflow of each module.

Figure 1.

Figure 1.

SilkMeta key features and data. We have divided the 12 functional modules of SilkMeta into four categories: phenome, multi-omics, variome, comparative genomics and functional genomics. (A) Sample information and silkworm population structure in SilkMeta. (B) The phenotypes of Bombyx mandarina (top) differ from those of Bombyx mori (bottom) at egg, larva, pupa (cocoon) and adult stages. B. mandarina mimics the morphology of bird droppings in the third and fourth of larval stages, while it mimics mulberry branches in the final larval stage. The mimicry characteristics of B. mandarina were lost in B. mori during domestication (from wild silkworm to local population). bar = 1 cm. (C) Representative economic characteristics of the silkworm, cocoon size. bar = 1 cm. (D) Genetic stocks (mutants) show numerous phenotypic mutations at egg, larva, pupa (cocoon) and adult stages. We have recorded 294 of these traits in SilkMeta. (E) The ‘gene search’ module and the main features of the gene information page. (F) The ‘expression’ module displays gene expression levels (FPKM values)as a line graph, heatmap, box plot or data sheet by defining tissues, development stages or project identifiers of interest. (G) The ‘variation search’ module and the main features of the variation information page. (H) The genome browser's main page for visualizing variations. (I) Visualization of gene structure and SNP heatmap in a genomic region (chr1:20.18–20.28 Mb) in multiple samples (groups) using the ‘variations viewer’ module. Users can select SNPs, InDels or SVs for plotting. Red (1/1) and blue (0/1) vertical lines represent homozygous and heterozygous variations. (J) Line graphs of selective sweep signals (FST, π, Tajima's D and ROD statistics) in a genomic region (chr1: 0.5–2 kb) between the local and improved population within the ‘selective sweep’ module. (K) Genetic locus and relevant article of tub mutant in the ‘mutant map’ module. (L) Gene symbol, gene ID and method (RNAi, Knockdown, Knockout or Overexpression) are optional key words for searching in the ‘functional research’ module. (M) Characteristics and main data available in BLAST tool. (N) Main features of the ‘sequence downloader’ tool.

Sample information and population structure

SilkMeta records the population structure, basic information and phenotype descriptions of 1078 silkworm strains, including 205 local strains, 194 improved accessions, 632 genetic stocks (mutants) and 47 wild silkworms (B. mandarina). The population structure is represented by a phylogenetic tree and two-dimensional graphs of principal components (Figure 1A). In the ‘sample information’ module of SilkMeta, we have presented three tables: sample information, ONT sequencing, genome assembly and SVs, genome assembly and pan-gene family. The sample information table keeps some basic information such as sample identifier (ID), common name, classification, origin, gender, typical phenotype and data volume of NGS. Below the table is a map showing geographic locations of silkworms. Sample IDs of the 1082 genomes have been named according to their relationship in the phylogenetic tree of ‘population structure’ module. For example, the BomL1, BomL2 and BomL3 are neighbors on the tree. In most cases, users can understand relationships between samples through sample IDs. The second and third tables exhibit information and graphs about ONT sequencing, genome assembly, SVs and pan-genes, which are useful for viewing of the silkworm pan-genome. A search function is presented for users to retrieve information of interest.

In the sample information table, each of those silkworms has been divided into wild, local, improved and genetic stock populations. Local silkworms are domesticated from wild ancestors, then bred into improved accessions. During the domestication process (from wild to local population), phenotypes associated with adaptation, such as the quantity of egg laid, hatching, larval pigmentation and adult flight ability, were modified (Figure 1B). At the breeding stage (from local to improved population), economic traits such as cocoon size and weight were largely increased (Figure 1C). In particular, we tagged 294 traits appearing at the egg, larval, pupal or adult stage with gene symbols (background/phenotype column in the sample information table), including egg pigmentation, larvae strip, cocoon colour, moth pigmentation, etc. (Figure 1D). Variations and genes associated with these phenotypes can be narrowed down using the comparative genomics analysis method (in the ‘variations viewer’ module) and selective sweep.

Gene search and basic information

In the ‘gene search’ module, users can search for genes using gene identifier (ID), gene name, gene function or a genome region within a selected genome (Figure 1E). In SilkMeta, we collected 546 genomes, 1 640 256 protein-coding genes from 101 genomes, 19 411 intra-species orthogroups and 7216 inter-species orthogroups (Table 1). Each of these genes has a unique gene ID and orthogroup identifier (OG ID). The target genes requested using the ‘gene search’ module are listed in the results table below the gene search box. Gene IDs in this table can be clicked to access a page containing basic gene information (e.g. OG ID, gene ID, gene name, locus, expression, functional research, etc.), sequences, gene structure, annotation information (GO, KEGG, IPR, Pfam), homologs, relevant publications and gene classification (Figure 1E). By clicking on the ‘jbrowse’, ‘variations viewer’, ‘expression’ and ‘functional’ buttons (green colour) on the right of gene ID, locus, expression and functional research items, users can switch to interfaces displaying the gene structure (JBrowse2), variations viewer tool, spatial expression profile and functional research module. On the bottom of the basic information box, we provide a link to SilkDB 3.0 for viewing of eFP, co-expression, 3D, synteny and Hi-C plot. In gene classification page, genes were divided into ‘core’ (present in all genomes), ‘softcore’ (present in >90% genomes but not all), ‘dispensable’ (present in more than one but <90% genomes) and ‘private’ (present in only one genome) categories. The phylogenetic tree in the middle visually highlights the genomes containing the observed gene, while the left one presents the classifications of silkworms (Figure 1E). The histogram and table show the frequencies of gene presence in wild silkworm subgroup, local accessions and improved strains in China (CHN-I) and Japan (JPN-I) (Figure 1E).

Expression

The mRNA expression profile is an important indicator of gene function. We collected 1168 transcriptomes (involving 122 bioprojects, 43 tissues and 48 development stages) and analyzed the FPKM values (Fragments Per Kilobase of exon model per Million mapped fragments) of these samples (Table 1). In the ‘expression’ module, users can obtain gene expression levels (FPKM values) in the form of a line graph, heatmap, box plot or data sheet by searching for gene IDs and selecting tissues, development stages or project identifiers (ID) of interest (Figure 1F). In general gene expression analysis, we recommend users select only a project named ‘PRJNA559726: Spatiotemporal expression’ for investigation of gene spatiotemporal expression pattern.

Variation search and basic information

SNPs, InDels and SVs are available by searching for gene ID or genomic region in the ‘variation search’ module (Figure 1G). All variations and corresponding genomic coordinates in the target region are plotted and listed in the results table below the variation search box. Users can access the variation information page by clicking on the variation IDs. This page shows the coordinates, genotype, annotation information and allele frequencies of the variations (Figure 1G). In the population frequencies section, we have highlighted the samples containing the selected variation in the middle phylogenetic tree (the colored tree on the left shows the silkworm classification) (Figure 1G). The allelic frequencies of the variations in global population and subgroups (wild, local, CHN-I and JPN-I) are presented in the histogram and table (Figure 1G).

Genome browser

In the ‘genome browser’ module, users can consult 101 genomes and corresponding gene models. The default interface for this module is the genome and gene model of the Dazao strain (14). In the page of Dazao genome, SNPs and InDels from 1082 samples, together with SVs from 545 silkworms, are uploaded (Figure 1H). The user can select one or more samples on the right-hand side of the window to view the variations. By clicking on one of these variations in the genome browser, detailed information on the variant, e.g. position, genotype, allele frequency, potential influence to adjacent genes, is displayed in the right-hand part of the window. The full length RNA from a prior publication (19) and repeats from SGID were presented in genome browser. In addition, piRNA, small RNA, methylome (bisulfite-seq) and ChIP-seq data were also integrated into genome browser as optional tracks for viewing.

Variations viewer

Alignments and visualizations of variations between multiple samples or groups are essential for the extraction of genes and variations associated with traits of interests. In SilkMeta, we have developed a tool called ‘variations viewer’, which can help users visualize and compare variations between samples in the form of a heatmap (Figure 1I). In this module, users can enter one or more sample identifiers (they also can divide these samples into different groups) and define a genomic region of interest. The gene model and heatmap of variations (SNPs, InDels and SVs) in the samples and genomic region entered are presented below the search box. The heatmap will show collinearly coordinates and genotypes (0/1 and 1/1 represent heterozygous and homozygous loci, respectively) of SNPs, InDels and SVs in the samples provided (Figure 1I). Users can go to the variation information page by clicking on a vertical line (variation).

Selective sweep

As the only insect entirely domesticated by man, the silkworm is a good model for research of insect domestication and evolution. Domestication and breeding are two main stages in the artificial selection of silkworm. Domestication has produced local varieties in native breeding grounds, while breeding has enabled the cultivation of improved accessions with enhanced commercial characteristics. Exploring the genes that may play a role in the domestication and breeding processes is important for evolutionary biologist and silkworm breeder. In SilkMeta, the ‘selective sweep’ module provides an interactive interface enabling users to visualize line graphs of selection signals (FST, Tajima's D, π and ROD statistics) and the gene model in each genomic region or chromosome (Figure 1J). Users can define a pair of populations for exposure. When the ancient group (group1) is defined as the wild population, the descendant population (group2) can be a local or improved population. On the other hand, when the old group is defined as the local population, the descendant population can be the improved, JPN-I or CHN-I population.

Mutant map and functional research

Over the long period of silkworm breeding, around 500 phenotypic mutations have been discovered and preserved in silkworm conservation institutes around the world (2). These mutants display a variety of phenotypes visible in the embryo (egg), larva, pupa (cocoon) or adult stage, providing us with valuable material and a shortcut for exploring questions relating to insects and biology. To date, 251 mutants have been mapped on the silkworm genetic map (36). In addition, scientists have positioned the specific genomic location and genes responsible for 68 mutants. The ‘mutant map’ module has been developed to present the genetic map, which includes the genetic loci of the 251 mutants, as well as the physical location, responsible genes, publications and photos of the phenotypes of the 68 mutants (gene symbol in red). Users can click on the gene symbols in the genetic map or search in the below table to check whether the gene responsible for a mutant has been discovered and follow the publications of silkworm mutants (Figure 1K).

In addition to deciphering natural mutants, biologists have deactivated (down) or overexpressed a gene or its expression level in vivo or in vitro using gene editing, RNA interference or transgenic technology to study gene function. We collected 320 publications involving 395 independent gene function investigations. We have collected these researches and summarized the relevant genes, methods, publications and phenotypes in the ‘functional research’ module of SilkMeta, forming a library of articles on silkworm functional genomics research. Users can search for genes of interest in this module to understand gene function in silkworms (Figure 1L).

Other tools

SilkMeta provides other bioinformatics tools such as BLAST, the sequence downloader, primer design and gRNA design. In the BLAST tool, there are 546 candidate genomes, gene and protein sequences from 101 annotated genomes, as well as libraries of TEs and TFs that can be selected as object database (Figure 1M). For the Dazao strain, we have provided libraries of genome, gene, protein, full-length RNA and lncRNA sequences. In the sequence downloader tool, users can download nucleotide sequences from a given genomic region (546 optional genomes) or requested genes (101 optional genomes). Three types of sequences can be downloaded: the gene region, the extended gene region (±2 kb) and the CDS (coding sequences) when the user indicates the gene ID (Figure 1N). Users can also design primers and gRNAs using the primer and gRNA design tools. The ‘Help’ module contains a ‘User Manual’ detailing the various running processes.

Examples of data mining using SilkMeta

Given the phenotypes, multi-omics data and tools of SilkMeta, users can make practical use of the variations and genes associated with the phenotypes. Here we present two well-known genes, CPH24 (37) and SP1 (38), as examples, to show how to use SilkMeta to mine genes and variations. Detailed analysis steps can be found in user manual of the ‘Help’ module.

First, the Bo mutant (BomM412) is a phenotypic mutation with a bamboo-like body shape (Figure 2A), which is sensitive to ultraviolet irradiation (39). In the ‘mutant map’ module, we found that the Bo locus was positioned in 28.8 centimorgan (cM) of chromosome 11 (11–28.8 cM), close to a reported mod mutation (genetic locus: 11–27.4 cM; physical location: 11–12.3 Mb) (Figure 2B) (36,40). We checked a 400kb genomic region in the right flank of the mod in the ‘gene search’ module and identified 39 protein-coding genes as Bo candidates. By examining the spatial expression file of these genes using the ‘expression’ module, we discovered that three of these genes, KWMTBOM06673 (CPH34), KWMTBOM06674 (CPH25) and KWMTBOM06675 (CPH24) are specifically expressed in the epidermis of silkworm larvae (Figure 2C-E). The Bo mutation display an epidermal abnormality phenotype, implying potential roles for all three genes in Bo determination. Using the ‘variations viewer’ and ‘variation search’ modules, we performed a multi-sample comparison of variations in the genomic regions of CPH34, CPH25 and CPH24, finding a Bo-specific frameshift deletion (bomindel3645341) in the second exon of CPH24 (Figure 2FH). Previous report has demonstrated that CPH24 and bomindel3645341 are the causal gene and variation of Bo mutant (37).

Figure 2.

Figure 2.

Extraction of the gene and variation associated with the Bo (bamboo-like) mutant using SilkMeta. (A) Phenotype of wild-type control and Bo mutant. bar = 1 cm. (B) Bo locus in the silkworm genetic map. (C) Heatmap of the spatial expression profile of the 39 genes in the Bo candidate genomic region. The red box marks genes that are specifically expressed in the epidermis. (D, E) KWMTBOM06673 (CPH34), KWMTBOM06674 (CPH25) and KWMTBOM06675 (CPH24) are specifically expressed in the epidermis of silkworm larvae. (F) An InDel (bomindel3645341) was found in the second exon of KWMTBOM06675 (CPH24) of the Bo mutant (Bom412). (G) The variation information page shows bomindel3645341, a 5 bp deletion in the Bo mutant that causes a frame shift in CPH24 translation. (H) Bomindel3645341 is present only in Bom412 (Bo).

Secondly, the gene associated with domestication, SP1, influences silkworm egg hatchability (38). Egg hatching is a domestication trait, which is weaker in B. mandarina than in B. mori. In SilkMeta, we can examine the artificial selective signal in the ‘selective sweep’ module. We found a significant selective signal in the Sp1 site (KWMTBOMO13992) (Figure 3A). By searching for the egg (or embryo, Oogamete) expression file in the ‘expression’ module, we found that the Sp1 expression in B. mandarina is lower than in B. mori (Figure 3B). Furthermore, by comparing variations in wild and local silkworms using the ‘variations viewer’, we identified seven SNPs and two SVs divergent between wild and local populations (Figure 3C). Among them, two SNPs in exon respectively caused a synonymous substitution and a missense mutation, while other variations are in the intron, 5′ UTR (Untranslated Region), upstream or downstream region of Sp1 (Figure 3D). The upstream SV (bomsv3405859) is a 9718 bp-long insertion at a site 60 bp from the transcriptional start site (Figure 3D). The downstream SV (bomsv3405860) is a 255 bp deletion at a site 1308 bp from the transcriptional termination site, which shows the largest frequency divergence between wild population and local silkworms (Figure 3D). These variations may influence Sp1 expression and merit experimental validation.

Figure 3.

Figure 3.

Data mining of the Sp1, a domestication-associated gene influencing egg hatchability of silkworm. (A) Selective sweep signal (FST, π, Tajima's D and ROD statistics) of the Sp1 locus and flanking region. (B) Expression of Sp1 gene is lower in B. mandarina (wild silkworm) than in B. mori (domesticated silkworm). (C) Alignments of the SNPs and SVs in the genomic (and flanking 2 kb) region of Sp1 between wild and local population. Red (1/1) and blue (0/1) vertical lines represent homozygous and heterozygous variations. Yellow and red circles are corresponding to circles in (D), representing the SNPs and SVs that differentially distributed in wild and local silkworms. (D) Seven SNPs and two SVs were found in the exon, intron, upstream and downstream of Sp1, respectively. Among them, the allele frequency of bomsv3405860 were the highest in wild silkworm population. Red lines in the phylogenetic tree represent samples taking the variation, while grey lines represent samples without the variation. Wild represents wild silkworm, local represents local silkworm, CHN-I represents Chinese improved silkworm, JPN-I represents Japanese improved silkworm. W freq. and L freq. in the table mean the allelic frequencies of variants in wild and local populations, respectively.

Discussion

Currently, the vast pan-genomic silkworm dataset has facilitated silkworm genome analysis, which has moved from research on a single reference genome to a comparison of multiple (population) genomes. Assessment of genomic diversity has expended from SNP (and InDel) detection based on short reads from limited silkworm strains to exploration of global genomic diversity (including SNP, InDel and SV) based on short and long reads from a large silkworm population. This is an essential step towards understanding genomic diversity and deciphering the genotype-phenotype relationship. Compared to previous database, such as KAIKObase (15), SGID (16), SilkDB3.0 (17) and Silkbase (18) centered on a single reference genome, SilkMeta is a comprehensive platform with several unique features: (i) hundreds of genomes; (ii) tens of millions of variations; (iii) records of features or phenotypes of silkworms; (iv) functional research library and genetic map; (v) abundant expression profiles; (vi) multi-omics data collection site. The 12 functional modules in SilkMeta link these features and provide tools for data mining, forming an interactive platform for the acquisition and analysis of biological information on silkworms.

We will regularly update SilkMeta with new data, new assembled genomes, transcriptomes and advances in silkworm functional genome research. We also plan to integrate tools and functionalities suitable for collinearity analysis and visualization, 3D protein structure prediction and visualization, phenotypic values of quantitative trait loci (QTL), as well as data relating to the exploration of regulatory elements (e.g. Chip-seq, ATAC-seq). In short, we aim to maintain and continually improve SilkMeta to facilitate research in silkworm science, insect biology and the life sciences in general.

Acknowledgements

We thank Dr Qi Liu and his team from Wuhan Onemore-tech Co., Ltd for technique support on database construction.

Contributor Information

Kunpeng Lu, State Key Laboratory of Resource Insects, Institute of Sericulture and Systems Biology, Southwest University, Chongqing 400715, China; Key Laboratory of Sericultural Biology and Genetic Breeding, Ministry of Agriculture and Rural Affairs, College of Sericulture, Textile and Biomass Sciences, Southwest University, Chongqing 400715, China.

Yifei Pan, State Key Laboratory of Resource Insects, Institute of Sericulture and Systems Biology, Southwest University, Chongqing 400715, China.

Jianghong Shen, State Key Laboratory of Resource Insects, Institute of Sericulture and Systems Biology, Southwest University, Chongqing 400715, China.

Lin Yang, State Key Laboratory of Resource Insects, Institute of Sericulture and Systems Biology, Southwest University, Chongqing 400715, China.

Chengyu Zhan, State Key Laboratory of Resource Insects, Institute of Sericulture and Systems Biology, Southwest University, Chongqing 400715, China.

Shubo Liang, State Key Laboratory of Resource Insects, Institute of Sericulture and Systems Biology, Southwest University, Chongqing 400715, China.

Shuaishuai Tai, BGI Research, Sanya 572025, China.

Linrong Wan, State Key Laboratory of Resource Insects, Institute of Sericulture and Systems Biology, Southwest University, Chongqing 400715, China.

Tian Li, State Key Laboratory of Resource Insects, Institute of Sericulture and Systems Biology, Southwest University, Chongqing 400715, China.

Tingcai Cheng, State Key Laboratory of Resource Insects, Institute of Sericulture and Systems Biology, Southwest University, Chongqing 400715, China.

Bi Ma, State Key Laboratory of Resource Insects, Institute of Sericulture and Systems Biology, Southwest University, Chongqing 400715, China.

Guoqing Pan, State Key Laboratory of Resource Insects, Institute of Sericulture and Systems Biology, Southwest University, Chongqing 400715, China.

Ningjia He, State Key Laboratory of Resource Insects, Institute of Sericulture and Systems Biology, Southwest University, Chongqing 400715, China.

Cheng Lu, State Key Laboratory of Resource Insects, Institute of Sericulture and Systems Biology, Southwest University, Chongqing 400715, China.

Eric Westhof, State Key Laboratory of Resource Insects, Institute of Sericulture and Systems Biology, Southwest University, Chongqing 400715, China; Architecture et Réactivité de l’ARN, Institut de Biologie Moléculaire et Cellulaire, UPR9002 CNRS, Université de Strasbourg, Strasbourg 67084, France.

Zhonghuai Xiang, State Key Laboratory of Resource Insects, Institute of Sericulture and Systems Biology, Southwest University, Chongqing 400715, China.

Min-Jin Han, State Key Laboratory of Resource Insects, Institute of Sericulture and Systems Biology, Southwest University, Chongqing 400715, China; Key Laboratory of Sericultural Biology and Genetic Breeding, Ministry of Agriculture and Rural Affairs, College of Sericulture, Textile and Biomass Sciences, Southwest University, Chongqing 400715, China.

Xiaoling Tong, State Key Laboratory of Resource Insects, Institute of Sericulture and Systems Biology, Southwest University, Chongqing 400715, China; Key Laboratory of Sericultural Biology and Genetic Breeding, Ministry of Agriculture and Rural Affairs, College of Sericulture, Textile and Biomass Sciences, Southwest University, Chongqing 400715, China.

Fangyin Dai, State Key Laboratory of Resource Insects, Institute of Sericulture and Systems Biology, Southwest University, Chongqing 400715, China; Key Laboratory of Sericultural Biology and Genetic Breeding, Ministry of Agriculture and Rural Affairs, College of Sericulture, Textile and Biomass Sciences, Southwest University, Chongqing 400715, China.

Data availability

All data in SilkMeta are available and accessible at http://silkmeta.org.cn.

Funding

National Natural Science Foundation of China [31 830 094, U20A2058, 32 202 746]; China Agriculture Research System of MOF and MARA [CARS-18-ZJ0102, CARS-18-ZJ0103]; Natural Science Foundation of Chongqing, China [cstc2021jcyj-cxtt0005, cstc2021jcyj-bshX0014]; Special Funding for Postdoctoral Research of Chongqing, China [2022CQBSHTB3066]. Funding for open access charge: National Natural Science Foundation of China [31 830 094, U20A2058, 32 202 746]; China Agriculture Research System of MOF and MARA [CARS-18-ZJ0102, CARS-18-ZJ0103]; Natural Science Foundation of Chongqing, China [cstc2021jcyj-cxtt0005, cstc2021jcyj-bshX0014]; Special Funding for Postdoctoral Research of Chongqing, China [2022CQBSHTB3066].

Conflict of interest statement. None declared.

References

  • 1. Kametaro T Studies on the hybridology of insects. I. On some silkworm crosses, with special reference to Mendel's law of heredity. Bull. Coll. Agric. Tokyo Imperial Univ. 1906; 7:259–353. [Google Scholar]
  • 2. Banno Y., Shimada T., Kajiura Z., Sezutsu H.. The silkworm-an attractive BioResource supplied by Japan. Exp. Anim. 2010; 59:139–146. [DOI] [PubMed] [Google Scholar]
  • 3. Kametaro T. Breeding methods of silkworm. Sangyo Shimpo. 1906; 158:282–286. [Google Scholar]
  • 4. Nagaraju J.U., Raje Datta R.K. Crossbreeding and heterosis in the silkworm, Bombyx mori: a review. Sericologia. 1996; 36:1–26. [Google Scholar]
  • 5. Daimon T., Koyama T., Yamamoto G., Sezutsu H., Mirth C.K., Shinoda T.. The number of larval molts is controlled by hox in caterpillars. Curr. Biol. 2021; 31:884–891. [DOI] [PubMed] [Google Scholar]
  • 6. Yamaguchi J., Banno Y., Mita K., Yamamoto K., Ando T., Fujiwara H.. Periodic Wnt1 expression in response to ecdysteroid generates twin-spot markings on caterpillars. Nat. Commun. 2013; 4:1857. [DOI] [PubMed] [Google Scholar]
  • 7. Tong X.L., Han M.J., Lu K.P., Tai S.S., Liang S.B., Liu Y.C., Hu H., Shen J.H., Long A.X., Zhan C.Y.et al.. High-resolution silkworm pan-genome provides genetic insights into artificial selection and ecological adaptation. Nat. Commun. 2022; 13:5619. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8. Xiang H., Liu X., Li M., Zhu Y., Wang L., Cui Y., Liu L., Fang G., Qian H., Xu A.et al.. The evolutionary road from wild moth to domestic silkworm. Nat. Ecol. Evol. 2018; 2:1268–1279. [DOI] [PubMed] [Google Scholar]
  • 9. Xia Q., Guo Y., Zhang Z., Li D., Xuan Z., Li Z., Dai F., Li Y., Cheng D., Li R.et al.. Complete resequencing of 40 genomes reveals domestication events and genes in silkworm (Bombyx). Science. 2009; 326:433–436. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10. Wilkinson M.E., Frangieh C.J., Macrae R.K., Zhang F.. Structure of the R2 non-LTR retrotransposon initiating target-primed reverse transcription. Science. 2023; 380:301–308. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11. Ma S.Y., Smagghe G., Xia Q.Y.. Genome editing in Bombyx mori: new opportunities for silkworm functional genomics and the sericulture industry. Insect Sci. 2019; 26:964–972. [DOI] [PubMed] [Google Scholar]
  • 12. Xia Q., Zhou Z., Lu C., Cheng D., Dai F., Li B., Zhao P., Zha X., Cheng T., Chai C.et al.. A draft sequence for the genome of the domesticated silkworm (Bombyx mori). Science. 2004; 306:1937–1940. [DOI] [PubMed] [Google Scholar]
  • 13. International Silkworm Genome Consortium The genome of a lepidopteran model insect, the silkworm Bombyx mori. Insect Biochem. Mol. Biol. 2008; 38:1036–1045. [DOI] [PubMed] [Google Scholar]
  • 14. Kawamoto M., Jouraku A., Toyoda A., Yokoi K., Minakuchi Y., Katsuma S., Fujiyama A., Kiuchi T., Yamamoto K., Shimada T.. High-quality genome assembly of the silkworm, Bombyx mori. Insect Biochem. Mol. Biol. 2019; 107:53–62. [DOI] [PubMed] [Google Scholar]
  • 15. Shimomura M., Minami H., Suetsugu Y., Ohyanagi H., Satoh C., Antonio B., Nagamura Y., Kadono-Okuda K., Kajiwara H., Sezutsu H.et al.. KAIKObase: an integrated silkworm genome database and data mining tool. Bmc Genomics [Electronic Resource]. 2009; 10:486. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16. Zhu Z., Guan Z., Liu G., Wang Y., Zhang Z.. SGID: a comprehensive and interactive database of the silkworm. Database (Oxford). 2019; 2019:baz134. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17. Lu F., Wei Z., Luo Y., Guo H., Zhang G., Xia Q., Wang Y.. SilkDB 3.0: visualizing and exploring multiple levels of data for silkworm. Nucleic Acids Res. 2020; 48:D749–D755. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18. Kawamoto M., Kiuchi T., Katsuma S.. SilkBase: an integrated transcriptomic and genomic database for Bombyx mori and related species. Database (Oxford). 2022; 2022:baac040. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19. Dai Z.R., Ren J.Y., Tong X.L., Hu H., Lu K.P., Dai F.Y., Han M.J.. The landscapes of full-length transcripts and splice isoforms as well as transposons exonization in the lepidopteran model system. Bombyx Mori. Front. Genet. 2021; 12:704162. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20. Feng M., Xia J.M., Fei S.G., Peng R.X., Wang X., Zhou Y.H., Wang P.W., Swevers L., Sun J.C.. Identification of silkworm hemocyte subsets and analysis of their response to baculovirus infection based on single-Cell RNA sequencing. Front. Immunol. 2021; 12:645359. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21. Ma Y., Zeng W.H., Ba Y.B., Luo Q., Ou Y., Liu R.P., Ma J.W., Tang Y.Y., Hu J., Wang H.M.et al.. A single-cell transcriptomic atlas characterizes the silk-producing organ in the silkworm. Nat. Commun. 2022; 13:3316. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22. Gene Ontology Consortium Aleksander S.A., Balhoff J., Carbon S., Cherry J.M., Drabkin H.J., Ebert D., Feuermann M., Gaudet P., Harris N.L.et al.. The Gene Ontology knowledgebase in 2023. Genetics. 2023; 224:iyad031. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23. Kanehisa M., Furumichi M., Sato Y., Ishiguro-Watanabe M., Tanabe M.. KEGG: integrating viruses and cellular organisms. Nucleic Acids Res. 2021; 49:D545–D551. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24. Mistry J., Chuguransky S., Williams L., Qureshi M., Salazar G.A., Sonnhammer E.L.L., Tosatto S.C.E., Paladin L., Raj S., Richardson L.J.et al.. Pfam: the protein families database in 2021. Nucleic Acids Res. 2021; 49:D412–D419. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25. Paysan-Lafosse T., Blum M., Chuguransky S., Grego T., Pinto B.L., Salazar G.A., Bileschi M.L., Bork P., Bridge A., Colwell L.et al.. InterPro in 2022. Nucleic Acids Res. 2023; 51:D418–D427. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26. Camacho C., Coulouris G., Avagyan V., Ma N., Papadopoulos J., Bealer K., Madden T.L.. BLAST+: architecture and applications. BMC Bioinf. 2009; 10:421. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27. Emms D.M., Kelly S.. OrthoFinder: phylogenetic orthology inference for comparative genomics. Genome Biol. 2019; 20:238. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28. Tajima F. Statistical method for testing the neutral mutation hypothesis by DNA polymorphism. Genetics. 1989; 123:585–595. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29. Hudson R.R., Slatkin M., Maddison W.P.. Estimation of levels of gene flow from DNA sequence data. Genetics. 1992; 132:583–589. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30. Chen H., Patterson N., Reich D. Population differentiation as a test for selective sweeps. Genome Res. 2010; 20:393–402. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31. Langmead B., Salzberg S.L.. Fast gapped-read alignment with Bowtie 2. Nat. Methods. 2012; 9:357–359. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32. Li H., Handsaker B., Wysoker A., Fennell T., Ruan J., Homer N., Marth G., Abecasis G., Durbin R.Genome Project Data Processing, S . The Sequence Alignment/Map format and SAMtools. Bioinformatics. 2009; 25:2078–2079. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33. Ramirez F., Ryan D.P., Gruning B., Bhardwaj V., Kilpert F., Richter A.S., Heyne S., Dundar F., Manke T.. deepTools2: a next generation web server for deep-sequencing data analysis. Nucleic Acids Res. 2016; 44:W160–W165. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34. Diesh C., Stevens G.J., Xie P.T., Martinez T.D., Hershberg E.A., Leung A., Guo E., Dider S., Zhang J.J., Bridge C.et al.. JBrowse 2: a modular genome browser with views of synteny and structural variation. Genome Biol. 2023; 24:74. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35. Koressaar T., Remm M.. Enhancements and modifications of primer design program Primer3. Bioinformatics. 2007; 23:1289–1291. [DOI] [PubMed] [Google Scholar]
  • 36. Dai F.Y., Tong X.L., Li C.L., Hu H.. The genetics of the silkworm. Sericultural Research Institute, Chinese Academy of Agricultural Sciences, Jiangsu University of Science and Technology: The sericultural science in China. 2020; 1st edn.Shanghai, China: Shanghai Scientific & Technical Publishers; 265–309. [Google Scholar]
  • 37. Xiong G., Tong X., Gai T., Li C., Qiao L., Monteiro A., Hu H., Han M., Ding X., Wu S.et al.. Body shape and coloration of silkworm larvae are influenced by a novel cuticular protein. Genetics. 2017; 207:1053–1066. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38. Zhu Y.N., Wang L.Z., Li C.C., Cui Y., Wang M., Lin Y.J., Zhao R.P., Wang W., Xiang H.. Artificial selection on storage protein 1 possibly contributes to increase of hatchability during silkworm domestication. PLoS Genet. 2019; 15:e1007616. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39. Xiong G., Tong X.L., Yan Z.W., Hu H., Duan X.H., Li C.L., Han M.J., Lu C., Dai F.Y.. Cuticular protein defective Bamboo mutant of Bombyx mori is sensitive to environmental stresses. Pestic. Biochem. Phys. 2018; 148:111–115. [DOI] [PubMed] [Google Scholar]
  • 40. Daimon T., Kozaki T., Niwa R., Kobayashi I., Furuta K., Namiki T., Uchino K., Banno Y., Katsuma S., Tamura T.et al.. Precocious metamorphosis in the juvenile hormone-deficient mutant of the silkworm, Bombyx mori. PLoS Genet. 2012; 8:e1002486. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

All data in SilkMeta are available and accessible at http://silkmeta.org.cn.


Articles from Nucleic Acids Research are provided here courtesy of Oxford University Press

RESOURCES