Abstract
We have developed a linkage map for the silkworm Bombyx mori based on single nucleotide polymorphisms (SNPs) between strains p50T and C108T initially found on regions corresponding to the end sequences of bacterial artificial chromosome (BAC) clones. Using 190 segregants from a backcross of a p50T female × an F1 (p50T × C108T) male, we analyzed segregation patterns of 534 SNPs between p50T and C108T, detected among 3840 PCR amplicons, each associated with a p50T BAC end sequence. This enabled us to construct a linkage map composed of 534 SNP markers spanning 1305 cM in total length distributed over the expected 28 linkage groups. Of the 534 BACs whose ends harbored the SNPs used to construct the linkage map, 89 were associated with 107 different ESTs. Since each of the SNP markers is directly linked to a specific genomic BAC clone and to whole-genome sequence data, and some of them are also linked to EST data, the SNP linkage map will be a powerful tool for investigating silkworm genome properties, mutation mapping, and map-based cloning of genes of industrial and agricultural interest.
THE silkworm, Bombyx mori, is an agriculturally important insect that has been domesticated for an estimated 5000 years and used extensively for silk production. In addition, it is a key model of the Lepidoptera, the second most numerous group of holometabolous insects, which include many beneficials but also the most destructive agricultural pests. The strength of the silkworm as a model arises in part from its value as a genetic resource. Spontaneous mutations found while practicing sericulture were used to help establish principles of classical genetics, including sex linkage and the discovery of no crossing over in lepidopteran females (Tanaka 1913; Sturtevant 1915), maternal mutations (Toyama 1912), homeotic mutants and complex loci (Ichikawa 1943; Tsujita 1953), and the construction of early linkage maps (Fujii et al. 1998; Yasukochi et al. 2005). With the establishment of stable transformation (Yamao et al. 1999; Tamura et al. 2000), the silkworm has shown the potential to produce pharmaceutically important proteins in high yield (Tomita et al. 2003), opening up new applications for sericulture in medical, agricultural, and industrial fields.
Analysis of the silkworm genome began a few years ago because of its importance for breeding and genetic studies, for isolating valuable genes and promoters, and for comparative genomics (Goldsmith et al. 2005). Our group first initiated intensive sequencing of the silkworm genome using expressed sequence tags (ESTs; Mita et al. 2003). Recently, our group (Mita et al. 2004) and a second group (Xia et al. 2004) reported the results of whole-genome shotgun sequencing and provided public access to the assembled silkworm genome data (http://www.dna.affrc.go.jp/genome/; Wang et al. 2005; http://silkworm.genomics.org.cn/).
At present, several silkworm linkage maps based on molecular markers are available, including randomly amplified polymorphic DNA (RAPD) (Promboon et al. 1995; Yasukochi 1998), simple sequence repeat and RAPDs (Nagaraja et al. 2005), amplified fragment length polymorphisms (AFLP) (Tan et al. 2001), restriction fragment length polymorphisms (Shi et al. 1995; Kadono-Okuda et al. 2002; Nguu et al. 2005), and microsatellites (Prasad et al. 2005; Miao et al. 2005). Recently, karyotyping of the chromosomes using bacterial artificial chromosome (BAC)–FISH based on Yasukochi's RAPD map was also reported (Yoshido et al. 2005). However, no linkage map composed of single-nucleotide polymorphisms (SNPs) has been developed, although a characterization of SNPs located in coding regions between different individuals and tissues has been reported (Cheng et al. 2004).
We report here construction of a linkage map of the silkworm genome comprising 534 SNPs. The SNPs were detected between two parental mapping strains, p50T and C108T, and their F1 hybrid by PCR amplification and resequencing of the end-sequence regions of BAC clones that are part of an independent p50T BAC end-sequence project (K. Yamamoto, Y. Suetsugu, J. Narukawa, H. Minami, S. Sasanuma, M. Sasanuma, J. Nohata, K. Kadono-Okuda, M. Shimomura and K. Mita, unpublished results). We constructed the map by scoring discovered SNPs in 190 segregants from a male informative backcross. The SNP maps were assembled into 28 linkage groups, which were further assigned to the standard genetic linkage groups based on morphological markers. In addition, we colocalized 107 ESTs to the same BACs on which the SNPs were discovered. The future localization of SNPs and ESTs on a robust genetic and physical map based on BAC clones will provide a useful tool not only for investigation of genome properties of silkworm and other lepidopteran species, but also for mapping mutations of interest and cloning them by map-based procedures.
MATERIALS AND METHODS
Silkworm strains and crosses:
The inbred silkworm strains p50T and C108T, maintained at the University of Tokyo, were used as parent strains for the mapping panel. For linkage map construction, 190 segregants of a single-pair backcross (BC1) between a p50T female and an F1 male (p50T female × C108T male) were used.
Genomic DNA extraction:
Genomic DNA of parental strains and F1 individuals was isolated from whole bodies of fifth instar larvae after removing midguts and hemolymph as described in a previous report (Nguu et al. 2005). Genomic DNA of individual BC1 segregants was isolated from whole pupae using DNAzol (Invitrogen, San Diego) after freezing in liquid nitrogen and homogenization with stainless steel beads.
BAC libraries:
Two BAC libraries constructed from strain p50T genomic DNA were used for end sequencing of individual clones. One, designated as RPCI-96, was constructed by Pieter de Jong's group (Children's Hospital Oakland Research Institute, Duarte, CA) with genomic DNA partially digested with EcoRI (Koike et al. 2003); the other was constructed with genomic DNA partially digested with BamHI (C. Wu and H. Zhang, personal communication; GeneFinder Genomic Resources, Texas A&M University, College Station, TX).
Purification of BAC clones:
Escherichia coli cells harboring single BAC clones were inoculated into individual wells of a 96-deep-well plate filled with 1.25 ml of 2× LB medium (2% tryptone peptone, 1% yeast extract, and 1% sodium chloride) containing 20 μg/ml of chloramphenicol and cultivated with shaking for 18–20 hr at 37°. The BAC DNA was prepared using a PI-1100 automatic DNA isolation system (KURABO) according to the manufacturer's instructions.
BAC end sequencing:
Sequencing reactions were performed using a reaction mixture composed of 3 μl Big Dye terminator (Applied Biosystems, Foster City, CA), 1.0 μl of 5× sequencing buffer, 0.5–1.0 μg of template DNA, 10 pmol of primer, and 4 mm MgCl2. The thermal cycling reactions were conducted under the following conditions: 96° for 5 min; 99 cycles of 96° for 30 sec, 55° for 10 sec, and 60° for 4 min, followed by 4° on hold. Custom-made T7 and SP6 sequencing primers (Table 1) were used. The DNA was recovered by multiscreen 384SEQ (Millipore, Bedford, MA). Sequence trimming was conducted by processing the traces using the base-calling software PHRED (Ewing and Green 1998; Ewing et al. 1998). Altogether, the sequences of 73,728 ends from the EcoRI-digested library and 42,240 ends from the BamHI-digested library were determined (K. Yamamoto, Y. Suetsugu, J. Narukawa, H. Minami, S. Sasanuma, M. Sasanuma, J. Nohata, K. Kadono-Okuda, M. Shimomura and K. Mita, unpublished results).
TABLE 1.
Sequences retained at each stage of the SNP survey
Stage | No. of sequences |
---|---|
Used for initial SNP survey | 3840 |
Quality sufficient for mappinga | 3005 |
Containing SNPs | 781 |
Assigned to linkage groups | 534 |
Among the sequences, 2109 had no SNPs between parents, and 115 had SNPs but the F1 did not have a heterozygote sequence peak.
Survey of the SNPs between p50T and C108T:
For the linkage map construction, SNPs, including small base insertions and deletions (indels), were identified in a large number of PCR amplicons designed from the sequence data obtained from the BAC end sequencing. We searched for nonredundant sequences using the BLAST algorithm (Altschul et al. 1990) with an E-value of 1e-50 as a threshold, and then we randomly selected 3840 nonredundant BAC end sequences for the SNP survey. For each end sequence, we designed a PCR primer pair using Primer3 (Rozen and Skaletsky 2000) and performed PCR amplification of the genomic DNA of the parental (p50T and C108T) and F1 strain with ExTaq (TaKaRa) using the manufacturer's instructions. We detected the presence of SNPs in these amplicons by sequencing the 3840 amplicons for all three genotypes and analyzing the resulting traces using PolyPhred (Nickerson et al. 1997).
Linkage map construction:
For the detection of polymorphisms, we used both the direct sequencing of the PCR amplicons from BAC end regions and a fluorescent polarization dye terminator SNP-detection assay (Nasu et al. 2002) in parallel. Details of the primers used for the amplification of each marker are given in supplemental Table S1 at http://www.genetics.org/supplemental/. For each of the SNPs detected between parent strains, the BAC end regions of 190 BC1 segregants were amplified and sequenced, and the polymorphisms of the segregants were determined. The SNPs among the PCR amplicons were also detected by using the AcycloPrime FP SNP detection kit (PerkinElmer Life Science) with a Wallac 1420 ARVOMX instrument (PerkinElmer Life Science). In this system, following enzymatic removal of excess primers and nucleotides, the first PCR amplicons are used as templates for a second PCR reaction based on primers designed to terminate immediately upstream of the polymorphic site to incorporate one of two fluorescent terminators, representing the allelic SNP nucleotides, into products. The SNPs are detected by differential fluorescence polarization of the two terminator dyes.
Segregation patterns were analyzed using Mapmaker/exp (version 3.0; Lander et al. 1987) with the Kosambi mapping function (Kosambi 1944). The “GROUP” command (Linkage Groups at min LOD 3.00, max Distance 37.2) was used to cluster all informative markers into linkage groups. The “COMPARE” commands were used to construct a framework (draft) of each linkage group, and additional marker positions were assigned by using the “TRY” command. Sequence and typing errors were detected with the “ERROR DETECTION” option.
Mapping of the p locus:
p50T larvae have dark body pigments (+p phenotype), but C108T larvae are unmarked (p plain phenotype). Thus, BC1 segregants with the same intense dark pigment as p50T were scored as homozygous (+p/+p) phenotype, and those with light-colored pigment, compared to p50T, were scored as heterozygous (+p/p). The segregation pattern obtained was analyzed together with SNP markers by Mapmaker/exp as described above.
Localization of ESTs on BAC clones:
A p50T EcoRI–BAC library consisting of 36,864 clones with an average insert size of 168 kb was arrayed in duplicate in specific patterns onto two nylon membranes to make BAC high-density replica (HDR) filters (copies of BAC library and HDR filters are available through BACPAC Resources at the Children's Hospital Oakland Research Institute; http://www.chori.org/bacpac). HDR filters were hybridized with PCR-amplified inserts of individual cDNA clones representing random, nonredundant EST sequences from a large-scale sequencing project (Mita et al. 2003; K. Mita, K. Kadono-Okuda, K. Yamamoto, M. Shimomura, Y. Nagamura, J. Nohata, S. Sasanuma, M. Sasanuma, M. R. Goldsmith and T. Shimada, unpublished results). Labeling, hybridization, and detection were performed using the ECL direct nucleic acid labeling and detection system kit (Amersham-Pharmacia Biotec) exactly according to the manufacturer's instructions (Koike et al. 2003).
RESULTS
Preliminary analysis of SNP frequency and characteristics:
Since minimal information was available on the frequency and characteristics of SNPs in the silkworm genome (Cheng et al. 2004), we carried out a survey using the end-sequence data of clones from two silkworm genomic BAC libraries constructed from strain p50T with partial EcoRI-digested or BamHI-digested genomic DNA in conjunction with a whole-genome physical mapping project (K. Yamamoto, Y. Suetsugu, J. Narukawa, H. Minami, S. Sasanuma, M. Sasanuma, J. Nohata, K. Kadono-Okuda, M. Shimomura and K. Mita, unpublished results). We eliminated redundant sequences from the complete data set of 115,968 BAC ends using the BLAST algorithm and randomly selected 3840 nonredundant sequences (3072 from the EcoRI- and 768 from the BamHI-digested library) for the SNP survey. In an initial characterization, we randomly choose 95 BAC end sequences from this subset, designed specific primer pairs for the respective end-sequence regions, and amplified the regions from genomic DNA of p50T and C108T by PCR. Then the amplicons from the two strains were resequenced. As a result, we found SNPs in 31 BAC ends and detected 133 SNPs within the total sequence length of 54,586 bp. This indicated that the frequency of SNPs for our data set was one/410 bp, sufficient for linkage mapping. Among these SNPs, transitions (A/G and C/T substitutions) accounted for 49%, transversions accounted for 46% [21, 17, and 8% for A/T, A/C (G/T), and G/C substitutions, respectively], and small base indels (Hayashi et al. 2004) accounted for 5%. A/T substitutions were the most frequent, whereas G/C substitutions were the least frequent. The ratio of transitions to transversions was 1.08.
SNP survey:
We designed primers for each of the 3840 BAC end sequences and used them to amplify genomic DNA from p50T and C108T, two standard strains that have been used routinely for large-scale molecular linkage map construction, as well as their F1 hybrid. We then resequenced the amplicons and examined the traces for evidence of SNPs. The results of this survey, including small nucleotide indels, are summarized in Table 1. Among the 3840 sequences analyzed, 3005 gave high-enough-quality sequence data for further analysis. Among them, 2109 did not have SNPs and 115 had SNPs between the parents but the F1 did not show a heterozygous sequence peak at the expected position. Consequently, we initially identified SNPs in 781 sequences, which could be used for a genetic analysis of polymorphism. However, we did not obtain a high-enough-sequence quality for 247 of these SNPs in the backcross mapping panel, giving a final yield of 534 SNPs for the linkage analysis.
Linkage map construction:
To construct the linkage map, using the same PCR amplification and resequencing procedure in parallel with a fluorescent polarization dye terminator SNP-detection assay (Nasu et al. 2002) (AcycloPrime FP method), we surveyed the segregation patterns of the SNPs of 190 BC1 individuals from a single pair mating between a p50T female and an F1 male (p50T female × C108T male). On the basis of an analysis of the data using Mapmaker/exp (version 3.0; LOD score 3.0), we successfully positioned 534 SNPs on the linkage map. The results are summarized in Table 2. The SNP markers segregated into 28 linkage groups, with a total recombination length of 1305 cM.
TABLE 2.
Summary of linkage groups in SNP linkage map
Linkage groupa | Morphological mutantsb | No. of markers | Recombination length (cM) |
---|---|---|---|
1 | — | 23 | 45 |
2 | p | 11c | 32 |
3 | Ze, lem | 13 | 35 |
4 | L | 26 | 51 |
5 | oc | 27 | 60 |
6 | E | 24 | 43 |
7 | q | 21 | 46 |
8 | st | 27 | 54 |
9 | Ia | 23 | 34 |
10 | w-2 | 32 | 44 |
11 | K | 30 | 64 |
12 | C | 21 | 47 |
13 | ch | 26 | 47 |
14 | U | 8 | 52 |
15 | bl | 25 | 42 |
16 | cts | 14 | 51 |
17 | bts | 16 | 46 |
18 | mln | 21 | 48 |
19 | nb | 17 | 49 |
20 | oh | 8 | 27 |
21 | Lan | 19 | 45 |
22 | or | 21 | 59 |
23 | tub | 27 | 53 |
24 | Ym, sel | 10 | 48 |
25 | oy | 19 | 39 |
26 | so | 7 | 48 |
A | — | 7 | 42 |
B | — | 12 | 54 |
Total | 535 | 1305 |
A total of 534 SNP markers and one morphological marker segregated into 28 linkage groups. The numbering of the linkage groups corresponds to those of the standard silkworm linkage map (Fujii et al. 1998).
The linkage group numbers except groups 1, A, and B were determined by using 27 morphological mutants specific to previously assigned groups (Fujii et al. 1998).
The number of markers includes the p locus.
Assignment of linkage groups:
We assigned one of the SNP linkage groups to the Z chromosome using a sex-chromosome-specific property on the basis of the fact that the female is heterogametic (ZW) and the male is homogametic (ZZ). Consequently, a BC1 female is predicted to have a sex chromosome pair of either Z-p50T/W-p50T or Z-C108T/W-p50T, whereas autosomes (A) will be A-p50T/A-p50T or A-C108T/A-p50T. Since Z and W chromosomes have virtually no correspondence, Z-linked markers are hemizygous in females. Therefore, we assigned the linkage group for which SNPs of the female population had either p50T or C108T-type SNP patterns to the Z chromosome, in contrast to the autosomes, which displayed homozygous p50T or heterozygous p50T/C108T phenotypes.
We assigned 25 of the autosomal SNP linkage groups to standard silkworm linkage groups 2–26 summarized in Fujii et al. (1998) and Banno et al. (2005). For the assignment, we used one or two morphological mutations specific to each of linkage groups 2–26 (27 mutations total). The mutations used for each group are listed in Table 2.
Marker strains with morphological mutations specific to individual linkage groups of the standard maps were crossed with a normal (wild-type) strain, and the F1 female was backcrossed to either the recessive homozygous mutant male or to a normal (wild-type) male as required to score segregants. The procedure relies on lack of crossing over in females to give complete linkage for markers on a given chromosome (Yasukochi et al. 2005). We selected a SNP marker from each of the linkage groups of our map, examined whether or not the markers cosegregated with the morphological mutations, and determined the SNPs corresponding to the mutations. Twenty-three backcrosses were carried out to marker stocks, and the cosegregations were scored in 6–22 individuals for each cross. By the analysis, we obtained 27 SNPs corresponding to 27 morphological mutations and assigned the linkage groups of our map to the standard silkworm linkage groups 2–26. We assigned the remaining two linkage groups provisionally to groups A and B, which do not have well-defined morphological markers.
The SNP linkage map is illustrated in Figure 1. The number of markers per linkage group varies from 7 (groups 26 and A) to 32 (group 10), and the recombination length for each linkage group ranges from 27 cM (group 20) to 64 cM (group 11). The average distance between the markers is 2.5 cM. The markers are not evenly distributed throughout the linkage map, and so they may be dense or sparse, depending on the region. For example, there are 14 gaps with lengths exceeding 10 cM, including large gaps on the proximal end of linkage group B (20.2 cM) and on the distal part of linkage group 14 (21.3 cM). Alternatively, several regions with relatively high marker density also exist (e.g., the middle portion of linkage group 23, from T013C09 to T603H10, with an average distance of 0.78 cM). Details of each marker, including BAC accession number, are described in supplemental Table S1 at http://www.genetics.org/supplemental/.
Figure 1.
B. mori SNP linkage map based on 534 SNP markers segregated into 28 linkage groups, represented by vertical lines. The BAC clones corresponding to the mapped SNP markers are shown at the right of the vertical lines; the recombination distances between the markers are indicated at the left. The p locus was mapped on linkage group 2 at the position of 10.3 cM from the proximal end.
Mapping of the p locus on the linkage map:
Alleles for one morphological marker, the p locus, which affects larval body pigmentation, segregated in the initial SNP mapping panel. p50T carries a semidominant allele that is darkly pigmented compared to C108T, which is unpigmented or “plain.” We were able to discriminate homozygotes and heterozygotes by scoring pigment intensity in fifth instar larvae. We mapped the p locus to a position 10.3 cM from the proximal end of the SNP linkage group; on the basis of the presence of this marker we assigned it to standard linkage group 2.
Mapping of EST markers onto the linkage map:
In parallel, as part of an independent project, we have been proceeding with the construction of BAC contigs with the “overlapping” method, whereby BAC clones are arrayed on HDR filters and subjected to large-scale screening by hybridization with individual, nonredundant cDNAs representing ESTs (Mita et al. 2003). Contigs are constructed by the presence of one or more common ESTs on different BAC clones. So far, we have carried out ∼6000 hybridizations (data not shown). Upon inspection, we found that a number of BAC clones containing nonredundant ESTs corresponded to ones mapped by the SNP analysis reported here. We designed specific primer sets to amplify expected ESTs on selected BACs by PCR and, finally, confirmed the presence of 107 ESTs on 89 BACs of the mapped SNP-containing BACs (Table 3). Forty-nine of them (55%) were found to have deduced amino acid sequences with significant homology to known proteins.
TABLE 3.
Mapped ESTs
Linkage group | Position (cM) | BAC ID | EST IDa | EST accession no. | Descriptionb | Score (bits) | E-value |
---|---|---|---|---|---|---|---|
1 | 19.5 | 083M08 | wv40151 | AU005825 | ref|XP_396476.1| similar to ENSANGP00000009569 (Apis mellifera) | 190 | 2e-47 |
23.7 | 092G15 | wdS30172 | AU005039 | ref|XP_315683.1| ENSANGP00000021837 (Anopheles gambiae) | 139 | 6e-32 | |
2 | 17.1 | 084K13 | heS30665 | AV402881 | ref|XP_308839.1| ENSANGP00000021738 (A. gambiae) | 96 | 1e-26 |
3 | 17.5 | 031A16 | heS30235 | AV402629 | |||
17.5 | 031A16 | prgv0160 | AV404557 | ||||
17.5 | 008I09 | wdS00806 | AU003984 | ||||
4 | 18.0 | 024J19 | wdS30342 | AU005170 | |||
26.4 | 085A12 | wdS30999 | AU005669 | ||||
26.4 | 047G12 | heS30087 | AV402533 | ||||
26.4 | 047G12 | n0006 | AU002482 | ref|XP_320836.1| ENSANGP00000017965 (A. gambiae) | 188 | 4e-47 | |
26.4 | 047G12 | heS00144 | AV401774 | ref|XP_319975.1| ENSANGP00000016783 (A. gambiae) | 106 | 5e-22 | |
26.9 | 074F19 | heS30428 | AV402746 | gb|AAM50732.1| GM29503p (Drosophila melanogaster) | 164 | 6e-40 | |
47.1 | 007M07 | NV021016 | AV398042 | ||||
5 | 14.7 | 038K12 | e96h0575 | AV401340 | ref|XP_307965.1| ENSANGP00000013477 (A. gambiae) | 231 | 1e-59 |
14.7 | 035M24 | heS30184 | AV402596 | ||||
27.5 | 013O10 | n0671 | AU002914 | ref|XP_309541.1| ENSANGP00000012655 (A. gambiae) | 171 | 6e-42 | |
29.6 | 014B16 | NV060186 | AV398700 | ref|XP_397415.1| similar to CG3204-PA (A. mellifera) | 73 | 3e-12 | |
29.6 | 014B16 | fbpv0469 | BP125169 | ||||
33.8 | 009D18 | wdS00230 | AU003516 | ||||
38.0 | 001P02 | n0242 | AU002614 | ||||
39.6 | 030F01 | e96h0213 | AV401089 | ref|NP_001002720.1| zgc:86649 (Danio rerio) | 65 | 6e-16 | |
45.4 | 010C18 | heS30303 | AV402666 | ||||
45.4 | 035H24 | e96h0691 | AV401418 | ref|XP_321293.1| ENSANGP00000018447 (A. gambiae) | 84 | 2e-15 | |
45.4 | 035H24 | wdS00334 | AU003610 | ref|XP_394153.1| similar to RIKEN cDNA 4921516M08 (A. mellifera) | 189 | 4e-47 | |
6 | 6.3 | 015A16 | msgV0729 | AV403591 | |||
6.3 | 006P19 | heS00306 | AV401906 | gb|AAL76013.1| putative carboxylesterase (Aedes aegypti) | 207 | 1e-52 | |
14.1 | 008L21 | n0666 | AU002911 | ref|XP_311717.1| ENSANGP00000014281 (A. gambiae) | 94 | 4e-18 | |
14.1 | 023I12 | wdS00704 | AU003886 | gb|AAH73002.1| MGC82583 protein (Xenopus laevis) | 117 | 3e-25 | |
14.1 | 023I12 | NV060169 | AV398683 | ||||
7 | 20.6 | 067B20 | heS30988 | AV403051 | |||
8 | 31.6 | 026N03 | NV060214 | AV398725 | |||
33.6 | 020I11 | wdS00070 | AU003368 | gb|AAD38624.1| BcDNA.GH08388 (D. melanogaster) | 178 | 1e-43 | |
48.6 | 054G16 | heS30160 | AV402579 | ref|XP_394434.1| similar to CG1782-PA (A. mellifera) | 169 | 4e-41 | |
9 | 13.8 | 011L18 | wdS00098 | AU003395 | |||
10 | 8.5 | 072H16 | heS00167 | AV401792 | |||
9.0 | 025H16 | wdS20879 | AU004814 | ||||
9.0 | 025H16 | wdS00081 | AU003379 | ref|XP_393498.1| similar to CG2182-PA (A. mellifera) | 201 | 1e-50 | |
12.2 | 003B12 | wv40101 | AU005784 | ||||
12.2 | 032C12 | wdS30748 | AU005485 | ref|XP_315790.1| ENSANGP00000018745 (A. gambiae) | 149 | 7e-35 | |
29.7 | 002C12 | heS30901 | AV403016 | ||||
30.2 | 057N02 | wv41052 | AU006456 | ||||
11 | 0.0 | 028M09 | heS30831 | AV402984 | ref|XP_392005.1| similar to ENSANGP00000019468 (A. mellifera) | 207 | 2e-52 |
9.6 | 047C08 | msgV0004 | AV403097 | dbj|BAD00700.1| sericin 1B (B. mori) | 75 | 4e-13 | |
13.7 | 038J22 | wdS20766 | AU004713 | ref|XP_308558.1| ENSANGP00000016021 (A. gambiae) | 367 | e-100 | |
13.7 | 038J22 | NV021864X | AV398486 | gb|AAH47966.1| Ruvbl2-prov protein (X. laevis) | 226 | 2e-58 | |
19.6 | 067D01 | heS30263 | AV402645 | ref|XP_310759.1| ENSANGP00000023871 (A. gambiae) | 131 | 8e-31 | |
24.8 | 027L05 | wdS00999 | AU004165 | gb|AAL90438.1| SD10213p (D. melanogaster) | 265 | 8e-70 | |
24.8 | 027L05 | msgV0666 | AV403540 | ||||
26.9 | 004A06 | wdS30801 | AU005526 | ||||
26.9 | 004A06 | prgv0990 | AV405259 | ref|XP_308810.1| ENSANGP00000009906 (A. gambiae) | 150 | 2e-35 | |
30.6 | 042B14 | wv41029 | AU006435 | ||||
35.8 | 044F05 | msgV0402 | AV403384 | ||||
48.8 | 034O20 | heS00552 | AV402088 | ref|NP_608909.1| CG8680-PA (D. melanogaster) | 139 | 3e-32 | |
63.2 | 002C09 | e40h269 | AU000228 | ref|XP_393536.1| similar to WD repeat protein Bub3 (A. mellifera) | 187 | 1e-46 | |
12 | 27.0 | 016I05 | wdS30491 | AU005281 | |||
36.9 | 055M23 | wdS30367 | AU005188 | gb|AAH02240.1| 2810410M20Rik protein (Mus musculus) | 100 | 3e-20 | |
13 | 18.1 | 031E16 | e96h0224 | AV401096 | |||
18.1 | 031E16 | fbpv0854 | BP125451 | ||||
22.8 | 015F01 | wv40022 | AU005726 | ||||
22.8 | 038L08 | prgv0635 | AV404953 | ||||
22.8 | 044F21 | fbpv0831 | BP125432 | ref|NP_648779.1| CG16979-PA (D. melanogaster) | 117 | 2e-25 | |
41.7 | 001D09 | e40h431 | AU000345 | ||||
14 | 6.4 | 054J20 | wv40057 | AU005752 | |||
15 | 0.0 | 022M13 | wdS00110 | AU003407 | |||
0.0 | 022M13 | wdS30866 | AU005569 | ref|XP_397214.1| similar to hypothetical protein FLJ11171 (A. mellifera) | 241 | 1e-62 | |
3.7 | 056C15 | wdS30669 | AU005430 | ref|XP_309182.1| ENSANGP00000018770 (A. gambiae) | 169 | 2e-41 | |
9.5 | 045C06 | wdS20853 | AU004791 | ||||
19.0 | 046N24 | wdS30871 | AU005573 | ||||
19.0 | 046N24 | fbpv0538 | BP125209 | ||||
19.0 | 089J12 | wdS30521 | AU005309 | ||||
19.5 | 046D04 | wdS00064 | AU003362 | ||||
22.0 | 022A01 | prgv0415 | AV404765 | ||||
25.7 | 044H18 | n1053 | AU003230 | emb|CAD35493.1| acidic ribosomal protein P1 (B. mori) | 140 | 9e-33 | |
25.7 | 003G01 | n0090 | AU002534 | gb|AAO38522.1| pumilio RBD (Schistocerca americana) | 233 | 8e-61 | |
16 | 2.3 | 042H15 | e40h872 | AU000684 | gb|AAL39863.1| LP02196p (D. melanogaster) | 182 | 5e-45 |
23.1 | 007P07 | wdS30025 | AU004944 | ||||
23.1 | 019G02 | wdS00972 | AU004141 | ||||
23.1 | 019G02 | wv40171 | AU005842 | ||||
33.1 | 037B10 | msgV0270 | AV403279 | ref|XP_214554.2| similar to asparaginyl-tRNA synthetase, cytoplasmic (Rattus norvegicus) | 171 | 8e-42 | |
17 | 11.4 | 041G05 | e40h320 | AU000267 | ref|XP_393352.1| similar to ENSANGP00000010223 (A. mellifera) | 177 | 9e-44 |
15.4 | 065K03 | wv40116 | AU005796 | gb|AAL26577.1| ribosomal protein L29 (Spodoptera frugiperda) | 72 | 2e-11 | |
30.6 | 022L03 | fbpv0625 | BP125270 | ||||
38.0 | 013J08 | wdS30444 | AU005242 | ||||
38.0 | 013J08 | wdS30864 | AU005567 | ||||
18 | 23.1 | 046A02 | wv41050 | AU006454 | |||
23.1 | 046A02 | e96h0001 | AV400935 | ||||
31.0 | 004E17 | wdS30171 | AU005038 | ref|XP_318676.1| ENSANGP00000022240 (A. gambiae) | 114 | 2e-24 | |
38.9 | 018F05 | wdS00027 | AU003326 | ||||
38.9 | 018F05 | e40h958 | AU000761 | emb|CAG09542.1| unnamed protein product (Tetraodon nigroviridis) | 114 | 2e-24 | |
19 | 32.5 | 018I09 | NV060108 | AV398630 | ref|NP_724756.1| CG8068-PE (D. melanogaster) | 286 | 3e-76 |
33.0 | 035K05 | e40h854 | AU000668 | ref|XP_315645.1| ENSANGP00000021747 (A. gambiae) | 278 | 3e-74 | |
20 | 10.5 | 009O17 | prgv0351 | AV404708 | |||
10.5 | 009O17 | wdS00362 | AU003633 | ref|XP_317592.1| ENSANGP00000010195 (A. gambiae) | 228 | 8e-59 | |
21.9 | 032I03 | e40h739 | AU000586 | ||||
21 | 6.3 | 059O10 | heS30277 | AV402652 | |||
22 | 26.4 | 018J08 | wdS00826 | AU004002 | ref|XP_316177.1| ENSANGP00000019739 (A. gambiae) | 104 | 5e-22 |
26.4 | 016C06 | wdS00218 | AU003507 | ||||
40.5 | 042G05 | NV021119X | AV398091 | ref|XP_320303.1| ENSANGP00000016464 (A. gambiae) | 99 | 7e-20 | |
23 | 26.5 | 018N04 | prgv0356 | AV404712 | ref|XP_311443.1| ENSANGP00000018481 (A. gambiae) | 139 | 6e-32 |
28.0 | 032M24 | prgv0316 | AV404674 | ||||
31.2 | 015J24 | wdS00876 | AU004049 | ref|XP_394735.1| similar to ENSANGP00000013413 (A. mellifera) | 115 | 3e-26 | |
36.9 | 015C02 | wdS30898 | AU005593 | ||||
25 | 8.5 | 038H21 | mg0587 | AU002179 | |||
22.1 | 035A16 | e96h0349 | AV401191 | ||||
23.7 | 091F16 | wdS30954 | AU005638 | ||||
A | 19.6 | 016G20 | NV021794X | AV398424 | ref|XP_319486.1| ENSANGP00000020215 (A. gambiae) | 190 | 1e-47 |
B | 23.7 | 011M16 | wdS00408 | AU003662 | ref|XP_393101.1| similar to ENSANGP00000001418 (A. mellifera) | 86 | 6e-16 |
For EST IDs, refer to Mita et al. (2003).
Obtained by deduced amino acid sequence homology search against public protein database NCBI–NR using BLASTX 2.2.10. E-value of 1e-10 was used as a threshold.
DISCUSSION
Recently, the importance of insects, especially the silkworm, as genetic and bio-material resources has increased. In Japan, the genetic data on the silkworm have accumulated throughout the long history of sericultural study. The silkworm is also important as a model insect for Lepidoptera, which include the most highly destructive agricultural pests. Due to these industrial and agricultural interests, the genome analysis of the silkworm is urgently required. In this study, we have carried out the first linkage analysis of this insect based on SNP information within the BAC end-sequence regions. We also assigned the SNP linkage groups to those of previous linkage maps using morphological mutants.
The silkworm genome is expected to contain a high frequency of repetitive sequences, such as transposable elements (Mita et al. 2004; Xia et al. 2004). We made an effort to eliminate as many SNPs as possible that originated from repetitive sequences by selecting nonredundant sequences from the initial BAC end-sequence population using the BLAST algorithm. Of these, we then randomly selected 3840 sequences for the SNP survey. Finally, we used the SNPs detected in this subset for detailed characterization and map construction.
The SNP density of silkworm BAC end sequences was nearly twofold higher than that reported previously by Cheng et al. (2004) (1/775 bp vs. 1/410 bp). This seems reasonable, since we characterized SNPs between two different strains, p50T and C108T, whereas Cheng et al. (2004) detected those between individuals or tissues of the same strain. Furthermore, the previous analysis was performed on SNPs located in coding regions (EST sequences). In contrast, we used BAC end sequences, which contained both coding and noncoding regions. Although we sequenced in total >45 Mbp of BAC ends, the annotation of the genome is still incomplete at present and so it was difficult to estimate the extent of coding regions in our data set. Noncoding, intronic, and intergenic regions are expected to have a higher SNP frequency than regions coding for amino acids, and thus the noncoding portion of the end sequences is most likely to have contributed to the observed increase in SNP density.
We also mapped the p locus onto standard linkage group 2 at a position with good agreement with previous linkage maps (Banno et al. 2005). The nearest marker to the p locus in our SNP map is T068P23, which was 1.6 cM away. If SNP markers (i.e., BAC clone markers) are mapped closer to the p locus, it will be a great help in isolating a gene that has many allelic variants affecting larval pigment patterns and thus of considerable interest for understanding the molecular basis of a key lepidopteran trait.
The recombination lengths for our linkage groups varied from 27 to 64 cM, which were shorter than those of previously reported linkage maps. For example, the Z chromosomes of the RAPD and AFLP maps constructed by Yasukochi (1998) and Tan et al. (2001) have recombination lengths of ∼94 and 417.8 cM, respectively. On the other hand, the Z chromosome of our SNP map was 45 cM, which was around one-ninth to one-half the size of the others. The maps were composed of similar numbers of markers, indicating that marker density was an unlikely explanation for the length differences. It is possible that our map did not reach the ends of the chromosome. To test for this, we detected the position of the markers mapped at the proximal and distal ends of the Z chromosome map, T004C11 and T027C05, by fluorescent in situ hybridization using the corresponding BACs. We found these markers did in fact map to positions at the chromosome ends (S. Kuwazaki, J. Narukawa, K. Mita and K. Yamamoto, unpublished data). In addition, whereas markers were unevenly distributed over the chromosomes in all three maps, the marker distribution patterns were dissimilar, suggesting that they did not reflect true variations in physical distance. Although the reasons for these observations are not clear at present, an increase of marker density may provide information needed to understand these phenomena.
The SNP markers of this map were based on BAC end sequences and thus are directly linked to BAC clones. Consequently, integrated analyses among the genetic linkage maps, physical maps based on BAC contigs, EST database, and whole-genome sequence contigs will become possible through the BAC end-sequence data. As a first step of this synthesis, we performed the mapping of ESTs onto our linkage map using BAC HDR filters. As a result, we associated 107 EST markers with 89 of our 534 mapped BACs containing SNPs. When a SNP linkage map with a higher density of BAC clone markers is realized in the near future, much more effective integrated analyses will become possible for the investigation of silkworm genome properties. Furthermore, the SNP map will provide a reference to enable integration of previously reported silkworm linkage maps with the physical map, since many of the linkage maps also used the same pair of silkworm strains, p50T and C108T. Finally, SNP linkage analysis will be a powerful tool for gene isolation by using map-based cloning methods.
Acknowledgments
We thank Toru Shimada for providing silkworm strains; Reiko Komatsuzaki, Yoko Fukusaki, Satsuki Tokoro, and Keiko Shiiba for technical assistance; and Shun-ichi Sasanuma for sequencing. We are also grateful to Hiroshi Minami and Michihiko Shimomura for data analysis and to two anonymous reviewers and the editor for helpful comments. This work was supported by funds from the Ministry of Agriculture, Forestry and Fisheries of Japan, and the Bio-oriented Technology Research Advancement Institution.
References
- Altschul, S. F., W. Gish, W. Miller, E. W. Myers and D. J. Lipman, 1990. Basic local alignment search tool. J. Mol. Biol. 215: 403–410. [DOI] [PubMed] [Google Scholar]
- Banno, Y., H. Fujii, Y. Kawaguchi, K. Yamamoto, K. Nishikawa et al., 2005. A guide to the silkworm mutants: 2005 gene name and gene symbol. Kyusyu University, Fukuoka, Japan, p. 29.
- Cheng, T. C., Q. Y. Xia, J. F. Qian, C. Liu, Y. Lin et al., 2004. Mining single nucleotide polymorphisms from EST data of silkworm, Bombyx mori, inbred strain Dazao. Insect Biochem. Mol. Biol. 34: 523–530. [DOI] [PubMed] [Google Scholar]
- Ewing, B., and P. Green, 1998. Basecalling of automated sequencer trace using phred. II. Error probabilities. Genome Res. 8: 186–194. [PubMed] [Google Scholar]
- Ewing, B., L. Hillier, M. Wendi and P. Green, 1998. Basecalling of automated sequencer traces using phred. I. Accuracy assessment. Genome Res. 8: 175–185. [DOI] [PubMed] [Google Scholar]
- Fujii, H., Y. Banno, H. Doira, H. Kihara and Y. Kawaguchi, 1998. Genetical stocks and mutations of Bombyx mori: important genetic resouces. Kyusyu University, Fukuoka, Japan, p. 54.
- Goldsmith, M. R., T. Shimada and H. Abe, 2005. The genetics and genomics of the silkworm, Bombyx mori. Annu. Rev. Entomol. 50: 71–100. [DOI] [PubMed] [Google Scholar]
- Hayashi, K., N. Hashimoto, M. Daigen and I. Ashikawa, 2004. Development of PCR-based SNP markers for rice blast resistance genes at the Piz locus. Theor. Appl. Genet. 108: 1212–1220. [DOI] [PubMed] [Google Scholar]
- Ichikawa, N., 1943. Genetical and embryological studies of a dominant mutant, “new additional crescent,” of the silkworm, Bombyx mori L. Jpn. J. Genet. 19: 182–188. [Google Scholar]
- Jakubowski, J., and K. Kornfeld, 1999. A local, high density, single-nucleotide polymorphism map, used to clone Caenorhabditis elegans cdf-1. Genetics 153: 743–752. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kadono-Okuda, K., E. Kosegawa, K. Mase and W. Hara, 2002. Linkage analysis of maternal EST cDNA clones covering all twenty-eight chromosomes in the silkworm, Bombyx mori. Insect Mol. Biol. 11: 443–451. [DOI] [PubMed] [Google Scholar]
- Koike, Y., K. Mita, M. G. Suzuki, S. Maeda, H. Abe et al., 2003. Genomic sequence of a 320-kb segment of the Z chromosome of Bombyx mori containing a kettin ortholog. Mol. Genet. Genomics 269: 137–149. [DOI] [PubMed] [Google Scholar]
- Kosambi, D. D., 1944. The estimation of map distances from recombination values. Ann. Eugen. 12: 172–175. [Google Scholar]
- Lander, E. S., P. Green, J. Abrahamson, A. Barlow, M. J. Daly et al., 1987. MAPMAKER: an interactive computer package for constructing primary genetic linkage maps of experimental and natural populations. Genomics 1: 174–181. [DOI] [PubMed] [Google Scholar]
- Miao, X.-X., S.-J. Xu, M.-H. Li, M.-W. Li, J.-H. Huang et al., 2005. Simple sequence repeat-based consensus linkage map of Bombyx mori. Proc. Natl. Acad. Soc. USA 102: 16303–16308. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mita, K., M. Morimyo, K. Okano, Y. Koike, J. Nohata et al., 2003. The construction of an EST database for Bombyx mori and its application. Proc. Natl. Acad. Soc. USA 100: 14121–14126. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mita, K., M. Kasahara, S. Sasaki, Y. Nagayasu, T. Yamada et al., 2004. The genome sequence of silkworm, Bombyx mori. DNA Res. 11: 27–35. [DOI] [PubMed] [Google Scholar]
- Nagaraja, G. M., G. Mahesh, V. Satish, M. Madhu, M. Muthulakshmi et al., 2005. Genetic mapping of Z chromosome and identification of W chromosome-specific markers in the silkworm, Bombyx mori. Heredity 95: 148–157. [DOI] [PubMed] [Google Scholar]
- Nasu, S., J. Suzuki, R. Ohta, K. Hasegawa, R. Yui et al., 2002. Search for and analysis of single nucleotide polymorphisms (SNPs) in rice (Oryza sativa, Oriza rufipogon) and establishment of SNP markers. DNA Res. 9: 163–171. [DOI] [PubMed] [Google Scholar]
- Nguu, E. K., K. Kadono-Okuda, K. Mase, E. Kosegawa and W. Hara, 2005. Molecular linkage map for the silkworm, Bombyx mori, based on restriction fragment length polymorphism of cDNA clones. J. Insect Biotechnol. Sericology 74: 5–13. [Google Scholar]
- Nickerson, D. A., V. O. Tobe and S. L. Taylor, 1997. PolyPhred: automating the detection and genotyping of single nucleotide substitution using fluorescence-based resequencing. Nucleic Acids Res. 25: 2745–2751. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Prasad, M. D., M. Muthulakshmi, M. Madhu, S. Archak, K. Mita et al., 2005. Survey and analysis of microsatellites in the silkworm, Bombyx mori: frequency, distribution, mutations, marker potential and their conservation in heterologous species. Genetics 169: 197–214. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Promboon, A., T. Shimada, H. Fujisawa and M. Kobayashi, 1995. Linkage map of random amplified polymorphic DNAs (RAPDs) in the silkworm. Bombyx mori. Genet. Res. 66: 1–7. [Google Scholar]
- Rozen, S., and H. Skaletsky, 2000. Primer3 on the WWW for general users and for biologist programmers. Methods Mol. Biol. 132: 365–386. [DOI] [PubMed] [Google Scholar]
- Shi, J., D. G. Heckel and M. R. Goldsmith, 1995. A genetic linkage map for the domesticated silkworm, Bombyx mori, based on restriction fragment length polymorphisms. Genet. Res 66: 109–126. [Google Scholar]
- Sturtevant, A. H., 1915. No crossing over in the female of the silkworm moth. Am. Nat. 49: 42–44. [Google Scholar]
- Tamura, T., C. Thibert, C. Royer, T. Kanda, E. Abraham et al., 2000. Germline transformation of the silkworm, Bombyx mori L. using a piggyBac transposon-derived vector. Nat. Biotechnol. 18: 81–84. [DOI] [PubMed] [Google Scholar]
- Tan, Y. D., C. Wan, Y. Zhu, C. Lu, Z. Xiang et al., 2001. An amplified fragment length polymorphism map of the silkworm. Genetics 157: 1277–1284. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tanaka, Y., 1913. A study of Mendelian factors in the silkworm, Bombyx mori. J. Coll. Agric. Tohoku Imp. Univ. 5(4): 91–113. [Google Scholar]
- Tomita, M., H. Munetsuna, T. Sato, T. Adachi, R. Hino et al., 2003. Transgenic silkworms produce recombinant human type III procollagen in cocoons. Nat. Biotechnol. 21: 52–56. [DOI] [PubMed] [Google Scholar]
- Toyama, K., 1912. Maternal inheritance and Mendelism. J. Genet. 2: 352–405. [Google Scholar]
- Tsujita, M., 1953. Studies on the semi-allelic E-series in the silkworm. Annu. Rep. Natl. Inst. Genet. Misima 3: 20–24. [Google Scholar]
- Wang, J., Q. Xia, X. He, M. Dai, J. Ruan et al., 2005. SilkDB: a knowledge base for silkworm biology and genomics. Nucleic Acids Res. 33: D399–D402. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Xia, Q., Z. Zhou, C. Lu, D. Cheng, F. Dai et al., 2004. A draft sequence for the genome of the domesticated silkworm (Bombyx mori). Science 306: 937–940. [DOI] [PubMed] [Google Scholar]
- Yamao, M., N. Katayama, H. Nakazawa, M. Yamakawa, Y. Hayashi et al., 1999. Gene targeting in the silkworm by use of a baculovirus. Genes Dev. 13: 511–516. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yasukochi, Y., 1998. A dense genetic map of the silkworm, Bombyx mori, covering all chromosomes based on 1018 molecular markers. Genetics 150: 1513–1525. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yasukochi, Y., Y. Banno, K. Yamamoto, M. R. Goldsmith and H. Fujii, 2005. Integration of molecular and classical linkage groups of the silkworm, Bombyx mori (n = 28). Genome 48: 626–629. [DOI] [PubMed] [Google Scholar]
- Yoshido, A., H. Bando, Y. Yasukochi and K. Sahara, 2005. The Bombyx mori karyotype and the assignment of linkage groups. Genetics 170: 675–685. [DOI] [PMC free article] [PubMed] [Google Scholar]