Application of target capture sequencing of exons and conserved non-coding sequences to 20 inbred rat strains

Minako Yoshihara; Tetsuya Sato; Daisuke Saito; Osamu Ohara; Takashi Kuramoto; Mikita Suyama

doi:10.1016/j.gdata.2016.11.010

. 2016 Nov 14;10:155–157. doi: 10.1016/j.gdata.2016.11.010

Application of target capture sequencing of exons and conserved non-coding sequences to 20 inbred rat strains

Minako Yoshihara ^a,^b, Tetsuya Sato ^a,^b, Daisuke Saito ^a,^b, Osamu Ohara ^c, Takashi Kuramoto ^d,^⁎, Mikita Suyama ^a,^b,^⁎

PMCID: PMC5114524 PMID: 27882299

Abstract

We report sequence data obtained by our recently devised target capture method TargetEC applied to 20 inbred rat strains. This method encompasses not only all annotated exons but also highly conserved non-coding sequences shared among vertebrates. The total length of the target regions covers 146.8 Mb. On an average, we obtained 31.7 × depth of target coverage and identified 154,330 SNVs and 24,368 INDELs for each strain. This corresponds to 470,037 unique SNVs and 68,652 unique INDELs among the 20 strains. The sequence data can be accessed at DDBJ/EMBL/GenBank under accession number PRJDB4648, and the identified variants have been deposited at http://bioinfo.sls.kyushu-u.ac.jp/rat_target_capture/20_strains.vcf.gz.

Specifications [standardized info for the reader]
Organism/cell line/tissue	Rattus norvegicus (BDIX/NemOda, BDIX. Cg-Tal/NemOda, BN/SsNSlc, BUF/MNa, DOB/Oda, F344/DuCrlCrlj, F344/Jcl, F344/NSlc, F344/Stm, HTX/Kyo, HWY/Slc, IS/Kyo, IS-Tlk/Kyo, KFRS3B/Kyo, LE/Stm, LEC/Tj, NIG-III/Hok, RCS/Kyo, ZF, ZFDM)

Sex	Female and male, see Table 1
Sequencer or array type	Illumina NextSeq 500
Data format	FASTQ and VCF
Experimental factors	Genomic DNA extracted from spleen
Experimental features	Target capture sequencing of exons and conserved non-coding sequences
Consent	Not applicable
Sample source location	Rat strains were provided by the National BioResource Project (NBRP)–Rat (http://www.anim.med.kyoto-u.ac.jp/nbr/).

Open in a new tab

1. Direct link to deposited data [provide URL below]

http://www.ncbi.nlm.nih.gov/bioproject/?term=PRJDB4648

http://bioinfo.sls.kyushu-u.ac.jp/rat_target_capture/20_strains.vcf.zip

2. Experimental design, materials and methods

Rats are used as animal models of many human diseases, such as cancer and hypertension. Because of its significance in biomedical analyses, the genome sequence of the Brown Norway rat strain was determined as the third complete mammalian genome [1]. The National BioResource Project–Rat (NBRP-Rat) at Kyoto University is one of the largest repositories for rat strains, and currently, > 700 strains have been collected and preserved as live animals, embryos, or sperm [2]. Determination of genome sequences for these strains is important not only for understanding genetic causes for various phenotypes but also to augment their value as biological resources.

Whole exome sequencing is an efficient approach to characterize only the exonic portions of a genome, which typically comprise 1%–2% of complete mammalian genomes, and has been successfully used in the identification of relevant genes and their causative mutations in many diseases in humans. Although some non-human exome capture kits exist, there had previously been no such capture probe set for rats. Therefore, we established a target capture kit specifically designed for this rodent species, employing the SeqCap EZ Developer Library (Roche NimbleGen, Madison, WI, USA; design name 140929_RN5_MS_EZ_HX1). In designing our target capture probe set, we included highly conserved non-coding sequences (CNSs) as target regions as well as all annotated exons, covering a total 146.8 Mb of the genome [3]. By applying this target capture method TargetEC (target capture for exons and conserved non-coding sequences) to four rat strains (WTC/Kyo, WTC-swh/Kyo, PVG/Seac, and KFRS4/Kyo), we confirmed that TargetEC performs efficiently in the identification of causative mutations, including those present in the non-coding regions [3]. In this study, we further applied TargetEC to 20 additional inbred strains preserved in NBRP-Rat to identify additional variants observed in multiple rat strains. These 20 strains were selected according to the following three categories: disease models derived from selective breeding (BDIX/NemOda, BDIX.Cg-Tal/NemOda, BUF/MNa, HTX/Kyo, HWY/Slc, KFRS3B/Kyo, RCS/Kyo, ZF, and ZFDM), those originated from wild populations (BN/SsNSlc, DOB/Oda, IS/Kyo, IS-Tlk/Kyo, LE/Stm, LEC/Tj, and NIG-III/Hok), and representative inbred strains (F344/DuCrlCrlj, F344/Jcl, F344/NSlc, and F344/Stm). All animal experimentation protocols were approved by the Institutional Animal Care and Use Committees of Kyoto University and were conducted according to the Regulation on Animal Experimentation at Kyoto University.

Genomic DNA was extracted from spleen samples with standard protocols. Target capture was performed using the standard SeqCap EZ System protocol (Roche NimbleGen). DNA sequencing libraries were prepared using the KAPA HyperPlus Library Preparation Kit (KAPA Biosystems, London, UK) according to the manufacturer's protocol. Sequencing was performed on an Illumina NextSeq 500 platform (Illumina, San Diego, CA, USA) using the High Output Kit (2 × 150 cycles). We obtained 61–82 million reads for each strain (Table 1). Sequence reads were mapped to the rat genome assembly rn5 (RGSC 5.0, March 2012) using BWA (v0.7.4) [4] with the default parameters. SAMtools (v0.1.12a) [5], Picard tools (v1.87) (http://broadinstitute.github.io/picard/), and the Genome Analysis Toolkit (GATK; v2.5.2) [6] were used for post-processing of mapped reads. Variant calling employed the UnifiedGenotyper utility in GATK. We identified 154,330 SNVs and 24,368 INDELs in the target regions, on an average (Table 1). The number of unique SNVs and INDELs among the 20 strains was 470,037 and 68,652, respectively. Sequence data and variants identified for these strains represent valuable resources for further genetic studies in the rat.

Table 1.

Summary statistics for sequencing and variant calling.

Strain	Sex	Total reads	Read length	Mapped reads after post-processing (%)	Average target depth	SNV (depth ≥ 5 ×)	INDEL (depth ≥ 5 ×)
BDIX.Cg-Tal/NemOda	Unknown	77,031,192	151	62,133,380 (80.7)	33.0	161,043	25,729
BDIX/NemOda	Female	62,884,340	151	50,668,261 (80.6)	26.2	155,727	24,561
BN/SsNSlc	Male	67,363,478	151	54,385,129 (80.7)	29.4	23,060	5533
BUF/MNa	Male	60,898,020	151	49,603,905 (81.5)	27.2	154,382	24,122
DOB/Oda	Male	68,359,820	151	61,010,641 (89.2)	31.7	196,751	30,148
F344/DuCrlCrlj	Male	73,516,660	151	59,541,186 (81.0)	27.7	152,184	23,890
F344/Jcl	Male	62,994,072	151	50,991,611 (80.9)	26.6	152,141	23,855
F344/NSlc	Male	62,838,170	151	50,726,936 (80.7)	27.5	152,546	23,930
F344/Stm	Male	64,788,908	151	52,984,127 (81.8)	29.1	151,919	23,735
HTX/Kyo	Male	72,484,640	151	64,572,821 (89.1)	33.7	154,418	24,156
HWY/Slc	Male	74,687,034	151	66,579,903 (89.1)	34.6	157,070	24,873
IS/Kyo	Male	79,430,344	151	70,744,396 (89.1)	37.4	187,300	29,120
IS-Tlk/Kyo	Male	75,990,092	151	67,761,875 (89.2)	35.8	186,648	28,902
KFRS3B/Kyo	Female	81,643,134	151	72,603,786 (88.9)	35.1	154,292	24,419
LE/Stm	Male	72,300,094	151	58,438,239 (80.8)	31.7	157,488	25,052
LEC/Tj	Unknown	78,990,272	151	70,539,682 (89.3)	37.3	167,547	26,315
NIG-III/Hok	Unknown	78,128,624	151	69,625,354 (89.1)	36.9	164,732	26,195
RCS/Kyo	Male	71,627,648	151	57,894,324 (80.8)	31.6	155,472	24,975
ZF	Male	69,986,466	151	56,655,891 (81.0)	30.2	150,778	23,815
ZFDM	Male	73,535,060	151	59,407,086 (80.8)	31.9	151,101	24,025

Open in a new tab

Conflict of interest

The authors declare no conflicts of interest.

Acknowledgements

We thank the National BioResource Project–Rat (http://www.anim.med.kyoto-u.ac.jp/nbr/) for providing rat strains. This work was supported in part by the Cooperative Research Project Program of the Medical Institute of Bioregulation, Kyushu University, to OO, and the Genome Information Upgrading Program of the National BioResource Project, Japan Agency for Medical Research and Development, to OO, TK, and MS.

Contributor Information

Takashi Kuramoto, Email: tkuramot@anim.med.kyoto-u.ac.jp.

Mikita Suyama, Email: mikita@bioreg.kyushu-u.ac.jp.

References

1.Gibbs R.A., Weinstock G.M., Metzker M.L., Muzny D.M., Sodergren E.J., Scherer S., Scott G., Steffen D., Worley K.C., Burch P.E., Okwuonu G., Hines S., Lewis L., DeRamo C., Delgado O., Dugan-Rocha S., Miner G., Morgan M., Hawes A., Gill R., Celera, Holt R.A., Adams M.D., Amanatides P.G., Baden-Tillson H., Barnstead M., Chin S., Evans C.A., Ferriera S., Fosler C., Glodek A., Gu Z., Jennings D., Kraft C.L., Nguyen T., Pfannkoch C.M., Sitter C., Sutton G.G., Venter J.C., Woodage T., Smith D., Lee H.-M., Gustafson E., Cahill P., Kana A., Doucette-Stamm L., Weinstock K., Fechtel K., Weiss R.B., Dunn D.M., Green E.D., Blakesley R.W., Bouffard G.G., De Jong P.J., Osoegawa K., Zhu B., Marra M., Schein J., Bosdet I., Fjell C., Jones S., Krzywinski M., Mathewson C., Siddiqui A., Wye N., McPherson J., Zhao S., Fraser C.M., Shetty J., Shatsman S., Geer K., Chen Y., Abramzon S., Nierman W.C., Havlak P.H., Chen R., Durbin K.J., Egan A., Ren Y., Song X.-Z., Li B., Liu Y., Qin X., Cawley S., Worley K.C., Cooney A.J., D'Souza L.M., Martin K., Wu J.Q., Gonzalez-Garay M.L., Jackson A.R., Kalafus K.J., McLeod M.P., Milosavljevic A., Virk D., Volkov A., Wheeler D.A., Zhang Z., Bailey J.A., Eichler E.E., Tuzun E., Birney E., Mongin E., Ureta-Vidal A., Woodwark C., Zdobnov E., Bork P., Suyama M., Torrents D., Alexandersson M., Trask B.J., Young J.M., Huang H., Wang H., Xing H., Daniels S., Gietzen D., Schmidt J., Stevens K., Vitt U., Wingrove J., Camara F., Albà M.M., Abril J.F., Guigo R., Smit A., Dubchak I., Rubin E.M., Couronne O., Poliakov A., Hübner N., Ganten D., Goesele C., Hummel O., Kreitler T., Lee Y.-A., Monti J., Schulz H., Zimdahl H., Himmelbauer H., Lehrach H., Jacob H.J., Bromberg S., Gullings-Handley J., Jensen-Seaman M.I., Kwitek A.E., Lazar J., Pasko D., Tonellato P.J., Twigger S., Ponting C.P., Duarte J.M., Rice S., Goodstadt L., Beatson S.A., Emes R.D., Winter E.E., Webber C., Brandt P., Nyakatura G., Adetobi M., Chiaromonte F., Elnitski L., Eswara P., Hardison R.C., Hou M., Kolbe D., Makova K., Miller W., Nekrutenko A., Riemer C., Schwartz S., Taylor J., Yang S., Zhang Y., Lindpaintner K., Andrews T.D., Caccamo M., Clamp M., Clarke L., Curwen V., Durbin R., Eyras E., Searle S.M., Cooper G.M., Batzoglou S., Brudno M., Sidow A., Stone E.A., Venter J.C., Payseur B.A., Bourque G., López-Otín C., Puente X.S., Chakrabarti K., Chatterji S., Dewey C., Pachter L., Bray N., Yap V.B., Caspi A., Tesler G., Pevzner P.A., Haussler D., Roskin K.M., Baertsch R., Clawson H., Furey T.S., Hinrichs A.S., Karolchik D., Kent W.J., Rosenbloom K.R., Trumbower H., Weirauch M., Cooper D.N., Stenson P.D., Ma B., Brent M., Arumugam M., Shteynberg D., Copley R.R., Taylor M.S., Riethman H., Mudunuri U., Peterson J., Guyer M., Felsenfeld A., Old S., Mockrin S., Collins F. Rat genome sequencing project consortium, genome sequence of the Brown Norway rat yields insights into mammalian evolution. Nature. 2004;428:493–521. doi: 10.1038/nature02426. [DOI] [PubMed] [Google Scholar]
2.Serikawa T., Mashimo T., Takizawa A., Okajima R., Maedomari N., Kumafuji K., Tagami F., Neoda Y., Otsuki M., Nakanishi S., Yamasaki K., Voigt B., Kuramoto T. National BioResource Project-Rat and related activities. Exp. Anim. Jpn. Assoc. Lab. Anim. Sci. 2009;58:333–341. doi: 10.1538/expanim.58.333. [DOI] [PubMed] [Google Scholar]
3.Yoshihara M., Saito D., Sato T., Ohara O., Kuramoto T., Suyama M. Design and application of a target capture sequencing of exons and conserved non-coding sequences for the rat. BMC Genomics. 2016;17:593. doi: 10.1186/s12864-016-2975-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
4.Li H., Durbin R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics. 2009;25:1754–1760. doi: 10.1093/bioinformatics/btp324. [DOI] [PMC free article] [PubMed] [Google Scholar]
5.Li H., Handsaker B., Wysoker A., Fennell T., Ruan J., Homer N., Marth G., Abecasis G., Durbin R. 1000 Genome Project Data Processing Subgroup, The Sequence Alignment/Map format and SAMtools. Bioinformatics. 2009;25:2078–2079. doi: 10.1093/bioinformatics/btp352. [DOI] [PMC free article] [PubMed] [Google Scholar]
6.McKenna A., Hanna M., Banks E., Sivachenko A., Cibulskis K., Kernytsky A., Garimella K., Altshuler D., Gabriel S., Daly M., DePristo M.A. The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 2010;20:1297–1303. doi: 10.1101/gr.107524.110. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bb0010] 2.Serikawa T., Mashimo T., Takizawa A., Okajima R., Maedomari N., Kumafuji K., Tagami F., Neoda Y., Otsuki M., Nakanishi S., Yamasaki K., Voigt B., Kuramoto T. National BioResource Project-Rat and related activities. Exp. Anim. Jpn. Assoc. Lab. Anim. Sci. 2009;58:333–341. doi: 10.1538/expanim.58.333. [DOI] [PubMed] [Google Scholar]

[bb0015] 3.Yoshihara M., Saito D., Sato T., Ohara O., Kuramoto T., Suyama M. Design and application of a target capture sequencing of exons and conserved non-coding sequences for the rat. BMC Genomics. 2016;17:593. doi: 10.1186/s12864-016-2975-9. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bb0020] 4.Li H., Durbin R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics. 2009;25:1754–1760. doi: 10.1093/bioinformatics/btp324. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bb0025] 5.Li H., Handsaker B., Wysoker A., Fennell T., Ruan J., Homer N., Marth G., Abecasis G., Durbin R. 1000 Genome Project Data Processing Subgroup, The Sequence Alignment/Map format and SAMtools. Bioinformatics. 2009;25:2078–2079. doi: 10.1093/bioinformatics/btp352. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bb0030] 6.McKenna A., Hanna M., Banks E., Sivachenko A., Cibulskis K., Kernytsky A., Garimella K., Altshuler D., Gabriel S., Daly M., DePristo M.A. The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 2010;20:1297–1303. doi: 10.1101/gr.107524.110. [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

Application of target capture sequencing of exons and conserved non-coding sequences to 20 inbred rat strains

Minako Yoshihara

Tetsuya Sato

Daisuke Saito

Osamu Ohara

Takashi Kuramoto

Mikita Suyama

Abstract

1. Direct link to deposited data [provide URL below]

2. Experimental design, materials and methods

Table 1.

Conflict of interest

Acknowledgements

Contributor Information

References

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

Application of target capture sequencing of exons and conserved non-coding sequences to 20 inbred rat strains

Minako Yoshihara

Tetsuya Sato

Daisuke Saito

Osamu Ohara

Takashi Kuramoto

Mikita Suyama

Abstract

1. Direct link to deposited data [provide URL below]

2. Experimental design, materials and methods

Table 1.

Conflict of interest

Acknowledgements

Contributor Information

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases