Microbiome analyses of pacific white shrimp (Litopenaeus vannamei) collected from disparate geographical locations

Arun S Seetharam; Emily Kawaler; Zhi-Qiang Du; Max F Rothschild; Andrew J Severin

doi:10.1016/j.gdata.2015.08.009

. 2015 Aug 12;6:67–69. doi: 10.1016/j.gdata.2015.08.009

Microbiome analyses of pacific white shrimp (Litopenaeus vannamei) collected from disparate geographical locations

Arun S Seetharam ^a, Emily Kawaler ^b, Zhi-Qiang Du ^c, Max F Rothschild ^b, Andrew J Severin ^a,^⁎

PMCID: PMC4664682 PMID: 26697337

Abstract

In this study, the tail muscle microbiota of pacific white shrimp (Litopenaeus vannamei) sourced from five countries across Central and South America and Southeast Asia were determined and compared. The genomic DNA was sequenced at around 10 × coverage for each geographical location and was assembled de novo for comparative analysis. The assembled sequences for all the lines were classified based on their similarity to the sequences in the public database. We found that there is high correlation among the microbiota of shrimp from disparate regions, as well as the presence of some DNA from bacteria known to cause food poisoning in humans. Sequencing data has been deposited at NCBI-SRA database and can be found under the BioProject ID PRJNA282154.

Keywords: Shrimp, Litopenaeus vannamei, Microbiome, Next-generation sequencing, Geographical diversity

Specifications
Organism/cell line/tissue	Litopenaeus vannamei, muscle tissue, genomic DNA
Sex	N/A
Sequencer or array type	Illumina HiSeq 2000
Data format	Raw
Experimental factors	Frozen packaged shrimp imported from Indonesia, Vietnam, Thailand, Venezuela, and Honduras were acquired from US supermarkets. From each package, 6–10 shrimp are used to isolate genomic DNA and subsequent sequencing.
Experimental features	Reads were assembled into contigs after pooling all the isolates. The presence/absence of contigs in each isolate followed by diversity analyses was performed to characterize the microbiota.
Consent	N/A
Sample source location	N/A

Open in a new tab

1. Direct link to deposited data

http://www.ncbi.nlm.nih.gov/bioproject/PRJNA282154.

2. Experimental design, materials and methods

2.1. Sample preparation

Frozen L. vannamei samples were purchased at the local grocery. Each bag had been packaged and imported from different countries: Indonesia, Vietnam, Thailand, Venezuela, and Honduras. Genomic DNA was isolated from 6 to 10 shrimp of each location and was sequenced separately. Tissue was sampled after peeling the shell and dissecting the tail muscle from the shrimp. DNeasy Blood and Tissue Kit (Qiagen, Hilden, Germany) was used for genomic DNA isolation. Single-end libraries were constructed using the Illumina TruSeq DNA Sample Preparation Kit (Illumina, Inc., San Diego, CA, USA), as per instructions. Each library was ligated to a different index-tag adapter and sequencing was performed using the TruSeq Unique and Universal Adaptors on a HiSeq 2000 sequencer, which produced 100 bp single-end reads [1].

2.2. Bioinformatics analyses

Sequence quality was assessed using FastQC (v 0.10.1) [2]. Digital normalization was performed on the raw reads using Khmer (v 1.01) [3] before assembling the reads. Khmer was run with the default settings except for the cutoff value (c), which was set for 10. About 50–60% of the total reads retained after normalization (see Table 1) were pooled and used for assembling the genome. The Ray assembler (v 2.3.1) [4] was used to generate the initial assembly using the default k-mer size (33). Obtained scaffolds were classified as prokaryotic or non-prokaryotic based on NCBI-BLAST (v 2.2.30 +) [5] searches against the NR database (see Table 2). All scaffolds of prokaryotic origin were used for the diversity analyses. Bowtie2 (v 2.2.0) [6] was used to map the reads back to the indexed draft assembly and classify them as present or absent in an isolate (Table 3).

Table 1.

Total number of reads used for the assembly. The raw number indicates the actual number of reads obtained from the sequencing machine, whereas normalized read count indicates the number of reads retained after performing digital normalization. Only normalized reads were used for the assembly.

Origin	File names	Raw	Normalized	Retained (%)
Honduras	H12_TAGGCATG_L006_R1_001.fastq	2,330,575	1,381,635	59.28%
H1_TAAGGCGA_L006_R1_001.fastq	14,712,575	8,115,248	55.16%
H3_CGTACTAG_L006_R1_001.fastq	1,453,462	859,454	59.13%
H5_AGGCAGAA_L006_R1_001.fastq	2,221,334	1,367,269	61.55%
H7_TCCTGAGC_L006_R1_001.fastq	5,865,487	3,773,662	64.34%
H8_GGACTCCT_L006_R1_001.fastq	6,583,439	3,665,752	55.68%
Indonesia	IO2_AGGCAGAA_L007_R1_001.fastq	9,177,514	5,357,801	58.38%
IO3_TCCTGAGC_L007_R1_001.fastq	3,197,185	2,168,825	67.84%
IO4_GGACTCCT_L007_R1_001.fastq	14,167,640	8,210,733	57.95%
IO5_TAGGCATG_L007_R1_001.fastq	11,396,545	6,795,882	59.63%
IO6_CTCTCTAC_L007_R1_001.fastq	10,435,103	6,227,204	59.68%
IO7_CAGAGAGG_L007_R1_001.fastq	7,342,116	4,524,518	61.62%
IO8_GCTACGCT_L007_R1_001.fastq	4,276,929	2,587,323	60.49%
IO9_CGAGGCTG_L007_R1_001.fastq	4,233,317	2,611,674	61.69%
Thailand	T10_CTCTCTAC_L008_R1_001.fastq	11,028,257	6,402,355	58.05%
T12_CAGAGAGG_L008_R1_001.fastq	13,584,934	8,121,686	59.78%
T1_TAAGGCGA_L008_R1_001.fastq	15,171,966	8,789,103	57.93%
T3_CGTACTAG_L008_R1_001.fastq	2,504,357	1,720,428	68.70%
T4_AGGCAGAA_L008_R1_001.fastq	7,341,971	4,321,487	58.86%
T5_TCCTGAGC_L008_R1_001.fastq	5,802,738	3,669,116	63.23%
T7_GGACTCCT_L008_R1_001.fastq	5,869,011	3,614,750	61.59%
T9_TAGGCATG_L008_R1_001.fastq	23,687,158	13,906,904	58.71%
Venezuela	V10_CAGAGAGG_L005_R1_001.fastq	6,584,510	4,175,631	63.42%
V11_GCTACGCT_L005_R1_001.fastq	14,586,243	7,767,055	53.25%
V12_CGAGGCTG_L005_R1_001.fastq	15,782,536	8,900,226	56.39%
V1_TAAGGCGA_L005_R1_001.fastq	21,065,170	11,223,993	53.28%
V2_CGTACTAG_L005_R1_001.fastq	11,940,498	7,142,047	59.81%
V3_AGGCAGAA_L005_R1_001.fastq	17,026,545	9,454,844	55.53%
V4_TCCTGAGC_L005_R1_001.fastq	10,803,314	6,567,535	60.79%
V5_GGACTCCT_L005_R1_001.fastq	7,554,075	4,433,658	58.69%
V8_TAGGCATG_L005_R1_001.fastq	10,224,862	6,104,682	59.70%
V9_CTCTCTAC_L005_R1_001.fastq	5,872,950	3,047,919	51.90%
Vietnam	VN12_GTAGAGGA_L008_R1_001.fastq	9,564,954	4,872,018	50.93%
VN1_GCTACGCT_L008_R1_001.fastq	7,092,963	3,912,680	55.16%
VN2_CGAGGCTG_L008_R1_001.fastq	14,190,209	8,738,927	61.58%
VN3_AAGAGGCA_L008_R1_001.fastq	3,430,308	2,069,722	60.34%
VN4_CGAGGCTG_L006_R1_001.fastq	35,679,513	20,000,119	56.05%
VN5_AAGAGGCA_L006_R1_001.fastq	11,274,952	6,484,876	57.52%
VN8_AAGAGGCA_L007_R1_001.fastq	5,376,761	2,709,392	50.39%
VN9_GTAGAGGA_L007_R1_001.fastq	6,887,031	3,555,985	51.63%

Open in a new tab

Table 2.

Number of scaffolds with matches to NR database and its classification (prokaryotes/eukaryotes).

Classification	≥ 5000 K nt	1000–5000 K nt	≥ 1000 nt
Eukaryotes	163 (24.33%)	4661 (5.04%)	4824 (5.14%)
Non-eukaryotes	378 (56.42%)	5534 (5.98%)	5912 (6.30%)
Scaffolds with hits (NR)	541 (80.75%)	10,195 (11.02%)	10,736 (11.43%)
Scaffolds without hits	129 (19.25%)	82,360 (88.98%)	82,489 (87.85%)
Total scaffolds	670	92,555	93,895

Open in a new tab

Table 3.

Mapping percent for various isolates to the draft assembly.

Origin	Isolate	Normalized	Aligned	Percent
Honduras	H12_TAGGCATG_L006_R1_001.fastq	2,330,575	135,668	5.82%
H1_TAAGGCGA_L006_R1_001.fastq	14,712,575	816,820	5.55%
H3_CGTACTAG_L006_R1_001.fastq	1,453,462	72,543	4.99%
H5_AGGCAGAA_L006_R1_001.fastq	2,221,334	125,013	5.63%
H7_TCCTGAGC_L006_R1_001.fastq	5,865,487	406,069	6.92%
H8_GGACTCCT_L006_R1_001.fastq	6,583,439	352,841	5.36%
Indonesia	IO2_AGGCAGAA_L007_R1_001.fastq	9,177,514	584,184	6.37%
IO3_TCCTGAGC_L007_R1_001.fastq	3,197,185	199,306	6.23%
IO4_GGACTCCT_L007_R1_001.fastq	14,167,640	970,809	6.85%
IO5_TAGGCATG_L007_R1_001.fastq	11,396,545	791,905	6.95%
IO6_CTCTCTAC_L007_R1_001.fastq	10,435,103	717,542	6.88%
IO7_CAGAGAGG_L007_R1_001.fastq	7,342,116	517,977	7.05%
IO8_GCTACGCT_L007_R1_001.fastq	4,276,929	289,595	6.77%
IO9_CGAGGCTG_L007_R1_001.fastq	4,233,317	303,002	7.16%
Thailand	T10_CTCTCTAC_L008_R1_001.fastq	11,028,257	677,696	6.15%
T12_CAGAGAGG_L008_R1_001.fastq	13,584,934	915,014	6.74%
T1_TAAGGCGA_L008_R1_001.fastq	15,171,966	877,129	5.78%
T3_CGTACTAG_L008_R1_001.fastq	2,504,357	177,428	7.08%
T4_AGGCAGAA_L008_R1_001.fastq	7,341,971	452,370	6.16%
T5_TCCTGAGC_L008_R1_001.fastq	5,802,738	389,395	6.71%
T7_GGACTCCT_L008_R1_001.fastq	5,869,011	380,675	6.49%
T9_TAGGCATG_L008_R1_001.fastq	23,687,158	1,531,907	6.47%
Venezuela	V10_CAGAGAGG_L005_R1_001.fastq	6,584,510	460,195	6.99%
V11_GCTACGCT_L005_R1_001.fastq	14,586,243	810,484	5.56%
V12_CGAGGCTG_L005_R1_001.fastq	15,782,536	964,607	6.11%
V1_TAAGGCGA_L005_R1_001.fastq	21,065,170	1,127,598	5.35%
V2_CGTACTAG_L005_R1_001.fastq	11,940,498	789,086	6.61%
V3_AGGCAGAA_L005_R1_001.fastq	17,026,545	1,025,844	6.02%
V4_TCCTGAGC_L005_R1_001.fastq	10,803,314	722,363	6.69%
V5_GGACTCCT_L005_R1_001.fastq	7,554,075	452,362	5.99%
V8_TAGGCATG_L005_R1_001.fastq	10,224,862	653,237	6.39%
V9_CTCTCTAC_L005_R1_001.fastq	5,872,950	278,656	4.74%
Vietnam	VN12_GTAGAGGA_L008_R1_001.fastq	9,564,954	177,669	1.86%
VN1_GCTACGCT_L008_R1_001.fastq	7,092,963	149,412	2.11%
VN2_CGAGGCTG_L008_R1_001.fastq	14,190,209	332,937	2.35%
VN3_AAGAGGCA_L008_R1_001.fastq	3,430,308	69,846	2.04%
VN4_CGAGGCTG_L006_R1_001.fastq	35,679,513	896,728	2.51%
VN5_AAGAGGCA_L006_R1_001.fastq	11,274,952	294,998	2.62%
VN8_AAGAGGCA_L007_R1_001.fastq	5,376,761	85,758	1.59%
VN9_GTAGAGGA_L007_R1_001.fastq	6,887,031	132,293	1.92%

Open in a new tab

Conflict of interest

The authors declare that there are no conflicts of interest.

Acknowledgments

This project was supported by funding from the USDA NIFA, the NRSP8 Aquaculture Genome Coordination Program and the College of Agriculture and Life Sciences, the State of Iowa and Hatch funds. The advice and assistance provided by Dr. James Dickson is appreciated.

References

1.Kawaler E., Seetharam A.S., Du Z.-Q., Severin A.J., Rothschild M.F. 2015. A comparison of the microbiomes of Litopenaeus vannamei from disparate geographical regions. (submitted for publication) [DOI] [PMC free article] [PubMed] [Google Scholar]
2.FastQC: a quality control tool for high throughput sequence data. http://www.bioinformatics.babraham.ac.uk/projects/fastqc/ Available from:
3.Brown C.T., Howe A., Zhang Q., Pyrkosz A.B., Brom T.H. 2012. A reference-free algorithm for computational normalization of shotgun sequencing data. (eprint arXiv:1203.4802) [Google Scholar]
4.Boisvert S., Laviolette F., Corbeil J. Ray: simultaneous assembly of reads from a mix of high-throughput sequencing technologies. J. Comput. Biol. 2010;17(11):1519–1533. doi: 10.1089/cmb.2009.0238. [DOI] [PMC free article] [PubMed] [Google Scholar]
5.Altschul S.F., Gish W., Miller W., Myers E.W., Lipman D.J. Basic local alignment search tool. J. Mol. Biol. 1990;215(3):403–410. doi: 10.1016/S0022-2836(05)80360-2. [DOI] [PubMed] [Google Scholar]
6.Langmead B., Salzberg S.L. Fast gapped-read alignment with Bowtie 2. Nat. Methods. 2012;9(4):357–359. doi: 10.1038/nmeth.1923. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bb0005] 1.Kawaler E., Seetharam A.S., Du Z.-Q., Severin A.J., Rothschild M.F. 2015. A comparison of the microbiomes of Litopenaeus vannamei from disparate geographical regions. (submitted for publication) [DOI] [PMC free article] [PubMed] [Google Scholar]

[bb0010] 2.FastQC: a quality control tool for high throughput sequence data. http://www.bioinformatics.babraham.ac.uk/projects/fastqc/ Available from:

[bb0015] 3.Brown C.T., Howe A., Zhang Q., Pyrkosz A.B., Brom T.H. 2012. A reference-free algorithm for computational normalization of shotgun sequencing data. (eprint arXiv:1203.4802) [Google Scholar]

[bb0020] 4.Boisvert S., Laviolette F., Corbeil J. Ray: simultaneous assembly of reads from a mix of high-throughput sequencing technologies. J. Comput. Biol. 2010;17(11):1519–1533. doi: 10.1089/cmb.2009.0238. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bb0025] 5.Altschul S.F., Gish W., Miller W., Myers E.W., Lipman D.J. Basic local alignment search tool. J. Mol. Biol. 1990;215(3):403–410. doi: 10.1016/S0022-2836(05)80360-2. [DOI] [PubMed] [Google Scholar]

[bb0030] 6.Langmead B., Salzberg S.L. Fast gapped-read alignment with Bowtie 2. Nat. Methods. 2012;9(4):357–359. doi: 10.1038/nmeth.1923. [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

Microbiome analyses of pacific white shrimp (Litopenaeus vannamei) collected from disparate geographical locations

Arun S Seetharam

Emily Kawaler

Zhi-Qiang Du

Max F Rothschild

Andrew J Severin

Abstract

1. Direct link to deposited data

2. Experimental design, materials and methods

2.1. Sample preparation

2.2. Bioinformatics analyses

Table 1.

Table 2.

Table 3.

Conflict of interest

Acknowledgments

References

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

Microbiome analyses of pacific white shrimp (Litopenaeus vannamei) collected from disparate geographical locations

Arun S Seetharam

Emily Kawaler

Zhi-Qiang Du

Max F Rothschild

Andrew J Severin

Abstract

1. Direct link to deposited data

2. Experimental design, materials and methods

2.1. Sample preparation

2.2. Bioinformatics analyses

Table 1.

Table 2.

Table 3.

Conflict of interest

Acknowledgments

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases