Skip to main content
Genomics Data logoLink to Genomics Data
. 2015 Aug 12;6:67–69. doi: 10.1016/j.gdata.2015.08.009

Microbiome analyses of pacific white shrimp (Litopenaeus vannamei) collected from disparate geographical locations

Arun S Seetharam a, Emily Kawaler b, Zhi-Qiang Du c, Max F Rothschild b, Andrew J Severin a,
PMCID: PMC4664682  PMID: 26697337

Abstract

In this study, the tail muscle microbiota of pacific white shrimp (Litopenaeus vannamei) sourced from five countries across Central and South America and Southeast Asia were determined and compared. The genomic DNA was sequenced at around 10 × coverage for each geographical location and was assembled de novo for comparative analysis. The assembled sequences for all the lines were classified based on their similarity to the sequences in the public database. We found that there is high correlation among the microbiota of shrimp from disparate regions, as well as the presence of some DNA from bacteria known to cause food poisoning in humans. Sequencing data has been deposited at NCBI-SRA database and can be found under the BioProject ID PRJNA282154.

Keywords: Shrimp, Litopenaeus vannamei, Microbiome, Next-generation sequencing, Geographical diversity


Specifications
Organism/cell line/tissue Litopenaeus vannamei, muscle tissue, genomic DNA
Sex N/A
Sequencer or array type Illumina HiSeq 2000
Data format Raw
Experimental factors Frozen packaged shrimp imported from Indonesia, Vietnam, Thailand, Venezuela, and Honduras were acquired from US supermarkets. From each package, 6–10 shrimp are used to isolate genomic DNA and subsequent sequencing.
Experimental features Reads were assembled into contigs after pooling all the isolates. The presence/absence of contigs in each isolate followed by diversity analyses was performed to characterize the microbiota.
Consent N/A
Sample source location N/A

1. Direct link to deposited data

http://www.ncbi.nlm.nih.gov/bioproject/PRJNA282154.

2. Experimental design, materials and methods

2.1. Sample preparation

Frozen L. vannamei samples were purchased at the local grocery. Each bag had been packaged and imported from different countries: Indonesia, Vietnam, Thailand, Venezuela, and Honduras. Genomic DNA was isolated from 6 to 10 shrimp of each location and was sequenced separately. Tissue was sampled after peeling the shell and dissecting the tail muscle from the shrimp. DNeasy Blood and Tissue Kit (Qiagen, Hilden, Germany) was used for genomic DNA isolation. Single-end libraries were constructed using the Illumina TruSeq DNA Sample Preparation Kit (Illumina, Inc., San Diego, CA, USA), as per instructions. Each library was ligated to a different index-tag adapter and sequencing was performed using the TruSeq Unique and Universal Adaptors on a HiSeq 2000 sequencer, which produced 100 bp single-end reads [1].

2.2. Bioinformatics analyses

Sequence quality was assessed using FastQC (v 0.10.1) [2]. Digital normalization was performed on the raw reads using Khmer (v 1.01) [3] before assembling the reads. Khmer was run with the default settings except for the cutoff value (c), which was set for 10. About 50–60% of the total reads retained after normalization (see Table 1) were pooled and used for assembling the genome. The Ray assembler (v 2.3.1) [4] was used to generate the initial assembly using the default k-mer size (33). Obtained scaffolds were classified as prokaryotic or non-prokaryotic based on NCBI-BLAST (v 2.2.30 +) [5] searches against the NR database (see Table 2). All scaffolds of prokaryotic origin were used for the diversity analyses. Bowtie2 (v 2.2.0) [6] was used to map the reads back to the indexed draft assembly and classify them as present or absent in an isolate (Table 3).

Table 1.

Total number of reads used for the assembly. The raw number indicates the actual number of reads obtained from the sequencing machine, whereas normalized read count indicates the number of reads retained after performing digital normalization. Only normalized reads were used for the assembly.

Origin File names Raw Normalized Retained (%)
Honduras H12_TAGGCATG_L006_R1_001.fastq 2,330,575 1,381,635 59.28%
H1_TAAGGCGA_L006_R1_001.fastq 14,712,575 8,115,248 55.16%
H3_CGTACTAG_L006_R1_001.fastq 1,453,462 859,454 59.13%
H5_AGGCAGAA_L006_R1_001.fastq 2,221,334 1,367,269 61.55%
H7_TCCTGAGC_L006_R1_001.fastq 5,865,487 3,773,662 64.34%
H8_GGACTCCT_L006_R1_001.fastq 6,583,439 3,665,752 55.68%
Indonesia IO2_AGGCAGAA_L007_R1_001.fastq 9,177,514 5,357,801 58.38%
IO3_TCCTGAGC_L007_R1_001.fastq 3,197,185 2,168,825 67.84%
IO4_GGACTCCT_L007_R1_001.fastq 14,167,640 8,210,733 57.95%
IO5_TAGGCATG_L007_R1_001.fastq 11,396,545 6,795,882 59.63%
IO6_CTCTCTAC_L007_R1_001.fastq 10,435,103 6,227,204 59.68%
IO7_CAGAGAGG_L007_R1_001.fastq 7,342,116 4,524,518 61.62%
IO8_GCTACGCT_L007_R1_001.fastq 4,276,929 2,587,323 60.49%
IO9_CGAGGCTG_L007_R1_001.fastq 4,233,317 2,611,674 61.69%
Thailand T10_CTCTCTAC_L008_R1_001.fastq 11,028,257 6,402,355 58.05%
T12_CAGAGAGG_L008_R1_001.fastq 13,584,934 8,121,686 59.78%
T1_TAAGGCGA_L008_R1_001.fastq 15,171,966 8,789,103 57.93%
T3_CGTACTAG_L008_R1_001.fastq 2,504,357 1,720,428 68.70%
T4_AGGCAGAA_L008_R1_001.fastq 7,341,971 4,321,487 58.86%
T5_TCCTGAGC_L008_R1_001.fastq 5,802,738 3,669,116 63.23%
T7_GGACTCCT_L008_R1_001.fastq 5,869,011 3,614,750 61.59%
T9_TAGGCATG_L008_R1_001.fastq 23,687,158 13,906,904 58.71%
Venezuela V10_CAGAGAGG_L005_R1_001.fastq 6,584,510 4,175,631 63.42%
V11_GCTACGCT_L005_R1_001.fastq 14,586,243 7,767,055 53.25%
V12_CGAGGCTG_L005_R1_001.fastq 15,782,536 8,900,226 56.39%
V1_TAAGGCGA_L005_R1_001.fastq 21,065,170 11,223,993 53.28%
V2_CGTACTAG_L005_R1_001.fastq 11,940,498 7,142,047 59.81%
V3_AGGCAGAA_L005_R1_001.fastq 17,026,545 9,454,844 55.53%
V4_TCCTGAGC_L005_R1_001.fastq 10,803,314 6,567,535 60.79%
V5_GGACTCCT_L005_R1_001.fastq 7,554,075 4,433,658 58.69%
V8_TAGGCATG_L005_R1_001.fastq 10,224,862 6,104,682 59.70%
V9_CTCTCTAC_L005_R1_001.fastq 5,872,950 3,047,919 51.90%
Vietnam VN12_GTAGAGGA_L008_R1_001.fastq 9,564,954 4,872,018 50.93%
VN1_GCTACGCT_L008_R1_001.fastq 7,092,963 3,912,680 55.16%
VN2_CGAGGCTG_L008_R1_001.fastq 14,190,209 8,738,927 61.58%
VN3_AAGAGGCA_L008_R1_001.fastq 3,430,308 2,069,722 60.34%
VN4_CGAGGCTG_L006_R1_001.fastq 35,679,513 20,000,119 56.05%
VN5_AAGAGGCA_L006_R1_001.fastq 11,274,952 6,484,876 57.52%
VN8_AAGAGGCA_L007_R1_001.fastq 5,376,761 2,709,392 50.39%
VN9_GTAGAGGA_L007_R1_001.fastq 6,887,031 3,555,985 51.63%

Table 2.

Number of scaffolds with matches to NR database and its classification (prokaryotes/eukaryotes).

Classification ≥ 5000 K nt 1000–5000 K nt ≥ 1000 nt
Eukaryotes 163 (24.33%) 4661 (5.04%) 4824 (5.14%)
Non-eukaryotes 378 (56.42%) 5534 (5.98%) 5912 (6.30%)
Scaffolds with hits (NR) 541 (80.75%) 10,195 (11.02%) 10,736 (11.43%)
Scaffolds without hits 129 (19.25%) 82,360 (88.98%) 82,489 (87.85%)
Total scaffolds 670 92,555 93,895

Table 3.

Mapping percent for various isolates to the draft assembly.

Origin Isolate Normalized Aligned Percent
Honduras H12_TAGGCATG_L006_R1_001.fastq 2,330,575 135,668 5.82%
H1_TAAGGCGA_L006_R1_001.fastq 14,712,575 816,820 5.55%
H3_CGTACTAG_L006_R1_001.fastq 1,453,462 72,543 4.99%
H5_AGGCAGAA_L006_R1_001.fastq 2,221,334 125,013 5.63%
H7_TCCTGAGC_L006_R1_001.fastq 5,865,487 406,069 6.92%
H8_GGACTCCT_L006_R1_001.fastq 6,583,439 352,841 5.36%
Indonesia IO2_AGGCAGAA_L007_R1_001.fastq 9,177,514 584,184 6.37%
IO3_TCCTGAGC_L007_R1_001.fastq 3,197,185 199,306 6.23%
IO4_GGACTCCT_L007_R1_001.fastq 14,167,640 970,809 6.85%
IO5_TAGGCATG_L007_R1_001.fastq 11,396,545 791,905 6.95%
IO6_CTCTCTAC_L007_R1_001.fastq 10,435,103 717,542 6.88%
IO7_CAGAGAGG_L007_R1_001.fastq 7,342,116 517,977 7.05%
IO8_GCTACGCT_L007_R1_001.fastq 4,276,929 289,595 6.77%
IO9_CGAGGCTG_L007_R1_001.fastq 4,233,317 303,002 7.16%
Thailand T10_CTCTCTAC_L008_R1_001.fastq 11,028,257 677,696 6.15%
T12_CAGAGAGG_L008_R1_001.fastq 13,584,934 915,014 6.74%
T1_TAAGGCGA_L008_R1_001.fastq 15,171,966 877,129 5.78%
T3_CGTACTAG_L008_R1_001.fastq 2,504,357 177,428 7.08%
T4_AGGCAGAA_L008_R1_001.fastq 7,341,971 452,370 6.16%
T5_TCCTGAGC_L008_R1_001.fastq 5,802,738 389,395 6.71%
T7_GGACTCCT_L008_R1_001.fastq 5,869,011 380,675 6.49%
T9_TAGGCATG_L008_R1_001.fastq 23,687,158 1,531,907 6.47%
Venezuela V10_CAGAGAGG_L005_R1_001.fastq 6,584,510 460,195 6.99%
V11_GCTACGCT_L005_R1_001.fastq 14,586,243 810,484 5.56%
V12_CGAGGCTG_L005_R1_001.fastq 15,782,536 964,607 6.11%
V1_TAAGGCGA_L005_R1_001.fastq 21,065,170 1,127,598 5.35%
V2_CGTACTAG_L005_R1_001.fastq 11,940,498 789,086 6.61%
V3_AGGCAGAA_L005_R1_001.fastq 17,026,545 1,025,844 6.02%
V4_TCCTGAGC_L005_R1_001.fastq 10,803,314 722,363 6.69%
V5_GGACTCCT_L005_R1_001.fastq 7,554,075 452,362 5.99%
V8_TAGGCATG_L005_R1_001.fastq 10,224,862 653,237 6.39%
V9_CTCTCTAC_L005_R1_001.fastq 5,872,950 278,656 4.74%
Vietnam VN12_GTAGAGGA_L008_R1_001.fastq 9,564,954 177,669 1.86%
VN1_GCTACGCT_L008_R1_001.fastq 7,092,963 149,412 2.11%
VN2_CGAGGCTG_L008_R1_001.fastq 14,190,209 332,937 2.35%
VN3_AAGAGGCA_L008_R1_001.fastq 3,430,308 69,846 2.04%
VN4_CGAGGCTG_L006_R1_001.fastq 35,679,513 896,728 2.51%
VN5_AAGAGGCA_L006_R1_001.fastq 11,274,952 294,998 2.62%
VN8_AAGAGGCA_L007_R1_001.fastq 5,376,761 85,758 1.59%
VN9_GTAGAGGA_L007_R1_001.fastq 6,887,031 132,293 1.92%

Conflict of interest

The authors declare that there are no conflicts of interest.

Acknowledgments

This project was supported by funding from the USDA NIFA, the NRSP8 Aquaculture Genome Coordination Program and the College of Agriculture and Life Sciences, the State of Iowa and Hatch funds. The advice and assistance provided by Dr. James Dickson is appreciated.

References

  • 1.Kawaler E., Seetharam A.S., Du Z.-Q., Severin A.J., Rothschild M.F. 2015. A comparison of the microbiomes of Litopenaeus vannamei from disparate geographical regions. (submitted for publication) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.FastQC: a quality control tool for high throughput sequence data. http://www.bioinformatics.babraham.ac.uk/projects/fastqc/ Available from:
  • 3.Brown C.T., Howe A., Zhang Q., Pyrkosz A.B., Brom T.H. 2012. A reference-free algorithm for computational normalization of shotgun sequencing data. (eprint arXiv:1203.4802) [Google Scholar]
  • 4.Boisvert S., Laviolette F., Corbeil J. Ray: simultaneous assembly of reads from a mix of high-throughput sequencing technologies. J. Comput. Biol. 2010;17(11):1519–1533. doi: 10.1089/cmb.2009.0238. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Altschul S.F., Gish W., Miller W., Myers E.W., Lipman D.J. Basic local alignment search tool. J. Mol. Biol. 1990;215(3):403–410. doi: 10.1016/S0022-2836(05)80360-2. [DOI] [PubMed] [Google Scholar]
  • 6.Langmead B., Salzberg S.L. Fast gapped-read alignment with Bowtie 2. Nat. Methods. 2012;9(4):357–359. doi: 10.1038/nmeth.1923. [DOI] [PMC free article] [PubMed] [Google Scholar]

Articles from Genomics Data are provided here courtesy of Elsevier

RESOURCES