Skip to main content
BMC Genomic Data logoLink to BMC Genomic Data
. 2025 Dec 15;27:12. doi: 10.1186/s12863-025-01401-7

16S rDNA sequencing of the intestinal metagenome of Wanxi White Goose (Anser cygnoides) with different egg production abilities

Renshu Huang 1, Yafei Zhang 1, Muhammad Arif 2, Cheng Song 1,, Lei Yang 1,
PMCID: PMC12822282  PMID: 41398218

Abstract

Objectives

The Wanxi White Goose (Anser cygnoides) is a large waterfowl of the Anatidae family and one of the most prominent medium-sized goose breeds in China. This breed has been observed to exhibit several distinctive characteristics, including accelerated early growth, robust stress resistance, substantial egg-laying performance, and elevated down production. The egg-laying performance of Wanxi White Geese is influenced by genetic factors, as well as by environmental and feeding management factors. In this study, the intestinal contents from the high-laying and low-laying Wanxi white geese were collected, and the 16 S amplicon sequencing method was used to evaluate the relationship between the composition of intestinal bacterial communities and their egg-laying ability.

Data description

The 16 S rDNA sequencing technology was utilized to sequence and identify the microorganisms present in the duodenum, jejunum, ileum, and cecum of Wanxi white geese with varying egg-laying abilities. Four biological replicates were collected from each sample across all sections, resulting in a total of 32 samples for subsequent sequencing studies. All raw DNA sequences were uploaded to the Genome Sequence Archive (GSA) database of the National Genomics Data Center (NGDC), which is under the China National Center for Bioinformation. The accession number assigned to the submission is CRA028174, and the BioProject number is PRJCA043467. The clean read sequencing lengths for most samples ranged from 200 to 450 bp.

Clinical trial number

Not applicable.

Supplementary Information

The online version contains supplementary material available at 10.1186/s12863-025-01401-7.

Keywords: Egg production, Wanxi White Goose, Metagenomics, Intestinal contents

Objective

Notwithstanding the documented capacity of the Wanxi White Goose to exhibit elevated levels of egg-laying ability and disease resistance, the underlying breeding techniques are predominantly predicated on the prolonged selection of artificial egg-laying rate and egg production [4]. The process of egg-laying performance is governed by a multitude of genes and is profoundly influenced by environmental factors (e.g., light, temperature, etc.) and dietary regimens [5]. Consequently, conventional breeding methods are characterized by a considerable investment of time and labor, resulting in unstable egg-laying rates. For decades, molecular-assisted breeding technology has been employed in the field of poultry breeding, with a gradual shift toward the screening of egg-laying-related genes or single-nucleotide polymorphism (SNP) markers [6]. In this study, we employed 16 S sequencing to elucidate the mechanism by which the composition of intestinal microorganisms (predominantly bacteria) influences the variation in egg-laying ability among Wanxi White Geese.

Data description

All experimental materials were obtained from the Anhui Wanxi White Goose Conservation Farm in Luan, China. A total of 115 Wanxi white geese were raised on the farm, including 92 females and 23 males. The geese were raised in single pens, with one male and four females in each [1]. The geese were raised where included an outdoor activity area, allowing for increased physical activity and natural behaviors. The geese were raised in a high-level, flat-net feeding system with a floor height of three meters. The design incorporated natural lighting and ventilation. The geese were provided a consistent diet and had unrestricted access to food and water. They were primarily fed a corn-soybean meal diet containing 12.25 MJ/kg of metabolizable energy and 16.30% crude protein. There was no significant difference in the geese’s weight, and all were vaccinated with the same amount of vaccines at the same time. According to continuous egg-laying records collected from December 2020 to May 2021, high-laying geese laid an average of 32 eggs, while low-laying geese laid an average of 11 eggs [1]. Goose eggs were collected at 5:00 p.m. every day. Four high-laying geese (HEP) and four low-laying geese (LEP) were randomly selected, slaughtered, and dissected. Before the geese were slaughtered, efforts were made to keep them calm in order to minimize stress. They were then stunned quickly with electric shocks to avoid unnecessary suffering. After the slaughter and bleeding, intestinal tissue samples were promptly collected. The contents of their duodenums, jejunums, ceca, and ileums were collected in sterile cryopreservation tubes. The tubes were sealed, labeled, and stored at -80 °C for later use.

Genomic DNA was extracted from intestinal content samples using a DNA extraction kit. The DNA concentration was then detected by agarose gel electrophoresis and NanoDrop 2000 (Figure S1). According to the selected sequencing region, PCR was performed using genomic DNA as a template, specific primers with barcodes, and Takara’s Tks Gflex DNA Polymerase to ensure amplification efficiency and accuracy. The 16 S V3-V4 region (primers 343 F and 798R) was selected for identifying bacterial diversity [7]. PCR amplification and library construction were then performed. The PCR product was detected by electrophoresis and then purified using magnetic beads. The purified product was used as a template for the second-round PCR. Second-round PCR amplification was performed, and electrophoresis was used again. After detection, the PCR product was purified using magnetic beads and quantified using a Qubit. Equal amounts of the samples were mixed according to the concentration of the PCR products, and then the mixture was sequenced.

Raw data files obtained from high-throughput sequencing are converted into raw sequencing sequences through the base calling. These results are stored in the fq file format, which contains sequence information and corresponding quality scores for each read. First, the raw sequence data is scanned using the sliding window method with Trimmomatic software (v. 0.35) [8]. If the average base quality in the sliding window is lower than the threshold, the window is cut off and sequences shorter than 50 bp are removed [9]. Qualified double-end raw data from the previous step are spliced using Flash software (v. 1.2.11) [10]. The maximum overlap during sequence splicing is 200 bp, resulting in a complete paired-end sequence. The split_libraries software in QIIME (v. 1.8.0) was used to remove sequences containing N bases, sequences with single-base repeats greater than eight, and sequences less than 200 bp to obtain clean tag sequences [11]. UCHIME (v. 2.4.2) was used to remove chimeras from the clean tags and obtain valid tags for OTU division [12]. After quality control, the number of clean tags ranged from 71,554 to 75,681 for 32 samples, and the number of valid tags ranged from 60,299 to 68,547. The average length of valid tags ranged from 415.48 to 421.07 bp. The 16 S amplicon sequencing dataset was deposited in the GSA database of the NGDC with the accession number CRA028174 and project accession PRJCA043467 [13]. The submitted metadata can be retrieved and downloaded via https://download.cncb.ac.cn/gsa5/CRA028174 [14] Table 1.

Table 1.

Overview of data files

Label Name of data file/data set File types (file extension) Data repository and identifier (DOI or accession number)
Data file 1 Raw sequencing data of the duodenum content of goose with high egg performance FASTQ file (.fastq) GSA (https://ngdc.cncb.ac.cn/gsa/browse/CRA028174/CRR2009221) [15]
Data file 2 Raw sequencing data of the duodenum content of goose with high egg performance FASTQ file (.fastq) GSA (https://ngdc.cncb.ac.cn/gsa/browse/CRA028174/CRR2009222) [16]
Data file 3 Raw sequencing data of the duodenum content of goose with high egg performance FASTQ file (.fastq) GSA (https://ngdc.cncb.ac.cn/gsa/browse/CRA028174/CRR2009223) [17]
Data file 4 Raw sequencing data of the duodenum content of goose with high egg performance FASTQ file (.fastq) GSA (https://ngdc.cncb.ac.cn/gsa/browse/CRA028174/CRR2009224) [18]
Data file 5 Raw sequencing data of the duodenum content of goose with low egg performance FASTQ file (.fastq) GSA (https://ngdc.cncb.ac.cn/gsa/browse/CRA028174/CRR2009225) [19]
Data file 6 Raw sequencing data of the duodenum content of goose with low egg performance FASTQ file (.fastq) GSA (https://ngdc.cncb.ac.cn/gsa/browse/CRA028174/CRR2009226) [20]
Data file 7 Raw sequencing data of the duodenum content of goose with low egg performance FASTQ file (.fastq) GSA (https://ngdc.cncb.ac.cn/gsa/browse/CRA028174/CRR2009227) [21]
Data file 8 Raw sequencing data of the duodenum content of goose with low egg performance FASTQ file (.fastq) GSA (https://ngdc.cncb.ac.cn/gsa/browse/CRA028174/CRR2009228) [22]
Data file 9 Raw sequencing data of the jejunum content of goose with high egg performance FASTQ file (.fastq) GSA (https://ngdc.cncb.ac.cn/gsa/browse/CRA028174/CRR2009229) [23]
Data file 10 Raw sequencing data of the jejunum content of goose with high egg performance FASTQ file (.fastq) GSA (https://ngdc.cncb.ac.cn/gsa/browse/CRA028174/CRR2009230) [24]
Data file 11 Raw sequencing data of the jejunum content of goose with high egg performance FASTQ file (.fastq) GSA (https://ngdc.cncb.ac.cn/gsa/browse/CRA028174/CRR2009231) [25]
Data file 12 Raw sequencing data of the jejunum content of goose with high egg performance FASTQ file (.fastq) GSA (https://ngdc.cncb.ac.cn/gsa/browse/CRA028174/CRR2009232) [26]
Data file 13 Raw sequencing data of the jejunum content of goose with low egg performance FASTQ file (.fastq) GSA (https://ngdc.cncb.ac.cn/gsa/browse/CRA028174/CRR2009233) [27]
Data set 14 Raw sequencing data of the jejunum content of goose with low egg performance FASTQ file (.fastq) GSA (https://ngdc.cncb.ac.cn/gsa/browse/CRA028174/CRR2009234) [28]
Data set 15 Raw sequencing data of the jejunum content of goose with low egg performance FASTQ file (.fastq) GSA (https://ngdc.cncb.ac.cn/gsa/browse/CRA028174/CRR2009235) [29]
Data set 16 Raw sequencing data of the jejunum content of goose with low egg performance FASTQ file (.fastq) GSA (https://ngdc.cncb.ac.cn/gsa/browse/CRA028174/CRR2009236) [30]
Data set 17 Raw sequencing data of the ileum content of goose with high egg performance FASTQ file (.fastq) GSA (https://ngdc.cncb.ac.cn/gsa/browse/CRA028174/CRR2009237) [31]
Data set 18 Raw sequencing data of the ileum content of goose with high egg performance FASTQ file (.fastq) GSA (https://ngdc.cncb.ac.cn/gsa/browse/CRA028174/CRR2009238) [32]
Data set 19 Raw sequencing data of the ileum content of goose with high egg performance FASTQ file (.fastq) GSA (https://ngdc.cncb.ac.cn/gsa/browse/CRA028174/CRR2009239) [33]
Data set 20 Raw sequencing data of the ileum content of goose with high egg performance FASTQ file (.fastq) GSA (https://ngdc.cncb.ac.cn/gsa/browse/CRA028174/CRR2009240) [34]
Data set 21 Raw sequencing data of the ileum content of goose with low egg performance FASTQ file (.fastq) GSA (https://ngdc.cncb.ac.cn/gsa/browse/CRA028174/CRR2009241) [35]
Data set 22 Raw sequencing data of the ileum content of goose with low egg performance FASTQ file (.fastq) GSA (https://ngdc.cncb.ac.cn/gsa/browse/CRA028174/CRR2009242) [36]
Data set 23 Raw sequencing data of the ileum content of goose with low egg performance FASTQ file (.fastq) GSA (https://ngdc.cncb.ac.cn/gsa/browse/CRA028174/CRR2009243) [37]
Data set 24 Raw sequencing data of the ileum content of goose with low egg performance FASTQ file (.fastq) GSA (https://ngdc.cncb.ac.cn/gsa/browse/CRA028174/CRR2009244) [38]
Data set 25 Raw sequencing data of the cecum content of goose with high egg performance FASTQ file (.fastq) GSA (https://ngdc.cncb.ac.cn/gsa/browse/CRA028174/CRR2009245) [39]
Data set 26 Raw sequencing data of the cecum content of goose with high egg performance FASTQ file (.fastq) GSA (https://ngdc.cncb.ac.cn/gsa/browse/CRA028174/CRR2009246) [40]
Data set 27 Raw sequencing data of the cecum content of goose with high egg performance FASTQ file (.fastq) GSA (https://ngdc.cncb.ac.cn/gsa/browse/CRA028174/CRR2009247) [41]
Data set 28 Raw sequencing data of the cecum content of goose with high egg performance FASTQ file (.fastq) GSA (https://ngdc.cncb.ac.cn/gsa/browse/CRA028174/CRR2009248) [42]
Data set 29 the cecum content of goose with low egg performance FASTQ file (.fastq) GSA (https://ngdc.cncb.ac.cn/gsa/browse/CRA028174/CRR2009249) [43]
Data set 30 Raw sequencing data of the cecum content of goose with low egg performance FASTQ file (.fastq) GSA (https://ngdc.cncb.ac.cn/gsa/browse/CRA028174/CRR2009250) [44]
Data set 31 Raw sequencing data of the cecum content of goose with low egg performance FASTQ file (.fastq) GSA (https://ngdc.cncb.ac.cn/gsa/browse/CRA028174/CRR2009251) [45]
Data set 32 Raw sequencing data of the cecum content of goose with low egg performance FASTQ file (.fastq) GSA (https://ngdc.cncb.ac.cn/gsa/browse/CRA028174/CRR2009252) [46]

Limitations

Our study examined the relationship between the distribution of bacteria in different intestinal tissues and the egg production of Wanxi White Geese. Based on 18 S and metagenomic sequencing technology, we can further analyze and investigate bacterial flora compositions. This will help us better explain the relationship between intestinal flora and egg production.

Supplementary Information

Below is the link to the electronic supplementary material.

12863_2025_1401_MOESM1_ESM.docx (33.7KB, docx)

Supplementary Material 1: Fig. S1. The electrophoresis of PCR products from different intestinal tissue contents

Acknowledgements

Not applicable.

Abbreviations

GSA

Genome Sequence Archive

NGDC

National Genomics Data Center

HEP

High-laying geese

LEP

Low-laying geese

Author contributions

RSH and YFZ conceived and designed the experiments, reviewed the initial draft of the manuscript, and approved the final draft submitted. LY, RSH and YFZ designed and performed the experiments, analyzed the data, prepared the materials, drafted and revised the manuscript, and approved the final draft submitted. CS and MA contributed materials and analysis tools. LY acquired the funding. All the authors approved the final manuscript.

Funding

This work was supported by Anhui Province Science and Technology Innovation Project (202423l10050055).

Data availability

The data sets are openly available in GSA database of NGDC (https://download.cncb.ac.cn/gsa5/CRA028174/).

Declarations

Ethics approval and consent to participate

The ethical approval was granted by the Ethics Committee of West Anhui University for studies involving animals. All the experimentation procedures adhere to the Basel Declaration (https://animalresearchtomorrow.org/en) and comply with the Guide for the Care and Use of Laboratory Animals published by West Anhui University. Consent was obtained from the farm handlers prior to sampling.

Consent for publication

Not applicable.

Competing interests

The authors declare no competing interests.

Footnotes

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Contributor Information

Cheng Song, Email: lanniao812329218@163.com.

Lei Yang, Email: 592408660@qq.com.

References

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

12863_2025_1401_MOESM1_ESM.docx (33.7KB, docx)

Supplementary Material 1: Fig. S1. The electrophoresis of PCR products from different intestinal tissue contents

Data Availability Statement

The data sets are openly available in GSA database of NGDC (https://download.cncb.ac.cn/gsa5/CRA028174/).


Articles from BMC Genomic Data are provided here courtesy of BMC

RESOURCES