Abstract
Background
The eastern and northern parts of Xinjiang are the main camel breeding areas in the region. Currently, there is a lack of systematic comparative genetic studies of local populations with significant phenotypic differences at the whole-genome level. Previous research has shown significant differences in milk production traits between Bactrian camels in the eastern and northern regions of Xinjiang. To further elucidate the genetic differences between the Bactrian camel populations in these two areas and the molecular basis of their lactation traits, this study selected 106 Bactrian camels—55 from three camel farms in the northern region and 51 from one farm in the eastern region—for whole-genome resequencing.
Results
Through variant detection, genetic diversity analysis, population structure, and selection signal analysis, a total of 6,451,453 SNPs were identified. The genetic diversity of the northern Bactrian camel population was slightly higher than that of the eastern population, with minimal overall difference. Principal component analysis and phylogenetic tree analysis indicated that the Bactrian camel populations in the eastern and northern regions could be clearly distinguished. GO and KEGG enrichment analyses showed that, compared with the northern population, candidate genes in the eastern population were mainly enriched in the cell membrane and thyroid hormone synthesis pathways. Based on three inter-population selection signal analysis methods (Fst,
, and XP-EHH), 459 intersecting genes were identified, among which eight were candidate genes related to lactation traits: DUOX1, DUOX2, CPQ, CGA, PLCG1, FYN, GNRHR, and TRHR.
Conclusions
These genes can serve as potential marker sites for future molecular breeding.
Supplementary Information
The online version contains supplementary material available at 10.1186/s12864-026-12727-6.
Keywords: Bactrian camels, Whole genome resequencing, Milk production traits
Introduction
The ancestors of camels originated in Canada, North America [1], and later migrated to Central Asia due to environmental changes, where they gradually evolved humps, giving rise to dromedary and Bactrian camels [2]. Dromedary camels are found in regions such as North Africa, West Asia, and Australia, and Bactrian camels are distributed in Central Asia and Northeast Asia, including China and Mongolia [3, 4]. In the 19th century, camels were primarily used for transportation. Since then, their economic value has increasingly been reflected in milk, meat, wool, tourism, and cosmetics [5]. Camel milk is rich in nutrients such as fat, protein, minerals, unsaturated fatty acids, and amino acids, and also contains various bioactive compounds such as lactoferrin, lysozyme, immunoglobulin, lactoperoxidase, and insulin [6]. These compounds mediate immune regulation [7], antibacterial and anti-inflammatory effects [8], anticancer properties [9, 10], anti-Parkinson’s effects [11], antioxidant activity [9], neuroprotective effects [12], and therapeutic effects on diabetes. Moreover, studies have shown that camel milk does not contain
-lactoglobulin and has a high content of
-lactalbumin, which is similar to human milk. Therefore, camel milk can be considered an alternative to cow’s milk for infants and young children prone to allergies [13].
According to the Food and Agriculture Organization of the United Nations (FAO) statistics for 2023, there are approximately 1.4 million Bactrian camels worldwide, with the largest populations in China (634,000), Mongolia (480,567), Kazakhstan (280,522), and Uzbekistan (17,631) [14]. In 2024, China’s “National Catalogue of Livestock and Poultry Genetic Resources” included five local breeds of Bactrian camels: Alxa Bactrian camel and Sunite Bactrian camel from Inner Mongolia, Tarim Bactrian camel and Dzungarian Bactrian camel from Xinjiang, and Qinghai camel. According to data from the “National Statistical Yearbook of China” (ISBN 978-7-5230-0799-0), there were 634,000 camels in China in 2024, distributed across Xinjiang, Inner Mongolia, Gansu, Qinghai, and other regions. Among them, Xinjiang and Inner Mongolia had the largest populations, with 364,000 and 203,000 camels, respectively. Currently, the development of the Bactrian camel industry faces several prominent issues, including insufficient milk supply, significant variations in the nutrient and functional compound content of milk from different breeding regions, and unclear quality characteristics of the milk [15]. Additionally, compared to other dairy animals, there has been less research on the molecular breeding of Bactrian camels. Studies on genes related to lactation traits and their regulatory mechanisms are not yet sufficiently in-depth, and many genes remain functionally unverified [16]. Therefore, selecting camel groups with high milk yield and good milk quality for breeding will help produce high-quality camel milk and promote the healthy and sustainable development of the camel milk industry.
Xinjiang has the largest population of Bactrian camels in China, with 364,000 camels distributed in Altay, Keping County, Changji, Dabancheng, Tacheng, Ili, Hami, and other places. There are few studies on the differences in the material composition of camel milk in different regions of Xinjiang. In 2023, a study compared camel milk from different regions of Xinjiang [17]. The results showed significant differences in the fat content of camel milk between Changji and Hami, in the protein content between Fu-kang and Hami, and in the lactose content between Dabancheng, Fukang, Jimunai, Changji, and Hami. The total solids content also showed significant differences be-tween Dabancheng and Hami. This indicates that there are certain differences in the conventional milk components of camel milk in northern and eastern Xinjiang. How-ever, no comparison was made between the Ili region in northern Xinjiang and the Hami region in eastern Xinjiang. This study fills that gap. In 2025, Li et al. [18] found that camel milk from the Altay, Tacheng, and Ili regions showed significant differences in the calcium/phosphorus (Ca/P) and sodium/potassium (Na/K) ratios. This provides di-rection and ideas for future research on the differences in conventional milk compo-nents and mineral content of camel milk in different regions of Xinjiang. Moreover, Li et al. speculated that these regional differences might be related to environmental ad-aptations to local climate or soil conditions.
Screening lactation-related genes and candidate gene regions in Bactrian camels is of great significance for improving their lactation traits [19]. Currently, in camel molecular breeding, studies have shown that casein genes [11] and PICALM genes (phosphatidylinositol-binding clathrin assembly protein) [20] are associated with lactation traits in dromedary camels. Gubin et al. [21] found that two SNP sites in the intergenic regions of OSBPL8, MRPL37, SSBP3, and LOC102516351 genes in Gobi Red Camels from Inner Mongolia, China, are linked to the content of conventional nutrients (fat, protein, and lactose) in camel milk. In addition, Yao et al. [22] discovered that the secretion of GnRH (gonadotropin-releasing hormone), mTOR (mechanistic Target Of Rapamycin), PI3K-Akt (PhosphatidylInositol 3-Kinase/Protein Kinase B), and MAPK (Mitogen-Activated Protein Kinase) signaling pathways is related to the regulation of lactation traits in Bactrian camels. Relatively few genes have been reported to regulate lactation traits in Bactrian camels. Milk production traits are typically regulated by multiple genes. Therefore, this study employed whole-genome resequencing to analyze the population structure and identify selection signals in two Bactrian camel populations: those from Northern Xinjiang and Eastern Xinjiang. The aim of the study was to screen for lactation-related genes and regulatory signaling pathways, thereby providing important molecular markers for the genetic selection and breeding of Bactrian camels in the Xinjiang region.
Results
Milk performance of eastern and northern Xinjiang Bactrian camels
Milk from randomly selected Bactrian-camel herds in eastern and northern Xinjiang was analyzed for four key production traits. SPSS 27.0 independent-samples t tests revealed highly significant regional differences in fat, protein, and total-solid contents (p < 0.001), whereas lactose levels were statistically equivalent; full results are presented in Table 1.
Table 1.
The difference in milk performance between the Bactrian camel populations of eastern and northern Xinjiang
| Group | N | Fat % | Protein % | Lactose % | Total Solids % |
|---|---|---|---|---|---|
| Eastern Xinjiang | 40 | ![]() |
![]() |
![]() |
![]() |
| Northern Xinjiang | 55 | ![]() |
![]() |
![]() |
![]() |
| P-Value | ![]() |
![]() |
0.081 | ![]() |
,
,
Results of data preprocessing and alignment
After preprocessing, a total of 2770.3 Gb of valid sequencing data was generated from all samples, with an average data volume of 26.13 Gb per sample. The average Q30 score was 96.64%, and the average GC content was 40.77%, indicating reliable data quality. The alignment results against the reference genome showed an average read alignment rate of 99.01% and a coverage depth of 96.14%, meeting the analysis requirements. The data quality and alignment results are presented in Table 2.
Table 2.
Data quality and alignment results
| Samples | Cleanbases | Cleanrate/% | CleanQ20/% | CleanQ30/% | Depth | GC_rate/% | TotalReads | MappedReads | MappedReads Rate/% | Cov1X/% |
|---|---|---|---|---|---|---|---|---|---|---|
| HCX-1 | 23,818,023,015 | 95.02 | 98.90 | 96.61 | 23.82 | 40.65 | 161,262,758 | 159,770,639 | 99.07 | 96.18 |
| HCX-2 | 23,691,436,199 | 95.33 | 98.98 | 96.80 | 23.69 | 40.62 | 160,309,470 | 158,750,560 | 99.03 | 96.06 |
| HCX-3 | 26,696,183,861 | 94.89 | 99.04 | 97.00 | 26.70 | 40.63 | 181,014,810 | 179,656,393 | 99.25 | 96.31 |
| HCX-4 | 27,056,383,370 | 94.68 | 99.06 | 97.04 | 27.06 | 40.56 | 183,527,246 | 182,118,834 | 99.23 | 96.25 |
| HCX-5 | 26,000,863,876 | 96.18 | 98.94 | 96.65 | 26.00 | 40.52 | 175,431,442 | 173,748,324 | 99.04 | 96.24 |
| HCX-6 | 24,102,338,444 | 96.03 | 98.90 | 96.49 | 24.10 | 40.61 | 162,668,074 | 161,090,745 | 99.03 | 96.23 |
| HCX-8 | 24,996,978,982 | 95.49 | 98.98 | 96.81 | 25.00 | 40.45 | 169,001,152 | 167,554,228 | 99.14 | 96.36 |
| HCX-9 | 25,592,661,111 | 95.05 | 98.88 | 96.53 | 25.59 | 40.38 | 173,424,156 | 171,867,797 | 99.10 | 96.23 |
| HCX-10 | 30,224,428,668 | 96.68 | 98.75 | 96.17 | 30.22 | 40.34 | 203,436,646 | 201,447,488 | 99.02 | 96.23 |
| HCX-11 | 27,810,085,619 | 94.15 | 99.03 | 96.92 | 27.81 | 40.68 | 189,281,688 | 188,004,845 | 99.33 | 96.08 |
| Mean | 26,134,933,609 | 96.70 | 98.95 | 96.64 | 26.13 | 40.77 | 175,866,655 | 174,129,340 | 99.01 | 96.14 |
Samples: Sample names; Clean bases: Number of bases remaining after filtering; Clean rate: Ratio of clean reads to raw reads; Clean Q20: Percentage of bases with a quality score
20; Clean Q30: Percentage of bases with a quality score
30; Depth: Sequencing depth; GC_rate: GC content; Total Reads: Number of clean reads; Mapped Reads: Reads aligned to the reference genome; MappedReadsRate: Alignment rate of reads, calculated as Mapped Reads / Total Clean Reads; Cov1X: MappedBases(
1X)/GenomeSize. Table 3 lists the sequencing quality and alignment results for a subset of samples (
); the complete dataset is available in the Supplementary Material
Summary of genetic variants
A total of 6,451,453 SNPs were detected across the entire genome. The distribution of these SNPs is shown in Fig. 1. The majority of variants are located in intergenic regions (3,003,030, 46.55%), followed by intronic regions (2,422,399, 37.55%). A smaller proportion of SNPs are found in the 1 kb upstream region of genes (111,624, 1.73%) and the 1 kb downstream region of genes (124,037, 1.92%), as well as in exonic regions (63,231, 0.98%). A small number of SNPs are located at splice sites (the 2 bp near the exon/intron boundary within introns) (277, 0.004%), and in regions that are both the 1 kb upstream region of one gene and the 1 kb downstream region of another gene (9,838, 0.15%). The remaining SNPs are distributed in other regions (717,017, 11.11%).
Fig. 1.
SNP detection and annotation results. a Distribution of SNPs across different genomic region; b Functional annotation of exonic SNPs; c SNP density on each chromosome
Functional annotations were performed on the SNPs located in exonic regions. The results revealed that the majority of SNPs were synonymous variants (32,739, 54.16%) and non-synonymous variants (27,289, 45.14%). A smaller proportion of SNPs were variants that introduced stop codons (366, 0.61%) or removed stop codons (60, 0.10%).
Genetic diversity of Xinjiang Bactrian camels
Genetic diversity was quantified in Bactrian camel populations from Eastern and Northern Xinjiang. As shown in Table 3 and Fig. 2, the nucleotide diversity analysis showed that the Northern Xinjiang population had slightly higher nucleotide diversity (
= 0.00117) than the Eastern Xinjiang population (
= 0.00112), though both were at low levels with minimal absolute difference. Runs of homozygosity (ROH) analysis revealed significant differences in inbreeding levels between the two populations. Key indicators, median_TotalROH and median_NROH, indicated that the Eastern Xinjiang Bactrian camel genome had far more homozygous segments from common ancestors in both total length and quantity than the Northern Xinjiang population, indicating higher genomic homozygosity and inbreeding levels in the Eastern Xinjiang population. The inbreeding coefficient (
) calculated from ROH also confirmed that Bactrian camels in Eastern Xinjiang had a slightly higher inbreeding coefficient than those in Northern Xinjiang.
Table 3.
Genetic diversity index statistics for Bactrian camel populations in eastern and northern Xinjiang
| Population Name | Nucleotide Diversity ( ) |
Median_TotalROH (Kb) | Median_NROH |
(%) |
|---|---|---|---|---|
| Eastern Xinjiang | 0.00112 | 37592 | 56 | 1.50 |
| Northern Xinjiang | 0.00117 | 19358 | 32 | 0.77 |
Median_TotalROH: Median Total ROH Length; Median_NROH: Median number of ROH segments;
:ROH-based inbreeding coefficient
Fig. 2.
Genetic diversity in eastern vs. northern Xinjiang Bactrian camels. a Nucleotide diversity of the two populations; b Median total ROH lengths (Kb) in the two populations; c median number of ROH segments in the two populations
Population genetic structure analysis
Principal component analysis
The results of the Principal Component Analysis (PCA) show that the first three principal components, which explained 10.12%, 7.11%, and 6.71% of the total variance, were selected for analysis. These three components effectively distinguished the Bactrian camel populations from Group 3 in Northern Xinjiang and from Group 4 in Eastern Xinjiang from the other populations (Groups 1 and 2). However, the three populations in Northern Xinjiang showed significant admixture and could not be clearly separated by clustering, indicating a close genetic relationship. The results indicated that the overall genetic structure could differentiate Bactrian camels from Northern and Eastern Xinjiang, although there was some admixture between the two regions. The PCA results are shown in Fig. 3a.
Fig. 3.
Population structure analysis. a PCA Plot; b Phylogenetic tree
Phylogenetic tree construction
The results of the phylogenetic tree are shown in Fig. 3b. In conjunction with the PCA results, the phylogenetic tree shows that the Bactrian camels from Eastern Xinjiang and Northern Xinjiang are well separated into distinct clusters, with evidence of some partial gene flow, indicating a relatively distant genetic relationship. In contrast, the Bactrian camels from the three camel farms in Northern Xinjiang are closely related.
Detection of genomic selection signals
Single-population selection signals
The genome was divided into 10-kb windows. For single-population selection signal detection in the camels from Northern and Eastern Xinjiang, we selected the top 5% of regions. This process identified 5,017 genes in the Northern Xinjiang population and 4,405 genes in the Eastern Xinjiang population, accounting for approximately 12.93% and 11.35% of the genome, respectively. The Manhattan plots of selection signals within the Bactrian camel populations from Eastern and Northern Xinjiang are shown in Fig. 4.
Fig. 4.
Manhattan plot for Bactrian camels from Eastern Xinjiang vs. Northern Xinjiang. a Eastern Xinjiang Bactrian camel population; b Northern Xinjiang Bactrian camel population
The GO analysis results indicate that the 51 Bactrian camels from Eastern Xinjiang are enriched in 20 significant GO terms (Fig. 5a). These terms span all three major Gene Ontology categories: Biological Process (BP), Cellular Component (CC), and Molecular Function (MF). Among them, 447 genes are enriched in protein binding. After PDR correction of the P-values, the terms signal transduction, axon, and protein binding show highly significant enrichment. The results of the KEGG enrichment analysis identified 20 significantly enriched pathways in the Eastern Xinjiang Bactrian camel population (Fig. 5b). Among them, 67 genes were enriched in the Calcium signaling pathway. The three pathways of Glutamatergic synapse, Long-term depression, and Calcium signaling pathway reached highly significant P-values. The GO analysis results show that the 55 Bactrian camels from Northern Xinjiang are enriched in 20 significant GO terms (Fig. 5c). These terms span all three major Gene Ontology categories: Biological Process (BP), Cellular Component (CC), and Molecular Function (MF). Among them, 634 genes are enriched in the plasma membrane. After PDR correction of the P-values, the terms axon guidance, cell adhesion, and intracellular signal transduction show highly significant enrichment. The results of the KEGG pathway enrichment analysis identified 20 significantly enriched pathways in the Northern Xinjiang Bactrian camel population (Fig. 5d). Among them, 73 genes were enriched in the Calcium signaling pathway. The P-values for four pathways—Calcium signaling pathway, signaling pathway, Pancreatic secretion, and Axon guidance—reached highly significant levels.
Fig. 5.
GO and KEGG Enrichment analysis results of SWEED candidate genes. a GO enrichment analysis results of SWEED candidate genes in the 51 Bactrian camels from Eastern Xinjiang; b KEGG enrichment analysis results of SWEED candidate genes in the 51 Bactrian Camels from Eastern Xinjiang; c GO enrichment analysis results for SWEED candidate genes in 55 Bactrian camels from Northern Xinjiang; d KEGG Enrichment analysis results for SWEED candidate genes in 55 Bactrian camels from Northern Xinjiang
Inter-population selection signals
Using three population-based selection signal analysis methods (Fst,
, and XP-EHH), we identified the top 5% of differentiated regions as candidate regions, using a 50kb window and a 25kb step size. The results showed that these methods identified 2391, 2334, and 3982 candidate genes (Figs. S1a, b, c), resp-latexectively. We performed GO and KEGG analyses on these candidate genes. The GO enrichment analysis results (Figs. S2a, c, e) revealed significant enrichment of candidate genes across all three major Gene Ontology categories (Biological Process (BP), Cellular Component (CC), and Molecular Function (MF)). Highly significant enrichments (P < 0.01) included terms such as adherens junction, phosphatidylinositol phospholipase C activity, cysteine-type endopeptidase inhibitor activity, one-carbon metabolic process, plasma membrane tubulation, and regulation of axonogenesis. Additionally, the XP-EHH-based method identified 17 highly significant (P < 0.01) GO terms. Among these, the plasma membrane term had the highest number of enriched genes, with 522 genes, followed by the ATP-binding term, which included 135 genes. The KEGG analysis results (Fig. S2b, d, f) showed three significantly enriched pathways (P < 0.05): Axon guidance, Pathways in cancer, and Ras signaling pathway. One pathway was highly significant (P < 0.01): the Axon guidance pathway.
Joint Detection by Fst,
, and XP-EHH
Given Xinjiang’s vast land area and diverse geographical environments, long-term differences in natural environments and selective breeding practices between Eastern and Northern Xinjiang have led to significant changes and differences in the physical characteristics and economic traits of Bactrian camels in these regions, which enhances the production efficiency. To identify candidate regions in the Bactrian camel population of Eastern Xinjiang compared to those of Northern Xinjiang, we used three different population selection signal methods—Fst,
, and XP-EHH (Fig. 6a)—to detect inter-population selection signals. These three methods collectively identified 459 overlapping genes. By intersecting these genes with those related to Bactrian camel lactation, we annotated eight genes associated with lactation traits in Bactrian camels (see Table 4), namely DUOX1, DUOX2, CPQ, CGA, PLCG1, FYN, GNRHR, and TRHR. CPQ displayed the highest CLR value (0.2827) among the candidates, potentially re-flecting recent positive selection. The low CLR values (<0.15) of the remaining genes imply that their differentiation is more likely driven by divergent selection than by re-cent selective sweeps.
Fig. 6.
Joint detection results based on Fst,
, and XP-EHH. a Overlapping genes from three population selection signal methods; b GO Enrichment analysis of common candidate genes; c KEGG enrichment analysis results of common candidate genes
Table 4.
Candidate genes list
| Serial Number | Chromosome Number | Gene Start Position | Gene End Position | Gene Name | Fst Value | CLR Value |
|---|---|---|---|---|---|---|
| 1 | 6 | 19589586 | 19625820 | DUOX1 | 0.0668 | 0.0063 |
| 2 | 6 | 19638040 | 19659885 | DUOX2 | 0.0557 | 0.0099 |
| 3 | 25 | 32700070 | 33089923 | CPQ | 0.0107 | 0.2827 |
| 4 | 8 | 23344490 | 23356944 | CGA | 0.0707 | 0.0063 |
| 5 | 19 | 42320197 | 42358748 | PLCG1 | 0.0346 | 0.0140 |
| 6 | 8 | 31721195 | 31886562 | FYN | 0.0404 | 0.1243 |
| 7 | 2 | 54411098 | 54424743 | GNRHR | 0.0445 | 0.0749 |
| 8 | 25 | 23865074 | 24343870 | TRHR | 0.0182 | 0.1264 |
To understand the signaling pathways in which the candidate genes are involved in camels, we performed GO and KEGG analyses on the 459 candidate genes. The results of the GO enrichment analysis (Fig. 6b) showed that 11 pathways were included in BP (Biological Process), 4 in CC (Cellular Component), and 5 in MF (Molecular Function). Eight pathways were highly significant, with the thyroid hormone generation term showing a particularly significant difference. Additionally, a large number of genes were enriched in the plasma membrane term. The results of the KEGG enrichment analysis (Fig. 6c) showed that the Neuroactive Ligand-Receptor Interaction pathway had the highest number of enriched genes (15). The pathways related to Axon guidance, Neuroactive Ligand-Receptor Interaction, and Thyroid Hormone Synthesis showed high significance levels. Among these, the Thyroid Hormone Synthesis pathway is associated with lactation traits.
Methods
Selection of experimental population and genomic DNA extraction
The experimental population consisted of 106 Bactrian camels, including 51 from the eastern and 55 from the northern parts of Xinjiang. The northern population was sourced from three different farms, while the eastern population came from a single farm. Both populations were raised under stall-feeding conditions. During the feeding process, the feed formula (corn stalks, bran, hay, and tofu residue fed in fixed proportions) and the amount of feed provided were essentially consistent. We clarify that no Bactrian camels were euthanized or sacrificed in this study; only blood and milk samples were collected. The camel milk samples were collected by professional milkers from camels in early lactation, with milking performed twice daily using a hand-operated milking cart. All samples were individual, not pooled, and sampling took place in April, maintaining a consistent lactation phase. Information on the experimental population is provided in Table 5. Blood samples were collected from the carotid artery by licensed veterinarians between March and October 2024. Blood samples were collected in anticoagulant tubes, thoroughly mixed, transferred into cryovials, immediately placed in liquid nitrogen, and then stored at -80°C in the laboratory upon return. Mid-infrared spectroscopy was used to profile the nutrients in Bactrian-camel milk from eastern and northern Xinjiang, and the resulting compositional data were analyzed in SPSS 27.0 using an independent-samples t-test to gauge the significance of regional differences in milk composition. Genomic DNA was extracted from the 106 blood samples using the Magbead HMW DNA Kit (Kangwei Century, China). The concentration of DNA samples was measured using a Qubit fluorometer (Thermo Fisher Scientific, USA). The integrity and purity of DNA samples were assessed by 1% agarose gel electrophoresis. Only qualified samples were used for library preparation.
Table 5.
Data display regarding the origin of the specimen
| Sampling region | Research farms | N | Age (years) | Days in milk | Sampling month |
|---|---|---|---|---|---|
| North Xinjiang region | Group1 | 13 | 5–10 | 28–59 | 4 |
| Group2 | 12 | 5–10 | 31–62 | 4 | |
| Group3 | 30 | 4–10 | 33–58 | 4 | |
| East Xinjiang region | Group4 | 51 | 5–10 | 25–65 | 4 |
Whole-genome resequencing and data processing/alignment
The purified genomic DNA from Bactrian camel blood was sent to Huazhi Biotechnology Co., Ltd. for whole-genome resequencing (10
) using the MGI DNBSEQ-T7 platform. The library construction included DNA fragmentation, end repair, adapter ligation, amplification, fragment selection, and circularization to produce high-quality DNA fragments of 300–500 bp. The fragment size was verified using the Qsep400 Bioanalyzer. The raw sequencing data were processed with fastp [23] to filter out low-quality reads, remove adapters, and discard reads with excessive N bases or shorter than 100 bp, generating clean reads. The Sentieon software [24], which includes an optimized version of BWA-MEM, was used to align the clean reads to the Bactrian camel reference genome (GCA_048773025.1, NCBI: https://www.ncbi.nlm.nih.gov/).
Variant detection
Sentieon [24] was used to detect variant sites for each sample, including generating gVCF files and conducting joint-calling to analyze the gVCFs collectively, yielding variant results for each individual in the population. The resulting SNPs were subjected to initial hard filtering with criteria:
,
,
,
,
, and
. Subsequently, VCFtools (v 0.1.16 )was employed for further filtering of the SNPs (criteria:
,
), ultimately obtaining the SNP sites.
Genetic diversity analysis
The nucleotide diversity (
) of the population was calculated using VCFtools v0.1.16. A filtered variant dataset was scanned across the whole genome with a 50-kb sliding window and a 10-kb step size. Additionally, we used the homozyg module of PLINK 2 (v2.00a5.10LM) to identify continuous homozygous segments in each individual and assessed inbreeding levels by counting the number and total length of ROH.
Population structure analysis
To investigate the genetic evolutionary relationships and clustering patterns among the two major camel populations in Eastern and Northern Xinjiang, as well as among different camel populations from various farms within the same region, we used high-confidence SNPs obtained from whole-genome sequencing. We performed Principal Component Analysis (PCA) , calculated genetic distances using PLINK 2 (v2.00a5.10LM), and constructed a phylogenetic tree using the Neighbor-Joining (NJ) method.
Detection of selection signals
We conducted four types of selection signal analyses. First, we used the Composite Likelihood Ratio (CLR) test to detect selection signals within individual populations. We treated 55 camels from the Northern Xinjiang region and 51 camels from the Eastern Xinjiang region as two separate populations. We applied SweeD (v4.0.0 ) to each population to detect within-population selection signals. Additionally, we employed three inter-population selection signal detection methods: Fst (population differentiation index),
(genomic heterozygosity), and XP-EHH (cross-population extended haplotype homozygosity). These methods were used to identify molecular regions associated with the adaptation of different camel populations to diverse ecological environments and the formation of phenotypes related to high levels of conventional milk components. The annotated genes were subjected to Gene Ontology (GO) enrichment analysis using DAVID (https://davidbioinformatics.nih.gov/home.jsp) and KEGG pathway analysis using KOBAS (http://bioinfo.org/kobas/).
Discussion
Differences in nutritional composition of Camel milk from east and north Xinjiang regions
Research suggests that the nutrient and functional substance content in camel milk is influenced by various factors, such as breed [25], lactation period [26], feeding methods, sampling season [27], and genetics. Additionally, the tools and methods used for milk testing and analysis [28] may also affect the measured nutrient content in the milk. In this study, samples were collected from camel farms in Hami City, East Xinjiang, as well as Yining County and Huocheng County in North Xinjiang. The camels in both regions were of the Xinjiang Dzungarian Bactrian camel breed. However, the two regions are approximately 1,500 kilometers apart, with significant differences in geography, climate, and feeding management practices. Therefore, at the outset of the study, we quantified the major milk components in the two regional Bactrian-camel herds. Apart from identical lactose levels, fat, protein, and total-solid contents all differed markedly between eastern and northern Xinjiang (
).
Molecular marker progress
This study identified eight genes: DUOX1, DUOX2, CPQ, CGA, PLCG1, FYN, GNRHR, and TRHR. Existing research suggests that DUOX1 and DUOX2 are primarily involved in the thyroid hormone synthesis signaling pathway. There is cross-regulation between thyroid hormones and prolactin (PRL), and hypothyroidism can reduce sensitivity to PRL. In dairy cows, thyroid hormone levels (T3/T4) are positively correlated with the content of conventional nutrients in milk, within the normal range [29]. This suggests that DUOX1 and DUOX2 may play a role in lactation in Bactrian camels. Nevertheless, studies on other species (such as humans and Markhor goats) have reported the DUOX1 and DUOX2 genes [30], their specific functions in Bactrian camels have not yet been systematically investigated. This study is the first to suggest that the DUOX1 and DUOX2 genes may influence the lactation process in Bactrian camels by participating in the thyroid hormone synthesis pathway.
Through transcriptome sequencing and protein interaction network analysis of Bactrian camel populations with and without mastitis, the PLCG1 gene was identified as a major immune core gene in Bactrian camel mastitis [31]. Additionally, the PLC gene family has been shown to play a role in targeted therapies for breast cancer [32]. In this study, the PLCG1 gene was enriched in multiple pathways, with the axon guidance signaling pathway reaching a highly significant level (
). In the study of lactation mechanisms in Holstein-Friesian dairy cows, it was found that the axon guidance signaling pathway influences lactation performance [33]. The significance of the PLCG1 gene in the immune core network aligns with existing literature [31]. However, this study further reveals that PLCG1 not only contributes to the immune response to mastitis but may also indirectly influence lactation traits through the Axon guidance signaling pathway. This finding is consistent with research in Holstein cows [33] and further indirectly validates PLCG1 as a candidate gene for lactation traits in Bactrian camels.
Hussain Bahbahani et al. [34] conducted whole-genome sequencing of dromedary camels and found that the FYN gene plays a role in fat metabolism and energy expenditure, consistent with the results of this study. The GNRHR gene, which encodes the gonadotropin-releasing hormone receptor, plays a significant role in mammalian reproduction [35] and regulates hormone levels, such as estrogen and progesterone. These hormones significantly impact the development of mammary glands and lactation regulation in mammals. Therefore, the GNRHR gene is likely associated with lactation traits in Bactrian camels, suggesting that the findings of this study are highly reliable. The roles of the CGA and TRHR genes in regulating camel phenotypic traits and their mechanisms of action have not yet been reported.
Pathway selection and analysis in this study
In this study, the enriched pathways of the differentially selected candidate genes between the East Xinjiang Bactrian camel and the North Xinjiang Bactrian camel primarily focus on axon guidance, neuroactive ligand-receptor interactions, and thyroid hormone synthesis. In mammals, signaling pathways associated with lactation traits include RAP1, Wnt, MTOR, PI3K-AKT, and MAPK pathways, which are mainly involved in mammogenesis, mammary gland development, and lactation [36–38].
Recent omics studies on camel lactation have provided key insights. GLI1, a component of the Hedgehog pathway, is enriched in camel mammary glands, facilitating duct and alveoli maturation. The differential expression of Wnt pathway genes and their antagonists across developmental stages suggests that both canonical and noncanonical Wnt pathways play a role in regulating epithelial branching and differentiation [15]. These findings are consistent with our study, which indicates that the lactation network in Bactrian camels involves complex neuroendocrine interactions that extend beyond the mechanisms observed in traditional mammals.
However, compared to dairy cows and other mammals, Bactrian camels exhibit significantly different mammary gland structures [39] and lactation patterns [40]. There are also notable differences in the signaling pathways involved. The findings of this study provide support for this perspective. Additionally, based on previously measured differences in the nutrient content of milk from East and North Xinjiang Bactrian camels, the camels were categorized into two groups. The relatively small sample size of the Bactrian camels selected is a limiting factor. Nonetheless, this study offers valuable biological insights into the unique lactation mechanisms of Bactrian camels and lays the groundwork for further research on the nutritional characteristics of Bactrian camel milk.
Conclusion
This study utilized whole-genome resequencing technology to analyze the lactation-related genetic differences between Bactrian camel populations in East Xinjiang (from a single camel farm) and North Xinjiang (from three camel farms across two counties and cities), identifying a total of 6,451,453 SNPs. PCA clearly separated the two regional clusters, revealing a detectable genetic split between the populations. GO and KEGG profiling pointed to two outlier pathways—plasma-membrane trafficking and thyroid-hormone synthesis—suggesting that the regional difference lies more in hormone tuning and membrane transport than in the core milk-protein/fat machinery. Within the corresponding selection windows we pinpointed eight non-synonymous variants in DUOX1/2, CPQ, CGA, PLCG1, FYN, GNRHR and TRHR; DUOX1/2 and TRHR sit squarely on the thyroid-hormone production line. Taken together, the data imply that divergence in the thyroid–signalling axis is one of the genetic drivers behind the higher milk-fat and total-solid contents observed in the eastern Xinjiang herd. We are now expanding the sample size to test the robustness of these associations, and will build genotype–phenotype regression models to decide whether these loci merit inclusion in future breeding programmes.
Supplementary Information
Authors’ contributions
Conceptualization, Y. L; methodology, Y. L; software, Y. L; validation, S. L, C. L; formal analysis, Y. L; investigation, S. C; resources, S. C; data curation, H. T; writing—original draft preparation, Y. L; writing—review and editing, S. L, C. L; visualization, Y. L; supervision, W. W; project administration, W. W; funding acquisition, W. W; project administration, J. C. All authors reviewed the manuscript.
Funding
This research was funded by The Xinjiang Provincial 'Open Bidding for Selecting the Best Candidates' Project, Xinjiang Uygur Autonomous Region Natural Science Foundation General Program [grant number 2025D01A85], The Basic Research Business Fund for Public Welfare Scientific Research Institutes in Xinjiang Uygur Autonomous Region [grant number ky202473], and Xinjiang Uygur Autonomous Region Academy of Animal Science [grant number 2022TSYCCX0045] and The APC was funded by 2025D01A85.
Data availability
The datasets generated during the current study are available in the GSA repository (GSA Accession Number: CRA034409).
Declarations
Ethics approval and consent to participate
The animal study protocol was approved by Science and Technology Ethics Committee of Xinjiang Academy of Animal Science (LLSC0026, March 10(th), 2024); informed consent was also secured from all owners of the Bactrian camel farms.
Consent for publication
Not applicable.
Competing interests
The authors declare no competing interests.
Footnotes
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Contributor Information
Weiwei Wu, Email: wuweiweigp@foxmail.com.
Juncheng Huang, Email: h_jc@sina.com.
References
- 1.Lapidge SJ, Eason CT, Humphrys ST. A review of chemical, biological and fertility control options for the camel in Australia. Rangel J. 2010;32(1):95–115. [Google Scholar]
- 2.Kn D. History of camel in Indian context. Asian Agri History. 1997;1(3):15–9. [Google Scholar]
- 3.Mihic T, Rainkie D, Wilby KJ, Pawluk SA. The Therapeutic Effects of Camel Milk: A Systematic Review of Animal and Human Trials. J Evid Based Complement Altern Med. 2016;21(4):Np110-26. 10.1177/2156587216658846. [DOI] [PubMed]
- 4.Silbermayr K, Orozco-terWengel P, Charruau P, Enkhbileg D, Walzer C, Vogl C, et al. High mitochondrial differentiation levels between wild and domestic Bactrian camels: a basis for rapid detection of maternal hybridization. Anim Genet. 2010;41(3):315–8. 10.1111/j.1365-2052.2009.01993.x. [DOI] [PubMed] [Google Scholar]
- 5.Zarrin M, Riveros JL, Ahmadpour A, de Almeida AM, Konuspayeva G, Vargas-Bello-Pérez E, et al. Camelids: new players in the international animal production context. Trop Anim Health Prod. 2020;52(3):903–13. 10.1007/s11250-019-02197-2. [DOI] [PubMed] [Google Scholar]
- 6.Alhassani* WE. Camel milk: Nutritional composition, therapeutic properties, and benefits for human health. Open Veterinary Journal. 2024;14(12): 3164–80. 10.5455/OVJ.2024.v14.i12.2. [DOI] [PMC free article] [PubMed]
- 7.Feng M, Cui L, Liu J, Zhao L. Research progress on the nutritional value of camel milk and its application and mechanism in disease prevention. Food Sci. 2022;43(11):392–401. [Google Scholar]
- 8.Rasheed Z. Medicinal values of bioactive constituents of camel milk: A concise report. Int J Health Sci (Qassim). 2017;11(5):1–2. [PMC free article] [PubMed] [Google Scholar]
- 9.Khatoon H, Najam R. Bioactive components in camel milk: their nutritional value and therapeutic application. In: Nutrients in Dairy and their Implications on Health and Disease. MA: Academic Press; 2017.
- 10.Selva Muthukumaran M, Mudgil P, Baba WN, Maqsood S. A comprehensive review on health benefits, nutritional composition and processed products of camel milk. Food Rev Intl. 2022;39(12):1–37. [Google Scholar]
- 11.Seifu E. Recent advances on camel milk: Nutritional and health benefits and processing implications–A review. AIMS Agric Food. 2022;7(4):777–804. 10.3934/agrfood.2022048. [Google Scholar]
- 12.Behrouz S, Saadat S, Memarzia A, Sarir H, Folkerts G, Boskabady MH. The Antioxidant, Anti-Inflammatory and Immunomodulatory Effects of Camel Milk. Front Immunol. 2022;13:855342. 10.3389/fimmu.2022.855342. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.El-Hatmi H, Girardet JM, Gaillard JL, Yahyaoui MH, Attia H. Characterisation of whey proteins of camel (Camelus dromedarius) milk and colostrum. Small Rumin Res. 2007;70(2):267–71. 10.1016/j.smallrumres.2006.04.001. [Google Scholar]
- 14.FAOSTAT. Food and Agricultural Organisation, United Nations statistics. 2024. http://faostat.fao.org/. Accessed 3 Jan 2026.
- 15.Yao H, Liang X, Dou Z, Zhao Z, Ma W, Hao Z, et al. Transcriptome analysis to identify candidate genes related to mammary gland development of Bactrian camel (Camelus bactrianus). Front Vet Sci. 2023;10. 10.3389/fvets.2023.1196950. [DOI] [PMC free article] [PubMed]
- 16.Darwish AM, Abdelhafez MA, El-Metwaly HA, Khim JS, Allam AA, Ajarem JS. Genetic divergence of two casein genes and correlated milk traits in Maghrebi camels. Biologia. 2022;77(7):1889–98. 10.1007/s11756-022-01046-2. [Google Scholar]
- 17.Miao J, Xiao S, Wang J. Comparative Study of Camel Milk from Different Areas of Xinjiang Province in China. Food Sci Anim Resour. 2023;43(4):674–84. 10.5851/kosfa.2023.e27. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Yang Q, Xu L, Zheng W, Baisanbieke D, Zhu L, Yimamu M, et al. Geographical Variation in the Mineral Profiles of Camel Milk from Xinjiang: Implications for Nutritional Value and Species Identification. Agriculture. 2025;15(20). 10.3390/agriculture15202120.
- 19.Ayalew W, Wu X, Tarekegn GM, Sisay Tessema T, Naboulsi R, Van Damme R, et al. Whole Genome Scan Uncovers Candidate Genes Related to Milk Production Traits in Barka Cattle. Int J Mol Sci. 2024;25(11). 10.3390/ijms25116142. [DOI] [PMC free article] [PubMed]
- 20.Bahbahani H, Musa HH, Wragg D, Shuiep ES, Almathen F, Hanotte O. Genome Diversity and Signatures of Selection for Production and Performance Traits in Dromedary Camels. Front Genet. 2019;10:893. 10.3389/fgene.2019.00893. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Gubin H, Lixin M, Wusutiannan, Qimude, Lihushan, Liyi, et al. Genome wide association studies for milk nutrition traits in Gobi red Bactrian camel. J Camel Pract Res. 2023;30:273–82.
- 22.Yao H, Pan Z, Ma W, Zhao Z, Su Z, Yang J. Whole-Genome Resequencing Analysis of the Camelus bactrianus (Bactrian Camel) Genome Identifies Mutations and Genes Affecting Milk Production Traits. Int J Mol Sci. 2024;25(14). 10.3390/ijms25147836. [DOI] [PMC free article] [PubMed]
- 23.Chen S, Zhou Y, Chen Y, Gu J. fastp: an ultra-fast all-in-one FASTQ preprocessor. Bioinformatics. 2018;34(17):i884–90. 10.1093/bioinformatics/bty560. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Kendig KI, Baheti S, Bockol MA, Drucker TM, Hart SN, Heldenbrand JR, et al. Sentieon DNASeq Variant Calling Workflow Demonstrates Strong Computational Performance and Accuracy. Front Genet. 2019;10:736. 10.3389/fgene.2019.00736. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Alhaj OA, Lajnaf R, Jrad Z, Alshuniaber MA, Jahrami HA, Serag El-Din MF. Comparison of Ethanol Stability and Chemical Composition of Camel Milk from Five Samples. Animals. 2022;12(5):615. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Oselu S, Ebere R, Arimi JM, Amante E. Camels, Camel Milk, and Camel Milk Product Situation in Kenya in Relation to the World. Int J Food Sci. 2022;2022:1–15. 10.1155/2022/1237423. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Nagy P, Juhasz J, Reiczigel J, Csaszar G, Kocsis R, Varga L. Circannual changes in major chemical composition of bulk dromedary camel milk as determined by FT-MIR spectroscopy, and factors of variation. Food Chem. 2019;278:248–53. 10.1016/j.foodchem.2018.11.059. [DOI] [PubMed] [Google Scholar]
- 28.Li Y, Fan Y, Gao J, Liu L, Cao L, Hu B, et al. Rapid detection and spectroscopic feature analysis of mineral content in camel milk using fourier-transform mid-infrared spectroscopy and traditional machine learning algorithms. Food Control. 2025;169. 10.1016/j.foodcont.2024.110983.
- 29.Miao SJ, Han MT, Li YJ, Luo YZ, Jiang ZD. Study on serum T4 and T3 concentration changes in mid-late lactation period of dairy cows with different milk production levels. Heilongjiang Anim Sci Vet Med. 1992;6:1–3. [Google Scholar]
- 30.Choi H, Park JY, Kim HJ, Noh M, Ueyama T, Bae Y, et al. Hydrogen peroxide generated by DUOX1 regulates the expression levels of specific differentiation markers in normal human keratinocytes. J Dermatol Sci. 2014;74(1):56–63. 10.1016/j.jdermsci.2013.11.011. [DOI] [PubMed] [Google Scholar]
- 31.Ma W, Yao H, Zhang L, Zhang Y, Wang Y, Wang W, et al. Transcriptomics-Based Study of Immune Genes Associated with Subclinical Mastitis in Bactrian Camels. Vet Sci. 2025;12(2). 10.3390/vetsci12020121. [DOI] [PMC free article] [PubMed]
- 32.Asadpour O, Rahbarizadeh F. Phospholipase-Cy1 Signaling Protein Down-Regulation by Oligoclonal-VHHs based Immuno-Liposome: A Potent Metastasis Deterrent in HER2 Positive Breast Cancer Cells. Cell J. 2020;22(1):30–9. 10.22074/cellj.2020.6704. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Fang C, Zhao Z, Zhu M, Zheng W, Xie M, Qi X, et al. Resequencing screening and validation of differential genes in Holstein cows with different lactation performances. Chin J Anim Sci. 2024;60(1):176–82. [Google Scholar]
- 34.Bahbahani H, Mohammad Z, Al-Ateeqi A, Almathen F. A comprehensive map of copy number variations in dromedary camels based on whole genome sequence data. Sci Rep. 2024;14(1). 10.1038/s41598-024-77773-0. [DOI] [PMC free article] [PubMed]
- 35.Desaulniers AT, Cederberg RA, Lents CA, White BR. Expression and Role of Gonadotropin-Releasing Hormone 2 and Its Receptor in Mammals. Front Endocrinol. 2017;8. 10.3389/fendo.2017.00269. [DOI] [PMC free article] [PubMed]
- 36.Kim J, Lee JE, Lee JS, Park JS, Moon JO, Lee HG. Phenylalanine and valine differentially stimulate milk protein synthetic and energy-mediated pathway in immortalized bovine mammary epithelial cells. J Anim Sci Technol. 2020;62(2):263–75. 10.5187/jast.2020.62.2.263. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Du A, Zhao F, Liu Y, Xu L, Chen K, Sun D, et al. Genetic polymorphisms of PKLR gene and their associations with milk production traits in Chinese Holstein cows. Front Genet. 2022;13:1002706. 10.3389/fgene.2022.1002706. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Mumtaz PT, Bhat B, Ibeagha-Awemu EM, Taban Q, Wang M, Dar MA, et al. Mammary epithelial cell transcriptome reveals potential roles of lncRNAs in regulating milk synthesis pathways in Jersey and Kashmiri cattle. BMC Genomics. 2022;23(1):176. 10.1186/s12864-022-08406-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Annika M, Wernery U, Kinne J, Nagy P, Juhasz J, De Bont M, et al. Ultrasonographic, endoscopic and radiographic examinations of the dromedary mammary glands and teats. Trop Anim Health Prod. 2024;56(5):180. 10.1007/s11250-024-04009-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Musaad A, Faye B, Nikhela AA. Lactation curves of dairy camels in an intensive system. Trop Anim Health Prod. 2013;45(4):1039–46. 10.1007/s11250-012-0331-x. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
The datasets generated during the current study are available in the GSA repository (GSA Accession Number: CRA034409).



















