Abstract
Background
Jingyuan (JY) chicken breed is one of the Chinese regional livestock and poultry genetic resources protection breeds raised in Ningxia, China. The phenotypic characteristics of JY chickens, such as feather color, skin color, and meat quality, exhibit distinct local traits, but the genetic foundation remains unclear.
Results
This study utilized whole-genome resequencing to analyze the genetic structure, genome diversity, selection signals, and genes associated with meat quality, growth and development of the Ningxia JY chicken breed. The results showed that JY chicken exhibited a relatively independent genetic structure compared to other local chicken breeds in China. Selective sweep analysis identified several genes and biological processes/pathways such as the FGF family, ANKRD family, PPARα, and HSP family, which are related to growth and metabolism. Through machine learning analysis, 7 breed specific loci were identified in the genome of JY chickens.
Conclusion
JY chickens exhibit relatively low genetic diversity and have distinct genetic backgrounds compared to other chicken breeds studied. These results help us understand the traits and utilization potential of JY chicken germplasm resources.
Supplementary Information
The online version contains supplementary material available at 10.1186/s12864-025-11927-w.
Keywords: Jingyuan chicken, Whole-genome resequencing, Genetic diversity, Selective sweep
Background
The significant and distinct phenotypic variations among commercial laying chickens, broiler chickens, and various local chicken breeds are influenced by natural environments and artificial selection. The diversity of phenotypes, especially noticeable in China due to significant variations in natural conditions across different regions, underscores the remarkable diversity of various chicken breeds. Despite the rich and unique genetic resources of local chicken breeds in China, most of these breeds have not undergone long-term scientific and effective artificial intervention and genetic improvement. Ningxia JY chicken, a dual-purpose breed of chickens used for both meat and eggs, is mainly distributed in Jingning County (or Jingyuan County), Gansu Province, Ningxia [1]. This breed has a long history of domestication, tracing back to the Han and Tang dynasties of China (https://www.cdad-is.org.cn/admin/Poul.Broiler/view2?id=14) [2]. JY chickens exhibit a medium body type and diverse feather colors [3]. Adult roosters primarily have red or red-black feathers, and adult hens exhibit a more diverse range of colors, such as yellow, black, white, and mixed patterns, with yellow and brown being the most prevalent [4]. Most of them have rose combs, whereas a few have single combs. Despite its long history of rearing, the population of JY chicken has drastically declined due to the unclear genetic background of the breed, insufficient breed development and utilization, and the absence of industrialization. Furthermore, severe inbreeding depression has pushed this breed to extinction. Therefore, analyzing the genetic variations in JY chicken can theoretically contribute to the protection of related genetic resources and the development of superior breeds.
Whole-genome resequencing technology has been widely applied in genetic breeding efforts across various livestock and poultry breeds. Whole-genome re-sequencing data of four Chinese local chicken breeds from different temperature regions were analyzed, and 11.7 million SNPs were identified, including about 2.2 million indels and structural variations (SVs), revealing a south-north domestication trajectory [5]. In sheep genetic studies, the most comprehensive dataset on the genetic structure of global sheep populations to date was established by whole-genome sequencing (WGS) analysis of 810 wild and domesticated sheeps, mining beneficial non-synonymous SNP mutations and SVs associated with local adaptive and agronomic traits [6]. Selective sweep analysis can be used to analyze and assess the genetic diversity, genome structure, and gene function of domesticated animals. This analysis helps researchers focus on loci or regions of interest, revealing genetic differences, genome structures, and functions among different individuals, populations, or breeds [7, 8]. For the assessment of genetic diversity levels in domesticated animals, the application of selective sweep analysis can aid in better understanding their evolutionary history, adaptability, and potential production traits [9]. TSHR, EPB41L1, and AGMO were identified as major candidate genes determining chicken breed-specific traits through genome-wide selective sweep analysis of 140 chickens from seven local breeds in Shandong Province and 20 introduced recessive white feather chickens [10]. These genes determine reproductive traits, body size, and aggressive behavior, respectively. Furthermore, selective sweep analysis can identify and study functional elements and regulatory regions in the genomes of domesticated animals, aiding researchers in understanding the function of specific genes, gene regulatory networks, and associated phenotypic variations [11]. In the WGS database of Tibetan pigs, positively selected genes involved in high-altitude physiology, such as hypoxia, cardiovascular function, UV damage, and DNA repair, were identified, with three loci strongly associated with the EPAS1, CYP4F2, and THSD7A genes, which are related to hypoxia and circulation [12]. These studies demonstrate the crucial role of selective sweep analysis in animal husbandry research. Selection signatures refer to the identification of specific genetic variants or regions associated with selection through the analysis of genomic data. It includes Fixation Index (Fst) analysis, Runs of Homozygosity (ROH) analysis, haplotype structure analysis and so on [13]. By calculating Fst values, researcher pinpointed gene regions with strong selection signals in high-altitude populations, including genes related to oxygen transport, cardiovascular function, antioxidant defense, and metabolism, such as EPAS1, JAZF1, and SPON1 [14, 15]. Analysis of homologous segments (ROH) of Yeonsan Ogye chickens using 600 K SNP arrays suggested the existence of past population bottlenecks and identified 152 genes annotated, some of which were associated with meat product traits and pigmentation in chickens [16]. These studies provide insights into high-altitude domestic mammalian adaptations and evolutionary differences among species.
Machine learning is the use of statistical and optimization methods to analyze and model a large amount of data, find rules and patterns, and realize prediction and decision-making [17]. There are no clear guidelines regarding the sample size for machine learning [18], but one must consider that the sample size is much larger than the feature size, which is the breed-specific locus used in this study. Machine learning algorithms are the core engine for implementing machine learning. They enable machines to learn from empirical data and improve their performance, thereby updating to the most suitable version required by researchers. There are many types of machine learning algorithms. Multiple Linear Regression (MLR) is capable of performing data regression analysis when two or more variables influence the outcome, offering advantages such as fast computation speed and high accuracy [19]. Gradient Boosting Decision Tree (GBDT), as a comprehensive learning method, accumulates the results of all trees as the outcome for regression prediction [20]. Support Vector Machines (SVM) performs target classification and regression analysis by mapping points in space [21]. K-Nearest Neighbors (KNN) classifies targets based on the classification of K neighbors. As the most common probability model, Naive Bayes (NB) utilises Bayes’ theorem for data separation based on simple training features [22]. Decision Trees (DT) classifies target data based on relevant features, whereas Random Forest (RF) generates a large number of DTs and combines multiple DTs or models to select the best features [23]. Back Propagation Neural Network (BPNN) mimics the operation mode of human neuron activation and transmission to learn and analyze data, enabling comprehensive comparative evaluation [24]. In practical applications, selecting an appropriate algorithm requires comprehensive consideration based on the specific needs of the problem, the characteristics of the data, and computational resources.
In this study, whole-genome re-sequencing was performed on 60 Ningxia JY chickens, and the data were analyzed in conjunction with the genome sequencing data of 17 domestic and foreign standard local chicken breeds from publicly available databases. Due to long-term artificial selection and natural environmental selection, these breeds have developed specific genetic variations, leading to phenotypic differences in growth, reproduction, habits, and other aspects. These differences may be reflected in the variations in their genomes. Based on this theoretical assumption, we conducted a comprehensive analysis of the genetic structure of JY chickens, aiming to identify selective genomic variations/genes associated with muscle development in JY chickens and to obtain unique breed-specific molecular markers for this breed.
Results
Genomic variants in the Ningxia JY chicken
Whole-genome resequencing was performed on 60 individuals of the Ningxia JY chicken breed, and the average sequencing depth was 10 × that of the genome (Figure S1A). An average of 10.72 GB of sequencing data per sample was obtained. Approximately 15.8 million autosomal SNPs were obtained. Following the filtering of minor allelic frequencies < 0.01 and call rates < 0.9, ~ 10.6 million autosomal SNPs were ultimately retained and used in subsequent experiments. Most of the variations were annotated in introns (57.45%), exons (1.834%), intergenic regions (19.929%), upstream (10.14%), and downstream (9.746%) of genes (Table S1). Chromosome SNP density distribution analysis indicated that the SNPs were evenly distributed on each chromosome, except at the telomeres of some chromosomes. Chromosomes 1 and 2 had the highest densities of SNPs (Figure S1B).
Population genetic structure and genetic diversity
To quantify the genetic differentiation and population structure, principal component analysis (PCA), admixture analysis, and phylogenetic tree analysis were performed. We analyzed 296 chickens from 18 chicken breeds (Fig. 1A), which were grouped into five clusters in the neighbor-joining phylogenetic tree. The close contribution rates of PC1 (13.63%) and PC2 (11.32%) suggested that the genetic differentiation of different populations of chickens is multidimensional, including artificial selection and geographic isolation at large scales. The first cluster represented 39 Red jungle fowl (RJFt) individuals, the ancestors of the domestic chicken in Thailand and India. The second cluster comprised the Chinese indigenous breeds of Wenchang chickens (WC) (10), Hetian chickens (HT) (10), Ningdu Yellow chickens (ND) (10) chickens, and other indigenous chickens. The third and fourth clusters included the commercial breeds of layers (RIR & WL: Rhode Island Red and White Leghorn) (17), and broiler sire lines (BRA&BRB) (20), and the fifth cluster demonstrated that all 60 JY chickens were clustered together (Fig. 1B). NJ analysis of population evolution found that the JY population was highly clustered (Fig. 1C). Based on the PCA results, we assumed that the ancestral population number of these breeds was 17 (commercial broilers came from the same ancestral population). Admixture was used to trace the population genetic structure, assuming K ancestral populations, with K ranging from 1 to 17. When K = 5, the five different colors indicated genetically differentiated populations consistent with PCA results (Figure S2). Cross-validation (CV) error was used to select the best range of K values, and it was found that the error value when K was in the range of 7–10 was lower than for other K values. When K = 9, the error value was the lowest, and the genetic difference between varieties was well reflected (Fig. 1D). The JY breed had the purest genetic background compared to other groups.
Fig. 1.
Genetic relationships and population structure between Ningxia JY chicken and other chicken breeds. A Map shows the geographic distribution of the sampling location. B PCA of 18 chicken breeds. C NJ phylogenetic tree of 18 breeds. D Admixture analysis across 16 chicken populations. Proportions of genetic ancestry for 16 chicken populations with K = 7–10 (K represents the number of inferred ancestral populations. Different colors represent assumed ancestors). JY Jingyuan chickens; GS Gushi chickens; ND Ningdu Yellow chickens; JH Jianghan chickens; WC Wenchang chickens; HY Huiyang bearded chickens; YC Guangxi Yao chickens; HL Huanglang chickens; HT Hetian chickens; HX Huaixiang chickens; HM Huaibei partridge chickens; ZSY Zhengyang sanhuang chickens; XB Xichuan black-bone chickens; BRA & BRB broiler sire lines A and B; RIR & WL layer parental lines, Rhode Island Red and White Leghorn; RJFt Red jungle fowl chickens
Genetic diversity, including heterozygosity and nucleotide diversity, was assessed in 10 individuals randomly selected from 16 populations of 18 chicken breeds. The results showed that the observed homozygosity (Ho) and nucleotide diversity (π) of the Layer were the lowest, followed by the JY (Table 1). For JY, the value of Ho was lower than that of expected heterozygosity (He). Such shifts in genotype frequencies are often caused by the presence of selection, genetic drift, migration, or nonrandom mating in the population. The lower Ho and π of the JY implied that the genetic drift in JY was caused by long-term small population rearing in different regions.
Table 1.
The genetic diversity estimates for Jingyuan chickens and other chicken breeds
| Population | Sample size | Hoa | Heb | Πc |
|---|---|---|---|---|
| Broiler | 10 | 0.317192 | 0.338660 | 0.000339 |
| GS | 10 | 0.348789 | 0.341195 | 0.000341 |
| HL | 10 | 0.317416 | 0.318813 | 0.000437 |
| HM | 10 | 0.308183 | 0.301519 | 0.000413 |
| HT | 10 | 0.312825 | 0.310169 | 0.000426 |
| HX | 10 | 0.309663 | 0.312538 | 0.000428 |
| HY | 10 | 0.30234 | 0.310429 | 0.000421 |
| JH | 10 | 0.283421 | 0.297974 | 0.000407 |
| JY | 10 | 0.243739 | 0.336125 | 0.000341 |
| Layer | 10 | 0.175901 | 0.351169 | 0.000279 |
| ND | 10 | 0.304466 | 0.309704 | 0.000425 |
| RJF | 10 | 0.264832 | 0.348208 | 0.000364 |
| WC | 10 | 0.307888 | 0.315151 | 0.00043 |
| XB | 10 | 0.340781 | 0.336917 | 0.000413 |
| YC | 10 | 0.310652 | 0.312307 | 0.000429 |
| ZSY | 10 | 0.299041 | 0.298136 | 0.000409 |
aHo Observed heterozygosity, bHe Expected heterozygosity, cπ nucleotide diversity
Genetic diversity and differentiation
ROH refers to the chromosomes of existence in a certain period of continuous homozygous state phenomenon. The genomic inbreeding coefficient, FROH, calculated using ROH, can reflect the history of inbreeding and the impact of selection pressure. We compiled the ROH and FROH for 18 different chicken breeds. HL, as commercial laying hens, had the least and shortest ROH, while BRA and BRB had medium ROH among all breeds. For example, BRA had an average ROH number of 22.5 and an average ROH length of about 750 kb, which was consistent with the low inbreeding background. In terms of the number and average length of ROH, the JY breed was in the upper middle range among all breeds (number > 50 and length > 1000 kb), implied that small group breeding or conservation led to inbreeding accumulation (Fig. 2A, B). Compared to the native breeds, the RJF, layer, and broiler breeds had higher levels of FROH. Among the 13 native chicken breeds, the JY chicken had the highest FROH (~ 0.3) (Fig. 2C).
Fig. 2.
Runs of homozygosity and linkage disequilibrium of JY chicken and other chicken breeds. A–C Statistical for runs of homozygosity and genomic inbreeding coefficient of different breeds. D Decay of linkage disequilibrium in the 18 chicken breeds. JY Jingyuan chickens; GS Gushi chickens; ND Ningdu Yellow chickens; JH Jianghan chickens; WC Wenchang chickens; HY Huiyang bearded chickens; YC Guangxi Yao chickens; HL Huanglang chickens; HT Hetian chickens; HX Huaixiang chickens; HM Huaibei partridge chickens; ZSY Zhengyang sanhuang chickens; XB Xichuan black-bone chickens; BRA & BRB broiler sire lines A and B; RIR & WL layer parental lines, Rhode Island Red and White Leghorn; RJFt Red jungle fowl chickens
Genome-wide linkage disequilibrium (LD) in each breed was estimated as the physical genomic distance at which the genotypic association (R2) decays to less than half of its maximum value. We observed the fastest rate of LD decay in JY and the slowest rate of LD decay in ND chicken (Fig. 2D). RJF, Gushi chickens (GS), and Xichuan black-bone chickens (XB) exhibited fast to moderate rates of LD decay, and Huiyang bearded chickens (HY), HT chickens, and Jianghan chickens (JH) showed moderately slow rates (Fig. 2D). These results indicated that JY chicken, as a local chicken breed, did not undergo high-intensity manual selection like other chicken breeds during the domestication process.
Genome-wide selective sweep signals for meat quality traits in JY
To further investigate and explore genes involved in the selection for meat quality traits, we conducted a genome-wide selection signal analysis comparing JY chickens to broiler chickens (BRA & BRB). A total of 5000 target SNP loci were obtained through selective elimination analysis. Based on both the reduction of diversity (ROD) and Fst statistical methods, the top 5% of selected genome regions were considered as potential selection signal regions. A total of 3977 genes were identified from the ROD analysis, and 2526 genes were identified from the Fst analysis (Fig. 3A & B, Figure S3). A total of 2149 overlapping selective genes were identified in JYs through gene intersection using two statistical methods. These genes were widely distributed in various chromosomes, with many candidate genes occurring mainly in chromosomes 1–9 (Fig. 3C, Table S3). Further, Gene Ontology categories (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) analysis of the candidate-selected genes enriched 255 and 620 characteristic selected genes, respectively, and 126 duplicate genes were obtained in both GO and KEGG analyses. These candidate selective genes were mainly associated with structural molecule activity, neuron projection, cell morphogenesis, cancer, and the MAPK signaling pathway (Fig. 4A & B). Cluster analysis found that cancer-related pathways, oxytocin signaling pathway, insulin signaling pathway and long-term depression were closely connected (Fig. 4C). Among the identified candidate genes were those from the FGF family, which stimulate or maintain specific cell functions through the fibroblast growth factor receptor (FGFR)-mediated signaling axis, crucial for metabolism, tissue homeostasis, and development. Other genes identified include IGF-1, IGF2BP3, and members of the ANKRD family, which promote muscle, fat, and bone growth. Furthermore, several genes related to energy expenditure and metabolism were identified, such as the SLC amino acid transporter superfamily, PPARα and DHTKD1. The immune-related CYP gene family, HSP gene family and EML gene family, which are involved in the regulation of microtubule interactions and cell division, were identified in the JY chicken genome. Interestingly, the CEP128 gene, which plays a role in sperm development, was also among the candidate genes. Together, these genes play pivotal roles in regulating cellular functions and biological processes.
Fig. 3.
Whole genome scanning of the selection signatures between JY and Broiler chickens. A, B Selection signals distribution of Fst and ROD calculated for 40 kb windows sliding in 10 kb steps. C Signals intersection by Fst and ROD. Blue points represent JY selected genomic regions with both an extremely high Z (Fst) value (top 5% level) and ROD value (top 5% level)
Fig. 4.
Functional enrichment analysis of selected genes. A GO enrichment analysis. B KEGG enrichment analysis. C Cluster analysis of enrichment pathways or biological process subclasses, colored by cluster ID, where nodes that share the same cluster ID are typically close to each other
Screening and identification of identification sites for JY chicken breeds
The top 20 loci were identified through genetic differentiation index analysis (Fig. 5A). These loci were considered important features and were extracted for model building. Eight different machine learning algorithms, including MLR, SVM, KNN, NB, DT, RF, GBDT, and BPNN, were utilized to accurately simulate the JY chicken breed-specific molecular identification at these 20 loci. As the number of SNPs increased, the accuracy of breed identification for JY chickens by different algorithms also improved, except for NB. Considering that the cost performance of breed-specific identification needs to achieve high accuracy with a minimal number of loci, SVM demonstrated more stable prediction results than other algorithms once the number of SNPs reached 7 (Fig. 5B). Therefore, SVM was chosen as the optimal algorithm for further research. The SVM model specifically identified seven molecular markers unique to JY chickens, including chr2:88999342, chr2:137016590, chr4:22037922, chr8:13527512, chr20:9365087, chr24:150425, and chr27:6092287 (Fig. 5C).
Fig. 5.
Screening and identification of characteristic loci for JY chicken. A Manhattan diagram of genetic differentiation indices for JY chicken. B Accuracy of different algorithms for different numbers of SNPs. C Importance of different SNPs under the SVM model. D Genetic polymorphism of seven SNP loci among diverse chicken breeds. PIC, Polymorphism Information Content. E KASP genotyping results of seven SNP loci across diverse chicken breeds. F Genotype combinations of seven SNP loci in each individual. G ROC curve and AUC of SVM model for category discrimination. MLR Multiple Linear Regression; SVM Support Vector Machines; KNN K-Nearest Neighbors; NB Naive Bayes; DT Decision Trees; RF Random Forest; BPNN Back Propagation Neural Network; GBDT Gradient Boosting Decision Tree. JY Jingyuan chickens; DX Dongxiang chickens; CS Changshun chickens; XB Xichuan black-bone chickens; GS Gushi chickens; HLB Hailan brown laying chickens; RM Rohman chickens; RS308 Ross 308 broilers; HBD Hubbard broilers; Cobb Cobb broilers; AA Arbor Acres broilers, and N-JY refers to chickens are not the JY breed.
Subsequently, these seven loci were genotyped in 613 individuals representing various chicken breeds using Kompetitive Allele Specific PCR (KASP) genotyping. After excluding individuals with unsuccessful genotyping, specific genotyping statistics for these seven loci across 11 breeds, including Arbor Acres broilers (AA), Cobb, Hubbard broiler (HBD), Ross 308 broiler (RS308), Roman hens (RM), Hyline brown (HLB), GS, XB, Changshun chicken (CS), Dongxiang chickens (DX), and JY, were recorded in Table 2. Statistical analysis of the polymorphism of these seven SNPs across different breeds revealed that when SNP 1-6 exhibited moderate polymorphism and SNP7 showed low polymorphism, the probability of the sample belonging to the JY chicken breed was maximized (Fig. 5D; Table 3).
Table 2.
Statistical results of genotype distribution of 7 SNPs in 613 samples
| Breed | SNP1 | SNP2 | SNP3 | SNP4 | SNP5 | SNP6 | SNP7 | Total number |
|---|---|---|---|---|---|---|---|---|
| AA | 37:35:13 | 86:0:0 | 0:0:85 | 84:0:0 | 14:53:19 | 85:0:0 | 86:0:0 | 86 |
| Cobb | 32:39:9 | 75:4:0 | 0:0:80 | 79:0:0 | 3:51:26 | 78:0:0 | 80:0:0 | 80 |
| HBD | 9:24:11 | 40:6:0 | 0:0:45 | 45:1:0 | 6:21:19 | 46:0:0 | 45:1:0 | 46 |
| GS | 44:27:1 | 57:12:3 | 13:35:24 | 67:5:0 | 38:32:2 | 34:37:1 | 54:15:3 | 72 |
| XB | 24:35:9 | 61:6:0 | 0:16:51 | 60:8:0 | 54:14:0 | 37:30:1 | 36:20:11 | 68 |
| JY | 2:26:67 | 36:37:22 | 45:34:16 | 36:44:13 | 14:38:43 | 15:50:29 | 89:5:0 | 95 |
| RM | 18:20:1 | 32:8:0 | 0:0:40 | 37:2:0 | 28:12:0 | 32:8:0 | 17:20:1 | 40 |
| HLB | 21:42:8 | 60:12:0 | 0:6:64 | 57:18:0 | 37:31:5 | 62:10:0 | 60:7:3 | 75 |
| CS | 7:10:3 | 18:2:0 | 3:1:16 | 20:0:0 | 16:3:1 | 15:15:0 | 19:1:0 | 20 |
| DX | 7:5:7 | 17:3:0 | 0:3:16 | 17:1:1 | 16:3:1 | 18:2:0 | 17:3:0 | 20 |
| RS | 3:7:1 | 10:0:0 | 0:0:11 | 11:0:0 | 1:8:2 | 11:0:0 | 10:0:0 | 11 |
The total number of genotypes at different mutation loci in the same breeds is different because individual loci were not successfully typed
Table 3.
Distribution of polymorphisms of the seven SNP sites in different breeds of chickens
| Breed | Polymorphism Information Content (PIC) | ||||||
|---|---|---|---|---|---|---|---|
| SNP1 | SNP2 | SNP3 | SNP4 | SNP5 | SNP6 | SNP7 | |
| AA | 0.460 | 0.000 | 0.000 | 0.000 | 0.498 | 0.000 | 0.000 |
| Cobb | 0.459 | 0.049 | 0.000 | 0.000 | 0.459 | 0.000 | 0.000 |
| HBD | 0.499 | 0.122 | 0.000 | 0.022 | 0.460 | 0.000 | 0.022 |
| GS | 0.322 | 0.219 | 0.488 | 0.067 | 0.375 | 0.395 | 0.249 |
| XB | 0.476 | 0.086 | 0.210 | 0.111 | 0.185 | 0.360 | 0.430 |
| JY | 0.266 | 0.489 | 0.453 | 0.469 | 0.453 | 0.489 | 0.052 |
| RM | 0.405 | 0.180 | 0.000 | 0.050 | 0.255 | 0.180 | 0.411 |
| HLB | 0.483 | 0.153 | 0.082 | 0.211 | 0.404 | 0.129 | 0.168 |
| CS | 0.480 | 0.095 | 0.289 | 0.000 | 0.219 | 0.219 | 0.049 |
| DX | 0.500 | 0.139 | 0.145 | 0.145 | 0.219 | 0.095 | 0.139 |
| RS | 0.483 | 0.000 | 0.000 | 0.000 | 0.496 | 0.000 | 0.000 |
PIC stands for polymorphic information content, PIC < 0.25 for low polymorphism, 0.25 < PIC < 0.5 was considered a moderate polymorphism
Statistical analysis of the genotypes of the seven loci within JY chickens and the chickens were not from the JY breeds (N-JY) chickens revealed a distinct pattern. SNP1 and SNP7 primarily exhibited two genotypes among JY chickens: mutant heterozygote and wild-type homozygote. Conversely, all three genotypes were present in N-JY chickens. SNP2 to SNP6 exhibited all three genotypes in JY chickens, whereas N-JY chickens primarily demonstrated only two genotypes (Fig. 5E). To further investigate the potential of these seven SNPs as molecular markers for breed discrimination, we analyzed the mutation combinations across all individuals of different breeds. Notably, the genotype combinations of these seven SNPs showed minimal overlap between JY chickens and other breeds. Using the SVM model, we identified combinations of four out of five SNP loci (1 + 2 + 4 + 6 + 7/1 + 2 + 5 + 6 + 7/1 + 3 + 4 + 6 + 7/1 + 4 + 5 + 6 + 7) that exhibited the highest breed identification probability of 99.25%. When all seven SNP loci were considered together, the probability of correctly identifying JY chickens was 98.51% (Fig. 5F, Table S4). The receiver operating characteristic (ROC) curve analysis further validated the accuracy of JY chicken breed identification, achieving a high area under the curve (AUC) of 0.9938 (Fig. 5G). This finding indicates that the combination of these seven SNP loci has the potential to serve as a reliable molecular marker for identifying JY chickens.
Discussion
JY chickens are primarily reared through free-range farming, which contributes to their robust meat quality [25]. Although JY chickens are tolerant of cold and drought conditions, this farming method has also resulted in an unclear genetic evolutionary background for them. Therefore, this study is the first to conduct whole-genome re-sequencing on JY chickens. Through genetic evolution analysis, genome variation analysis, selective sweep analysis, and other analytical methods, we have comprehensively investigated the genetic background of JY chickens. We have initially identified genes associated with their growth, metabolism, and meat quality. Additionally, by utilizing machine learning models, we have identified seven SNP combinations that can preliminarily distinguish JY chickens. This lays the groundwork for applying specific breed molecular markers for JY chickens.
There are numerous examples in the study of local breeds where geographical and cultural factors have contributed to breed specificity. For instance, in the comparative genomic study of indigenous and commercial chickens in Iran, some candidate genes related to adaptation to arid environment and activation of immune response were identified, explaining why indigenous chickens in desert regions show better adaptability to hot climate environment [26]. Tan sheep, unique to the ecological environment of Ningxia, exhibit characteristics such as drought resistance, salt and alkali tolerance, and tolerance of coarse grains, demonstrating great adaptability to deserts, semi-deserts, and arid grasslands [27, 28]. In this study, a high-density variation map of the JY chicken genome was first constructed using 10X genome re-sequencing data. The reference genome version used in this study was Ensembl Gallus_gallus_6.0. In this version, the chicken autosomes included chromosomes 1–28 and 30–33, totaling 32 autosomes, with no chromosomes 29 or 34–38. Although chromosome 16 appears to have a small number of SNPs, it contains 12,750 SNPs. It may be due to the relationship between color and chromosome length, which resulted in the presentation of chromosome 16 images with almost no SNPs. Comparative genomic analysis of different breeds has revealed the unique genetic structure of JY chickens. The JY samples were highly concentrated in PCA results, showing high homogeneity and low genetic diversity, while the RJF samples were more widely dispersed along PC1 and formed two clusters, which were consistent with the characteristics of the RJF population from India and Thailand, respectively. LD patterns in populations reveal evolutionary forces such as genetic bottlenecks and selective tags [29, 30]. JY chicken had the lowest LD value and the shortest decay distance, indicating that this breed did not undergo high-intensity positive selection during domestication. This situation also tends to be present in wild animal groups. Therefore, JY maintained a complex genetic background like that of the wild population to a certain extent. Their genetic background appears to have diverge significantly from other Chinese local chicken breeds, with minimal genetic infiltration and admixture. This unique genetic profile may be attributed to the specific geographical location of Ningxia and the local Hui culture, which have likely exerted selective pressures on the evolution and breeding practices of JY chickens [31]. Therefore, studying the unique genetic evolution patterns of animals in different regions is significant for the preservation of local genetic resources and the advancement of cultural research.
Heterozygosity refers to the frequency of different alleles at a locus in an individual. The He is the expected heterozygosity calculated based on genotype frequencies in a population, and the Ho represents the frequency of heterozygotes observed in actual observations. It reflects the frequency of heterozygote states observed in individuals within a population or species. Genetic diversity evaluation results indicated that the Ho of JY chickens was lower than the He and was the lowest compared to other breeds studied. This may be because the JY chicken breeding area is situated on a plateau where people primarily engage in grazing activities, leading to the development of several small, isolated populations with restricted genetic exchange. This limited gene flow raises the possibility of genetic drift, leading to a reduction in genetic diversity within the JY population. In addition, the adaptability of the population to the natural environment and the absence of high-intensity artificial selection breeding are also potential factors that cause significant differences in genetic background between JY chickens and other breeds, especially commercial breeds. Through genome-wide selective sweep analysis, we identified genes related to the growth and development, as well as energy metabolism, of JY chickens. These genes included the FGF gene family, IGF-1, IGF2BP3, ANKRD family, SLC superfamily, HSP family, PPARα, and DHTKD1. These results support our hypothesis that the genomic selection signatures of JY chickens in Ningxia exhibit unique geographical and artificial selection traces, thus demonstrating distinct genomic characteristics. As is well known, the FGF gene family, IGF-1, IGF2BP3, and ANKRD family all play important roles in regulating growth and development. FGF was discovered by Armelin and Gospodarowicz [32, 33] in the 1970s. Subsequent studies have shown that it plays a crucial role in the growth, development, and injury repair of the central nervous system [34–36]. IGF-1 is a growth factor that has been studied for its role in bone remodeling processes, stimulating osteoblast replication and bone matrix synthesis [37–39]. IGF2BP3 and the ANKRD family genes not only play roles in diseases and tumors but also have significant functions in normal embryonic development, vertebrate growth, and development [40–44]. For example, ANKRD9 can participate in intracellular lipid accumulation [45]. Members of the SLC gene superfamily are involved in the transport of substances inside and outside the cell, including amino acids, carbohydrates, lipids, and inorganic salt ions, and are widely present in various cells of the body [46, 47]. PPARα, a well-studied gene that regulates metabolism, can directly control the transcription of genes involved in peroxisomal and mitochondrial FAO pathways, FA uptake, and TG catabolism to regulate intracellular lipid metabolism [48, 49]. The deletion of PPARα can slow down metabolism and exacerbate the development of renal fibrosis [50]. Similarly, many studies have found that DHTKD1 plays an important role in cellular metabolism [51, 52], and the deletion of DHTKD1 in the body can cause metabolic disorders [53]. The HSP gene family, as genes associated with heat shock, has been reported in several studies of breeds from tropical arid regions, such as Isfahan and Lari native chickens from Iran and Peloco and Caneluda native chickens from Brazil [26, 54]. In our study, HSPA4, HSPA14 and HSPB2 were identified in the genome of JY. This result is consistent with the results of genetic analysis of indigenous chickens in Iran and Brazil. It is suggested that the genes related to reproductive performance of JY chickens also have the potential for further research. In this study, selective sweep analysis revealed that the aforementioned genes were all under selection in JY chickens, suggesting their potential advantages in growth performance and meat quality traits. This may be attributed to Ningxia’s unique geographical environment contributing to their elevated metabolic rates.
Machine learning, as an interdisciplinary field involving probability theory, statistics, algorithm complexity theory, and other domains, plays a crucial role in various applications. Among various machine learning paradigms, supervised learning is a fundamental one, consisting of two main stages: training and testing. During the training stage, a model learns the mapping relationship between inputs and outputs by observing training data and continuously adjusting its parameters. During the testing stage, the model assesses its performance by using unseen data to verify its ability to accurately predict the output labels of new input samples. Once trained, the model can be used to classify new handwritten digit images, even if they are not part of the training set [55]. Apart from its applications in computer science, economics, and medicine, this machine-learning paradigm has also been widely reported in biological and animal husbandry research [56–58]. These algorithms include SVM, KNN, MLR, NB, DT, RF, GBDT, BPNN, etc. In this study, we utilized these machine learning models to screen and identify the most suitable model for discriminating the SNP locus characteristics of JY chickens - SVM. We identified seven significant candidate loci for JY chicken breed identification. Then, we utilized KASP genotyping technology to genotype different chicken breeds, including JY chickens and validated the SVM model. The ROC curve was used to evaluate the prediction model’s performance, demonstrating a prediction accuracy of 0.99. This comprehensive analysis demonstrated that these seven SNP loci can effectively achieve breed identification of JY chickens.
Conclusion
In this study, we conducted a whole-genome sequencing analysis of the Ningxia JY chicken breed and utilized machine learning models to identify loci that can distinguish JY chickens. JY chickens exhibit relatively low genetic diversity and have distinct genetic backgrounds compared to other chicken breeds studied. Several candidate genes related to growth and metabolism were found in the genomic of JY chickens. The SVM model was the most suitable for the identification of JY chickens, leading to the discovery of seven SNP loci that could be applied for breed identification. Overall, our study provides valuable information on the genetic variations in the genome of Ningxia JY chickens during their long-term domestication process. It also demonstrates the efficient application of machine learning in the conservation of local breeds, which will help protect endangered species in the future.
Methods
Sample collection and sequencing
The 60 blood samples of Ningxia JingYuan chicken were collected and provided by Ningxia Feed Engineering Technology Research Center. DNA samples from all other breeds of chickens involved in this study were obtained from the Henan Key Laboratory for Innovation and Utilization of Chicken Germplasm Resources’ chicken DNA sample repository. Genomic DNA was extracted from blood samples using the TianGen DNA Kit (DP 341, Tiangen Biochemical Technology, Beijing, China). The quantity and quality of the genome’s DNA were assessed using a NanoDrop spectrophotometer 2000 (NanoDrop Inc., Wilmington, DE, USA) and agarose gel electrophoresis. The 60 samples were resequenced on the whole genome level using the Illumina NovaSeq 6000 platform at the LC-BIO TECHNOLOGIES (HANGZHOU) CO., LTD. (Hangzhou, China). This study’s sequence data has been deposited in the National Center for Biotechnology Information (https://www.ncbi.nlm.nih.gov/) with the accession code PRJNA1099329.
To analyze and evaluate the genetic background of static fowl more objectively and comprehensively, the WGSs of 236 chickens from another 17 chicken breeds from different regions, including 29 GS chickens (from central China), 10 ZSY chickens (from central China), 31 XB chickens (from central China), 10 Ningdu Yellow chickens (ND) (from southern China), 10 Jianghan chickens (JH) (from southern China), 10 Wenchang chickens (WC) (from southern China), 10 Huiyang bearded chickens (HY) (from southern China), 10 Guangxi Yao chickens (YC) (from southern China), 10 Huanglang chickens (HL) (from southern China), 10 HT chickens (from southern China), 10 HX chickens (from southern China), 10 Huaibei partridge chickens (HM) (from central China), 20 Broiler sire lines (BRA & BRB), 17 Layer parental lines (RIR & WL), and 39 RJFt chickens(from Thailand & India), were retrieved from a published dataset [59–61] (Table S5).
Whole genome data processing and variant calling
After obtaining the raw sequencing reads, adapter sequences, polyN, polyA, and other sequences were filtered out. The filtered valid reads were then aligned to the Gallus_gallus_6.0 reference genome coordinates (http://ftp.ensembl.org/pub/release-106/fasta/gallus_gallus/dna/) [62] using BWA [63], resulting in BAM-formatted alignment data. Subsequently, the alignment data were sorted using SAMtools [64] and duplicate reads were marked using Picard. Sequencing coverage statistics were generated using BEDtools (v2.29.2, Aaron R. Quinlan, USA). Variant calling was performed using GATK (3.8, Aaron McKenna, USA) software [65]. Single-nucleotide polymorphisms (SNPs) were filtered using GATK’s Variant filtering with the criteria of “QD < 2.0|| MQ < 40.0|| FS > 60.0|| SOR > 3.0|| MQRankSum < − 12.5” to exclude SNPs with distorted segregation or sequencing errors. Finally, using VCFtools (v0.1.16, Adam Auton, UK), 9,861,819 SNPs were obtained with a minimum allele frequency > 0.05 and a maximum missing rate > 0.8 [66]. SNP annotation information was retrieved using the Ensemble genome database and SNPEff (v4.1, Pablo Cingolani, USA) program [67]. The related command scripts were shown in Supplementary Material 3.
Phylogenetic tree, PCA and admixture
Based on MEGA X software (v10.1.5), a neighbor-joining phylogenetic tree was constructed using the Kimura 2-parameter model and 1000 bootstrap replications to determine the phylogenetic relationships [68]. To understand the relationship between different geographical populations, PCA was performed on all SNP loci using EIGENSOFT software (v7.2.1) [69]. ADMIXTURE (v1.3.0) software was used to infer population structure by studying the population structure through a maximum likelihood model, quantifying the genome-wide admixture among multiple chicken populations (BRA&BRB as broiler population and RIR&WL as layer population) [70]. The number of genetic clusters, K, ranged from 1 to 17, with a maximum number of iterations set to 10,000. The related command scripts are shown in Supplementary Material 3.
Heterozygosity Runs of Homozygosity and Linkage Disequilibrium
To describe genetic diversity, PLINK (v1.90, Shaun Purcell, USA) was used to calculate indicators related to genetic differentiation, including heterozygosity, ROH, and LD [71]. VCFtools v0.1.16 was used to estimate genome-wide nucleotide diversity for each breed [66]. The ratio of observed heterozygosity to observed homozygosity (Ho/-het) was calculated as 1 - and He was estimated as 1 - (the number of expected homozygous loci divided by the number of non-missing loci). These values were averaged across all SNPs to determine Ho and He estimates for each sample within each breed. In addition, the inbreeding coefficient of each sample was estimated based on the number of ROH.
Long homozygous segments were scanned using PLINK (v1.90, Shaun Purcell, USA). The specific parameters used to assess homozygosity were a sliding window of 50 SNPs along the chromosome; a maximum of 1 heterozygote, 5 missing SNPs allowed per sliding window; a minimum ROH length of 100 kb; a minimum density of one SNP per 50 kb; and a maximum gap between consecutive SNPs of 1000 kb. Following McQuillan [72], the inbreeding coefficient (FROH) for each breed was determined using the following formula:
![]() |
The LAUTO refers to the length of the autosomal genome spanning the SNP loci, which in this study was 960,280 kb. To determine the LD among the 18 breeds, PopLDdecay [73] was used based on the correlation coefficient R2 statistics between pairs of loci. The related command scripts were shown in Supplementary Material 3.
Selective sweep analysis
The sliding-window approach (40-kb windows with 10-kb increments) was used to determine the genome-wide distribution of Fst values and θπ ratios among 60 JY as well as 20 commercial broiler chickens (BRA & BRB) to identify possible areas that had undergone directional selection in the JY chickens. BRA and BRB were commercial broilers bred by companies in the USA and France, respectively, and their sequencing data were obtained from the database under the sequence number PRJEB30270 [59]. Based on the ratio of π for a subpopulation to a control subpopulation, the ROD value was estimated. Fst values were changed using Z-transform, and θπ ratios were transformed using log2 ratio. The process was analyzed using VCFtools [66]. We looked at the windows with the top 5% values for the Fst and log2 ratio as potential candidates for strongly selected genes. Finally, the candidate genes were analyzed by GO and Kyoto Encyclopedia of Genes using Metascape https://metascape.org/gp/index.html#/main/step1 (accessed on 11 November 2022). The related command scripts were shown in Supplementary Material 3.
Screening of variety molecular markers and KASP typing verification
To genetically analyze the JY chicken in the context of other chicken breeds, the re-sequencing data of JY chickens from this study, along with publicly available re-sequencing data and 600 K chip sequencing data from 3,536 chickens were used for genetic differentiation index analysis [60, 74–76] (Table S6). VCFtools (v0.1.16) was used to calculate the genetic differentiation index of the N-JY and JY chicken.
Eight different machine learning classification methods were employed to construct training models, including MLR, SVM, KNN, NB, DT, RF, BPNN, and GBDT. The Python 3.6 programming language was used to implement these models.
KASP genotyping technology is well-suited for large-scale, multi-locus genotyping detection. To apply this technology, specific KASP primer sequences were designed for the candidate SNP loci and synthesized by LGC Genomics. Using DNA templates from 613 individuals representing known chicken breeds, PCR amplification was performed. The amplified products were then analyzed for FAM and VIC fluorescence wavelengths, enabling SNP genotyping based on the fluorescent signals (Beijing Vegetable Research Center, China). The spots where no fluorescent signal was detected were filtered out, and the success rate was statistically analyzed. It was found that the detection success rate of HL was 93.71%, and the success rate of other breeds was more than 97%, indicating that KASP results could be used for subsequent analysis (Table S7). The genotype distribution of each locus across different breeds was statistically analyzed to identify reliable molecular markers specific to the JY chicken. The KASP primer sequences were provided in Table S8. Polymorphism information content (PIC) was used to measure the degree of polymorphism at loci:
![]() |
The pi represents the frequency of the ith allele.
The genotypes of 7 SNPS in different breeds of chickens were analyzed by SVM algorithm. In SVM algorithm, the target variable was set to “is JY”, and different breeds of chickens were named with 0 and 1 (0: N-JY, 1: JY). To evaluate the performance of the machine learning classification models, ROC curves were constructed. These curves measure the diagnostic ability of a binary classifier system as its discrimination threshold is varied. The individuals with known genotypes and breeds were used to validate the top-scoring training model, and the resulting data was used to generate the ROC curves. By analyzing the ROC curves, we can assess the sensitivity and specificity of the models, which are critical in determining the accuracy and reliability of the classification system. This validation process was essential for ensuring that the models can accurately distinguish between different chicken breeds, particularly the JY breed, based on the identified SNP markers. The related command scripts were shown in Supplementary Material 3.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Acknowledgements
The authors would like to thank the Ningxia Feed Engineering Technology Research Center for providing the blood samples and all the students of the Henan Key laboratory for innovation and utilization of chicken germplasm resources for their contribution to sample collection and processing.
Abbreviations
- JY
Jingyuan
- SVs
Structural variations
- WGS
Whole-genome sequencing
- MLR
Multiple Linear Regression
- GBDT
Gradient Boosting Decision Tree
- SVM
Support Vector Machines
- KNN
K-Nearest Neighbors
- NB
Naive Bayes
- DT
Decision Trees
- RF
Random Forest
- BPNN
Back Propagation Neural Network
- GS
Gushi
- ND
Ningdu Yellow
- JH
Jianghan
- WC
Wenchang
- HY
Huiyang bearded
- YC
Guangxi Yao
- HL
Huanglang
- HT
Hetian
- HX
Huaixiang
- HM
Huaibei partridge
- ZSY
Zhengyang sanhuang
- XB
Xichuan black-bone
- BRA & BRB
Broiler sire lines A&B
- RIR
Rhode island reds
- WL
White Leghorn
- RJFt
Red jungle fowl
- Ho
Observed homozygosity
- He
Expected heterozygosity
- ROH
Runs of Homozygosity
- LD
Linkage disequilibrium
- ROD
Reduction of diversity
- GO
Gene Ontology
- KEGG
Kyoto Encyclopedia of Genes and Genomes
- KASP
Kompetitive Allele Specific PCR
- N-JY
Not from the JY breeds
- PIC
Polymorphism information content
- ROC
Receiver operating characteristic
- FGFR
Fibroblast growth factor receptor
- AUC
Area under the curve
- AA
Arbor Acres broilers
- HBD
Hubbard broiler
- RS308
Ross 308 broiler
- RM
Roman hens
- CS
Changshun
- DX
Dongxiang
Author contributions
Y.T., Z.J., X.K. and X.L. conceived the idea and supervised the project; Y.Y., Z.Z., Y.L. and B.Z. performed the experiments; Y.Y. and X.Z. performed the data analysis; Z.L.and H.L. helped to analyze the results; K.W., X.L. and W.L. helped to processed the data; Y.Y. wrote the manuscript; Y.T. and X.L revised the manuscript. All authors read and approved the final manuscript.
Funding
This research was funded by grants from the Key Research and Development Special Projects of Henan Province (241111113600) and the Key R&D Program of the Ningxia Hui Autonomous Region (2022BBF02034). The funding bodies had no role in study design or in any aspect of the collection, analysis and interpretation of data or in writing the manuscript.
Data availability
This study’s sequence data have been deposited in the National Center for Biotechnology Information (https://www.ncbi.nlm.nih.gov/) with the accession code PRJNA1099329.
Declarations
Consent for publication
Not applicable.
Competing interests
The authors declare no competing interests.
Ethics Statement
All animal experiments were performed according to the Regulations for the Administration of Affairs Concerning Experimental Animals (Ministry of Science and Technology, China, 2004). The blood samples of all JY chickens were provided by Ningxia Feed Engineering Technology Research Center Ningxia Feed Engineering Technology Research Center and granted to us for research purposes. All the blood samples were collected from the brachial veins of chickens by standard venipuncture and the whole procedure was performed following the Guidelines for Experimental Animals established by the Ministry of Agriculture of China (Beijing, China). The protocols and guidelines were approved by the Institutional Animal Care and Use Committee of Henan Agricultural University, China. The study was carried out in compliance with the ARRIVE guidelines. No animals were euthanized or sacrificed in this study.
Footnotes
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Ya-Xin Yue and Xin Zhang have contributed equally to this work.
References
- 1.Jian-ming Z, Xi L, Yin-di L, Ya-lin D, Ya-ling Z, Jian-ju Z, Jun-hui L. The Present Situation and Countermeasures of Jingyuan Chicken Industry Development (in Chinese). Journal of Animal Husbandry and Veterinary Medicine 2019, 2019,38(05):34–36.
- 2.Institute of Animal Husbandry CAoAS: Chinese domestic breeds [chicken]. In. 1982 edn: Beijing Institute of Animal Science, Chinese Academy of Agricultural Sciences; 1982.
- 3.Lixia M, Guowei C, Hongfang Z, Zhanzhao D, Zhengyun C, Chenghao Z, Wei H, Yaling G, Juan Z. Analysis of genetic variation in a conserved population of Jingyuan chickens based on RAD-seq (in Chinese). Chin J Anim Veterinary Sci. 2022;53(07):2104–17. [Google Scholar]
- 4.E E H H, DING W. Development and utilization of local breed chicken in Ningxia (in Chinese). Anim Husb Feed Sci. 2013;34(7–8):106–7. [Google Scholar]
- 5.Shi S, Shao D, Yang L, Liang Q, Han W, Xue Q, Qu L, Leng L, Li Y, Zhao X, et al. Whole genome analyses reveal novel genes associated with chicken adaptation to tropical and frigid environments. J Adv Res. 2023;47:13–25. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Lv FH, Cao YH, Liu GJ, Luo LY, Lu R, Liu MJ, Li WR, Zhou P, Wang XH, Shen M et al. Whole-Genome resequencing of worldwide wild and domestic sheep elucidates genetic diversity, introgression, and agronomically important loci. Mol Biol Evol 2022, 39(2). [DOI] [PMC free article] [PubMed]
- 7.Wang Z, Chen Q, Liao R, Zhang Z, Pan Y. Genome-wide genetic variation discovery in Chinese Taihu pig breeds using next generation sequencing. Anim Genet. 2017;48(1):38–47. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Bosse M, Megens HJ, Madsen O, Paudel Y, Frantz LA, Schook LB, Crooijmans RP, Groenen MA. Regions of homozygosity in the Porcine genome: consequence of demography and the recombination landscape. PLoS Genet. 2012;8(11):e1003100. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Wang GD, Xie HB, Peng MS, Irwin D, Zhang YP. Domestication genomics: evidence from animals. Annu Rev Anim Biosci. 2014;2:65–84. [DOI] [PubMed] [Google Scholar]
- 10.WANG J, LEI Q-x CAO, D-g ZHOUY, HAN H-x LIUW. LI D-p, LI F-w, LIU J: whole genome SNPs among 8 chicken breeds enable identification of genetic signatures that underlie breed features. J Integr Agric. 2023;22(7):2200–12. [Google Scholar]
- 11.Stephan W. Selective sweeps. Genetics. 2019;211(1):5–13. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Ma YF, Han XM, Huang CP, Zhong L, Adeola AC, Irwin DM, Xie HB, Zhang YP. Population genomics analysis revealed origin and High-altitude adaptation of Tibetan pigs. Sci Rep. 2019;9(1):11463. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Walsh JB. Genomic selection signatures and animal breeding. J Anim Breed Genet. 2021;138(1):1–3. [DOI] [PubMed] [Google Scholar]
- 14.Verma P, Sharma A, Sodhi M, Thakur K, Kataria RS, Niranjan SK, Bharti VK, Kumar P, Giri A, Kalia S, et al. Transcriptome analysis of Circulating PBMCs to understand mechanism of high altitude adaptation in native cattle of Ladakh region. Sci Rep. 2018;8(1):7681. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Wu DD, Yang CP, Wang MS, Dong KZ, Yan DW, Hao ZQ, Fan SQ, Chu SZ, Shen QS, Jiang LP, et al. Convergent genomic signatures of high-altitude adaptation among domestic mammals. Natl Sci Rev. 2020;7(6):952–63. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Kim J, Kim M, Cho E, Lee SS, Kim S, Jin D, Lee JH. Analysis of runs of homozygosity in Yeonsan Ogye chickens using 600K single nucleotide polymorphism arrays. J Anim Sci Technol. 2025;67(3):520–35. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Biswas A, Saran I, Wilson FP. Introduction to supervised machine learning. Kidney360. 2021;2(5):878–80. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Vabalas A, Gowen E, Poliakoff E, Casson AJ. Machine learning algorithm validation with a limited sample size. PLoS ONE. 2019;14(11):e0224365. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Joo C, Park H, Kwon H, Lim J, Shin E, Cho H, Kim J. Machine Learning Approach to Predict Physical Properties of Polypropylene Composites: Application of MLR, DNN, and Random Forest to Industrial Data. Polymers (Basel) 2022, 14(17). [DOI] [PMC free article] [PubMed]
- 20.Zhou CM, Wang Y, Ye HT, Yan S, Ji M, Liu P, Yang JJ. Machine learning predicts lymph node metastasis of poorly differentiated-type intramucosal gastric cancer. Sci Rep. 2021;11(1):1300. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Hosseini MP, Nazem-Zadeh MR, Mahmoudi F, Ying H, Soltanian-Zadeh H. Support vector machine with nonlinear-kernel optimization for lateralization of epileptogenic hippocampus in MR images. Annu Int Conf IEEE Eng Med Biol Soc. 2014;2014:1047–50. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Rish I. An empirical study of the naive Bayes classifier. In: IJCAI-01 Workshop on Empirical Methods in AI, 2001: 2001; 2001.
- 23.Karalis G. Decision trees and applications. Adv Exp Med Biol. 2020;1194:239–42. [DOI] [PubMed] [Google Scholar]
- 24.Zhang SZ, Chen S, Jiang H. A back propagation neural network model for accurately predicting the removal efficiency of ammonia nitrogen in wastewater treatment plants using different biological processes. Water Res. 2022;222:118908. [DOI] [PubMed] [Google Scholar]
- 25.Yu B, Liu J, Cai Z, Mu T, Gu Y, Xin G, Zhang J. miRNA-mRNA associations with inosine monophosphate specific deposition in the muscle of Jingyuan chicken. Br Poult Sci. 2022;63(6):821–32. [DOI] [PubMed] [Google Scholar]
- 26.Asadollahpour Nanaei H, Kharrati-Koopaee H, Esmailizadeh A. Genetic diversity and signatures of selection for heat tolerance and immune response in Iranian native chickens. BMC Genomics. 2022;23(1):224. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Kang X, Liu G, Liu Y, Xu Q, Zhang M, Fang M. Transcriptome profile at different physiological stages reveals potential mode for Curly fleece in Chinese Tan sheep. PLoS ONE. 2013;8(8):e71763. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Xu X, Wei X, Yang Y, Niu W, Kou Q, Wang X, Chen Y. PPARγ, FAS, HSL mRNA and protein expression during Tan sheep fat-tail development. Electron J Biotechnol. 2015;18(2):122–7. [Google Scholar]
- 29.Szpiech ZA, Xu J, Pemberton TJ, Peng W, Zöllner S, Rosenberg NA, Li JZ. Long runs of homozygosity are enriched for deleterious variation. Am J Hum Genet. 2013;93(1):90–102. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Good BH. Linkage disequilibrium between rare mutations. Genetics 2022, 220(4). [DOI] [PMC free article] [PubMed]
- 31.Rucai LU. Tradition and Modernity–The life of the Hui people in Ningxia. China Today. 2008;57(11):18–22. [Google Scholar]
- 32.Armelin HA. Pituitary extracts and steroid hormones in the control of 3T3 cell growth. Proc Natl Acad Sci U S A. 1973;70(9):2702–6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Gospodarowicz D, Handley HH. Stimulation of division of Y1 adrenal cells by a growth factor isolated from bovine pituitary glands. Endocrinology. 1975;97(1):102–7. [DOI] [PubMed] [Google Scholar]
- 34.Nawrocka D, Krzyscik MA, Opaliński Ł, Zakrzewska M, Otlewski J. Stable fibroblast growth factor 2 dimers with high Pro-Survival and mitogenic potential. Int J Mol Sci 2020, 21(11). [DOI] [PMC free article] [PubMed]
- 35.Khosravi F, Ahmadvand N, Bellusci S, Sauer H. The multifunctional contribution of FGF signaling to cardiac development, homeostasis, disease and repair. Front Cell Dev Biol. 2021;9:672935. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Murugaiyan K, Amirthalingam S, Hwang NS, Jayakumar R. Role of FGF-18 in bone regeneration. J Funct Biomater 2023, 14(1). [DOI] [PMC free article] [PubMed]
- 37.Nicholls AR, Holt RI. Growth hormone and Insulin-Like growth Factor-1. Front Horm Res. 2016;47:101–14. [DOI] [PubMed] [Google Scholar]
- 38.Bhalla S, Mehan S, Khan A, Rehman MU. Protective role of IGF-1 and GLP-1 signaling activation in neurological dysfunctions. Neurosci Biobehav Rev. 2022;142:104896. [DOI] [PubMed] [Google Scholar]
- 39.Xi G, Chiu WC, Wah CJP, Chiu LT, Xiaoyan C. Expression of vascular endothelial growth factor A (VEGFA), placental growth factor (PlGF) and insulin-like growth factor 1 (IGF-1) in serum from women undergoing frozen embryo transfer. Hum Fertil (Cambridge England). 2022;26(5):11–11. [DOI] [PubMed] [Google Scholar]
- 40.Huang H, Weng H, Sun W, Qin X, Shi H, Wu H, Zhao BS, Mesquita A, Liu C, Yuan CL, et al. Recognition of RNA N(6)-methyladenosine by IGF2BP proteins enhances mRNA stability and translation. Nat Cell Biol. 2018;20(3):285–95. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Ren F, Lin Q, Gong G, Du X, Dan H, Qin W, Miao R, Xiong Y, Xiao R, Li X, et al. Igf2bp3 maintains maternal RNA stability and ensures early embryo development in zebrafish. Commun Biol. 2020;3(1):94. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Vong YH, Sivashanmugam L, Leech R, Zaucker A, Jones A, Sampath K. The RNA-binding protein Igf2bp3 is critical for embryonic and germline development in zebrafish. PLoS Genet. 2021;17(7):e1009667. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Pan Z, Zhao R, Li B, Qi Y, Qiu W, Guo Q, Zhang S, Zhao S, Xu H, Li M, et al. EWSR1-induced circNEIL3 promotes glioma progression and exosome-mediated macrophage immunosuppressive polarization via stabilizing IGF2BP3. Mol Cancer. 2022;21(1):16. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Jing X, Han C, Li Q, Li F, Zhang J, Jiang Q, Zhao F, Guo C, Chen J, Jiang T, et al. IGF2BP3-EGFR-AKT axis promotes breast cancer MDA-MB-231 cell growth. Biochim Biophys Acta Mol Cell Res. 2023;1870(8):119542. [DOI] [PubMed] [Google Scholar]
- 45.Hayward D, Kouznetsova VL, Pierson HE, Hasan NM, Guzman ER, Tsigelny IF, Lutsenko S. ANKRD9 is a metabolically-controlled regulator of IMPDH2 abundance and macro-assembly. J Biol Chem. 2019;294(39):14454–66. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Schaller L, Lauschke VM. The genetic landscape of the human solute carrier (SLC) transporter superfamily. Hum Genet. 2019;138(11–12):1359–77. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Rebsamen M, Girardi E, Sedlyarov V, Scorzoni S, Papakostas K, Vollert M, Konecka J, Guertl B, Klavins K, Wiedmer T et al. Gain-of-function genetic screens in human cells identify SLC transporters overcoming environmental nutrient restrictions. Life Sci Alliance 2022, 5(11). [DOI] [PMC free article] [PubMed]
- 48.Montaigne D, Butruille L, Staels B. PPAR control of metabolism and cardiovascular functions. Nat Rev Cardiol. 2021;18(12):809–23. [DOI] [PubMed] [Google Scholar]
- 49.Kim G, Chen Z, Li J, Luo J, Castro-Martinez F, Wisniewski J, Cui K, Wang Y, Sun J, Ren X, et al. Gut-liver axis calibrates intestinal stem cell fitness. Cell. 2024;187(4):914–e930920. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Chung KW, Lee EK, Lee MK, Oh GT, Yu BP, Chung HY. Impairment of PPARα and the fatty acid oxidation pathway aggravates renal fibrosis during aging. J Am Soc Nephrol. 2018;29(4):1223–37. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Plubell DL, Fenton AM, Wilmarth PA, Bergstrom P, Zhao Y, Minnier J, Heinecke JW, Yang X, Pamir N. GM-CSF driven myeloid cells in adipose tissue link weight gain and insulin resistance via formation of 2-aminoadipate. Sci Rep. 2018;8(1):11485. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Xu WY, Shen Y, Zhu H, Gao J, Zhang C, Tang L, Lu SY, Shen CL, Zhang HX, Li Z, et al. 2-Aminoadipic acid protects against obesity and diabetes. J Endocrinol. 2019;243(2):111–23. [DOI] [PubMed] [Google Scholar]
- 53.Xu WY, Zhu H, Shen Y, Wan YH, Tu XD, Wu WT, Tang L, Zhang HX, Lu SY, Jin XL et al. DHTKD1 deficiency causes Charcot-Marie-Tooth disease in mice. Mol Cell Biol 2018, 38(13). [DOI] [PMC free article] [PubMed]
- 54.Cedraz H, Gromboni JGG, Garcia AAPJ, Farias Filho RV, Souza TM, Oliveira ER, Oliveira EB, Nascimento CSD, Meneghetti C, Wenceslau AA. Heat stress induces expression of HSP genes in genetically divergent chickens. PLoS ONE. 2017;12(10):e0186083. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Razaghi-Moghadam Z, Nikoloski Z. Supervised learning of gene-regulatory networks based on graph distance profiles of transcriptomics data. NPJ Syst Biol Appl. 2020;6(1):21. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Rubiolo M, Milone DH, Stegmayer G. Extreme learning machines for reverse engineering of gene regulatory networks from expression time series. Bioinformatics. 2018;34(7):1253–60. [DOI] [PubMed] [Google Scholar]
- 57.Yin C, He Z, Wang Y, He X, Zhang X, Xia M, Zhai D, Chang K, Chen X, Chen X, et al. Improving the regional Y-STR haplotype resolution utilizing haplogroup-determining Y-SNPs and the application of machine learning in Y-SNP haplogroup prediction in a forensic Y-STR database: A pilot study on male Chinese Yunnan Zhaoyang Han population. Forensic Sci Int Genet. 2022;57:102659. [DOI] [PubMed] [Google Scholar]
- 58.Patra P, B RD, Kundu P, Das M, Ghosh A. Recent advances in machine learning applications in metabolic engineering. Biotechnol Adv. 2023;62:108069. [DOI] [PubMed] [Google Scholar]
- 59.Qanbari S, Rubin CJ, Maqbool K, Weigend S, Weigend A, Geibel J, Kerje S, Wurmser C, Peterson AT, Brisbin IL Jr., et al. Genetics of adaptation in modern chicken. PLoS Genet. 2019;15(4):e1007989. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60.Wang K, Hu H, Tian Y, Li J, Scheben A, Zhang C, Li Y, Wu J, Yang L, Fan X, et al. The chicken Pan-Genome reveals gene content variation and a promoter region deletion in IGF2BP1 affecting body size. Mol Biol Evol. 2021;38(11):5066–81. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61.Huang X, Otecko NO, Peng M, Weng Z, Li W, Chen J, Zhong M, Zhong F, Jin S, Geng Z, et al. Genome-wide genetic structure and selection signatures for color in 10 traditional Chinese yellow-feathered chicken breeds. BMC Genomics. 2020;21(1):316. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62.Wang K, Li M, Hakonarson H. ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data. Nucleic Acids Res. 2010;38(16):e164. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 63.Li H, Durbin R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics. 2009;25(14):1754–60. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 64.Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R. The sequence alignment/map format and samtools. Bioinformatics. 2009;25(16):2078–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 65.DePristo MA, Banks E, Poplin R, Garimella KV, Maguire JR, Hartl C, Philippakis AA, del Angel G, Rivas MA, Hanna M, et al. A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nat Genet. 2011;43(5):491–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 66.Danecek P, Auton A, Abecasis G, Albers CA, Banks E, DePristo MA, Handsaker RE, Lunter G, Marth GT, Sherry ST, et al. The variant call format and vcftools. Bioinformatics. 2011;27(15):2156–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 67.Cingolani P, Platts A, Wang le L, Coon M, Nguyen T, Wang L, Land SJ, Lu X, Ruden DM. A program for annotating and predicting the effects of single nucleotide polymorphisms, snpeff: SNPs in the genome of Drosophila melanogaster strain w1118; iso-2; iso-3. Fly (Austin). 2012;6(2):80–92. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 68.Kumar S, Stecher G, Li M, Knyaz C, Tamura K. MEGA X: molecular evolutionary genetics analysis across computing platforms. Mol Biol Evol. 2018;35(6):1547–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 69.Price AL, Patterson NJ, Plenge RM, Weinblatt ME, Shadick NA, Reich D. Principal components analysis corrects for stratification in genome-wide association studies. Nat Genet. 2006;38(8):904–9. [DOI] [PubMed] [Google Scholar]
- 70.Alexander DH, Novembre J, Lange K. Fast model-based Estimation of ancestry in unrelated individuals. Genome Res. 2009;19(9):1655–64. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 71.Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MA, Bender D, Maller J, Sklar P, de Bakker PI, Daly MJ, et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet. 2007;81(3):559–75. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 72.McQuillan R, Leutenegger AL, Abdel-Rahman R, Franklin CS, Pericic M, Barac-Lauc L, Smolej-Narancic N, Janicijevic B, Polasek O, Tenesa A, et al. Runs of homozygosity in European populations. Am J Hum Genet. 2008;83(3):359–72. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 73.Pickrell JK, Pritchard JK. Inference of population splits and mixtures from genome-wide allele frequency data. PLoS Genet. 2012;8(11):e1002967. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 74.Fu W, Wang R, Xu N, Wang J, Li R, Asadollahpour Nanaei H, Nie Q, Zhao X, Han J, Yang N, et al. Galbase: a comprehensive repository for integrating chicken multi-omics data. BMC Genomics. 2022;23(1):364. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 75.Guo Y, Ou JH, Zan Y, Wang Y, Li H, Zhu C, Chen K, Zhou X, Hu X, Carlborg Ö. Researching on the fine structure and admixture of the worldwide chicken population reveal connections between populations and important events in breeding history. Evol Appl. 2022;15(4):553–64. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 76.Zhi Y, Wang D, Zhang K, Wang Y, Geng W, Chen B, Li H, Li Z, Tian Y, Kang X et al. Genome-Wide genetic structure of Henan Indigenous chicken breeds. Anim (Basel) 2023, 13(4). [DOI] [PMC free article] [PubMed]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
This study’s sequence data have been deposited in the National Center for Biotechnology Information (https://www.ncbi.nlm.nih.gov/) with the accession code PRJNA1099329.







