Abstract
African indigenous sheep are classified as fat-tail, thin-tail and fat-rump hair sheep. The fat-tail are well adapted to dryland environments, but little is known on their genome profiles. We analyzed patterns of genomic variation by genotyping, with the Ovine SNP50K microarray, 394 individuals from five populations of fat-tail sheep from a desert environment in Egypt. Comparative inferences with other East African and western Asia fat-tail and European sheep, reveal at least two phylogeographically distinct genepools of fat-tail sheep in Africa that differ from the European genepool, suggesting separate evolutionary and breeding history. We identified 24 candidate selection sweep regions, spanning 172 potentially novel and known genes, which are enriched with genes underpinning dryland adaptation physiology. In particular, we found selection sweeps spanning genes and/or pathways associated with metabolism; response to stress, ultraviolet radiation, oxidative stress and DNA damage repair; activation of immune response; regulation of reproduction, organ function and development, body size and morphology, skin and hair pigmentation, and keratinization. Our findings provide insights on the complexity of genome architecture regarding dryland stress adaptation in the fat-tail sheep and showcase the indigenous stocks as appropriate genotypes for adaptation planning to sustain livestock production and human livelihoods, under future climates.
Introduction
The development of high throughput genome-wide assays and associated computational tools, have made domestic livestock attractive for investigating how an organism’s genome is influenced by its production and natural environment. The existence of independent livestock populations/breeds within a species, presents a natural experimental design that can be used to study the genetic mechanisms of adaptive divergence arising from bio-physical and/or anthropological selection. For instance, genome-wide SNP and sequence data has been used to explore genetic mechanisms of adaptation to contrasting environments1–4 and investigate evidence for genomic selection relating to domestication, breed formation and improvement5 in livestock species.
Although sheep were domesticated in the Fertile Crescent, Africa is endowed with a diverse repository of the species represented by 179 breeds/populations6 that have been classified into three broad groups; thin-tail, fat-tail and fat-rump hair sheep. The thin-tails occur in Sudan and in the sub-humid and humid regions of West Africa. The fat-tails occur across the deserts of northern Africa, and in the highlands, semi-arid and arid environments of eastern and southern Africa. The fat-rumps are found exclusively in the semi-arid and arid zones of the Horn of Africa. The thin-tails are the most ancient in the continent and were introduced, via the Isthmus of Suez and/or the southern Sinai Peninsula, whereas the fat-tail’s arrived much later, initially via northeastern Africa (Egypt) and later via the Horn of Africa (Ethiopia)7. The origin of the fat-rumps remains unknown.
The genomes of African indigenous sheep have been subjected mainly to natural selection driven by tropical and sub-tropical climates, diseases and parasites. Although their productivity is much lower than that of commercial breeds under intensive production systems, indigenous sheep are often the only option available to millions of resource-poor farmers in agro-pastoral and pastoral production systems, where exotic improved genotypes under-perform under limited (quality and quantity) feed and water resources, and high ecto- and endo-parasite and disease challenges. This is evident in Egypt, a country within the Sahara desert, where sheep of the fat-tail type, are preferred by livestock keepers because of their excellent adaptation to desert-like conditions8. This adaptation and the historical significance of Egypt as one of the centers of dispersion of domesticates into Africa, makes Egyptian indigenous sheep of interest in understanding the genetic history of indigenous fat-tail sheep in the continent and the genomic mechanisms underlying their adaptation to dryland environments, which remain poorly investigated. Here, we generated genotype data using the Ovine SNP50K BeadChip from five populations of Egyptian indigenous sheep representative of the fat-tail hair sheep diversity found across the dry belts of Africa, the Middle East and Asia to investigate their genome profiles. Comparative genome analysis with fat-tail sheep from East Africa and western Asia provided insights on the history of the genotype in northeastern and East Africa regions, and the subsequent inclusion of European commercial breeds in selection sweep analysis, allowed us to identify unique genome profiles of fat-tail hair sheep to dryland adaptation.
Results
Population genetic analysis
Population genetic relationships were assessed with PCA and DAPC (Fig. 1) using 5,140 SNPs that were selected to be unlinked. The first two principal components of the DAPC and PCA accounted for 7.93% and 9.80% (PC1) and 4.58% and 6.74% (PC2), respectively of the total genetic variation. The PC1 separated Egyptian and non-Egyptian populations. The PC2 separated East African and western Asia fat-tail sheep, which seem to cluster together, from the European breeds. The five Egyptian populations clustered very close together, but the non-Egyptian ones disperse along the vertical plane of the two plots. This suggests a lower level of genetic variation between the Egyptian populations but a comparatively higher one between the non-Egyptian populations. Generally, higher PCs (>2) did not result in observable genetic clusters.
Detection of candidate selection sweep regions
Based on the results of the DAPC and PCA, SNP genotypes were used to estimate allele frequency differentiation, measured as di, in a pairwise comparison between the Egyptian and non-Egyptian populations. The genome wide distribution of di values for each SNP is shown in Fig. 2a. A total of 109 significant SNPs (di ≥ 4.0) defined seven candidate regions across six chromosomes (Oar1, Oar2, Oar3, Oar8, Oar9, Oar27; Fig. 2a; Supplementary Table S1a). The strongest candidate region was on Oar9 spanning 24 significant SNPs and 15 genes.
The RsB analysis revealed 154 significant SNPs (pRsB ≥ 4.0) that defined 10 candidate selection sweep regions across nine chromosomes (Oar1, Oar2, Oar5, Oar7, Oar8, Oar13, Oar15, Oar20, Oar26; Fig. 2b; Supplementary Table S1b). It identified two candidate regions as the strongest, one on Oar2 and the other on Oar15, spanning 107 and 9 significant SNPs, and, 28 and 0 genes, respectively.
The intra-population iHS analysis was performed for the five Egyptian populations grouped based on the PCA and DAPC. It identified 47 significant SNPs (piHS ≥ 4.0) that defined 14 candidate regions across eight chromosomes (Oar1, Oar2, Oar3, Oar6, Oar10, Oar13, Oar15, Oar17; Fig. 2c; Supplementary Table S1C). Two candidate regions, that were each defined by seven significant SNPs, on Oar1 and Oar13 and spanning 46 and 17 genes, respectively were the strongest.
Overlap between candidate selection sweep regions
The three approaches (di, RsB, iHS) used here to detect selection sweeps revealed 31 candidate regions across 15 chromosomes. Selection signatures were identified on Oar1, Oar2, Oar3, Oar13 and Oar15 by more than one approach. The signatures identified by the three approaches on Oar1 had an overlapping segment (di = 19,517,811–20,118,195 bp; RsB = 19,651,513–19,761,666 bp; iHS = 19,409,931–21,607,699 bp) (Supplementary Table S1a, S1b, S1c) spanning two genes (TOE1, TESK2). The signatures identified by the three approaches on Oar2, and by di and iHS on Oar3, had no overlaps. Similarly, the selection sweeps that were identified by the three approaches on Oar13 and Oar15 had no overlapping segments. Reducing the significance threshold for di to ≥ 3.0, resulted in overlapping segments with iHS on Oar13 (di = 57,868,286–58,232,687 bp; iHS = 57,771,173–58,251,812 bp) that spanned seven genes (PCK1, CTCFL, ENSOARG00000017883, RAE1, SPO11, BMP7, ZBP1), and Oar15 (di = 43,962,190–44,595,229 bp; iHS = 44,112,180–44,332,191 bp) that spanned three genes (ENSOARG00000015306, ENSOARG00000017164, ENSOARG00000017173).
Gene content and functional annotation of the candidate regions
From the 31 candidate selection sweep regions, seven (di = 2, RsB = 3, iHS = 2) spanned no genes (Supplementary Table S1a, S1b, S1c) on the OARv4.0 genome assembly. Such regions have also been reported in cattle5,9,10. We investigated this further by checking the gene content of the seven regions against the Bovine UMD3.1 and Capra hircus V1 (ARS1 (GCF_001704415.1)) genome assemblies. Interestingly they spanned 83 and 18 genes, respectively on the bovine and caprine genomes, suggesting incomplete annotation of the ovine genome assembly. Genome-wide, we identified 172 genes mapping to 24 (di = 5, Rsb = 7, iHS = 12) candidate regions that were defined by 218 significant SNPs across 12 chromosomes (Supplementary Table S1a, S1b, S1c).
We performed functional enrichment for the 172 genes using the Enrichr web tool (See Supplementary Table S2). The rank based and combined score ranking gave similar results and revealed ten GO terms (Table 1) as the most significant (P ≤ 0.01). The genes were associated with diverse biological functions and some had roles in multiple functions. Relevant to this study was that majority of the functions were associated with adaptation to dryland environment stress (Supplementary Table S3a and S3b). They included response to feed stress; lipid, protein and carbohydrate metabolism; response to heat/temperature stimulus and oxidative stress; protection from ultraviolet radiation; regulation of immune response, DNA damage repair, transcription and translation, protein modification and RNA processing. Other functions included regulation of body size, growth and development; muscle structure, function and adaptation; kidney function and development; and reproduction and nervous system development and function.
Table 1.
Description | GO Term | P-value | Associated genes |
---|---|---|---|
(a) Enrichment analysis | |||
Regulation of stem cell maintenance | GO:2000036 | P = 0.005938196 | TAL1, BMP7 |
Granulocyte differentiation | GO:0030851 | P = 0.005938196 | L3MBTL3, TAL1 |
Vitamin biosynthetic process | GO:0009110 | P = 0.007613258 | AKR1A1, MMACHC |
Microtubule cytoskeleton organization involved in mitosis | GO:1902850 | P = 0.005172798 | KIF3B, CHEK2 |
Spindle assembly involved in mitosis | GO:0090307 | P = 0.003174766 | KIF3B, CHEK2 |
Negative regulation of embryonic development | GO:0045992 | P = 0.014909626 | COL5A2, BMP7 |
Erythrocyte maturation | GO:0043249 | P = 0.003790269 | L3MBTL3, TAL1 |
Benzene containing compound metallic process | GO:0042537 | P = 0.013736768 | CMPK1, CYP4B1 |
Hair cell differentiation | GO:0035315 | P = 0.005938196 | ERCC3, MYO6 |
Protein-lipid complex assembly | GO:0065005 | P = 0.008521348 | BIN1, PLAGL2 |
(b) KEGG Pathway | |||
Pyrimidine metabolism | hsa00240 | P = 0.049048713 | POLR2D, CMPK1, DCK |
Citrate cycle (TCA Cycle) | hsa00020 | P = 0.022794916 | IDH3B, PCK1 |
(c) WikiPathway | |||
mRNA processing | WP310 | P = 0.036816137 | ZBP1, ESRP1, ANKAR, GRSF1, KIAA1429, RAE1, SNRPB |
TCA Cycle | WP434 | P = 0.022794916 | PDP1, IDH3B |
Oxidation by Cytochrome P450 | WP43/ WP1274 | P = 0.012581695 | CYP27C1, CYP4 × 1, CYP4B1 |
Splicing factor NOVA regulated synpatic proteins | WP1983 | P = 0.042470409 | KCNJ6, EPB41L2 |
miRNA targets in ECM and membrane receptors | WP2911 | P = 0.040651645 | COL3A1, COL5A2 |
Comparisons against the KEGG pathway and WikiPathway databases (Supplementary Table S2), revealed two and five pathways (Table 1), respectively as the most significant (P < 0.05). In general, the analysis of GO terms shows an over-representation of GO categories in pathways relating to stress response and which may underlie dryland stress adaptation in the fat-tail sheep (Supplementary Table S4a, S4b, S4c).
Discussion
Occurring predominantly in the Afro-Asiatic drybelts, the fat-tail sheep account for approximately 25% of the global sheep population. Here, we analysed genotype data generated with the Ovine SNP50K Chip to investigate the genome profiles and the genetic basis of adaptation to dry environments in fat-tail sheep from a desert environment in Egypt. The inclusion of fat-tail sheep from East Africa and western Asia allowed us to gain insights on the history of the fat-tails in Africa. Due to their geographic proximity and ancient history of commercial and religious interactions, we hypothesized that the fat-tail sheep from Egypt and western Asia should show a close genetic relationship. However, the DAPC and PCA showed a genetic divergence between the Egyptian and western Asia and East African populations, respectively and a close genetic relationship between the latter. This confirms the results of a previous analysis with microsatellites in a large sample of African indigenous sheep that showed a clear divergence between Egypt’s fat-tail Ossimi and its East and southern African counterparts11. These results indicate that Egyptian and East African fat-tail sheep represent different genetic stocks, and suggest one of two possibilities; an independent introduction of at least two genepools of fat-tail sheep to Africa, or an introduction of one genetic stock that gave rise to two genepools following reproductive isolation and adaptation to different eco-climates. We favour the first suggestion which is in line with archaeological evidence which supports two separate entry points of fat-tail sheep into Africa, initially via northeast Africa (around 7500 and 7000 years ago) and later (around 5000 years ago) via the Horn of Africa12. The close genetic relationship between the East African and western Asia populations is difficult to explain but we suggest that it may be due to their recent common history. On the other hand, long-term reproductive isolation may explain the divergence of the Egyptian from western Asia populations. Intensive anthropological selection and/or genetic drift arising from low effective population sizes in the European breeds may explain their divergence from the African and western Asia populations.
The genetic divergence revealed by DAPC and PCA informed the grouping of populations into Egyptian and non-Egyptian ones for selection sweep analysis. The three approaches (di, iHS, Rsb) detected one overlapping candidate region, and two were detected with di and iHS after relaxing the stringency of di. A modest number of overlapping candidate regions have been reported in several studies4,13–15 and Bahbahani et al.9 reported none. Although coincident signatures that are detected by multiple approaches may provide strong evidence of selection16, their modest occurrence may be due to algorithm differences17,18. This may also explain why results from different studies also tend to differ. Therefore, a genomic region that has been identified by only one approach does not exclude the possibility that it could be under selection19,20. In total therefore, we detected 24 candidate regions, spanning 172 candidate genes, representing signatures of past and/or on-going selection in Egyptian fat-tail sheep. Unsurprisingly, the regions spanned genes that did not concern production, but rather, adaptation traits. This is because the target populations have mainly been exposed, over a long time period, to complex interacting biophysical stressors (heat, solar radiation, physical exhaustion, resource scarcity, parasites etc.) which impact fitness. This may be the reason why the spanned genes were associated with diverse physiological, molecular and cellular processes and pathways (Supplementary Table S3) emphasizing the importance of multi-functionality for adaptation to dryland environments. The large number of candidate regions and genes detected is also not surprising; similar findings have been reported for livestock species from extreme environments4,13,15,21. It reinforces the fact that adaptation is a complex trait that involves many biological processes and quantitative trait loci each having a small and cumulative effect on the overall expression of the phenotype.
Energy and nutrient metabolism is vital for herbivores in food scarce environments. Our candidate regions spanned several genes associated with feeding efficiency and regulation of metabolism and, GO clusters associated with energy metabolic processes (glycolysis/gluconeogenesis, TCA Cycle, insulin signaling pathway, pyruvate metabolism etc.). For instance the gene PIK3R3, which has not been reported before, regulates responses to changes in nutritional conditions as well as cellular metabolism and growth22 and through its association with 5′ AMP-activated protein kinase (AMPK), it serves as a metabolic master switch in response to alterations in cellular energy levels23. Indeed, sheep under heat stress decrease dry matter intake and rumination time by up to 76%24 which is related to eating efficiency and metabolic processes25. In the drylands most breeds tend to have small body sizes as an adaptation strategy to scarce and poor quality forages and for thermoregulation25. This may explain the occurrence, in candidate regions, of genes such as BMP7, MSTN (GDF8) and STIL which regulate adult and embryonic size, growth and development22,26,27. BMP7 also plays a crucial role in renal function and development28,29. Renal vasodilation, transmembrane transport, water-salt metabolism, bicarbonate absorption, water retention and reabsorption are key functions of the renal cortex and central to desert environment adaptation.
Prolonged exposure to intense solar and ultraviolet radiation, which are key abiotic stressors in arid environments, can result in ophthalmic and skin conditions. Genes controlling pigmentation of coat and skin30 and eyelids31 and photoreception and visual protection32 have been identified and reported to offer protection against solar and UV radiation. None of these reported genes however, occurred in the candidate regions although our study populations had pigmented skins and coats and some such as Barki, Farafra and Souhagi had pigmented eyelids8. Instead, we observed ERCC3 a major nucleotide excision repair (NER) protein. In humans, mutations in ERCC3 result in skin disorders, such as xeroderma pigmentosum, cockayne syndrome and trichothiodystrophy, which result in sensitivity to UV radiation and oxidative stress33–35. Since NER modulates melanocyte stem cell attrition and development of non-pigmented hair36, ERCC3 may be involved in maintaining and regulating the fate and behaviour of melanocyte stem cells and mature melanocytes, and thus the production of melanin which is responsible for skin, hair and eye colour. Another gene was TGM3 which is widely expressed in skin cells, specifically keratinocytes and corneocytes. During keratinocyte differentiation, TGM3 crosslinks structural proteins and lipids in the formation of cornified cell envelope which provides the barrier function of epidermis against harmful environmental stimuli such as UV radiation37–39.
Our candidate regions spanned several genes (PMS1, SPO11, RAD54L, MUTYH, CHEK2, POLR2D, CMPK1) that maintain cellular functions and DNA repair. For instance, MUTYH repairs 8-oxo-G (a mutagenic product of oxidative DNA damage) in the nucleus and mitochondria, via the base excision repair pathway40,41. PMS1 which belongs to the mutL/hexB family of DNA mismatch repair proteins, also forms heterodimers with MLH1, a DNA mismatch repair protein42. Knockdown of CMPK1 was observed to delay DNA repair during recovery from UV damage, suggesting it contributes to the efficiency of the DNA repair process43. This finding could be associated with the fact that long term exposure to acute and chronic heat stress enhances the production of free radicals which can result in DNA damage and induce oxidative stress leading to mitochondria damage44–46, apoptosis and necrosis47,48. Therefore the DNA repair system serves to preserve genomic integrity under excessive exposure to UV radiation in dryland environments.
The initiation of stress response involves the activation of the neuroendocrine system to trigger physiological and/or behavioural responses. We found several genes involved in the development of the nervous system and eliciting response to stress, indicating the importance of the neuroendocrine system in activating stress response in the candidate regions. One of the genes, DMBX1 is involved in brain and sensory organ development49,50 while TP53INP1 is a key cell stress response protein with antioxidant function51,52. Through its interaction with the ERK1 and p38 mitogen-activated protein kinases, MKNK1 may play a role in responding to environmental stress and control cytokine production and delayed apoptosis53,54. PLAGL2 is also an oxidative stress responding regulator55. The activation of the neuroendocrine system and initiation of stress response results however, in the chronic production of glucocorticoids and catecholamines, which can dysregulate immune functions56,57, and corticosteroids which possess potent immunosuppressive properties in lymphocytes58. This appears to be counterintuitive. However, we observed several genes such as ZBP1, PRDX1, MAST2 and LURAP in the candidate regions that enhance immune functions. Indeed some genes such as MAST2 and PRDX1 have dual roles. By controlling the activities of TRAF6 and NF-kB; MAST2 regulates immune response and acts as the first responder to harmful cellular stimuli such as stress, free radicals and UV radiation59,60. While PRDX1 may play an antioxidant protective role in cells, it also contributes to antiviral activity of CD8(+) T-cells61,62. The ZBP1 activates innate immune response by binding foreign DNA, enhances DNA-mediated induction of type I interferons and other genes that activate innate immune responses, as well as, signaling mechanisms underlying DNA-associated antimicrobial immunity and autoimmune disorders63,64. LURAP1 is an activator of the canonical NF-kB pathway and drives the production of proinflammatory cytokines65. Taken together, these findings suggest that genes evoking cellular stress and immune responses have been the subject of selection in the course of adaptive evolution to dryland environments.
Thermal stress compromises fertility through a direct effect of hyperthermia on the reproduction axis or through the indirect effect of thermal stress on feed intake to reduce metabolic heat production, leading to changes in energy balance and nutrient availability. Four genes (CTCFL, MAST2, TESK2, SPO11) associated with male reproduction physiology were detected in the candidate regions. CTCFL is a testis-specific DNA binding protein that forms methylation-sensitive insulators which regulate X-chromosome inactivation, nuclear architecture and transcription66,67. MAST2 68 and TESK2 69 play an important role in spermatid maturation during spermiogenesis and spermatogenesis, respectively. In mouse, knockout of SPO11 led to meiotic arrest of spermatocytes at zygotene70,71 resulting in sterility in SPO11 −/− homozygous male mice and atrophied testes. Since thermal stress compromises fertility through a direct effect of hyperthermia on the reproductive axis or indirectly through the effects of thermal stress on feed intake to reduce metabolic heat production resulting in changes in energy balance and nutrient availability, our result suggests that reproductive success is a key determinant of adaptive fitness in the fat-tail sheep.
The observation that some of the genes spanned by the candidate regions are enriched for the GO term “response to hypoxia (GO:0001666; GO:0071456)” and “HIF-1 (hypoxia inducible factor-1) signaling pathway (hsa04066)” are novel findings of our study. The HIF-1 pathway and response to hypoxia plays an important role in cellular response to systemic oxygen levels and has been associated, so far, with adaptation to high altitude environments1–3. We suggest that this finding could be related to physical exhaustion arising from long-term trekking of long distances in search of feed and water which results in hypoxia-like conditions and oxygen debt in skeletal muscles. In addition two candidate genes (MSTN/GDF8, BIN1) occurred in the candidate regions under selection. MSTN tightly regulates skeletal muscle homeostasis and promotes the survival of muscle syncitia which make up type I (oxidative/slow) or type II (glycolytic/fast) muscle fibers72. Type I fibers are fatigue-resistant and rich in mitochondria and utilize oxidative metabolism to provide a stable and long-lasting supply of ATP73. They have been observed in skeletal muscles of endurance athletes and their activation entrains complex pathways that enhance physical73 and racing performance74. Isoforms of BIN1, are important in the formation of transverse tubules which play a role in skeletal and cardiac muscle contractility and relation75,76. The selection of MSTN, BIN1 and HIF-1 associated genes could have arisen as an adaptive response to endure physical exhaustion arising from long-term long distance walking.
In this study, we generated a catalogue of genetic variants in Egyptian fat-tail sheep. Although the studied populations are only a subset of the fat-tail sheep found in Africa, they illustrate that at least two distinct and phylogeographically structured autosomal gene pools define the genotype in the continent. Genome-wide differentiation and LD based scans of selection sweeps identified several candidate genomic regions under selection that spanned several novel and reported genes with key adaptive physiological functions. These genes especially those associated with pigmentation, muscle function and the HIF pathway would need further investigation and validation using full genome sequences and expression studies in sheep and model species and further test hypotheses arising from our study.
Material and Methods
Animals
The animals used in this study are owned by farmers. Prior to sampling, the objectives of the study were explained to them in local languages so that they could make an informed decision with regard to providing consent to sample their animals. Blood sampling was performed by a licensed veterinarian following the guidelines of the General Organization for Veterinary Service (GOVS), Egypt.
Sampling, SNP genotyping and data quality control
Venous blood was collected into EDTA vacutainer tubes from 394 individuals from five fat-tail sheep populations (Barki, Saidi, Farafra, Souhagi, AHS) in Egypt (Table 2). The phenotypic characteristics, and socio-economic and cultural significance of the populations were described by Galal et al.8. The animals were sampled at random from farmers’ flocks where they are managed under transhumant grazing system, and veterinary care and anthropic selection is modest or not practised. The blood samples were transferred onto Whatman FTATM Classic cards (GE Healthcare UK Ltd) for storage. Genotyping was performed on FTATM spotted blood at GeneSeek Inc (http://www.neogen.com/Genomics/) on the Illumina Ovine SNP50K BeadChip. Similar genotype data from 565 individuals from 15 breeds of European sheep, 196 individuals from seven populations of western Asia fat-tail sheep and 79 individuals from two populations of fat-tail sheep from East Africa (Table 2) were obtained from the Ovine HapMap project (http://www.sheep-hapmap.org). The European breeds were used to represent genotypes from a contrasting temperate environment. The East African populations represented fat-tail sheep from a different geographic region within Africa and the western Asia ones represented fat-tails from a region close to the centre of domestication.
Table 2.
Region/Ecology | Type | Country | Breed/Population | Sample size |
---|---|---|---|---|
North Africa (Subtropical dryland) | Indigenous | Egypt | Barki | 181 |
Saidi | 72 | |||
Farafra | 62 | |||
Souhagi | 49 | |||
AHS | 30 | |||
Total | 394 | |||
East Africa (Tropical Highland) | Indigenous | Ethiopia | Menz | 34 |
Kenya | Red Maasai | 45 | ||
Western Asia (Subtropical) | Indigenous | Cyprus | Cyprus fat tail | 30 |
Iran | Afshari | 37 | ||
Moghani | 34 | |||
Qezel | 35 | |||
Turkey | Karakas | 18 | ||
Norduz | 20 | |||
Sakiz | 22 | |||
Total | 196 | |||
North/Central Europe (Temperate) | Selected | Switzerland | Bundner Oberlander | 21 |
Swiss White Alpine | 21 | |||
Valais Black Nose | 21 | |||
Valais Red | 21 | |||
Germany | East Friesian Brown | 39 | ||
East Friesian White | 9 | |||
England | Border Leicester | 48 | ||
Dorset Horn | 21 | |||
Wiltshire | 23 | |||
Ireland | Gallway | 49 | ||
Irish Suffolk | 55 | |||
Scotland | Scottish Texel | 80 | ||
Germany | German Texel | 43 | ||
Norway | Old Norwegian Spaelsau | 15 | ||
Spael coloured | 3 | |||
Finland | Finnsheep | 96 | ||
Total | 565 |
PLINK 1.0777 was used to perform data quality control and processing. A SNP was excluded from the analysis if it failed the following quality criteria; a SNP call rate greater than 95%, SNP genotyping success rate of greater than 90%, a per-SNP minor allele frequency of less than 0.03 and the presence of mapped autosomal loci. Overall, 1,234 individuals genotyped for at least 95% of the SNPs and 51,407 SNPs that passed quality thresholds were available for analysis.
Population analyses
We performed the principal component analysis (PCA) using the adegenet 1.4–2 package78 for R79 to investigate genetic relationships between individuals and populations. To minimize possible confounding effects of linkage disequilibrium on the underlying pattern of population genetic relationship and structure, one in every ten SNPs was sampled and used in PCA. This dataset was also used to perform the discriminant analysis of principal components (DAPC) with adegenet 1.4–2 package in R, to assess replication of the population clusters and relationships between individuals and populations. The PCA aims to reduce the total variance in the dataset and identifies the principal components (PCs) that represent population structures based on genetic correlations between individuals. The DAPC reveals genetic differences between groups, as best as possible, while minimizing the variation within the groups.
Signatures of selection based on genetic differentiation
We computed locus-specific divergence in allele frequencies between groups of populations revealed by DAPC and PCA, using the di statistic80. The di is a function of unbiased estimates of pairwise values of F ST calculated for each SNP between one or a group of population(s) against the other(s). For each of the 51,407 SNPs that passed quality control, the expected value and standard deviation of F ST were calculated. The F ST values were then averaged across SNPs contained in non-overlapping sliding windows of 20 SNPs. The regions under selection were defined if at least two non-overlapping windows passed the distribution threshold, taking the windows with the highest (top 1%) average values of di as the candidate regions80.
Signatures of selection based on linkage disequilibrium (LD)
LD based selection sweeps were identified using the iHS 17 and RsB 18 approaches. These approaches are based on the extended haplotype homozygosity (EHH) estimates of LD. The iHS detects regions with high levels of unexpected EHH within populations relative to neutral expectations and RsB detects the same but between populations. The analyses were performed using the rehh package81 in R. SNPs under selection were identified by transforming iHS and RsB values into piHS (piHS = −log10[1 − 2(Φ(iHS) − 0.5)] and pRsB (pRsB = −log10[1 − 2(Φ(RsB) − 0.5)], where (Φ(x)) represents the Gaussian cumulative distribution function. Assuming the values of iHS and RsB are normally distributed under neutrality, the piHS and pRsB can be interpreted as −log10(P-value).
For iHS, the intra-population Integrated Haplotype Score (iHS score) was computed for the five Egyptian populations as a group. For each SNP, we calculated the natural logarithm of the ratio between the integrated EHH for the ancient (iHH A) and derived (iHH D) alleles. We inferred the allelic states of each SNP using two approaches; (i) the SNP with the highest frequency was taken to be the ancestral allele; and (ii) the states were assigned at random following 100 permutations to ensure consistency. The RsB scores were computed between the Egyptian and non-Egyptian groups of populations. The integrated EHHS (site-specific EHH) score for each SNP and group of populations (iES) were calculated, and the RsB statistic between the two groups calculated as the natural logarithm of the ratio between iES Egyptian and iES non-Egyptian.
Haplotypes for iHS and RsB were constructed by phasing the genotyped SNPs using Beagle82. Haplotype frequencies were then calculated using 20 SNP sliding windows that overlapped by five SNPs. The SNPs having piHS and pRsB (log10(P-value)) values ≥ 4.0 (P-value < 0.0001), were considered significant. Candidate regions were defined if the values for at least two adjacent SNPs were significant. In cases where multiple SNPs were significant, a distance of 0.5 Kb up- and down-stream of the extreme significant SNPs was used to define the candidate regions.
Functional annotation of candidate regions and genes
The candidate regions were annotated and the associated genes retrieved from the Ensembl Genome Browser 87 using the Sheep Genome Assembly OARv4.0 (GCF_000298735.2). Functional enrichment was performed with Enrichr83 against terms from the Gene Ontology (GO), KEGG (www.kanehisa.jp/) and WikiPathways (www.WikiPathways.org/index.php/WikiPathways) databases which annotates groups of genes with dedicated terms. To rank the GO terms that are relevant to the candidate genes, we assessed the significance of overlap between the input gene list and the gene sets in each library using the z-score test statistic of the deviation from the expected rank by the Fisher exact test (rank based ranking). This test outperforms the Fisher exact test and is comparable to the combined score ranking test83. The NCBI Pubmed (http://www.ncbi.nlm.nih.gov/pubmed/), GeneCards® (http://www.genecards.org/) and UniProt (http://www.uniprot.org/) databases were also used to determine the gene functions.
Data availability
The data generated in this study is available upon request from Eui-Soo Kim (euisoo.kim@recombinetics.com).
Electronic supplementary material
Acknowledgements
We express our gratitude’s to all livestock farmers in Egypt whose animals were sampled and the APRI personnel for their assistance in sampling. The International Sheep Genomics Consortium (http://www.sheephapmap.org) is acknowledged. This project was supported by funding from the College of Agriculture and Life Sciences of Iowa State University, the State of Iowa, Hatch funds, the Ensminger program and the ARC Egypt-ICARDA Collaborative Research Program. ICARDA also thanks the donors supporting the CGIAR Research Program (CRP) on Livestock (formerly Livestock and Fish CRP). This study forms part of our ongoing efforts to understand the adaptation of local indigenous livestock to improve their productivity.
Author Contributions
J.M.M., A.M.A., B.A.R., M.F.R. designed the study. A.R.E. collected the samples. J.M.M., E.-S.K., A.R.E. analysed the data. J.M.M. wrote the manuscript. All the authors read and approved the manuscript.
Competing Interests
The authors declare that they have no competing interests.
Footnotes
Electronic supplementary material
Supplementary information accompanies this paper at 10.1038/s41598-017-17775-3.
Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
References
- 1.Qiu Q, et al. The yak genome and adaptation to life at high altitude. Nature Genet. 2012;44:946–949. doi: 10.1038/ng.2343. [DOI] [PubMed] [Google Scholar]
- 2.Dong K, et al. Genomic scan reveals loci under altitude adaptation in Tibetan and Dahe pigs. PLoS One. 2014;9:e110520. doi: 10.1371/journal.pone.0110520. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Gorkhali NA, et al. Genomic analysis identified a potential novel molecular mechanism for high-altitude adaptation in sheep at the Himalayas. Sci. Rep. 2016;6:29963. doi: 10.1038/srep29963. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Yang J, et al. Whole-genome sequencing of native sheep provides insights into rapid adaptations to extreme environments. Mol. Biol. Evol. 2016;33:2576–2592. doi: 10.1093/molbev/msw129. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Xu L, et al. Genomic signatures reveal new evidences for selection of important traits in domestic cattle. Mol. Biol. Evol. 2015;32:711–725. doi: 10.1093/molbev/msu333. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.DAGRIS. Domestic Animal Genetic Resources Information System (ed. Dessie, T., Hanotte, O. & Kemp, S.). International Livestock Research Institute, Addis Ababa, Ethiopia. http://www.dagris.info/ (2017).
- 7.Muigai AWT, Hanotte O. The origin of African sheep: archaeological and genetic perspectives. Afr. Archaeol. Rev. 2013;30:39–50. doi: 10.1007/s10437-013-9129-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Galal, S., Rasoul, F.A., Annous, M.R. & Shoat, I. Small Ruminant Breeds of Egypt. In Iñiguez, L., Characterization of Small Ruminant Breeds in West Asia and North Africa Vol 2: North Africa. International Center for Agricultural Research in the Dry Areas (ICARDA), Aleppo, Syria vi+ 196pp (2005).
- 9.Bahbahani H, et al. Signatures of positive selection in East African shorthorn Zebu: A genome-wide single nucleotide polymorphism analysis. Sci. Rep. 2015;5:11729. doi: 10.1038/srep11729. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Iso-Touru T, et al. Genetic diversity and genomic signatures of selection among cattle breeds from Siberia, eastern and northern Europe. Anim. Genet. 2016;47:647–657. doi: 10.1111/age.12473. [DOI] [PubMed] [Google Scholar]
- 11.Muigai, A. W. T. Characterization and Conservation of Indigenous Animal Genetic Resources: Genetic Diversity and Relationships of Fat-tailed and Thin-tailed Sheep of Africa. PhD thesis, Department of Biochemistry, Jomo Kenyatta University of Agriculture and Technology, Juja, Kenya (2003).
- 12.Gifford-Gonzalez D, Hanotte O. Domesticating animals in Africa: implications of genetic and archeological findings. J. World Prehist. 2011;24:1–23. doi: 10.1007/s10963-010-9042-2. [DOI] [Google Scholar]
- 13.Kim E-S, et al. Multiple genomic signatures of selection in goats and sheep indigenous to a hot arid environment. Heredity. 2016;116:255–264. doi: 10.1038/hdy.2015.94. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Gouveia JJS, et al. Genome-wide search for signatures of selection in three major Brazilian locally adapted sheep breeds. Livest. Sci. 2017;197:36–45. doi: 10.1016/j.livsci.2017.01.006. [DOI] [Google Scholar]
- 15.Kim J, et al. The genome landscape of indigenous African cattle. Genome Biol. 2017;18:34. doi: 10.1186/s13059-017-1153-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Ramey HR, et al. Detection of selective sweeps in cattle using genome-wide SNP data. BMC Genomics. 2013;14:382. doi: 10.1186/1471-2164-14-382. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Voight BF, Kudaravalli S, Wen X, Pritchard JK. A map of recent positive selection in the human genome. PLoS Biol. 2006;4:446–458. doi: 10.1371/journal.pbio.0040446. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Tang K, Thornton KR, Stoneking M. A new approach for using genome scans to detect recent positive selection in the human genome. PLoS Biol. 2007;5:e171. doi: 10.1371/journal.pbio.0050171. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Hohenlohe PA, Phillips PC, Cresko WA. Using population genomics to detect selection in natural populations: Key concepts and methodological considerations. Int. J. Plant Sci. 2010;171:1059–1071. doi: 10.1086/656306. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Oleksyk TK, Smith MW, O’Brien SJ. Genome-wide scans for footprints of natural selection. Philos. Trans. R. Soc. London [Biol] 2010;365:185–205. doi: 10.1098/rstb.2009.0219. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Lv F-H, et al. Adaptations to climate-mediated selective pressures in sheep. Mol. Biol. Evol. 2014;31:3324–3343. doi: 10.1093/molbev/msu264. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Engelman JA, Luo J, Cantley LC. The evolution of phosphatidylinositol 3-kinases as regulators of growth and metabolism. Nature Rev. Genet. 2006;7:606–619. doi: 10.1038/nrg1879. [DOI] [PubMed] [Google Scholar]
- 23.Winder WW, Hardie DG. AMP-activated protein kinase, a metabolic master switch: possible roles in Type 2 diabetes. Am. J. Physiol. Endocrinol. Metab. 1999;277:E1–E10. doi: 10.1152/ajpendo.1999.277.1.E1. [DOI] [PubMed] [Google Scholar]
- 24.Monty DE, Kelly LM, Rice WR. Acclimatization of St. Croix, Karakul and Rambouillet sheep to intense and dry summer heat. Small Ruminant Res. 1991;4:379–392. doi: 10.1016/0921-4488(91)90083-3. [DOI] [Google Scholar]
- 25.McManus C, et al. Skin and coat traits in sheep in Brazil and their relation with heat tolerance. Trop. Anim. Health Prod. 2011;43:121–126. doi: 10.1007/s11250-010-9663-6. [DOI] [PubMed] [Google Scholar]
- 26.Wu MY, Hill CS. Tgf-β superfamily signaling in embryonic development and homeostasis. Dev. Cell. 2009;16:329–343. doi: 10.1016/j.devcel.2009.02.012. [DOI] [PubMed] [Google Scholar]
- 27.Chen D, Zhao M, Mundy GR. Bone morphogenetic proteins. Growth Factors. 2004;22:233–41. doi: 10.1080/08977190412331279890. [DOI] [PubMed] [Google Scholar]
- 28.Gould SE, Day M, Jones SS, Dorai H. BMP-7 regulates chemokine, cytokine, and hemodynamic gene expression in proximal tubule cells. Kidney Int. 2002;61:51–60. doi: 10.1046/j.1523-1755.2002.00103.x. [DOI] [PubMed] [Google Scholar]
- 29.Zeisberg M, et al. Bone morphogenic protein-7 inhibits progression of chronic renal fibrosis associated with two genetic mouse models. Am. J. Physiol. Renal Physiol. 2003;285:F1060–F1067. doi: 10.1152/ajprenal.00191.2002. [DOI] [PubMed] [Google Scholar]
- 30.Cieslak M, Reissmann M, Hofreiter M, Ludwig A. Colours of domestication. Biol. Rev. 2011;86:885–899. doi: 10.1111/j.1469-185X.2011.00177.x. [DOI] [PubMed] [Google Scholar]
- 31.Pausch H, et al. Identification of QTL for UV-protective eye area pigmentation in cattle by progeny phenotyping and genome-wide association analysis. PLoS One. 2012;7:e36346. doi: 10.1371/journal.pone.0036346. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Wu H, et al. Camelid genomes reveal evolution and adaptation to desert environments. Nature Comms. 2014;5:5188. doi: 10.1038/ncomms6188. [DOI] [PubMed] [Google Scholar]
- 33.Weeda G, et al. A presumed DNA helicase encoded by ERCC-3 is involved in the human repair disorders xeroderma pigmentosum and Cockayne’s syndrome. Cell. 1990;62:777–791. doi: 10.1016/0092-8674(90)90122-U. [DOI] [PubMed] [Google Scholar]
- 34.Weeda G, et al. A mutation in the XPB/ERCC3 DNA repair transcription gene, associated with trichothiodystrophy. Am. J. Hum. Genet. 1997;60:320–329. [PMC free article] [PubMed] [Google Scholar]
- 35.Riou L, et al. The relative expression of mutated XPB genes results in xeroderma pigmentosum/Cockayne’s syndrome or trichothiodystrophy cellular phenotypes. Hum. Mol. Gen. 1999;8:1125–1133. doi: 10.1093/hmg/8.6.1125. [DOI] [PubMed] [Google Scholar]
- 36.Yu M, et al. Deficiency in nucleotide excision repair family gene activity, especially ERCC3, is associated with non-pigmented hair fiber growth. PLoS One. 2012;7:e34185. doi: 10.1371/journal.pone.0034185. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Kalinin AE, Kajava AV, Steinert PM. Epithelial barrier function: assembly and structural features of the cornified cell envelope. Bioessays. 2002;24:789–800. doi: 10.1002/bies.10144. [DOI] [PubMed] [Google Scholar]
- 38.Eckert RL, Sturniolo MT, Broome AM, Ruse M, Rorke EA. Transglutaminase function in epidermis. J. Invest. Dermatol. 2005;124:481–492. doi: 10.1111/j.0022-202X.2005.23627.x. [DOI] [PubMed] [Google Scholar]
- 39.Hitomi K. Transglutaminases in skin epidermis. Eur. J. Dermatol. 2005;15:313–319. [PubMed] [Google Scholar]
- 40.Pilati C, et al. Mutational signature analysis identifies MUTYH deficiency in colorectal cancers and adrenocortical carcinomas. J. Pathol. 2017;242:10–15. doi: 10.1002/path.4880. [DOI] [PubMed] [Google Scholar]
- 41.Poulsen MLM, Bisgaard ML. MUTYH Associated Polyposis (MAP) Curr. Genomics. 2008;9:420–435. doi: 10.2174/138920208785699562. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Smith CE, et al. Dominant mutations in S. cerevisiae PMS1 identify the mlh-Pms1 endonuclease active site and an exonuclease 1-independent mismatch repair pathway. PLoS Genet. 2013;9:e1003869. doi: 10.1371/journal.pgen.1003869. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Tsao N, Lee MH, Zhang W, Cheng YC, Chang ZF. The contribution of CMP kinase to the efficiency of DNA repair. Cell Cycle. 2015;14:354–363. doi: 10.4161/15384101.2014.987618. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.England K, O’Driscoll C, Cotter TG. Carbonylation of glycolytic proteins is a key response to drug-induced oxidative stress and apoptosis. Cell Death Differ. 2004;11:252–260. doi: 10.1038/sj.cdd.4401338. [DOI] [PubMed] [Google Scholar]
- 45.Lewandowska A, Gierszewska M, Marszalek J, Liberek K. Hsp78 chaperone functions in restoration of mitochondrial network following heat stress. Bioenergetics. 2006;1763:141–151. doi: 10.1016/j.bbamcr.2006.01.007. [DOI] [PubMed] [Google Scholar]
- 46.Song XL, Qian LJ, Li FZ. Injury of heat-stress to rat cardiomyocytes. Chin. J. Appl. Physiol. 2000;16:227–230. [Google Scholar]
- 47.Rhoads RP, Baumgard LH, Suagee JK, Sanders SR. Nutritional interventions to alleviate the negative consequences of heat stress. Adv. Nutr. 2013;4:267–276. doi: 10.3945/an.112.003376. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Du J, Di HS, Guo L, Li ZH, Wang GL. Hyperthermia causes bovine mammary epithelial cell death by a mitochondrial-induced pathway. J. Thermal Biol. 2008;33:37–47. doi: 10.1016/j.jtherbio.2007.06.002. [DOI] [Google Scholar]
- 49.Kawahara A, Chien CB, Dawid IB. The homeobox gene mbx is involved in eye and tectum development. Dev. Biol. 2002;248:107–117. doi: 10.1006/dbio.2002.0709. [DOI] [PubMed] [Google Scholar]
- 50.Miyamoto T, et al. Mbx, a novel mouse homeobox gene. Dev. Genes Evol. 2002;212:104–106. doi: 10.1007/s00427-002-0217-4. [DOI] [PubMed] [Google Scholar]
- 51.Shahbazi J, Lock R, Liu T. Tumor Protein 53-Induced Nuclear Protein 1 Enhances p53 Function and Represses Tumorigenesis. Front. Genet. 2013;4:80. doi: 10.3389/fgene.2013.00080. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Sándor N, et al. TP53inp1 gene is implicated in early radiation response in human fibroblast cells. Int. J. Mol. Sci. 2015;16:25450–25465. doi: 10.3390/ijms161025450. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Waskiewicz AJ, et al. Phosphorylation of the cap-binding protein eukaryotic translation initiation factor 4E by protein kinase Mnk1 in vivo. Mol. Cell. Biol. 1999;19:1871–1880. doi: 10.1128/MCB.19.3.1871. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Fortin CF, Mayer TZ, Cloutier A, McDonald PP. Translational control of human neutrophil responses by MNK1. J. Leukoc. Biol. 2013;94:693–703. doi: 10.1189/jlb.0113012. [DOI] [PubMed] [Google Scholar]
- 55.Guo Y, Yang MC, Weissler JC, Yang YS. PLAGL2 translocation and SP-C promoter activity – a cellular response of lung cells to hypoxia. Biochem. Biophys. Res. Commun. 2007;360:659–665. doi: 10.1016/j.bbrc.2007.06.106. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Padgett DA, Glaser R. How stress influences the immune response. Trends Immunol. 2003;24:444–448. doi: 10.1016/S1471-4906(03)00173-X. [DOI] [PubMed] [Google Scholar]
- 57.Salak-Johnson JL, McGlone JJ. Making sense of apparently conflicting data: stress and immunity in swine and cattle. J. Anim. Sci. 2007;85:E81–E88. doi: 10.2527/jas.2006-538. [DOI] [PubMed] [Google Scholar]
- 58.Cupps TR, Fauci AS. Corticosteroid-mediated immunoregulation in man. Immunol. Rev. 1982;65:133–155. doi: 10.1111/j.1600-065X.1982.tb00431.x. [DOI] [PubMed] [Google Scholar]
- 59.Chandel NS, Trzyna WC, McClintock DS, Schumacker PT. Role of oxidants in NF-kappa B activation and TNF-alpha gene transcription induced by hypoxia and endotoxin. J. Immunol. 2000;165:1013–21. doi: 10.4049/jimmunol.165.2.1013. [DOI] [PubMed] [Google Scholar]
- 60.Fitzgerald DC, et al. Tumour necrosis factor-alpha (TNF-alpha) increases nuclear factor kappaB (NFkappaB) activity in and interleukin-8 (IL-8) release from bovine mammary epithelial cells. Vet. Immunol. Immunopathol. 2007;116:59–68. doi: 10.1016/j.vetimm.2006.12.008. [DOI] [PubMed] [Google Scholar]
- 61.Sun CC, et al. Peroxiredoxin 1 (Prx1) is a dual-function enzyme by possessing Cys-independent catalase-like activity. Biochem. J. 2017;474:1373–1394. doi: 10.1042/BCJ20160851. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62.He T, et al. Peroxiredoxin 1 knockdown potentiates β-lapachone cytotoxicity through modulation of reactive oxygen species and mitogen-activated protein kinase signals. Carcinogenesis. 2013;34:760–769. doi: 10.1093/carcin/bgs389. [DOI] [PubMed] [Google Scholar]
- 63.Takaoka A, et al. DAI (DLM-1/ZBP1) is a cytosolic DNA sensor and an activator of innate immune response. Nature. 2007;448:501–505. doi: 10.1038/nature06013. [DOI] [PubMed] [Google Scholar]
- 64.Rathinam VA, Fitzgerald KA. Innate immune sensing of DNA viruses. Virology. 2011;411:153–162. doi: 10.1016/j.virol.2011.02.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 65.Jing Z, et al. Chromosome 1 open reading frame 190 promotes activation of NF-kB canonical pathway and resistance of dendritic cells to tumor-associated inhibition in vitro. J. Immunol. 2010;185:6719–6727. doi: 10.4049/jimmunol.0903869. [DOI] [PubMed] [Google Scholar]
- 66.Loukinov DI, et al. BORIS, a novel male germ-line-specific protein associated with epigenetic reprogramming events, shares the same 11-zinc-finger domain with CTCF, the insulator protein involved in reading imprinting marks in the soma. Proc. Nat. Acad. Sci. USA. 2002;99:6806–6811. doi: 10.1073/pnas.092123699. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 67.Ohlsson R, Renkawitz R, Lobanenkov V. CTCF is a uniquely versatile transcription regulator linked to epigenetics and disease. Trends Genet. 2001;17:520–527. doi: 10.1016/S0168-9525(01)02366-6. [DOI] [PubMed] [Google Scholar]
- 68.Huang N, et al. A screen for genomic disorders of infertility identifies MAST2 duplications associated with non-obstructive azoospermia in humans. Biol. Reprod. 2015;93: 61:1–10. doi: 10.1095/biolreprod.115.131185. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 69.Røsok O, Pedeutour F, Ree AH, Aasheim HC. Identification and characterisation of TESK2, a novel member of the LIMK/TESK family of protein kinases, predominantly expressed in testis. Genomics. 1999;61:44–54. doi: 10.1006/geno.1999.5922. [DOI] [PubMed] [Google Scholar]
- 70.Baudat F, Manova K, Yuen JP, Jasin M, Keeney S. Chromosome synapsis defects and sexually dimorphic meiotic progression in mice lacking Spo11. Mol. Cell. 2000;6:989–998. doi: 10.1016/S1097-2765(00)00098-8. [DOI] [PubMed] [Google Scholar]
- 71.Romanienko PJ, Camerini-Otero RD. The mouse Spo11 gene is required for meiotic chromosome synapsis. Mol. Cell. Proteomics. 2000;6:975–987. doi: 10.1016/S1097-2765(00)00097-6. [DOI] [PubMed] [Google Scholar]
- 72.Bradley L, Yaworsky PJ, Walsh FS. Myostatin as a therapeutic target for musculoskeletal disease. Cell. Mol. Life Sci. 2008;65:2119–2124. doi: 10.1007/s00018-008-8077-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 73.Wang Y-X, et al. Regulation of muscle fiber type and running endurance by PPARδ. PLoS Biol. 2004;2:e294. doi: 10.1371/journal.pbio.0020294. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 74.Mosher DS, et al. A mutation in the myostatin gene increases muscle mass and enhances racing performance in heterozygote dogs. PLoS Genet. 2007;3:e79. doi: 10.1371/journal.pgen.0030079. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 75.Tjondrokoesoemo A, et al. Disrupted membrane structure and intracellular Ca2+ signaling in adult skeletal muscle with acute knockdown of Bin1. PLoS One. 2011;6:e25740. doi: 10.1371/journal.pone.0025740. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 76.Zhou K, Hong T. Cardiac BIN1 (cBIN1) is a regulator of cardiac contractile function and an emerging biomarker of heart muscle health. Sci. China Life Sci. 2017;60:257–263. doi: 10.1007/s11427-016-0249-x. [DOI] [PubMed] [Google Scholar]
- 77.Purcell S, et al. PLINK: a tool set for whole-genome association and population based linkage analyses. Am. J. Hum. Genet. 2007;81:559–575. doi: 10.1086/519795. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 78.Jombart T, Ahmed I. adegenet 1.3-1: new tools for the analysis of genome-wide SNP data. Bioinformatics. 2011;27:3070–3071. doi: 10.1093/bioinformatics/btr521. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 79.R-Development Core Team. R: A Language and Environment for Statistical Computing Version 3.0.1. Vienna, Austria. R Foundation for Statistical Computing (2013).
- 80.Akey JM, et al. Tracking footprints of artificial selection in the dog genome. Proc. Nat. Acad. Sci. USA. 2010;107:1160–1165. doi: 10.1073/pnas.0909918107. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 81.Gautier M, Vitalis R. rehh: an R package to detect footprints of selection in genome-wide SNP data from haplotype structure. Bioinformatics. 2012;28:1176–1177. doi: 10.1093/bioinformatics/bts115. [DOI] [PubMed] [Google Scholar]
- 82.Browning SR, Browning BL. Rapid and accurate haplotype phasing and missing data inference for whole genome association studies using localized haplotype clustering. Am. J. Hum. Genet. 2007;81:1084–1097. doi: 10.1086/521987. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 83.Kuleshov MV, et al. Enrichr: a comprehensive gene set enrichment analysis web server 2016 update. Nucleic Acids Res. 2016;44:W90–W97. doi: 10.1093/nar/gkw377. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
The data generated in this study is available upon request from Eui-Soo Kim (euisoo.kim@recombinetics.com).