Abstract
The quantities and proportions of protein fractions have notable effects on the nutritional and technological value of milk. Although much is known about the effects of genetic variants on milk proteins, the complex relationships among the set of genes and pathways regulating the different protein fractions synthesis and secretion into milk in dairy cows are still not completely understood. We conducted genome-wide association studies (GWAS) for milk nitrogen fractions in a cohort of 1,011 Brown Swiss cows, which uncovered 170 significant single nucleotide polymorphism (SNPs), mostly located on BTA6 and BTA11. Gene-set analysis and the network-based Associated Weight Matrix approach revealed that the milk proteins associated genes were involved in several biological functions, particularly ion and cation transmembrane transporter activity and neuronal and hormone signalling, according to the structure and function of casein micelles. Deeper analysis of the transcription factors and their predicted target genes within the network revealed that GFI1B, ZNF407 and NR5A1 might act as master regulators of milk protein synthesis and secretion. The information acquired provides novel insight into the regulatory mechanisms controlling milk protein synthesis and secretion in bovine mammary gland and may be useful in breeding programmes aimed at improving milk nutritional and/or technological properties.
Introduction
Milk is an important source of proteins of high-quality due to their high content of essential amino acids, such as lysine, which is deficient in many human diets1, and their well-known physiological effects, such as immunomodulatory and gastrointestinal activities2. The main proteins in bovine milk are the four key caseins (CN), namely αS1-CN, αS2-CN, β-CN and κ-CN, which are organized in micelles and account for about 80% of the total protein content. Casein micelles have a role in concentrating, stabilizing and transporting essential nutrients in milk, mainly Ca2+ and proteins, to the offspring3. The other protein category is the whey proteins fraction, which consists of mainly β-lactoglobulin (β-LG) and α-lactalbumin (α-LA), immunoglobulins, serum albumin, lactoferrin, lactoperoxidase and a minor component corresponding to glycomacropeptide3. This fraction make up approximately 20% of total milk proteins4 and it is demonstrated to affect satiety by reducing food intake, stimulating satiating gut hormone production and slowing stomach emptying in humans and animal models (reviewed by Sánchez-Moya et al.5).
Milk protein content and composition influence milk technological properties (MCP) and are therefore important for the dairy industry, especially in Europe, where the majority of milk produced is transformed into cheese6. Milk coagulation, curd structure, curd firmness and cheese yield are directly related to casein content7. Additionally, genetic variants of milk protein fractions, and particularly of κ-CN, strongly influence MCPs; κ-CN B milk is indeed characterised by an increased κ-CN content, which favourably affect MCPs8. Moreover, milk payment systems in the dairy sectors producing hard cheeses with EU Protected Designation of Origin (PDO) status often include among their payment criteria coagulation and curd firming properties, which are strongly affected by the amounts, proportions and genetic variants of milk protein fractions8, as these are related to cheese quality and sensory properties9,10. Different milk protein fractions and genetic variants (such as the A1 and A2 variants of β-CN) also seem to affect human health and wellbeing in different ways11,12.
In recent decades, there have been extraordinary advances in our knowledge of the physiology and biochemistry of the lactating mammary gland. Despite such efforts, little is as yet known of the genetic regulation of the physiological and cellular mechanisms required for milk protein synthesis and secretion. It is well known that milk protein synthesis in the mammary gland depends on hormonal and developmental cues that modulate the transcriptional and translational regulation of genes through the activity of specific transcription factors, non-coding RNAs and alterations of the chromatin structure in the mammary epithelial cells13,14. The interplay between all the aforementioned factors might play a key role in milk protein synthesis, which is crucial during the onset and throughout the lactation in high-producing dairy cattle. Recently, it has also been shown that CN phosphorylation, one of the most important factors controlling the stabilization of calcium phosphate nanoclusters in casein micelles and the internal structure of the casein micelles15, is also essential for the protein synthesis machinery in the mammary gland. Differences in the phosphorylation of αS1-CN may be of particular interest as it represents 40% of the total CN fraction in bovine milk16. The possibility of tailoring milk composition, e.g., to obtain milk with high protein content and/or favourable MCPs, would allow to meet specific demands from the cheese industry and consumers, and therefore represents a highly desirable goal for the dairy industry. Since milk protein composition is less responsive to diet than milk fat content17, genomic selection may offer a valid alternative for optimising milk protein nutritional value in relation to human health7 while maximizing economic returns for the dairy industry.
There are substantial differences among different bovine breeds in the proportions of milk protein fractions and in the frequencies of protein genotypes18. Several studies have investigated the effects of genetic variants of CN and β-LG genes on the milk protein content and cheese-making ability8,18,19. However, other loci seem to contribute to regulate the proportions and characteristics of milk proteins, suggesting that regulation is shared among different genes16,20–26. Deeper knowledge of the set of genes and pathways regulating bovine milk protein synthesis and secretion might, therefore, help to identify their contribution to optimising casein and whey protein contents during lactation. Pathway-based and gene network analyses have been often used as complementary approaches for extracting biological information from genome-wide association analysis studies (GWAS) and for better characterising the genomic structure of complex traits21,22.
To date, only one study has explored this type of integrated analysis for milk protein fractions (albeit limited to κ-CN and β-LG and a small cohort of 164 lactating cows), and it suggests that, in addition to the role played by single genes, a complex multi-hormonal system regulates the expression of milk proteins and the interactions between mammary epithelial cells and the components of the extracellular matrix23. Nevertheless, no genome-wide association analysis (GWAS) of Brown Swiss populations with the aim of unravelling the genomic architecture controlling milk protein synthesis and secretion has been yet reported. The aims of this study, therefore, were: i) to perform a GWAS analysis to identify genomic regions associated to the proportions of non-protein nitrogen (N) and protein fractions in milk samples from 1,011 Brown Swiss cows; ii) to uncover the biological functions regulating the milk N compound profile through gene-set enrichment analysis; and iii) to use an association weight matrix (AWM) approach24 based on SNP co-associations in silico, to identify regulatory networks associated with milk protein synthesis, metabolism and secretion in cattle.
Results
GWAS analysis
Summary statistics and genomic heritabilities for milk N fractions calculated from a cohort of 1,011 Italian Brown Swiss cows are reported in Table 1. Overall, very high genomic heritabilities were found for the proportions of β-CN (0.833), κ-CN (0.681) and αS1-CN (0.661) out of the total nitrogenous compounds. Of the whey proteins, the β-LG proportion also had high heritability (0.558), while the estimates for α-LA were decidedly lower (0.194). Heritabilities of milk non-protein N compounds were moderate (0.363 for minor N compounds, 0.248 for urea).
Table 1.
Trait 1 | Mean | SD | h2 | #SNP2 |
---|---|---|---|---|
Milk yield, kg/d | 24.26 | 7.96 | 0.094 | 2 |
True protein N, % total milk N | 89.05 | 2.29 | 0.402 | 21 |
Milk N fractions, % total milk N | ||||
Caseins | 77.97 | 1.25 | 0.133 | 4 |
β-CN | 32.14 | 2.45 | 0.833 | 64 |
κ-CN | 9.48 | 1.48 | 0.681 | 74 |
αS1-CN | 25.71 | 1.85 | 0.661 | 39 |
αS1P-CN | 1.45 | 0.62 | 0.171 | 3 |
αS1P/αS1-CN | 0.06 | 0.03 | 0.183 | 3 |
αS2-CN | 9.19 | 1.14 | 0.365 | 32 |
Whey proteins | 11.08 | 1.70 | 0.523 | 32 |
β-LG | 8.72 | 1.56 | 0.558 | 29 |
α-LA | 2.36 | 0.51 | 0.194 | 7 |
Other N compounds | 10.95 | 2.28 | 0.402 | 21 |
Minor N compounds | 7.94 | 2.37 | 0.363 | 17 |
MUN | 3.01 | 1.04 | 0.248 | 4 |
1True Protein nitrogen (N) and milk N fractions are expressed as percentage of total milk N; αS2-CN: αS2-casein; α-LA: α-lactalbumin; β-LG: β-lactoglobulin; β-CN: β-casein; κ-CN: κ-casein; αS1-CN: αS1-casein; αS1P-CN/αS1-CN: ratio between αS1(phosphorylated)-casein and αS1-casein; αS1P-CN: αS1(phosphorylated)-casein; caseins: Σcaseins (β-CN+ κ-CN+ αS1-CN+ αS1P-CN+ αS2-CN+ αS1P/αs1-CN); Whey proteins: Σ whey proteins (α-LA + β-LG). Other N compounds: other N compounds (Σurea + minor N compounds); Minor N compounds: minor N compounds (e.g., small peptides, ammonia, creatine, creatinine, etc.); MUN: milk urea N.
SD: standard deviation; h2: genomic heritability.
2#SNP: number of significant SNP (5 × ) for each trait.
Table 2 and Supplementary Table S1 report the results of the GWAS analysis. A total of 170 SNPs were significant, mainly located on two Bos taurus autosomes (BTAs), BTA6 and BTA11. Three regions were detected on BTA6, which showed associations with 11 traits (Fig. 1). Region 6a included 3 SNPs (~37.02–39.60) close to the significance threshold associated to the total CN percentage and milk yield (MY). Region 6b (~68.55–74.85 Mbp) corresponded to 17 SNPs associated to αS2-CN, β-CN and κ-CN. A total of 103 signals were detected in region 6c (~77.19–99.45) with significant associations with MY, all the CN fractions except for αS1P-CN and αS1P/αS1-CN, the two whey proteins, α-LA and β-LG, and other N compounds except for milk urea (MUN). Very high peaks corresponding to κ-CN, β-CN and αS1-CN were detected in this region. In particular, the highest signal corresponded to the marker Hapmap52348-rs29024684 (~87.40 Mbp), which was significantly associated to κ-CN (P = 5.05443E-59). The proportion of additive genetic variance (Va) explained by this SNP was 71.60% (see Supplementary Table S1). Other peaks corresponded to Hapmap28023-BTC-060518 (~87,20 Mbp), which was associated with β-CN (P = 1.72926E-52, Va = 49.67%) and αS1-CN (P = 1.2914E-39, Va = 39.56%), and Hapmap24184-BTC-070077 (~87,25 Mbp), which was associated to β-CN (P = 2.60856E-50, Va = 47.55%) (see Supplementary Table S1). Moderate linkage disequilibrium (LD) was observed between Hapmap52348-rs29024684 and Hapmap28023-BTC-060518, and between Hapmap52348-rs29024684 and Hapmap24184-BTC-070077 (r2 = 0.35). The markers Hapmap28023-BTC-060518 and Hapmap24184-BTC-070077 were in full LD (r2 = 1) (see Supplementary Fig. S1). Two regions were detected on the tail part of BTA11: region 11a, containing 7 significant SNPs (~94.69–98.89 Mbp), and region 11b (~101.27–106.54 Mbp), containing 22 SNPs. Both regions were significantly associated to β-LG, whey proteins, other N compounds and minor N compounds (Table 2) (Fig. 2). The highest signals were detected in region 11b and corresponded to markers ARS-BFGL-NGS-115328 (~103.11 Mbp) associated to β-LG (P = 1.12371E-20), and ARS-BFGL-NGS-104610 (~104.29 Mbp) associated to β-LG (P = 6.92605E-24) and total WP (P = 1.29446E-20). The markers BTA-76907-no-rs and ARS-BFGL-NGS-110734 had undefined positions on the genome and showed highly significant associations with κ-CN (P = 2.80E-16) and β-CN (P = 6.16E-15) (see Supplementary Table S1).
Table 2.
BTA 1 | #SNP | Interval, Mbp | P-value (range) | Top SNP | Top SNP location, bp | Top SNP MAF | Trait 2 |
---|---|---|---|---|---|---|---|
1 | 1 | — | 2.75E-05 | BTB-01778303 | 151883849 | 0.02 | αS2-CN |
3 | 1 | — | 4.64E-05 | ARS-BFGL-NGS-100159 | 88864456 | 0.49 | α-LA |
3 | 1 | — | 1.23E-05 | ARS-BFGL-NGS-33061 | 44364191 | 0.01 | CN |
4 | 1 | — | 3.68E-05 | BTB-01672972 | 21194199 | 0.01 | Other N, protein |
4 | 1 | — | 3.29E-05 | BTB-01066453 | 53857273 | Other N, protein | |
4 | 2 | 73.60–73.84 | (7.34E-06, 2.72E-05) | BTA-71368-no-rs | 73837632 | 0.05 | MUN |
5 | 1 | — | 1.8E-05 | Hapmap44167-BTA-95489 | 82944314 | 0.07 | MUN |
6a | 3 | 37.02–39.60 | (1.64E-05, 2.23E-05) | Hapmap31921-BTC-033863 | 37019972 | 0.05 | MY, CN |
6b | 16 | 68.55–74.85 | (5.86E-08, 4.5E-05) | Hapmap29639-BTC-041962 | 71350048 | 0.02 | αS2-CN, β-CN, κ-CN |
6c | 105 | 77.19–99.45 | (5.05E-59, 4.96E-05) | Hapmap52348-rs29024684 | 87396306 | 0.24 | κ-CN, β-CN, αS2-CN, αS1-CN, MY, α-LA, Nmin, WP, β-LG, protein, Other N |
9 | 1 | — | 4.34E-05 | BTA-21753-no-rs | 36790663 | 0.01 | αS1-CN |
11a | 7 | 94.69–98.89 | (2.36E-07, 3.60E-05) | Hapmap56906-rs29014970 | 97844929 | 0.31 | β-LG, WP, protein, Other N, Nmin |
11b | 22 | 101.27–106.54 | (6.93E-24, 4.94E-05) | ARS-BFGL-NGS-104610 | 104293559 | 0.45 | β-LG, WP, Other N, protein, Nmin |
13 | 1 | — | 2.9E-05 | ARS-BFGL-NGS-108308 | 28999095 | 0.23 | MUN |
14 | 1 | — | 2.16E-05 | BTA-02620-rs29010169 | 45601728 | 0.01 | αS1P/αS1-CN, αS1P-CN |
20 | 1 | — | 1.27E-05 | ARS-BFGL-NGS-102102 | 10233876 | 0.37 | αS1P-CN, αS1P/αS1-CN |
20 | 1 | — | 6.37E-06 | Hapmap51592-BTA-41521 | 46709345 | 0.37 | αS1P/αS1-CN, αS1P-CN |
20 | 1 | — | 5.85E-06 | BTB-01648552 | 58264762 | 0.42 | Protein, Nmin, Other N |
24 | 1 | — | 4.22E-05 | ARS-BFGL-BAC-42839 | 4118163 | 0.11 | Nmin |
25 | 1 | — | 5.19E-06 | Hapmap31994-BTC-065943 | 5385729 | 0.14 | CN |
#SNP = number of the single nucleotide polymorphisms significantly associated to the trait; Interval: The region on the chromosome spanned among the significant SNP(s) (in Mb); P-value (range) = The P-value of the highest significant SNP adjusted for genomic control and the range of the P-values when multiple SNP were significantly associated to one trait; Top SNP location (bp) = position of the highest significant SNP on the chromosome in base pairs on UMD3.1 (http://www.ensembl.org/index.html); Top SNP MAF = minor allele frequency of the top SNP.
2True Protein nitrogen (N) and milk N fractions are expressed as percentage of total milk N; αS2-CN: αS2-casein; α-LA: α-lactalbumin; Other N: other N compounds (urea + minor nitrogen compounds); MY: milk yield; β-LG: β-lactoglobulin; β-CN: β-casein; κ-CN: κ-casein; αS1-CN: αS1-casein; Nmin: minor N compounds (e.g., small peptides, ammonia, creatine, creatinine, etc.); αS1P/αS1-CN: ratio between αS1(phosphorylated)-casein and αS1-casein; αS1P-CN: αS1(phosphorylated)- casein; CN: casein, Σcaseins (β-CN+ κ-CN+ αS1-CN+ αS1P-CN+ αS2-CN+ αS1P/αS1-CN); WP: whey proteins, Σ whey proteins (α-LA + β-LG); MUN: milk urea N.
The trait with the highest P-value in each genomic region is bolded.
Adjusting for the effect of the highest signals for κ-CN and β-LG altered the SNPs with the most significant associations (see Supplementary Table S1). The genetic variance explained by the SNPs for the κ-CN proportion decreased dramatically (0.124 vs 1.138; −89.1%), as did heritability (0.325 vs 0.681; −73.3%). Significant decreases were also observed for the proportions of β-CN (−43.9% genetic variance, −23.5% heritability) and of β-LG, although to a lesser extent (−23.4% genetic variance, −11.0% heritability) (see Supplementary Table S1).
Pathway analysis
Of the total 37,568 SNPs used in this study, 17,006 were located in the 15 kb flanking region of the annotated genes. These were assigned to 13,269 genes on the basis of the UMD3.1 bovine genome sequence assembly. On average, a total of 600 genes showed significant associations (P < 0.05) with MY or milk N fractions. To gain a better understanding of the functional implications of these 600 significant genes, we performed pathway analyses in order to identify over-represented biological processes. On the one hand, the total CN percentage was significantly enriched by K+ transport pathways, including 7 over-represented gene ontology (GO) categories, e.g., K+ ion transmembrane transport (q = 0.00015), voltage-gated K+ channel complex (q = 6.07E-06) and K+ channel activity (q = 1.16E-05; Fig. 3a). The plasma membrane, plasma membrane protein complex and cell-periphery cellular components were also significantly enriched for CN (q = 0.00011, q = 1.33E-05 and q = 8.94E-05, respectively; Fig. 3a). On the other hand, over-represented pathways for β-CN included cellular responses to stimuli, e.g., alcohol (q = 2.89E-06), corticosteroid hormones (q = 2.30E-05) and ketone bodies (q = 4.54E-05; Fig. 3a). Minor N compounds (N min) were significantly associated with the metal ion transport pathways (q = 1.04E-05) (Fig. 3a). The full list of significantly enriched pathways (q < 0.05) is given in Supplementary Table S2).
Complementary, the most significant over-represented KEGG pathways for κ-CN included genes involved with Ca2+ homeostasis, Ca2+ cycling and elevation in intracellular Ca2+, as well as hypertrophic cardiomyopathy (HCM) processes (q = 7.22E-06), arrhythmogenic right ventricular cardiomyopathy (ARVC) (q = 2.73E-05) and dilated cardiomyopathy (DCM) (q = 8.63E-05; Fig. 3b). Axon guidance was enriched for total CN (q = 3.92E-07) while salivary secretion was associated with αS1-CN (q = 5.20E-05). The Fc γ R-mediated phagocytosis displayed an association with αS1P-CN (q = 8.86E-05) (Fig. 3b and Supplementary Table S2).
Gene network analyses
A total of 15,277 annotated SNPs were used for the AWM construction and the SNP co-association analyses. The AWM matrix was then built using a total of 15 phenotypes and the 1,917 SNPs that were significantly associated with at least one of these phenotypes (selected after applying the filtering steps described in the Material and Methods section). These SNPs corresponded to 1,917 unique genes. The SNPs selected by the AWM method explained 72% of the phenotypic variance for κ-CN, which was significantly larger (P < 0.001) than the average variance (46%) explained by the same number of randomly selected SNPs (10,000 replicates). Hierarchical clustering of traits was firstly performed to describe the set of phenotypes that inevitable were correlated between them. In fact, milk N fractions profiles were clustered in three different groups: the first comprised the minor N compounds, the second comprised the whey proteins, total CN and the αS-CN fraction, while the third included β-CN, κ-CN, urea, αS1P-CN and the αS1P/αS1-CN ratio (Supplementary Fig. S2). Then, operating on the rows of the AWM matrix, the correlations between all pair-wise genes were used to predict gene interactions and generate a regulatory network for the milk N fractions, where the nodes are genes and the edges represent significant interactions between nodes. The PCIT algorithm identified a total of 235,764 edges connecting the 1,917 nodes. After filtering for sparse correlations values ≥ |0.80|, we obtained a regulatory network with 101,284 edges and 1,904 nodes. The analysis of the network topological parameters, e.g., closeness centrality and betweenness centrality, revealed that the genes related to ion transport pathway (e.g., ITPR2, IQGAP1, TP53RK and LACE1), protein metabolism (e.g., METAP1 and PRC1) and axon guidance (e.g., NTNG1 and ROBO3) might have an important influence on the regulatory network. Ranking the nodes according to their degree (number of significant interactions), we found BPIFB1 and FAM169A at the top of the list with 481 and 477 edges, respectively (Supplementary Table S3). Analysis with the LASAGNA tool, which predicts the transcription factors (TF) binding sites in the genes’ promoter regions, showed that the promoter of BPIFB1 and FAM169A contained binding sites for several TFs involved in regulating milk protein synthesis, such as GR, ER, STAT5A, C/EBP and YY1 (Supplementary Table S4). Additionally, we detected other highly-connected nodes within our regulatory network, including the K+ channel KCNK9 (with 455 edges), transporters such as CRABP1 (450 edges) and SLC4A7 (420 edges), and the phosphatase PLPP7 (located 2 Mb from PAEP; 418 edges) (Supplementary Table S3).
The TFs act in a regulatory network and can drive or repress the expression of different genes in a feed-forward and feedback manner. Accordingly, a second network was generated to explore the main putative regulatory TFs in our regulatory network and the connectivity between them. We identified GFI1B, NR5A1 and ZNF407 as the “best” trio of TFs within our regulatory network. Altogether, they potentially regulated the transcription of 452 genes (about 24% of genes in the AWM matrix filtered for correlations ≥|0.80|; Fig. 4). Figure 5A,B show the distribution of the partial correlation coefficients in the full and TF networks. More sophisticated regulation patterns between the TFs and their target genes were provided by the LASAGNA promotor analyser. For instance, the promoters of GFI1B and NR5A1 were discovered to contain putative binding sites for the TFs that are known to regulate milk protein synthesis (e.g. STAT5A, C/EBPbeta, YY1, NFκB, NF-1 and CREB; Supplementary Table S4; Fig. 4). Differences between the correlation values of the full regulatory network and the TFs network were apparent. The absolute correlation values of the full regulatory network ranged from 0.80 to 1.00, with a mean of 0.86, whereas the absolute correlation values of the TF network ranged from 0.80 to 0.99, with a mean of 0.86. Moreover, while NR5A1 repressed most of its target genes (63%), the proportion of repressed and induced target genes were similar for GFI1B and ZNF407 (Fig. 5).
To identify the most important cellular activities controlled by the regulatory network and the TFs network, we analysed over-represented GO biological process terms using ClueGO. The full list of enriched pathways and ontologies is reported in Supplementary Table S5. Most of the molecular functions that were commonly enriched in both the full and TF networks were related to ion and cation transmembrane transporter activity and phosphatidylinositol signalling (Fig. 5C). The two networks also shared a considerable number of pathways and biological processes related to neuronal and hormone (e.g. glucocorticoids and insulin) signalling, reproduction, nitrogenous compound metabolism and molecular transport (Supplementary Table S5). Several functions related to the Golgi apparatus were also enriched in both networks such as Golgi vesicle transport, regulation of Golgi organization, intra-Golgi vesicle-mediated transport and post-Golgi vesicle-mediated transport (Supplementary Table S5). In addition, processes and components belonging to the extracellular matrix (ECM), such as the proteinaceous extracellular matrix (q = 0.00418), and cell proliferation, e.g., epithelial cell proliferation (q = 0.03981), were significantly overrepresented in the full network (Supplementary Table S5). Immune system response was only over-represented in the TF network, e.g., “positive regulation of lymphocyte mediated immunity” (q = 0.03696) and “regulation of adaptive immune response based on somatic recombination of immune receptors built from immunoglobulin superfamily domains” (q = 0.04188) (Supplementary Table S5).
Discussion
GWAS analysis
We carried out GWAS analysis of the bovine milk N profile, including the main CN and whey protein fractions and non-protein N compounds. The genomic heritabilities we found were generally higher than previously found in the literature, which may be partially due to several factors, such as differences in breed, population size, analytical method, statistical model and data measurement unit (e.g., yield vs proportion)20–26. Heritabilities of single casein fractions such as κ-CN and β-CN were much higher than that of total caseins. This might be due to the fact that single protein fractions (as well as totals) were expressed as percentage of total N and therefore qualitative (and not quantitative) information was provided. Accordingly, proportions of single milk protein fractions do not share the same profile nor necessarily vary conforming to the totals. The same explanation might be applied also to the number of significant SNPs (much lower in the case of total caseins). However, it is worth mentioning that when using a less stringent P- value (as in the case of pathway analyses) the situation was reversed, suggesting that in the case of total caseins the significantly associated signals tended to be mostly weak. These findings might provide further indication that selection for individual milk protein fractions might be more effective than selection based on total caseins, especially when setting breeding programmes aimed at improving milk nutritional and/or technological properties.
As expected, our GWAS results confirmed the highest signals to be on BTA6 in the region of the casein cluster and its flanking region (~86.35–87.40 Mb), and on the tail part of BTA11 including the region of the PAEP gene (~101.27–106.54), in line with previous results20–23. The most significant SNPs for κ-CN (Hapmap52348-rs29024684), β-CN (Hapmap28023-BTC-060518 and Hapmap24184-BTC-070077, in full LD) and β-LG (ARS-BFGL-NGS-104610 and ARS-BFGL-NGS-115328) are located near (less than 1 Mb from) the causal mutations for protein variants25,27–35.
Even after adjusting for the effect of the highest significant SNPs, we still detected high signals on BTA6. Apart from Hapmap28023-BTC-060518 and Hapmap24184-BTC-070077, which are in moderate LD with Hapmap52348-rs29024684, we still found peaks in the ranges 82 to 85 Mb and 88 to 94 Mb. The highest signal in the former region corresponded to Hapmap46932-BTA-111719, which was associated to β-CN, αS1-CN and αS2-CN. This marker was located about 0.2 Mb from CTSL2 and 0.5 Mb from IARS. CTSL2 belongs to the cathepsins family, which are endogenous proteases affecting the physicochemical characteristics of fresh milk and the quality of dairy products; an increase in CTSL2 expression in bovine milk was observed over the course of lactation28. IARS, on the other hand, encodes for the isoleucyl-tRNA synthetase. Aminoacyl-tRNA synthetases are key enzymes involved in translating the genetic code by attaching the correct amino acid to each tRNA species and hydrolysing an incorrectly attached amino acid in the editing process29. Amino acids serve as precursors for protein synthesis but also act as regulators of protein synthesis30. Furthermore, isoleucine seemed to act cooperatively with leucine to increase milk protein synthesis31,32, which appeared to be controlled (at least partially) by the mTOR pathway33. The highest peak in the latter region corresponded to Hapmap43045-BTA-76998, which was associated to αS2-CN and mapped in close proximity to several genes involved in immune system response, e.g., 0.2 Mb from IL8, 0.1 Mb from CXCL6 and 64 Kb from PPBP. IL8, for instance, is a highly polymorphic gene considered to be a mastitis trait34 and may also be a quantitative trait locus (QTL) for milk production traits35,36. We found highly significant SNPs on BTA11 in the region flanking PAEP (102.94–103.05 Mb) and including the QTL for the β-LG percentage deposited in the Cattle QTL Database. The marker ARS-USMARC-Parent-AY851163-rs17871661 (associated to β-LG, whey proteins, other N compounds, protein and N minor compounds) was located within GFI1B (intron variant effect), one of the TFs we proposed as master regulators of milk protein synthesis in bovine mammary gland. We also found a high signal located at 104.29 Mb and corresponding to ARS-BFGL-NGS-104610, which was associated to the same phenotypes. Interestingly, this region (104.13–104.31 Mb) is densely packed with genes coding for small nucleolar-RNA and micro-RNA, well-known regulators of gene expression37,38.
Pathway and network analyses
Pathway and network analyses derived from GWAS gave additional insights into the complex relationships among genes and the interconnected pathways that are likely to have a role in regulating protein synthesis and secretion in the mammary gland. For instance, we found several pathway associations within our regulatory network, which to the best of our knowledge have not been fully described before, namely: (i) ion and cation transmembrane transport (particularly K+, P, and Ca2+); (ii) hormone signalling, (iii) neuronal signalling and (iv) immune system response (Fig. 6; Supplementary Table S5). Additionally, we also identified three TFs, which were likely to be key activators and repressors of a total of 1,904 targets genes within the regulatory network, e.g., GFI1B, ZNF407 and NR5A1, which controlled the expression respectively of 260, 197 and 41 genes in the network. Interestingly, many of these pathways derived from GWAS analysis have been also related to milk coagulation properties, curd firmness, cheese yield and curd nutrient recovery22, such as calcium and potassium transport, neuronal and hormonal signalling, as well as phosphatidylinositol signalling. These functional findings might confirm the established relationship between milk protein composition and cheese-making traits.
The relationship between CN percentage in milk and the genes involved in the regulation of Ca2+ and phosphate transmembrane transport is in line with the structure and the main functions of the casein micelles, which on the one hand act as Ca2+-transporting vehicles to supply young mammals with a highly concentrated yet soluble form of calcium phosphate and on the other hand, prevent calcified, proteinaceous deposits containing amyloid fibrils in the mammary gland39. Caseins bind Ca2+ via highly phosphorylated sequences called phosphate centres present in αS1-CN, αS2-CN, β-CN40. Calcium-dependent CN kinase is responsible for κ-CN phosphorylation before micelle formation and milk secretion41. In agreement to our results, the Ca2+ ion-binding GO term has been already associated with κ-CN and β-LG in bovine milk23. These biologically reasonable associations were further confirmed by the enrichment of several functions related to the Golgi vesicle transport within the full-network and our TFs network. Indeed, the milk proteins newly synthesized in the rough endoplasmic reticulum are transferred to the Golgi apparatus where they are processed for transport to the apical area of the mammary epithelial cells through secretory vesicles3. A cardiovascular regulation function through several genes (e.g. ARVC, HCM, and DCM) has been also associated with κ-CN, suggesting that this protein fraction is involved with the regulation of Ca2+ homeostasis. Impaired Ca2+ ion regulation (and alteration in insulin signalling) is known to contribute to the pathophysiological effect on cardiomyocyte function42. Furthermore, these cardiovascular related pathways also included genes coding for integrins, the major ECM receptors that have been identified as important regulators of mammary epithelial cell growth and differentiation43. In relation to these results, pathways pertaining to the extracellular matrix were indeed significantly enriched in our full-regulatory network. Similarly, Gambra et al.23 reported an association between the extracellular matrix receptors, κ-CN and β-LG concentrations in bovine milk23. Besides Ca2+ ion, the K+ transport was also enrichment. It is likely that prolactin (PRL), which have a direct role in milk synthesis33, activates the extrusion of Na+ and the entry of K+ in mammary cells in both lactating and pre-lactating tissue44. Interestingly, a plasmin-induced β-CN breakdown product (fraction 1–28) has been found to act as a potent blocker of K+ channels in bovine mammary epithelia apical membranes45.
Our study also showed that milk proteins related genes were associated with the concerted action of hormones such as prolactin, growth hormone, thyroid hormone, corticosteroids, insulin, and growth factors, which are essential for the regulation of milk protein synthesis within the bovine mammary epithelium33. Lactogenic hormones enter MECs by diffusion and synergistically bind to milk protein gene promoters. Indeed, the proximal promoters of the β- and κ-CN genes contain so-called lactogenic response elements that harbour binding sites for TFs, which act either as inducers, such as GR, STAT5, NF-1 and C/EBPβ, or as repressors, such as YY-146,47. Remarkably, binding sites for these abovementioned TFs have also been predicted by the LASAGNA tool for the two most important nodes in the full regulatory network, in particular BPIFB1 and FAM169A, and for the two key TFs, GFI1B and NR5A1. Among the pathways overrepresented in the networks, regulation of insulin secretion and of insulin-like growth factor receptor signalling pathways were included (Supplementary Table S5). A direct effect of insulin on the bovine mammary gland might be mediated by the major milk protein ELF5, which seemed to be regulated by means of phosphoinositide 3-kinase/Akt signalling48, which has been identified as playing a central role in lactation49. Overrepresentation of phosphatidylinositol signalling (PI3K) in the full and TF networks might provide further support for this hypothesis. Both insulin and IGF1 might in turn activate the mTOR signalling pathway, which is crucial for milk protein synthesis50,51. Among the enriched genes included in the insulin secretion pathway, GLUT1 (SLC2A1) is of particular interest. The large uptake of glucose by the mammary gland during lactation considerably induces the expression of GLUT133, which seemed also to be regulated by mTOR52. Both RPTOR and GLUT1 were predicted to be targets for ZNF407 by our TF network.
Additionally, milk proteins associated genes were involved in the activation of neuronal signalling pathways, suggesting an indirect link to the reproduction process and lactation. The overrepresentation of neurotransmitter signalling, such as the cholinergic synapse (enriched in the full network) and axon guidance may be explained by the stimulation of mechanoreceptors in the teat skin, which induces cholinergic nerve impulses with the result that oxytocin is released from the pituitary gland, essential for milk secretion53. In fact, the study carried out by Gao et al.54 provides support to this hypothesis. These authors reported a significant increase in the expression of all CN genes in the bovine mammary gland at the lactation onset54, which is reasonably consistent with the need to meet the nutritional requirements of new-born calves. Having established that neuronal signalling appeared to be associated to milk protein components, we also demonstrated that CN could be related to the control of reproduction. The mammary gland is considered as an accessory reproductive organ55. This later association may be attributable to several genes involved in the regulation of reproductive process, including NR5A1, which plays an important role in various aspects of reproductive development and function56 and also regulates gene expression of pituitary gonadotropins, such as the luteinizing hormone (LH) and the follicle-stimulating hormone (FSH)57. On the other hand, we found 100 genes in the full network that might be related to amyloidosis disease. Caseins, as other unfolded proteins, tend to form amyloid fibrils and calcified deposits, although to avoid the risk of amyloidosis and calcification, the mammary gland orchestrates different aggregation mechanisms that result in the formation of the casein micelle58. Amyloidosis and the production of amyloid proteins have been associated with a variety of so-called protein conformational or protein misfolding diseases (including Alzheimer’s disease, Parkinson’s disease, type-II diabetes)59. Caseins have been also found to function as holdase molecular chaperones to prevent the potentially harmful formation of amyloid fibrils58, which might explain the enrichment of the signal sequence binding found in our study.
Finally, the enrichment of pathways related to immune response observed for the TF-network might be partly related to the biological role of GFI1B which is a transcriptional repressor that plays important roles in the differentiation of several haematopoietic cells60. Our findings might be related to the antimicrobial activity of caseins, and specifically of κ-CN61; of interest, an overall increase in the immune response and/or in milk antimicrobial activity of the bovine mammary gland has been observed during lactation62.
Milk protein composition is subject to the well-known effect of the major genes coding for the various CNs and whey proteins. In our study, the combination of GWAS and pathway and network analyses showed several genes that were coordinated and highly connected between them, making a substantial contribution at different stages of milk protein synthesis. This information advances our understanding of bovine mammary gland functionality and could be helpful to breeding programmes aimed at improving milk quality and/or technological properties. However, altogether, the correlative nature of associations between outcomes from which causality cannot be determined limits the interpretation of our results. Therefore, it is of paramount importance to carry on larger longitudinal studies to explore the causes and the persistency of these interactions. Additionally, the predicted associations need to be biologically validated, e.g., by integrating genomic data with gene expression profiles, by using machine-learning approaches or animal models with knockout genes.
Methods
Ethics statement
The cows included in this study belonged to commercial private herds and were not subjected to any invasive procedures. Milk and blood samples were previously collected during the routine milk recording coordinated by technicians working at the Breeder Association of Trento Province (Italy) and therefore authorized by a local authority.
Phenotypes and genotypes
Individual milk samples were collected from 1,264 Italian Brown Swiss cows from 85 commercial herds located in the Alpine province of Trento (Italy). Details of the animals used in this study and the characteristics of the area are reported in Cipolat-Gotet et al.63 and Cecchinato et al.64.
Milk total nitrogen, casein and urea nitrogen (MUN) were measured using a MilkoScan FT6000 (Foss, Hillerød, Denmark). Proportions of the true proteins, e.g., casein fractions (αS1-, αS1P-, αS2-, β- and κ- CN), and whey proteins [β-lactoglobulin (β-LG) and α-lactalbumin (α-LA)] were determined using validated reversed-phase high-performance liquid chromatography (RP-HPLC)65. Each fraction was expressed as a percentage of the milk total nitrogen (N) content. These percentages were summed and deducted from the milk total N content to arrive at the proportion of the remaining minor milk N compounds.
The Illumina BovineSNP50 v.2 BeadChip (Illumina Inc., San Diego, CA) was used to genotype 1,152 cows (blood samples were not available for all the phenotyped animals). Quality control excluded markers that do not fulfil the subsequent criteria: call rates >95%, minor allele frequencies >0.5% and no extreme deviation from the Hardy-Weinberg equilibrium (P > 0.001, Bonferroni corrected). After filtering, 1,011 cows and 37,568 SNPs were retained for subsequent analyses.
Genome-wide association study
Genome-wide association analyses (GWAS) were conducted using single-marker regression in the GenABEL R package66 and GRAMMAR-GC (Genome-wide Association using Mixed Model and Regression - Genomic Control) with the default function gamma67. There are 3 steps to the GRAMMAR-GC: firstly, an additive polygenic model with a genomic relationship matrix is fitted; secondly, the residuals obtained from this model are regressed on the SNPs to test for associations; finally, genomic control corrects for conservativeness of the procedure68. The polygenic model was:
1 |
where y is a vector of the milk N fractions; β is a vector with fixed effects of (i) days in milk of the cow (classes of 30 days each), (ii) parity of each cow (classes of 1, 2, 3, ≥4), and (iii) herd-date effect (n = 85); X is an incidence matrix connecting each observation to specific levels of the factors in . The two random terms in the model were the animal and the residuals, which were assumed to be normally distributed as and , where G is the genomic relationship, I is the identity matrix, is the additive genomic variance and the residual variance. The G matrix was built in the GenABEL R package, where for a given pair of individuals i and j, the identical by state coefficients (fi,j) is calculated as:
2 |
where N is the number of markers used, xi,k is the genotype of the ith individual at the kth SNP (coded as 0, ½ and 1), pk is the frequency of the “+” allele and k = 1, …, N.
A significance threshold of P < 5 × was adopted69. Manhattan plots were drawn using the qqman R package70.
SNP variance was calculated as 2pqa2, where p is the frequency of one allele, q = 1 − p is the frequency of the second allele and a is the estimated additive genetic effect. Model (1) was also used to estimate the variance components and the genomic heritability of the traits based on the genomic relationship matrix (2). Heritability was estimated as .
To identify secondary association signals, association analysis conditioning on the primary associated SNPs was carried out to test for the presence of other significantly associated SNPs. Therefore, in model (1) we fixed the most significant SNPs on BTA6 and on BTA11 to obtain SNP effect estimates adjusted for the effect of these highly significant SNPs.
The r-squared statistic was chosen to predict the extent of LD. The r2 between pairwise SNPs covering the region of CN loci on BTA6 and the region of the β-LG gene (progestagen-associated endometrial protein, PAEP) on BTA11 and their respective 1 Mb flanking regions was calculated using the R package LDheatmap71.
Gene-set enrichment and pathway analyses
Pathway analyses were performed as detailed in Dadousis et al.22 to identify the biological functions regulating the milk N fraction profile. Briefly, the SNPs (nominal P-values < 0.05) were assigned to genes if they were located within the gene or within 15 kb of 5′ and 3′ends72 using the BiomaRt R package73,74 and the Ensembl Bos taurus UMD3.1 assembly. Respect to the GWAS analysis, a less stringent significance threshold was adopted since we aimed to detect the effect of less significant SNPs which still contribute to explain phenotypic variability, as associated to genes which are part of biological networks and cellular processes. Combining weaker but related variant signals we can improve the prediction of how these variants might be collectively related to the phenotypes of interest. The Kyoto Encyclopaedia of Genes and Genomes (KEGG)75 and the Gene Ontology (GO) databases76 were used to define the functional categories associated to the gene sets. To avoid testing broad or narrow functional categories, only GO and KEGG terms with >10 and <1000 genes were inspected. A Fisher’s exact test was applied to each functional category to test for overrepresentation of significant gene sets. A q-value of 0.05 was set as the cut-off for significant enrichments. The gene-set enrichment analysis was performed using the R package goseq77.
SNP co-association and network analyses
The GWAS results were used to build the AWM as described by Fortes et al.24. The selection criteria favour genes harbouring SNPs with significant associations across related traits. In brief, κ-CN was selected as the key phenotype (due to its greater importance for milk technological properties) and the SNPs that were associated with it (P ≤ 0.05) were included in the AWM.
Dependency among phenotypes was explored by estimating the average number of other phenotypes (Ap) that were associated with these SNPs at the same P value (P ≤ 0.05) (Ap = 3). Then, we selected SNPs that were both close (<10 Kb) to the nearest annotated gene (UMD3.1 assembly) and were associated with any ≥3 other traits (P < 0.05). To identify putative regulators, the TFs reported by Vaquerizas78 and the microRNA (miRNA) that were mapped to the UMD 3.1 bovine genome assembly (GenBank assembly accession: GCA_000003055.3) were also included in this analysis. To estimate the phenotypic variance explained by the AWM-SNPs, we constructed a first G matrix based only on the SNPs that were selected for the AWM. The same numbers of randomly selected SNPs were used to build 10,000 G matrices (10,000 replicates), to estimate the variance explained by those randomly selected SNPs. The Pearson correlations obtained from pair-wise correlations of AWM columns (standardized SNP effects across traits) were computed and hierarchical clustering of traits was visualised using the hclust function in R79. The PCIT algorithm80 was used to report significant interactions in the network, which were visualized in Cytoscape81. Every node in the network represents a gene, while every edge connecting two nodes represents a significant interaction. In order to include only the high-confidence gene co-associations determined by PCIT, those with correlations ≥|0.80| were retained (n = 1,904 unique genes), on the assumption that these genes have relevant biological significance for the key phenotype from which the AWM-PCIT was constructed. The co-association network was automatically generated using the organic layout algorithm in Cytoscape V2.7 (http://cytoscape.org). Network topological parameters and node centrality values were calculated using the NetworkAnalyzer plugin82 to gain insights into the organisation and structure of the complex networks formed by the interacting molecules. In parallel, the list of co-associated genes was fed into the Cytoscape plugin ClueGo83 to identify relevant categories of molecular functions, cellular components and biological processes. The ClueGO cut-off for the statistical assessment was FDR < 0.05. In addition, the list of co-associated genes was uploaded to the Ingenuity Pathway Analysis (IPA, version 5.5; Ingenuity Systems, USA) to define information on molecule type (e.g., transcription factor, cytokine, transporter). Genes in the network were coloured according to the biological processes they participate in. Then, a list of TFs (based on Vaquerizas et al.78) and their target genes, to which they were potentially connected, were identified within our high-confidence gene network (r ≥ |0.80|). An information-lossless approach84 was used to identify the optimal subset of TFs spanning the majority of the network topology. The density plots of the genes’ partial-correlation values in the full and the TF network were generated using the R package ggpubr.
Prediction of TF binding sites in the genes’ promoter regions was performed by the LASAGNA-Search 2.0 web tool85 using matrices in the TRANSFAC public database and with a significance threshold of P = 0.001.
Electronic supplementary material
Acknowledgements
The research was funded by Trento Province (Italy), the Italian Brown Swiss Cattle Breeders Association (ANARB, Verona, Italy), and the Superbrown Consortium of Bolzano and Trento. The authors wish to thank Dr. Christos Dadousis for his help in setting up the genome-wide association analysis and Dr. Ezequiel Luis Nicolazzi for technical support in SNP annotation.
Author Contributions
S.P. contributed to set up the objectives of this study, performed the statistical analysis and drafted the first version of the manuscript; N.M. and Y.R.C. performed the network analysis and helped with results interpretation; A.C. conceived the study, helped to interpret the results, and supervised the project together with G.B. S.S. contributed to the results interpretation. All authors have read and approved the final manuscript.
Competing Interests
The authors declare that they have no competing interests.
Footnotes
Electronic supplementary material
Supplementary information accompanies this paper at 10.1038/s41598-017-18916-4.
Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
References
- 1.WHO Technical Report Series PROTEIN AND AMINO ACID REQUIREMENTS IN HUMAN NUTRITION Report of a Joint WHO/FAO/UNU Expert Consultation. At http://apps.who.int/iris/bitstream/10665/43411/1/WHO_TRS_935_eng.pdf. [PubMed]
- 2.Korhonen H, Pihlanto A. Bioactive peptides: Production and functionality. Int. Dairy J. 2006;16:945–960. doi: 10.1016/j.idairyj.2005.10.012. [DOI] [Google Scholar]
- 3.Rezaei R, Wu Z, Hou Y, Bazer FW, Wu G. Amino acids and mammary gland development: nutritional implications for milk production and neonatal growth. J. Anim. Sci. Biotechnol. 2016;7:20. doi: 10.1186/s40104-016-0078-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Farrell HM, et al. Nomenclature of the Proteins of Cows’ Milk—Sixth Revision. J. Dairy Sci. 2004;87:1641–1674. doi: 10.3168/jds.S0022-0302(04)73319-6. [DOI] [PubMed] [Google Scholar]
- 5.Sánchez-Moya, T. et al. In vitro modulation of gut microbiota by whey protein to preserve intestinal health. Food Funct, 8, 3053–3063 (2017). [DOI] [PubMed]
- 6.European Union. European Commission & European Union Eurostat. Agriculture, forestry and fishery statistics. (Publications Office of the European Union, 2016).
- 7.Jenkins TC, McGuire MA, Baldwin RL. Major advances in nutrition: impact on milk composition. J. Dairy Sci. 2006;89:1302–10. doi: 10.3168/jds.S0022-0302(06)72198-1. [DOI] [PubMed] [Google Scholar]
- 8.Bittante G, Penasa M, Cecchinato A. Invited review: Genetics and modeling of milk coagulation properties. J. Dairy Sci. 2012;95:6843–70. doi: 10.3168/jds.2012-5507. [DOI] [PubMed] [Google Scholar]
- 9.Bittante G, et al. Factors affecting the incidence of first-quality wheels of Trentingrana cheese. J. Dairy Sci. 2011;94:3700–3707. doi: 10.3168/jds.2010-3746. [DOI] [PubMed] [Google Scholar]
- 10.Bittante G, et al. Monitoring of sensory attributes used in the quality payment system of Trentingrana cheese. J. Dairy Sci. 2011;94:5699–5709. doi: 10.3168/jds.2011-4319. [DOI] [PubMed] [Google Scholar]
- 11.Bell SJ, Grochoski GT, Clarke AJ. Health Implications of Milk Containing?-Casein with the A2 Genetic Variant. Crit. Rev. Food Sci. Nutr. 2006;46:93–100. doi: 10.1080/10408390591001144. [DOI] [PubMed] [Google Scholar]
- 12.Graf S, Egert S, Heer M. Effects of whey protein supplements on metabolism. Curr. Opin. Clin. Nutr. Metab. Care. 2011;14:569–580. doi: 10.1097/MCO.0b013e32834b89da. [DOI] [PubMed] [Google Scholar]
- 13.Rhoads RE, Grudzien-Nogalska E. Translational Regulation of Milk Protein Synthesis at Secretory Activation. J. Mammary Gland Biol. Neoplasia. 2007;12:283–292. doi: 10.1007/s10911-007-9058-0. [DOI] [PubMed] [Google Scholar]
- 14.Bian Y, et al. Epigenetic Regulation of miR-29s Affects the Lactation Activity of Dairy Cow Mammary Epithelial Cells. J. Cell. Physiol. 2015;230:2152–2163. doi: 10.1002/jcp.24944. [DOI] [PubMed] [Google Scholar]
- 15.Huppertz, T. 1 Proteins-Volume 1A: Basic Aspects in Advanced Dairy Chemistry Volume 1A (Eds McSweeney, P. L. H. & Fox, P. F.) 135–160 (Springer US, 2013).
- 16.Bijl E, van Valenberg H, Huppertz T, van Hooijdonk A, Bovenhuis H. Phosphorylation of αS1-casein is regulated by different genes. J. Dairy Sci. 2014;97:7240–7246. doi: 10.3168/jds.2014-8061. [DOI] [PubMed] [Google Scholar]
- 17.Lee J, Seo J, Lee SY, Ki KS, Seo S. Meta-analysis of factors affecting milk component yields in dairy cattle. J. Anim. Sci. Technol. 2014;56:5. doi: 10.1186/2055-0391-56-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Gustavsson F, et al. Effects of breed and casein genetic variants on protein profile in milk from Swedish Red, Danish Holstein, and Danish Jersey cows. J. Dairy Sci. 2014;97:3866–3877. doi: 10.3168/jds.2013-7312. [DOI] [PubMed] [Google Scholar]
- 19.Dadousis C, et al. Genome-wide association of coagulation properties, curd firmness modeling, protein percentage, and acidity in milk from Brown Swiss cows. J. Dairy Sci. 2016;99:3654–66. doi: 10.3168/jds.2015-10078. [DOI] [PubMed] [Google Scholar]
- 20.Buitenhuis B, Poulsen NA, Gebreyesus G, Larsen LB. Estimation of genetic parameters and detection of chromosomal regions affecting the major milk proteins and their post translational modifications in Danish Holstein and Danish Jersey cattle. BMC Genet. 2016;17:114. doi: 10.1186/s12863-016-0421-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Peñagaricano F, Weigel KA, Rosa GJM, Khatib H. Inferring Quantitative Trait Pathways Associated with Bull Fertility from a Genome-Wide Association Study. Front. Genet. 2013;3:307. doi: 10.3389/fgene.2012.00307. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Dadousis C, et al. Pathway-based genome-wide association analysis of milk coagulation properties, curd firmness, cheese yield, and curd nutrient recovery in dairy cattle. J. Dairy Sci. 2017;100:1223–1231. doi: 10.3168/jds.2016-11587. [DOI] [PubMed] [Google Scholar]
- 23.Gambra R, et al. Genomic architecture of bovine κ-casein and β-lactoglobulin. J Dairy Sci. 2013;96:5333–43. doi: 10.3168/jds.2012-6324. [DOI] [PubMed] [Google Scholar]
- 24.Fortes MRS, et al. Association weight matrix for the genetic dissection of puberty in beef cattle. Proc. Natl. Acad. Sci. USA. 2010;107:13642–7. doi: 10.1073/pnas.1002044107. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Schopen GCB, et al. Whole-genome association study for milk protein composition in dairy cattle. J. Dairy Sci. 2011;94:3148–58. doi: 10.3168/jds.2010-4030. [DOI] [PubMed] [Google Scholar]
- 26.Bonfatti V, Cecchinato A, Gallo L, Blasco A, Carnier P. Genetic analysis of detailed milk protein composition and coagulation properties in Simmental cattle. J. Dairy Sci. 2011;94:5183–93. doi: 10.3168/jds.2011-4297. [DOI] [PubMed] [Google Scholar]
- 27.Huang W, et al. Association between milk protein gene variants and protein composition traits in dairy cattle. J. Dairy Sci. 2012;95:440–9. doi: 10.3168/jds.2011-4757. [DOI] [PubMed] [Google Scholar]
- 28.Wickramasinghe, S., Rincon, G., Islas-Trejo, A. & Medrano, J. F. Transcriptional profiling of bovine milk using RNA sequencing. BMC Genomics 25,13:45 (2012). [DOI] [PMC free article] [PubMed]
- 29.Ling J, Söll D. Severe oxidative stress induces protein mistranslation through impairment of an aminoacyl-tRNA synthetase editing site. Proc. Natl. Acad. Sci. USA. 2010;107:4028–33. doi: 10.1073/pnas.1000315107. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Meijer AJ. Amino acids as regulators and components of nonproteinogenic pathways. J. Nutr. 2003;133:2057S–2062S. doi: 10.1093/jn/133.6.2057S. [DOI] [PubMed] [Google Scholar]
- 31.Appuhamy JA, Knoebel NA, Nayananjalie WA, Escobar J, Hanigan MD. Isoleucine and Leucine Independently Regulate mTOR Signaling and Protein Synthesis in MAC-T Cells and Bovine Mammary Tissue Slices. J. Nutr. 2012;142:484–491. doi: 10.3945/jn.111.152595. [DOI] [PubMed] [Google Scholar]
- 32.Richert BT, Goodband RD, Tokach MD, Nelssen JL. Increasing valine, isoleucine, and total branched-chain amino acids for lactating sows. J. Anim. Sci. 1997;75:2117–28. doi: 10.2527/1997.7582117x. [DOI] [PubMed] [Google Scholar]
- 33.Bionaz M, Loor JJ. Gene networks driving bovine mammary protein synthesis during the lactation cycle. Bioinform. Biol. Insights. 2011;5:83–98. doi: 10.4137/BBI.S7003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Ogorevc J, Kunej T, Razpet A, Dovc P. Database of cattle candidate genes and genetic markers for milk production and mastitis. Anim. Genet. 2009;40:832–51. doi: 10.1111/j.1365-2052.2009.01921.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Olsen HG, et al. A Genome Scan for Quantitative Trait Loci Affecting Milk Production in Norwegian Dairy Cattle. J. Dairy Sci. 2002;85:3124–3130. doi: 10.3168/jds.S0022-0302(02)74400-7. [DOI] [PubMed] [Google Scholar]
- 36.Boichard D, et al. Detection of genes influencing economic traits in three French dairy cattle breeds. Genet. Sel. Evol. 2003;35:77–101. doi: 10.1186/1297-9686-35-1-77. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Shivdasani RA. MicroRNAs: regulators of gene expression and cell differentiation. Blood. 2006;108:3646–53. doi: 10.1182/blood-2006-01-030015. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Valadkhan S, Gunawardane LS. Role of small nuclear RNAs in eukaryotic gene expression. Essays Biochem. 2013;54:79–90. doi: 10.1042/bse0540079. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Liu J, et al. The Effect of Milk Constituents and Crowding Agents on Amyloid Fibril Formation by κ-Casein. J. Agric. Food Chem. 2016;64:1335–1343. doi: 10.1021/acs.jafc.5b04977. [DOI] [PubMed] [Google Scholar]
- 40.Holt C, Carver JA. Darwinian transformation of a ‘scarcely nutritious fluid’ into milk. J. Evol. Biol. 2012;25:1253–63. doi: 10.1111/j.1420-9101.2012.02509.x. [DOI] [PubMed] [Google Scholar]
- 41.Brooks CL, Landt M. Calcium-ion and calmodulin-dependent kappa-casein kinase in rat mammary acini. Biochem. J. 1984;224:195–200. doi: 10.1042/bj2240195. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Lebeche D, Davidoff AJ, Hajjar RJ. Interplay between impaired calcium regulation and insulin signaling abnormalities in diabetic cardiomyopathy. Nat. Clin. Pract. Cardiovasc. Med. 2008;5:715–724. doi: 10.1038/ncpcardio1347. [DOI] [PubMed] [Google Scholar]
- 43.Taddei I, et al. Integrins in Mammary Gland Development and Differentiation of Mammary Epithelium. J. Mammary Gland Biol. Neoplasia. 2003;8:383–394. doi: 10.1023/B:JOMG.0000017426.74915.b9. [DOI] [PubMed] [Google Scholar]
- 44.Falconer IR, Rowe JM. Effect of Prolactin on Sodium and Potassium Concentrations in Mammary Alveolar Tissue. Endocrinology. 1977;101:181–186. doi: 10.1210/endo-101-1-181. [DOI] [PubMed] [Google Scholar]
- 45.Silanikove N, Shamay A, Shinder D, Moran A. Stress down regulates milk yield in cows by plasmin induced beta-casein product that blocks K+ channels on the apical membranes. Life Sci. 2000;67:2201–12. doi: 10.1016/S0024-3205(00)00808-0. [DOI] [PubMed] [Google Scholar]
- 46.Rosen JM, Wyszomierski SL, Hadsell D. Regulation of milk protein gene expression. Annu. Rev. Nutr. 1999;19:407–436. doi: 10.1146/annurev.nutr.19.1.407. [DOI] [PubMed] [Google Scholar]
- 47.Lenasi, T., Kokalj-Vokac, N., Narat, M., Baldi, A. & Dovc, P. Functional study of the equine beta-casein and kappa-casein gene promoters. J. Dairy Res. 72Spec No, 34–43 (2005). [DOI] [PubMed]
- 48.Menzies KK, Lefèvre C, Macmillan KL, Nicholas KR. Insulin regulates milk protein synthesis at multiple levels in the bovine mammary gland. Funct. Integr. Genomics. 2009;9:197–217. doi: 10.1007/s10142-008-0103-x. [DOI] [PubMed] [Google Scholar]
- 49.Lemay DG, Neville MC, Rudolph MC, Pollard KS, German J. Gene regulatory networks in lactation: identification of global principles using bioinformatics. BMC Syst. Biol. 2007;1:56. doi: 10.1186/1752-0509-1-56. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Castro JJ, Arriola Apelo SI, Appuhamy JADRN, Hanigan MD. Development of a model describing regulation of casein synthesis by the mammalian target of rapamycin (mTOR) signaling pathway in response to insulin, amino acids, and acetate. J. Dairy Sci. 2016;99:6714–6736. doi: 10.3168/jds.2015-10591. [DOI] [PubMed] [Google Scholar]
- 51.Haar EV, Lee S, Bandhakavi S, Griffin TJ, Kim D-H. Insulin signalling to mTOR mediated by the Akt/PKB substrate PRAS40. Nat. Cell Biol. 2007;9:316–323. doi: 10.1038/ncb1547. [DOI] [PubMed] [Google Scholar]
- 52.Buller CL, et al. A GSK-3/TSC2/mTOR pathway regulates glucose uptake and GLUT1 glucose transporter expression. AJP Cell Physiol. 2008;295:C836–C843. doi: 10.1152/ajpcell.00554.2007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Handbook of Milk ofNon-Bovine Mammals. (Eds Park, Y. W. & Haenlein, G. F. W.) (John Wiley & Sons, 2008).
- 54.Gao Y, Lin X, Shi K, Yan Z, Wang Z. Bovine Mammary Gene Expression Profiling during the Onset of Lactation. PLoS One. 2013;8:e70393. doi: 10.1371/journal.pone.0070393. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Blackburn DG, Hayssen V, Murphy CJ. The origins of lactation and the evolution of milk: a review with new hypotheses. Mamm. Rev. 1989;19:1–26. doi: 10.1111/j.1365-2907.1989.tb00398.x. [DOI] [Google Scholar]
- 56.Richards JS, et al. Novel Signaling Pathways That Control Ovarian Follicular Development, Ovulation, and Luteinization. Recent Prog Horm Res. 2002;57:195–22057. doi: 10.1210/rp.57.1.195. [DOI] [PubMed] [Google Scholar]
- 57.Haisenleder DJ, Yasin M, Dalkin AC, Gilrain J, Marshall JC. GnRH regulates steroidogenic factor-1 (SF-1) gene expression in the rat pituitary. Endocrinology. 1996;137:5719–5722. doi: 10.1210/endo.137.12.8940405. [DOI] [PubMed] [Google Scholar]
- 58.Holt C, Carver JA, Ecroyd H, Thorn DC. Invited review: Caseins and the casein micelle: Their biological functions, structures, and behavior in foods. J. Dairy Sci. 2013;96:6127–6146. doi: 10.3168/jds.2013-6831. [DOI] [PubMed] [Google Scholar]
- 59.Ashraf GM, et al. Protein misfolding and aggregation in Alzheimer’s disease and type 2 diabetes mellitus. CNS Neurol. Disord. Drug Targets. 2014;13:1280–93. doi: 10.2174/1871527313666140917095514. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60.van der Meer LT, Jansen JH, van der Reijden BA. Gfi1 and Gfi1b: key regulators of hematopoiesis. Leukemia. 2010;24:1834–1843. doi: 10.1038/leu.2010.195. [DOI] [PubMed] [Google Scholar]
- 61.Meredith-Dennis, L. et al. Composition and Variation of Macronutrients, Immune Proteins, and Human Milk Oligosaccharides in Human Milk From Nonprofit and Commercial Milk Banks. J. Hum. Lact. 089033441771063 (2017). [DOI] [PubMed]
- 62.Loor JJ, Moyes KM, Bionaz M. Functional Adaptations of the Transcriptome to Mastitis-Causing Pathogens: The Mammary Gland and Beyond. J. Mammary Gland Biol. Neoplasia. 2011;16:305–322. doi: 10.1007/s10911-011-9232-2. [DOI] [PubMed] [Google Scholar]
- 63.Cipolat-Gotet C, Cecchinato A, De Marchi M, Bittante G. Factors affecting variation of different measures of cheese yield and milk nutrient recovery from an individual model cheese-manufacturing process. J. Dairy Sci. 2013;96:7952–7965. doi: 10.3168/jds.2012-6516. [DOI] [PubMed] [Google Scholar]
- 64.Cecchinato A, Albera A, Cipolat-Gotet C, Ferragina A, Bittante G. Genetic parameters of cheese yield and curd nutrient recovery or whey loss traits predicted using Fourier-transform infrared spectroscopy of samples collected during milk recording on Holstein, Brown Swiss, and Simmental dairy cows. J. Dairy Sci. 2015;98:4914–4927. doi: 10.3168/jds.2014-8599. [DOI] [PubMed] [Google Scholar]
- 65.Bonfatti V, Grigoletto L, Cecchinato A, Gallo L, Carnier P. Validation of a new reversed-phase high-performance liquid chromatography method for separation and quantification of bovine milk protein genetic variants. J. Chromatogr. A. 2008;1195:101–106. doi: 10.1016/j.chroma.2008.04.075. [DOI] [PubMed] [Google Scholar]
- 66.GenABEL project developers GenABEL: genome-wide SNP association analysis. R package version 1.8–0, https://cran.r-project.org/web/packages/GenABEL/index.html (2013).
- 67.Amin N, van Duijn CM, Aulchenko YS. A Genomic Background Based Method for Association Analysis in Related Individuals. PLoS One. 2007;2:e1274. doi: 10.1371/journal.pone.0001274. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 68.Svishcheva GR, Axenovich TI, Belonogova NM, van Duijn CM, Aulchenko YS. Rapid variance components-based method for whole-genome association analysis. Nat. Genet. 2012;44:1166–70. doi: 10.1038/ng.2410. [DOI] [PubMed] [Google Scholar]
- 69.Burton PR, et al. Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature. 2007;447:661–678. doi: 10.1038/nature05911. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 70.Turner, S. D. qqman: an R package for visualizing GWAS results using Q-Q and manhattan plots, bioRxiv (2014).
- 71.Shin, J.-H. et al. LDheatmap: An R Function for Graphical Display of Pairwise Linkage Disequilibria Between Single Nucleotide Polymorphisms. J. Stat. Softw. 016, (2006).
- 72.Pickrell JK, et al. Understanding mechanisms underlying human gene expression variation with RNA sequencing. Nature. 2010;464:768–72. doi: 10.1038/nature08872. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 73.Durinck S, et al. BioMart and Bioconductor: a powerful link between biological databases and microarray data analysis. Bioinformatics. 2005;21:3439–40. doi: 10.1093/bioinformatics/bti525. [DOI] [PubMed] [Google Scholar]
- 74.Durinck S, Spellman PT, Birney E, Huber W. Mapping identifiers for the integration of genomic datasets with the R/Bioconductor package biomaRt. Nat. Protoc. 2009;4:1184–1191. doi: 10.1038/nprot.2009.97. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 75.Ogata H, et al. KEGG: Kyoto Encyclopedia of Genes and Genomes. Nucleic Acids Res. 1999;27:29–34. doi: 10.1093/nar/27.1.29. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 76.Ashburner M, et al. Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat. Genet. 2000;25:25–9. doi: 10.1038/75556. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 77.Young MD, Wakefield MJ, Smyth GK, Oshlack A. Gene ontology analysis for RNA-seq: accounting for selection bias. Genome Biol. 2010;11:R14. doi: 10.1186/gb-2010-11-2-r14. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 78.Vaquerizas JM, Kummerfeld SK, Teichmann SA, Luscombe NM. A census of human transcription factors: function, expression and evolution. Nat. Rev. Genet. 2009;10:252–263. doi: 10.1038/nrg2538. [DOI] [PubMed] [Google Scholar]
- 79.Ihaka R, Gentleman R. R: A Language for Data Analysis and Graphics. J. Comput. Graph. Stat. 1996;5:299–314. [Google Scholar]
- 80.Reverter A, Chan EKF. Combining partial correlation and an information theory approach to the reversed engineering of gene co-expression networks. Bioinformatics. 2008;24:2491–2497. doi: 10.1093/bioinformatics/btn482. [DOI] [PubMed] [Google Scholar]
- 81.Shannon P, et al. Cytoscape: A Software Environment for Integrated Models of Biomolecular Interaction Networks. Genome Res. 2003;13:2498–2504. doi: 10.1101/gr.1239303. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 82.Assenov Y, Ramirez F, Schelhorn S-E, Lengauer T, Albrecht M. Computing topological parameters of biological networks. Bioinformatics. 2008;24:282–284. doi: 10.1093/bioinformatics/btm554. [DOI] [PubMed] [Google Scholar]
- 83.Bindea G, et al. ClueGO: a Cytoscape plug-in to decipher functionally grouped gene ontology and pathway annotation networks. Bioinformatics. 2009;25:1091–3. doi: 10.1093/bioinformatics/btp101. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 84.Reverter A, Fortes MRS. Breeding and Genetics Symposium: building single nucleotide polymorphism-derived gene regulatory networks: Towards functional genomewide association studies. J. Anim. Sci. 2013;91:530–6. doi: 10.2527/jas.2012-5780. [DOI] [PubMed] [Google Scholar]
- 85.Lee C, Huang C-H. LASAGNA-Search: an integrated web tool for transcription factor binding site search and visualization. Biotechniques. 2013;54:141–53. doi: 10.2144/000113999. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.