Skip to main content
PLOS ONE logoLink to PLOS ONE
. 2022 May 13;17(5):e0267967. doi: 10.1371/journal.pone.0267967

Quantitative glycoproteomics of human milk and association with atopic disease

Matilda Holm 1,¤a,¤b, Mayank Saraswat 1,¤c, Sakari Joenväärä 1, Antti Seppo 2, R John Looney 3, Tiialotta Tohmola 1, Jutta Renkonen 1, Risto Renkonen 1,, Kirsi M Järvinen 2,‡,*
Editor: Frederique Lisacek4
PMCID: PMC9106177  PMID: 35559953

Abstract

The prevalence of allergic diseases and asthma is increasing rapidly worldwide, with environmental and lifestyle behaviors implicated as a reason. Epidemiological studies have shown that children who grow up on farms are at lower risk of developing childhood atopic disease, indicating the presence of a protective “farm effect”. The Old Order Mennonite (OOM) community in Upstate New York have traditional, agrarian lifestyles, a low rate of atopic disease, and long periods of exclusive breastfeeding. Human milk proteins are heavily glycosylated, although there is a paucity of studies investigating the milk glycoproteome. In this study, we have used quantitative glycoproteomics to compare the N-glycoprotein profiles of 54 milk samples from Rochester urban/suburban and OOM mothers, two populations with different lifestyles, exposures, and risk of atopic disease. We also compared N-glycoprotein profiles according to the presence or absence of atopic disease in the mothers and, separately, the children. We identified 79 N-glycopeptides from 15 different proteins and found that proteins including immunoglobulin A1, polymeric immunoglobulin receptor, and lactotransferrin displayed significant glycan heterogeneity. We found that the abundances of 38 glycopeptides differed significantly between Rochester and OOM mothers and also identified four glycopeptides with significantly different abundances between all comparisons. These four glycopeptides may be associated with the development of atopic disease. The findings of this study suggest that the differential glycosylation of milk proteins could be linked to atopic disease.

Introduction

Allergic diseases and asthma are becoming more common worldwide. While an increase in environmental risk factors has contributed to the rise in prevalence, there are also associations between early life circumstances and the development of these diseases [1]. The prevalence of allergic disease and asthma is higher in affluent, developed countries and has increased significantly during the past five decades. The rapid changes in prevalence suggest environmental and lifestyle behaviors as a more likely origin than genetics [2]. Childhood atopic disease includes atopic dermatitis, food allergy, asthma, and allergic rhinitis. The role of breastfeeding in the prevention of childhood atopic disease is controversial, with conflicting data reported. The findings of some studies have shown that breastfeeding decreases the risk of developing atopic disease, while other studies indicate that it either increases the risk or does not affect the development of atopic disease. However, evidence suggests that exclusive breastfeeding for around 4 months may confer a protective effect against the development of childhood atopic disease, especially atopic dermatitis and wheeze [35].

Epidemiological studies have shown that children who grow up on farms are at a significantly lower risk of developing childhood atopic disease than children from the same area who do not grow up on farms. This protective “farm effect” is due to factors including early-life contact with livestock and animal feed, as well as the consumption of unprocessed cow’s milk. The timing of exposure is crucial for this protective effect, with the strongest effects observed for exposures that occurred in or before the first year of life and those that continued up to the fifth year of life [6, 7]. The Old Order Mennonites (OOM) in Upstate New York predominantly live on farms, have large families and home births, and avoid the use of antibiotics. Having more siblings has been associated with a lower prevalence of hay fever, and vaginal delivery at home has been shown to confer the greatest protection against atopic disease. This OOM community has been found to have a very low rate of allergic disease as compared to a sample of the US population of similar age, sex, and race [2]. Further, the overall prevalence of food allergies was significantly lower in the New York OOM community as compared to the general US population. Rates of breastfeeding at 6 and 12 months and rates of exclusive breastfeeding and 3 and 6 months were also significantly higher in the OOM community compared to in New York State [8].

The proteins in human milk are heavily glycosylated, with up to 70% of human milk proteins estimated to be glycosylated [9]. While the human milk proteome, especially the major components of the protein fraction in milk, has been quite extensively studied in comparison [1012], studies that have investigated the human milk glycoproteome are fewer and have to a large degree focused on characterizing the N-glycoproteins in milk. One study used LC-MS to identify 32 glycoproteins in milk samples from one donor [13], while another study used multiple reaction monitoring to quantify seven milk proteins and their site-specific N-glycosylation from three milk samples [14]. A study by Cao et al. used quantitative glycoproteomics to compare the N-glycoproteins in human colostrum and mature milk and discovered 68 N-glycosylation sites on 38 proteins that were differentially expressed between the groups [15]. The same authors have also compared the N-glycosylation of human milk fat globule membrane proteins between colostrum and mature milk. This study identified 220 N-glycoproteins with 304 N-glycosylation sites that were differentially expressed between the groups [16]. Previous studies investigating the differences in protein or glycoprotein profiles depending on the presence or absence of atopic disease are very few. One previous study compared the milk proteomes between samples from allergic and non-allergic mothers and discovered 19 proteins whose levels differed significantly between the groups, with the authors proposing that these proteins may be linked to allergy and asthma [17].

In this study, we have used quantitative mass spectrometry-based glycoproteomics to analyze the N-glycoprotein profiles of a total of 54 milk samples from Rochester and OOM mothers, two populations with different lifestyles and exposures. The prevalence of allergic disease is higher in Rochester mothers, and Rochester infants are at higher risk of childhood atopic disease as compared to the OOM of the Finger Lakes region in New York. We compared the N-glycopeptide profiles between these two communities as well as between samples according to the presence or absence of atopic disease in the mothers and, separately, the children (regardless of community). Here, we show that the glycosylation of milk proteins varies depending on lifestyle as well as the presence or absence of atopic disease in both children and, separately, mothers. To the best of our knowledge, this is the first study to compare the milk glycoproteome between two communities with different lifestyles and a high or low risk of atopic disease.

Results

Study population and design

Milk samples from a total of 54 mothers were analyzed in this study (see the “Materials and methods” section for details). The samples were divided into groups depending on community (Rochester or OOM) and the N-glycopeptide profiles were compared between these groups. The N-glycopeptide profiles were also compared between samples from mothers whose children developed atopic disease in the first 3 years of life (by parent report) and those whose children did not and, separately, between samples from mothers with or without atopic disease, regardless of which community the mothers belonged to. In our study, the prevalence of childhood atopic disease was twice as common in Rochester children (n = 6) than OOM children (n = 3). The prevalence of atopic disease was also higher in Rochester mothers (n = 16) than OOM mothers (n = 9).

Identified glycoproteins

We identified a total of 79 N-glycopeptides from 16 different proteins. The 15 proteins identified are listed along with their Uniprot ID and the number of identified glycopeptides for each protein in Table 1. For each of the 79 N-glycopeptides, we also identified the N-glycosylation site, peptide sequence, glycan composition, and proposed glycan structure (given in S1 Table). Different glycoforms were identified for multiple proteins, such as the ones presented below.

Table 1. Summary of results.

Protein name Protein UniProt ID Number of identified glycopeptides
Alpha-1-antitrypsin A1AT_HUMAN 1
Butyrophilin subfamily 1 member A1 BT1A1_HUMAN 1
Alpha-S1-casein CASA1_HUMAN 9
Clusterin CLUS_HUMAN 2
Chordin-like protein 2 CRDL2_HUMAN 1
Fibrinogen gamma chain FIBG_HUMAN 2
Hemopexin HEMO_HUMAN 1
Haptoglobin HPT_HUMAN 1
Immunoglobulin heavy constant alpha 1 IGHA1_HUMAN 8
Immunoglobulin heavy constant alpha 2 IGHA2_HUMAN 1
Immunoglobulin J chain IGJ_HUMAN 1
Alpha-lactalbumin LALBA_HUMAN 1
Lactadherin MFGM_HUMAN 2
Polymeric immunoglobulin receptor PIGR_HUMAN 24
Lactotransferrin TRFL_HUMAN 24

Glycan compositions and proposed structures

The proposed structures for the 79 N-glycan compositions identified matched database entries in the GlyTouCan repository (https://glytoucan.org), enabling the provision of the proposed glycan structures for these glycopeptides. The majority of the glycan compositions identified were classified as complex-type N-glycans, although nine high-mannose N-glycans were also identified. Additionally, two glycan compositions were classified as hybrid-type N-glycans, namely H6N3 and H4N3F1. Further, three glycan compositions were classified as monoantennary N-glycans, namely H4N3F2, H4N3F1, and S1H4N3F1. The glycan composition H4N3F1 that was classified as a hybrid-type N-glycan was found at site 156 on lactotransferrin, while the glycan composition H4N3F1 that was classified as a monoantennary N-glycan was found at site 497 on lactotransferrin.

Alpha-S1-casein (CASA1) was found to have nine glycan compositions, all at site 69. In other words, nine different N-glycopeptides were identified that belonged to alpha-S1-casein. Eight of nine glycan compositions were fucosylated, while four were sialylated. One glycan composition identified, H6N3, was classified a hybrid-type N-glycan, while the others were classified as complex-type N-glycans. For immunoglobulin A1 (IGHA1, IgA1) we identified five different glycan compositions at site 340 and three at site 144. Of these eight glycans, four were classified as complex-type N-glycans, while the remaining four were classified as high-mannose N-glycans. IgA1 with site 340 indicated is shown in Fig 1A, and the five glycan compositions identified at site 340 are shown in Fig 1B. A total of 24 different glycan compositions at four different N-glycosylation sites were identified for polymeric immunoglobulin receptor (pIgR), the majority of which were complex-type N-glycans. The N-glycosylation site with the most glycan compositions identified was site 469, at which eight different glycans were identified. Different glycoforms of lactotransferrin (TRFL) were also identified, with a total of 24 different glycopeptides being identified. Of these, one was classified as a high-mannose glycan, one as a hybrid-type N-glycan, and two as monoantennary N-glycans. The remainder were complex-type N-glycans. Site 497 was the N-glycosylation site with the largest microheterogeneity, with a total of 11 different glycans identified at this site.

Fig 1.

Fig 1

A) IgA1 with N-glycosylation site 340 shown. B) The five different glycan compositions identified at site 340 of IgA1. Five different N-glycopeptides were identified that belonged to IgA1 and contained glycans attached at site 340. Fig 1A shows IgA1 with two of these glycans as an example (it is not possible to determine if these two glycans were both detected on the same IgA1 antibody). The heavy chains are shown in gray and the light chains in blue. The three high-mannose N-glycans and two complex-type N-glycans identified at site 340 are shown in Fig 1B. Monosaccharide symbols follow the SNFG (Symbol Nomenclature for Glycans) system [18]. The m/z value and charge are given below the glycan structures. For the glycan structures, the following abbreviations were used: H = hexose, N = N-acetylhexosamine, F = fucose, S = sialic acid. The peptide sequence is given at the bottom of the figure, with asparagine (N) being the amino acid to which the glycan is linked. L = leucine, A = alanine, G = glycine, K = lysine, P = proline, T = threonine, H = histidine, V = valine, S = serine, M = methionine, E = glutamic acid, D = aspartic acid, C = cysteine, Y = tyrosine.

Of the nine glycan compositions identified that were part of glycopeptides originating from alpha-S1-casein, all but one contained fucose residues. Four of these glycans also contained sialic acid residues. In contrast, only two glycan compositions out of the eight glycopeptides identified that were part of IgA1 contained fucose residues, while one out of eight contained a sialic acid residue. Out of the 24 glycopeptides identified that were part of pIgR, more than half were fucosylated or sialylated, with 11 glycopeptides containing both a fucose residue and a sialic acid residue. The same was found true for lactotransferrin, with many of the glycopeptides identified being both fucosylated and sialylated. The glycan composition S1H5N4F2 was identified twice at site 69 on alpha-S1-casein. S1H5N4F2 was also identified twice at site 469 and twice at site 421 on pIgR. The glycan composition S1H5N4F1 was identified twice at site 421 on pIgR. In these cases, the same composition was found in different charged states and with different m/z values and subsequently identified twice at the same site.

Differences between Rochester and OOM mothers

The samples analyzed in this study were first divided according to community, Rochester (n = 20) or OOM (n = 34). The 79 N-glycopeptides identified in this study were compared between these two cohorts and a total of 38 glycopeptides were found to have significantly different (p<0.05) abundances between the two groups. The top 10 glycopeptides according to fold change are given in Table 2 and all 38 glycopeptides are given in S2 Table. The majority of the glycan compositions identified were classified as complex-type N-glycans, although three glycans were classified as high-mannose N-glycans and two as monoantennary N-glycans. Out of the 38 N-glycopeptides, 26 were fucosylated and 22 were sialylated. Fifteen of 45 glycan compositions contained both fucose and sialic acid residues. The largest fold change (88.9) was observed for an N-glycopeptide originating from Immunoglobulin A2 (IGHA2, IgA2). The abundance of this N-glycopeptide, whose glycan composition was H3N5F1, was higher in the milk of Rochester mothers.

Table 2. The top 10 significant (p<0.05) glycopeptides according to fold change when samples were compared between Rochester mothers and OOM mothers.

Nine out of 10 glycopeptides were also significant as analyzed by the Benjamini-Hochberg method. Further details can be found in S2 Table.

Protein UniProt ID N-glycosylation site Peptide sequence Glycan composition Fold change Mann-Whitney U test p-value
IGHA2_HUMAN 205 TPLTANITK H3N5F1 88,93 3,45E-03
FIBG_HUMAN 78 VDKDLQSLEDILHQVENK S1H5N4F2 3,35 5,65E-04
TRFL_HUMAN 497 TAGWNIPMGLLFNQTGSCK S2H5N4F1 2,86 5,06E-04
CASA1_HUMAN 69 NESTQNCVVAEPEK S1H5N4F2 2,76 7,99E-03
CASA1_HUMAN 69 NESTQNCVVAEPEK S1H5N4F2 2,29 1,54E-02
TRFL_HUMAN 497 TAGWNIPMGLLFNQTGSCK S1H5N4F1 2,13 3,44E-04
PIGR_HUMAN 499 WNNTGCQALPSQDEGPSK S1H5N4 2,10 4,64E-06
IGHA1_HUMAN 340 LAGKPTHVNVSVVMAEVDGTCY H6N2 2,00 2,82E-03
PIGR_HUMAN 186 QIGLYPVLVIDSSGYVNPNYTGR S2H5N4 1,87 1,76E-04
PIGR_HUMAN 421 LSLLEEPGNGTFTVILNQLTSR S1H5N4F1 1,86 2,94E-02

Alpha-S1-casein

The abundances of three of the nine N-glycopeptides identified at site 69 on alpha-S1-casein were significantly different when compared between Rochester and OOM mothers. These three glycan compositions are given in S3 Table. Both glycan compositions were sialylated and classified as complex-type N-glycans, and the abundances of these glycopeptides were higher in samples from Rochester mothers.

IgA

The abundances of five of the eight N-glycopeptides identified that belonged to IgA1 were significantly different when compared between Rochester and OOM mothers (S3 Table). Three were classified as complex-type N-glycans and two as high-mannose N-glycans. Four of the glycan compositions were located at site 340 of IgA1, with the remaining one located at site 144. The glycan composition located at site 144, H3N5, was classified as a complex-type N-glycan with a bisecting GlcNAc. One of the significantly different N-glycopeptides, S1H4N5F1, also contained a sialic acid residue. Further, we also identified an N-glycopeptide belonging to IgA2, which had a fold change of 88.9 between the groups and a higher abundance in Rochester mothers. The glycan composition of this glycopeptide was H3N5F1, which was classified as a complex-type N-glycan.

pIgR

Out of the 24 N-glycopeptides identified that belonged to pIgR, given in Fig 2, the abundances of 14 glycopeptides, given in S3 Table, were significantly different between samples from Rochester and OOM mothers. At site 421, four glycan compositions were identified with different abundances between the groups. An N-glycopeptide containing the complex-type N-glycan S2H5N4F1 was identified at this site and had a fold change of 0.04 when compared between samples from Rochester and OOM mothers, indicating that levels of this N-glycopeptide were around 25 times higher in samples from OOM mothers. At site 469, four glycan compositions with significantly different abundances between the groups were identified, with three of these compositions displaying higher abundances in samples from OOM mothers. At site 499, five glycan compositions with different abundances between the groups were identified. Three of these five N-glycopeptides had significantly higher abundances in samples from Rochester mothers, among them an N-glycopeptide containing the high-mannose glycan H4N2. The glycopeptide containing the glycan composition S1H5N4 at site 499 was found to have a higher abundance in Rochester mothers, while the glycopeptide containing the composition S2H5N4 at site 499 had a higher abundance in OOM mothers.

Fig 2. The 24 glycan compositions identified belonging to N-glycopeptides that were part of pIgR.

Fig 2

Two glycan compositions were identified at site 186, seven at site 421, eight at site 469, and seven at site 499. The m/z value and charge are given above the glycan structures. Monosaccharide symbols follow the SNFG (Symbol Nomenclature for Glycans) system [18]. For the glycan structures, the following abbreviations were used: H = hexose, N = N-acetylhexosamine, F = fucose, S = sialic acid. Gray boxes indicate a significant (p<0.05) p-value. The fold change (FC) when abundances were compared between Rochester and OOM mothers is also given, with values greater than 1 indicating a higher abundance of the glycopeptide containing that specific glycan composition in Rochester mothers and values less than 1 indicating a higher abundance in OOM mothers. NH2 indicates the N-terminus and COOH the C-terminus of the protein.

Lactotransferrin

Of the 24 different N-glycopeptides identified that belonged to lactotransferrin, the abundances of nine glycopeptides, given in S3 Table, were significantly different between samples from Rochester and OOM mothers. Eight of nine glycan compositions were located at site 497, while the ninth glycan was located at site 156. Interestingly, the four N-glycopeptides with higher abundances in Rochester mothers all contained sialylated glycans. The glycan compositions of the five N-glycopeptides with higher abundances in OOM mothers were all unsialylated. The N-glycopeptide containing the glycan composition H5N4F1 at site 497 had the largest fold change (8.9) between the groups.

Differences between children with or without atopic disease

The samples in this study were also compared between mothers whose children developed atopic disease and those whose children did not (regardless of which community the mothers belonged to). Out of the 79 N-glycopeptides identified, the abundances of eight glycopeptides (given in Table 3) were significantly different among the two groups. Further details can be found in S4 Table. Five of nine glycan compositions were classified as complex-type N-glycans, while H8N2 was classified as a high-mannose N-glycan and H4N3F2 as a monoantennary N-glycan. The abundances of two glycopeptides were higher in samples from mothers whose children developed atopic disease, while the abundances of the remaining glycopeptides were higher in samples from mothers whose children did not develop atopic disease. The N-glycopeptide with the largest fold change (3.7) was identified on chordin-like protein 2 (CRDL2) and was more abundant in samples from mothers of children that developed atopic disease.

Table 3. The eight significant (p<0.05) glycopeptides identified, given according to fold change, when samples were compared between children who developed atopic disease and children who did not.

Further details can be found in S4 Table.

Protein UniProt ID N-glycosylation site Peptide sequence Glycan composition Max fold change Mann-Whitney U test p-value
CRDL2_HUMAN 114 SCQHNGTMYQHGEIFSAHELFPSR S1H5N4F1 3,69 1,67E-02
PIGR_HUMAN 499 WNNTGCQALPSQDEGPSK S1H5N4 1,52 1,04E-02
BT1A1_HUMAN 55 LSPNASAEHLELR S2H5N4F1 0,76 4,30E-02
PIGR_HUMAN 469 VPGNVTAVLGETLK H4N3F2 0,75 3,86E-02
IGHA1_HUMAN 340 LAGKPTHVNVSVVMAEVDGTCY S1H4N5F1 0,45 2,20E-02
HPT_HUMAN 241 VVLHPNYSQVDIGLIK S2H5N4 0,42 1,11E-02
PIGR_HUMAN 499 WNNTGCQALPSQDEGPSK H5N4F2 0,37 3,95E-03
IGHA1_HUMAN 340 LAGKPTHVNVSVVMAEVDGTCY H8N2 0,32 2,33E-02

Differences between atopic and non-atopic mothers

As significant differences were seen in the abundances of multiple N-glycopeptides when compared between samples from mothers whose children developed atopic disease (n = 9) versus those that did not (n = 42), we also decided to compare the samples between mothers with (n = 25) and without atopic disease (n = 26), regardless of community. The abundances of 18 out of 79 N-glycopeptides (given in S5 Table) were significantly different between samples from mothers with atopic disease and those without. Three glycan compositions were classified as high-mannose N-glycans and one as a monoantennary N-glycan, while the remainder were classified as complex-type N-glycans. The abundances of all but one N-glycopeptide were higher in mothers without atopic disease.

Similarities between the comparisons

In Fig 3A, a Venn diagram comparing the significantly different N-glycopeptides in each comparison against each other is shown. Three N-glycopeptides were identified that had significantly different abundances in all three comparisons in this study (see Table 4 for details). The glycan compositions of these three N-glycopeptides are presented in Fig 3B. The abundances of these three N-glycopeptides were higher in samples from OOM mothers, samples from mothers whose children did not develop atopic disease, and samples from mothers without atopic disease. The N-glycopeptide containing the glycan composition H8N2, which was classified as a high-mannose N-glycan, was identified at site 340 on IgA1. The glycan composition H5N4F2, a complex-type N-glycan, was identified at site 499 on pIgR, while the composition H4N3F2, a monoantennary N-glycan, was identified at site 469 on pIgR. The proposed structures of these three glycan compositions matched entries in the GlyTouCan database.

Fig 3.

Fig 3

(A) The Venn diagram showing the significant N-glycopeptides identified in each comparison in relation to each other. (B) The three glycan compositions with significantly different abundances in all three comparisons in this study. Monosaccharide symbols follow the SNFG (Symbol Nomenclature for Glycans) system [18]. For the glycan structures, the following abbreviations were used: H = hexose, N = N-acetylhexosamine, F = fucose, S = sialic acid. The m/z value and charge are given to the left of the glycan structures. OOM = Old Order Mennonites. IgA1 = immunoglobulin A1. pIgR = polymeric immunoglobulin receptor.

Table 4. The three N-glycopeptides identified that had significantly different abundances in all three comparisons (according to community, presence of atopic disease in children, and presence of atopic disease in mothers) in this study.

Protein UniProt ID N-glycosylation site Peptide sequence Glycan composition Rochester vs. OOM mothers Atopic disease vs. no disease (children) Atopic disease vs. no disease (mothers)
Fold change Mann-Whitney U test p-value Fold change Mann-Whitney U test p-value Fold change Mann-Whitney U test p-value
PIGR_HUMAN 469 VPGNVTAVLGETLK H4N3F2 0,55 7,38E-04 0,75 3,86E-02 0,69 4,79E-03
PIGR_HUMAN 499 WNNTGCQALPSQDEGPSK H5N4F2 0,49 7,24E-03 0,37 3,95E-03 0,41 1,78E-04
IGHA1_HUMAN 340 LAGKPTHVNVSVVMAEVDGTCY H8N2 0,48 2,60E-02 0,32 2,33E-02 0,37 1,69E-03

The abundances of two N-glycopeptides were significantly different both when compared between samples from Rochester and OOM mothers and mothers whose children developed atopic disease versus mothers whose children did not, but not when compared between mothers with or without atopic disease. The N-glycopeptide with higher abundances in samples from Rochester mothers and samples from mothers whose children developed atopic disease was identified as belonging to pIgR. The glycan component of this glycopeptide, S1H5N4, was found at site 499 and classified as a complex-type N-glycan. The N-glycopeptide containing the glycan composition S1H4N5F1, also a complex-type glycan, had higher abundances in samples from OOM mothers and mothers whose children did not develop atopic disease. This glycan composition was found at site 340 of IgA1.

Discussion

In this study, we analyzed a total of 54 milk samples that were divided according to community (Rochester or OOM) and, separately, the presence or absence of atopic disease in children and mothers, regardless of community. The Rochester community represents a population at high risk of atopic disease, while the OOM community represents a population at low risk of atopic disease. A total of 38 N-glycopeptides had significantly different abundances when samples were compared according to community, while eight and 18 significantly different N-glycopeptides were identified when samples were compared according to the presence or absence of atopic disease in children and mothers, respectively. These findings indicate that the significant differences seen in milk N-glycopeptide profiles between Rochester and OOM mothers, although to some degree be related to the development of atopic disease, are likely due to factors such as genetics, lifestyle, and exposures. The abundances of six of the nine N-glycopeptides identified when samples were compared according to the presence or absence of atopic disease in children were higher in children who did not develop atopic disease. This indicates that these glycopeptides are associated with protection against atopic disease in children.

Three N-glycopeptides were identified that had significantly different abundances in all three comparisons in this study. As the abundances of all three N-glycopeptides were higher in the OOM community and mothers and children without atopic disease, these findings indicate that these glycan compositions may reflect various environmental exposures but could be linked to the development of atopic disease and could affect the development of the infant’s immune system. Two glycan compositions were identified on pIgR and the third on IgA. IgA is the most predominant immunoglobulin in human milk and is produced by B cells that have migrated to the mammary gland. pIgR is present at the basal surface of mammary epithelial cells, where it binds to IgA and is subsequently internalized into endosomes. This complex is then transported via transcytosis to the apical surface, where pIgR is proteolytically cleaved, leading to the release of secretory IgA (SIgA). SIgA, which consists of IgA and a large extracellular fragment of pIgR, the so-called secretory component, is then secreted into mammary alveoli [19, 20]. A study by Steffen et al. showed that altered glycosylation affects the effector functions of IgA. The authors found that the enzymatic removal of sialic acid increases the pro-inflammatory capacity of IgA1 [21]. Our findings indicate that the differential glycosylation of pIgR and IgA, two proteins that interact closely with each other, has a role in the development of atopic disease and may confer a protective effect against the development of atopic disease in both mothers and children. It is likely that the differences in glycosylation observed between IgA and pIgR in our study lead to functional differences, although the exact effects are unknown.

Strengths of this study include access to the OOM community, a unique population with an agrarian lifestyle and a low rate of allergic disease, and the large number of samples analyzed, which enabled the comparison of N-glycopeptide profiles between different groups and lends validity to the findings of our statistical analyses. The number of samples analyzed in our study is larger than what previous glycoproteomic studies of milk have used and, to the best of our knowledge, this is the first study to comprehensively analyze the N-glycoproteome between two communities at different risk of atopic disease. Further, in several cases, the same glycan composition, although with different charged states and m/z values, was identified twice at the same site of a protein. This serves to validate our findings, especially regarding the quantitation, which was independently done. As a limitation, the follow-up on development of allergic diseases was limited to self-reporting of physician diagnosis and/or a retrospective query of allergic symptoms performed by a pediatric allergist. The latter was implemented to mitigate the possible bias due to lower healthcare utilization by the OOM.

In conclusion, we show that the differential glycosylation of milk proteins may play a role in the development of atopic disease. We identified six N-glycopeptides that may protect against the development of atopic disease in children, and 16 that may protect against the development of atopic disease in mothers. Further, we identified three glycan compositions that may confer a protective effect against the development of atopic disease in both mothers and children.

Materials and methods

Study population

Milk samples from 54 mothers from a previously studied cohort [22] were analyzed. Of these, 20 samples came from Rochester mothers and 34 from OOM mothers. The OOM of the Finger Lakes region in New York State were recruited by a nurse midwife among prenatal visits in her clinic. Rochester urban and suburban mothers were recruited from the University of Rochester Medical Center using posted fliers. This study was approved by the Institutional Review Board of the University of Rochester Medical Center (RSRB52971) and all subjects provided written consent prior to enrollment in the study. The samples used in this study are given in S6 Table together with relevant details.

Data were collected by questionnaires regarding maternal atopic disease (asthma, eczema, allergic rhinitis and food allergy), which was self-reported. In children, the presence of possible atopic disease in the first three years of life was determined through a blinded telephone follow-up for allergic symptoms, performed by a pediatric allergist (KJ). We queried physician-diagnosis of atopic dermatitis, allergic rhinitis, food allergy, and asthma, or symptoms consistent with atopic disease including chronic/remitting pruritic rash in a distribution age-typical for atopic eczema that had been treated by steroidal treatments, chronic or recurring symptoms of rhinorrhea/congestion/sneezing treated with antihistamines, symptoms suggestive of IgE-mediated food allergy (itching/swelling of lips/mouth/throat, urticaria or severe vomiting after 2h of ingestion of a specific food) and allergic proctocolitis, and recurrent wheezing episodes. Few intermittent wheezing episodes associated with viral infections alone were not determined indicative of asthma due to young age, whereas recurrent exercise-induced symptoms and persistent wheezing treated with inhaled corticosteroids were labeled as “recurrent wheezing/asthma”. Some data regarding the presence or absence of atopic disease in mothers and children was unavailable (marked N/A in S6 Table) due to a lack of follow-up data.

Sample collection

Milk samples were collected in the morning by the use of UV-irradiated sterile manual breast pumps (Harmony Breastpump, Medela) or manual expression, wearing gloves. Foremilk was collected after cleaning the breast with Castille soap. All milk samples were frozen at -20°C immediately upon collection and transferred to -80°C within four weeks for storage.

Trypsin digestion

Milk samples were defatted by centrifuging at 2,000g for 20 minutes at 4°C, after which the supernatant was transferred to a new tube and centrifuged at 5,000g for 10 minutes at 4°C. The supernatant was collected and transferred to a new tube again. This supernatant was then used for BCA assay. Equivalent amounts of protein were adjusted to the same concentration and diluted 1:1 with PBS (pH 7.4) and 6 volumes of cold acetone was added to the samples. After vortexing to mix the samples, the tubes were incubated at -20°C for 2 hours. The samples were centrifuged at 10,000g for 30 minutes at 4°C and the supernatant was carefully decanted. Pellets were dissolved in 50 mM Tris buffer + 8M urea (pH 8.0) and 10mM dithiothreitol (final concentration) was added to the samples. Samples were reduced for 1 hour at RT with mixing. Iodoacetamide was added to the samples at 40mM (final concentration) and the samples were alkylated at RT for 1 hour with shaking in the dark. Further, 40mM DTT was added to prevent overalkylation by iodoacetamide. Bovine pancreatic trypsin was added to the solution at a 1:50 ratio and the samples were incubated at 37°C overnight.

Lectin affinity chromatography

Lectin affinity chromatography was performed as described previously [23]. Briefly, tryptic peptides were diluted 1:10 with 10mM HEPES buffer (pH 7.4) containing 1mM CaCl2 and 1mM MnCl2. Diluted sample mixtures were individually incubated with lectin-agarose columns. Con-A:SNA:LCA:AAL (5:3:3:1) were used for a final volume of 150 μL lectin resin slurry, which was mixed with samples and incubated overnight at 4°C with rotation. 24 hours later, lectin beads were washed three times with HEPES buffer and the N-glycopeptides were eluted with a solution containing fucose, α-methyl mannoside, α-methyl glucoside, and lactose, followed by 1% formic acid. The N-glycopeptides were cleaned using C18 micro spin columns according to the manufacturer’s instructions. 0.1% formic acid was used to dissolve the eluted N-glycopeptides for analysis by Ultra-Performance Liquid Chromatography (UPLC)-mass spectrometry (MS).

UPLC-MS/MS2 and -MSE

N-glycopeptides were analyzed using a Waters SYNAPT G2 High Definition MS connected to a Waters nanoACQUITY UPLC. MSE (100–2000 Da mass range) was performed in positive mode with sensitivity mode for the quantification and MS2 FAST DDA (positive and sensitivity) mode was used for N-glycopeptide fragmentation (50–2500 Da mass range) for identification. One second scan time was used for both MSE and FAST DDA. Trap collision energy (high energy function) was ramped from 14 to 44 V for MSE. Continuum data format and deisotope peak selection was set in the parameters. In the FAST DDA trap collision, energy was ramped with low mass to high mass range from 20 to 60 V. Calibration was performed with sodium formate. The trap column was a nanoACQUITY UPLC Trap, 180 μm x 20 mm (5 μm), Symmetry®C18, and the analytical column was a nanoACQUITY UPLC, 75 μm x 100 mm (1.8 μm), HSS T3. Samples were loaded, trapped, and washed for two minutes with 8.0 μL/min with 1% B. The analytical gradient used is as follows: 0–1 minutes 1% B, at 2 minutes 5% B, at 45 minutes 30% B, at 48 minutes 50% B, at 50 minutes 85% B, at 53 minutes 85% B, at 54 minutes 1% B and at 60 minutes 1% B with 450nL/min for MSE while 300 nL/min was used for N-glycopeptide fragmentation.

Data analysis

The raw files were imported to Progenesis QI for proteomics (Version V2, Nonlinear Dynamics, Newcastle, UK). Post-acquisition mass correction was performed when the files were imported using a lock mass correction of 785.8426 m/z (doubly-charged Glu1-Fibrinopeptide B). The default parameters for peak picking and alignment were used. Progenesis QI for proteomics performs the label-free quantification and works as follows:

Run alignment

The run with the most peaks (ions) is used as a reference to which all other runs are aligned.

Peak picking

All data from the aligned runs is aggregated into a dataset that contains all peak information from all sample files. This aggregate peak list is then used to match each individual sample file.

Ion abundance quantification

Peptide ion abundance is a sum of areas that are obtained from calculating the peak areas from the intensity curves obtained from the MSE runs. Charge selection is applied by considering +3 to +5 charged ions as potential N-glycopeptides. The groups are then compared to show differences between the groups and finally, MS/MS is matched to MSE runs to collate quantitation and identification information.

N-glycopeptide ions were identified as previously described [23]. Briefly, deconvolution of the MS/MS spectra was performed using the MaxEnt3 module of Waters MassLynx 4.1 software and exported as peak lists (.pkl). The publicly available software GlycopeptideID was used to identify the N-glycopeptides, as GlycopeptideID can analyze CID MS/MS spectra in an automated manner. A database of tryptic peptides (Uniprot, two miscleavages allowed, mandatory to have NXS/T/C, P! in the peptide sequence, where X is any amino acid other than proline) from known human proteins was used and a glycan library was generated within the software by downloading human glycans from the GlyTouCan glycan structure repository (https://glytoucan.org). All deconvoluted MS/MS spectra are imported as.pkl files combined into one file. MS2 spectra are matched against this database and scores for potential peptide backbones are generated. GlycopeptideID then searches glycan compositions against a glycan database and returns glycan compositions which are fitted onto the spectrum by generating glycan scores. Peptide and glycan scores are summed up to provide glycopeptide scores [24]. The results are then ranked and an annotated spectrum is drawn for each potential result. Manual assessment of the matching y and b ions, glycan fragments, and glycopeptide fragments is subsequently performed. The target-decoy strategy is used to calculate the false-discovery rate. The decoy database was generated by reversing the peptide sequences of the tryptic peptide database specified above. The false-discovery rate was set to 2%.

Glycan compositions

Glycan compositions are specified as one-letter abbreviations: H = hexose, N = hexosamine, S = sialic acid, and F = fucose. The number following the abbreviations indicates the number of monosaccharides. For example, S2H6N5 indicates a glycan containing two sialic acids, six hexoses, and five hexosamines. A simplified version of the consortium for functional genomics (CFG) Modified IUPAC condensed format is used for glycan structure output (see www.functionalglycomics.org/static/consortium/Nomenclature.shtml for details. The stereoisomer (α, β) and the region isomer (for ex. 1–4) notations are removed and single-letter codes (H = Hex, N = HexNAc, F = Fuc, S = NeuAc) are used for monosaccharides. Branching is shown by parenthesis and written from non-reducing to reducing end. As the glycan compositions identified matched database entries in the GlyTouCan repository, it was possible to provide proposed glycan structures for these compositions. Glycans were also divided into complex-type, high-mannose, hybrid, or monoantennary N-glycans on the basis of their proposed structures.

Statistical analysis

The samples were divided into groups based on community (Rochester or OOM), as well as the presence or absence of atopic disease in the children and mothers (see S6 Table). The Mann-Whitney U test was used to analyze the differences between the groups and a p-value of <0.05 was considered significant.

Supporting information

S1 Table. The 79 N-glycopeptides identified in this study.

(XLSX)

S2 Table. The 38 N-glycopeptides with significantly different abundances when samples were compared according to community (Rochester vs. OOM).

(XLSX)

S3 Table. The N-glycopeptides identified from alpha-S1-casein, IgA, pIgR, and lactotransferrin with significantly different abundances when samples were compared according to community (Rochester vs. OOM).

(XLSX)

S4 Table. The eight N-glycopeptides with significantly different abundances when samples were compared between children who developed atopic disease and those that did not.

(XLSX)

S5 Table. The 18 N-glycopeptides with significantly different abundances when samples were compared between mothers with or without atopic disease.

(XLSX)

S6 Table. The milk samples analyzed in this study together with details regarding the mother’s community and the presence or absence of atopic disease in both mothers and children.

(XLSX)

Acknowledgments

The authors would like to thank Mary Ann Martin and the late Joyce Wade, CNM for their guidance and assistance in recruitment efforts within the Old Order Mennonite Community and the participating families without whom this research would not be possible.

Data Availability

The data underlying this article are available at the ProteomeXchange Consortium (http://proteomecentral.proteomexchange.org) via the PRIDE partner repository and can be accessed with the dataset identifier PXD026644.

Funding Statement

This work was supported by the National Center for Advancing Translational Sciences [Grant Number R21 TR002516 to KJS and RJL]; Founders’ Distinguished Professorship in Pediatric Allergy [KJ]; and Pilot Awards from University of Rochester Clinical and Translational Science Institute and Environmental Health Sciences Center Pilot Award [to RJL]. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.

References

  • 1.Pawankar R. Allergic diseases and asthma: a global public health concern and a call to action. World Allergy Organ J. 2014;7(1):12. doi: 10.1186/1939-4551-7-12 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Martina C, Looney RJ, Marcus C, Allen M, Stahlhut R. Prevalence of allergic disease in Old Order Mennonites in New York. Ann Allergy Asthma Immunol. 2016;117(5):562–3 e1. doi: 10.1016/j.anai.2016.08.023 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Friedman NJ, Zeiger RS. The role of breast-feeding in the development of allergies and asthma. J Allergy Clin Immunol. 2005;115(6):1238–48. doi: 10.1016/j.jaci.2005.01.069 [DOI] [PubMed] [Google Scholar]
  • 4.Järvinen KM, Bergmann KE, Bergmann R. 15—Breast—Always Best? In: Wahn U, Sampson HA, editors. Allergy, Immunity and Tolerance in Early Childhood. Amsterdam: Academic Press; 2016. p. 235–60. [Google Scholar]
  • 5.Elbert NJ, van Meel ER, den Dekker HT, de Jong NW, Nijsten TEC, Jaddoe VWV, et al. Duration and exclusiveness of breastfeeding and risk of childhood atopic diseases. Allergy. 2017;72(12):1936–43. doi: 10.1111/all.13195 [DOI] [PubMed] [Google Scholar]
  • 6.von Mutius E, Vercelli D. Farm living: effects on childhood asthma and allergy. Nat Rev Immunol. 2010;10(12):861–8. doi: 10.1038/nri2871 [DOI] [PubMed] [Google Scholar]
  • 7.Riedler J, Braun-Fahrlander C, Eder W, Schreuer M, Waser M, Maisch S, et al. Exposure to farming in early life and development of asthma and allergy: a cross-sectional survey. Lancet. 2001;358(9288):1129–33. doi: 10.1016/S0140-6736(01)06252-3 [DOI] [PubMed] [Google Scholar]
  • 8.Phillips JT, Stahlhut RW, Looney RJ, Jarvinen KM. Food allergy, breastfeeding, and introduction of complementary foods in the New York Old Order Mennonite Community. Ann Allergy Asthma Immunol. 2020;124(3):292–4 e2. doi: 10.1016/j.anai.2019.12.019 [DOI] [PubMed] [Google Scholar]
  • 9.Zhu J, Dingess KA. The Functional Power of the Human Milk Proteome. Nutrients. 2019;11(8). doi: 10.3390/nu11081834 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.O’Donnell R, Holland JW, Deeth HC, Alewood P. Milk proteomics. International Dairy Journal. 2004;14(12):1013–23. [Google Scholar]
  • 11.Mange A, Bellet V, Tuaillon E, Van de Perre P, Solassol J. Comprehensive proteomic analysis of the human milk proteome: contribution of protein fractionation. J Chromatogr B Analyt Technol Biomed Life Sci. 2008;876(2):252–6. doi: 10.1016/j.jchromb.2008.11.003 [DOI] [PubMed] [Google Scholar]
  • 12.Liao Y, Alvarado R, Phinney B, Lonnerdal B. Proteomic characterization of human milk whey proteins during a twelve-month lactation period. J Proteome Res. 2011;10(4):1746–54. doi: 10.1021/pr101028k [DOI] [PubMed] [Google Scholar]
  • 13.Picariello G, Ferranti P, Mamone G, Roepstorff P, Addeo F. Identification of N-linked glycoproteins in human milk by hydrophilic interaction liquid chromatography and mass spectrometry. Proteomics. 2008;8(18):3833–47. doi: 10.1002/pmic.200701057 [DOI] [PubMed] [Google Scholar]
  • 14.Huang J, Kailemia MJ, Goonatilleke E, Parker EA, Hong Q, Sabia R, et al. Quantitation of human milk proteins and their glycoforms using multiple reaction monitoring (MRM). Anal Bioanal Chem. 2017;409(2):589–606. doi: 10.1007/s00216-016-0029-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Cao X, Song D, Yang M, Yang N, Ye Q, Tao D, et al. Comparative Analysis of Whey N-Glycoproteins in Human Colostrum and Mature Milk Using Quantitative Glycoproteomics. J Agric Food Chem. 2017;65(47):10360–7. doi: 10.1021/acs.jafc.7b04381 [DOI] [PubMed] [Google Scholar]
  • 16.Cao X, Kang S, Yang M, Li W, Wu S, Han H, et al. Quantitative N-glycoproteomics of milk fat globule membrane in human colostrum and mature milk reveals changes in protein glycosylation during lactation. Food Funct. 2018;9(2):1163–72. doi: 10.1039/c7fo01796k [DOI] [PubMed] [Google Scholar]
  • 17.Hettinga KA, Reina FM, Boeren S, Zhang L, Koppelman GH, Postma DS, et al. Difference in the breast milk proteome between allergic and non-allergic mothers. PLoS One. 2015;10(3):e0122234. doi: 10.1371/journal.pone.0122234 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Varki A, Cummings RD, Aebi M, Packer NH, Seeberger PH, Esko JD, et al. Symbol Nomenclature for Graphical Representations of Glycans. Glycobiology. 2015;25(12):1323–4. doi: 10.1093/glycob/cwv091 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Niimi K, Usami K, Fujita Y, Abe M, Furukawa M, Suyama Y, et al. Development of immune and microbial environments is independently regulated in the mammary gland. Mucosal Immunol. 2018;11(3):643–53. doi: 10.1038/mi.2017.90 [DOI] [PubMed] [Google Scholar]
  • 20.Mostov KE. Transepithelial transport of immunoglobulins. Annu Rev Immunol. 1994;12:63–84. doi: 10.1146/annurev.iy.12.040194.000431 [DOI] [PubMed] [Google Scholar]
  • 21.Steffen U, Koeleman CA, Sokolova MV, Bang H, Kleyer A, Rech J, et al. IgA subclasses have different effector functions associated with distinct glycosylation profiles. Nat Commun. 2020;11(1):120. doi: 10.1038/s41467-019-13992-8 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Seppo AE, Bu K, Jumabaeva M, Thakar J, Choudhury RA, Yonemitsu C, et al. Infant gut microbiome is enriched with Bifidobacterium longum ssp. infantis in Old Order Mennonites with traditional farming lifestyle. Allergy. 2021. doi: 10.1111/all.14877 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Joenvaara S, Saraswat M, Kuusela P, Saraswat S, Agarwal R, Kaartinen J, et al. Quantitative N-glycoproteomics reveals altered glycosylation levels of various plasma proteins in bloodstream infected patients. PLoS One. 2018;13(3):e0195006. doi: 10.1371/journal.pone.0195006 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Joenvaara S, Ritamo I, Peltoniemi H, Renkonen R. N-glycoproteomics—an automated workflow approach. Glycobiology. 2008;18(4):339–49. doi: 10.1093/glycob/cwn013 [DOI] [PubMed] [Google Scholar]

Decision Letter 0

Frederique Lisacek

8 Nov 2021

PONE-D-21-30987Quantitative glycoproteomics of human milk and association with atopic diseasePLOS ONE

Dear Dr. Jarvinen,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

Both reviewers expressed concerns in finding clear explanations on some key points such as quantification and the 2nd review stresses more concerns on precision and rigor lacking in data analysis and interpretation that are crucial steps in the study.

Please submit your revised manuscript by Dec 23 2021 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.

  • A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.

  • An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols.

We look forward to receiving your revised manuscript.

Kind regards,

Frederique Lisacek

Academic Editor

PLOS ONE

Journal Requirements:

When submitting your revision, we need you to address these additional requirements.

1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at

https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and

https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf

2. In your Data Availability statement, you have not specified where the minimal data set underlying the results described in your manuscript can be found. PLOS defines a study's minimal data set as the underlying data used to reach the conclusions drawn in the manuscript and any additional data required to replicate the reported study findings in their entirety. All PLOS journals require that the minimal data set be made fully available. For more information about our data policy, please see http://journals.plos.org/plosone/s/data-availability.

Upon re-submitting your revised manuscript, please upload your study’s minimal underlying data set as either Supporting Information files or to a stable, public repository and include the relevant URLs, DOIs, or accession numbers within your revised cover letter. For a list of acceptable repositories, please see http://journals.plos.org/plosone/s/data-availability#loc-recommended-repositories. Any potentially identifying patient information must be fully anonymized.

Important: If there are ethical or legal restrictions to sharing your data publicly, please explain these restrictions in detail. Please see our guidelines for more information on what we consider unacceptable restrictions to publicly sharing data: http://journals.plos.org/plosone/s/data-availability#loc-unacceptable-data-access-restrictions. Note that it is not acceptable for the authors to be the sole named individuals responsible for ensuring data access.

We will update your Data Availability statement to reflect the information you provide in your cover letter.

3. We note that you have stated that you will provide repository information for your data at acceptance. Should your manuscript be accepted for publication, we will hold it until you provide the relevant accession numbers or DOIs necessary to access your data. If you wish to make changes to your Data Availability statement, please describe these changes in your cover letter and we will update your Data Availability statement to reflect the information you provide.

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Yes

Reviewer #2: No

**********

2. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

Reviewer #2: I Don't Know

**********

3. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: Yes

**********

4. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: Yes

**********

5. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: Overall, this is a well performed study and is well described. I think that it provides a nice set of information for the field which may lead to future research examining the potential impact of the associations between site-specific glycosylation of milk proteins and allergy.

Abstract

32: N-glycopeptide should be N-glycoprotein—as you are using tryptic peptides to identify the glycoproteins.

34: same comment. Should be glycoprotein, not glycopeptide profiles. Saying glycopeptide implies that you studied the naturally occurring glycopeptides rather than the trypsin-produced peptides.

Intro:

93: you are actually comparing N-glycoprotein profiles (via tryptic N-glycopeptides), not N-glycopeptide profiles—the N-glycopeptides here are created in your study, not a natural phenomenon.

Results

Table 2: max fold change and P-value numbers shoul have decimal points as periods, not commas. Same for Table 3.

Method

424: Should give time and temperatures for centrifugation steps.

433: iodoacetamide should not be capitalized.

Statistics: It seems that a correction for multiple testing should be applied (each test increases the likelihood of a false positive). I do not see one listed.

Reviewer #2: Data interpretation: The abstract is overstretched. “These four glycopeptides may have a protective effect against the development of atopic disease. Our findings indicate that the differential glycosylation of milk proteins may affect the development of atopic disease, something previously uninvestigated.” As the authors do present a mere association, yet no mechanistic hypothesis, and also no experimental support for a causal relationship whatsoever, these final sentences of the abstract appear poorly supported. Please adjust.

Data interpretation: what are the differences between the milk samples? Please try an distill from the data some major changes: how is site-specific glycosylation different, is e.g. sialylation up or down, are there differences in fucosylation?

To which extent are the differences in glycopeptide signals caused by differences in the expression of proteins, versus differences in site-specific glycosylation? An integrated proteomic and glycoproteomic analysis of the data would be necessary.

Glycopeptide identification: The authors mention that MS/MS assignments were manually checked, but they do not display any of the MS/MS spectra. Please provide assigned MS/MS spectra as supplementary data for all de novo assignments and for all more exotic compositions and proposed structures.

Figure 1: the H3N3S1 glycan structure is not credible, with the sialic acid linked to GlcNAc; please adjust

Figure 2: what is the structural basis for assigning fucoses to the core, to the antennary GlcNAc, or antennary galactose; please provide support for these assignments, same for figure 3.

Quantification: quantifying each charge state separately does not make sense, as the aim is to quantify glycopeptides, not MS signals. Hence, please sum up the different charge states before quantification, or rely on the most intense one.

The quantification needs some clarification: For example, it is unclear how the fold change was calculated, as the tables have a column headed Max fold change. What is a Max fold change ? I searched the manuscript for explanation, but could not find it.

In line with this comment, please provide comprehensive legends for all suppl. Tables to guide the reader in assessing the data.

**********

6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

Decision Letter 1

Frederique Lisacek

28 Feb 2022

PONE-D-21-30987R1Quantitative glycoproteomics of human milk and association with atopic diseasePLOS ONE

Dear Dr.Jarvinen,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process. The revised manuscript still does not include enough evidence supporting some claims in the text. This requires according to reviewer #2, the enhancement of supplementary material and the inclusion of further details justifying some structural assignments.

Please submit your revised manuscript by April 15, 2022. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.

  • A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.

  • An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols.

We look forward to receiving your revised manuscript.

Kind regards,

Frederique Lisacek

Academic Editor

PLOS ONE

Journal Requirements:

Please review your reference list to ensure that it is complete and correct. If you have cited papers that have been retracted, please include the rationale for doing so in the manuscript text, or remove these references and replace them with relevant current references. Any changes to the reference list should be mentioned in the rebuttal letter that accompanies your revised manuscript. If you need to cite a retracted article, indicate the article’s retracted status in the References list and also include a citation and full reference for the retraction notice.

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation.

Reviewer #1: All comments have been addressed

Reviewer #2: (No Response)

**********

2. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Yes

Reviewer #2: (No Response)

**********

3. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

Reviewer #2: I Don't Know

**********

4. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: No

**********

5. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: Yes

**********

6. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: All changes requested have been made accurately. The manuscript is now ready for publication in the journal.

Reviewer #2: The authors have considerably improved the manuscript by toning down on some statements, and including more information in e.g figure and table legends. A lot of the key data of the paper are, however, not sufficiently accessible, and I therefore propose a major revision of the manuscript, to make some of the conclusions more substantiated, make the paper more comprehensible, and ultimately increase its impact.

1. Display of MS/MS data. The authors indicate that MS/MS data are made online available. I think this is not sufficient. For the most "exotic" assignments the authors should provide fully assigned spectra to support their conclusions. I would e.g. love to see the assigned MS/MS spectrum of S2H3N10F5 now indicated in Figure 2. Also for the small number of additional, de novo assigned glycopeptides, a display of the assigned spectra may help to support the claims.

2. Another aspect of concern is the use of "Max fold change". I think one can safely agree that "Max fold change" is not a conventional way of assessing differences between groups, and is not the most meaningful parameter. Obviously, I am not interested in seeing the maximum difference between groups, but the differences of medians or means in these cross-sectional comparisons. Displaying such averages and ranges (e.g. interquartile ranges, or standard deviations) would make the paper biologically way more meaningful.

3. As a reader I would need more relevant information in the supplemental tables: what is the score column? Were the p-values corrected? If so, in which manner, with which factor? Which significance threshold was applied?

4. For some glycopeptides the antennary fucoses were assigned to galactoses, for other to N-acetylglucosamine. I would love to see the data that support this assignment, in the form of some exemplary MS/MS spectra. I would be particularly interested to know whether the authors saw diagnostic B ions such as fucose-hexose, that would substantiate such a claim.

**********

7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

PLoS One. 2022 May 13;17(5):e0267967. doi: 10.1371/journal.pone.0267967.r004

Author response to Decision Letter 1


30 Mar 2022

Replies to reviewer comments

We thank the reviewers for their suggestions. We have addressed all the comments as followes.

Reviewer #1: All changes requested have been made accurately. The manuscript is now ready for publication in the journal.

Reviewer #2: The authors have considerably improved the manuscript by toning down on some statements, and including more information in e.g figure and table legends. A lot of the key data of the paper are, however, not sufficiently accessible, and I therefore propose a major revision of the manuscript, to make some of the conclusions more substantiated, make the paper more comprehensible, and ultimately increase its impact.

1. Display of MS/MS data. The authors indicate that MS/MS data are made online available. I think this is not sufficient. For the most "exotic" assignments the authors should provide fully assigned spectra to support their conclusions. I would e.g. love to see the assigned MS/MS spectrum of S2H3N10F5 now indicated in Figure 2. Also for the small number of additional, de novo assigned glycopeptides, a display of the assigned spectra may help to support the claims.

We have now provided the PRIDE database ID for each glycan composition in the respective tables to make it easier to access the mass spectrometric data. The spectra and fragments were also previously accessible in the database. For clarity and simplicity, we have removed the de novo compositions from this study.

2. Another aspect of concern is the use of "Max fold change". I think one can safely agree that "Max fold change" is not a conventional way of assessing differences between groups, and is not the most meaningful parameter. Obviously, I am not interested in seeing the maximum difference between groups, but the differences of medians or means in these cross-sectional comparisons. Displaying such averages and ranges (e.g. interquartile ranges, or standard deviations) would make the paper biologically way more meaningful.

We have now removed the maximum fold changes and replaced them with fold changes calculated using the difference of the mean abundance of each glycopeptide between the groups compared.

3. As a reader I would need more relevant information in the supplemental tables: what is the score column? Were the p-values corrected? If so, in which manner, with which factor? Which significance threshold was applied?

The glycopeptide scores in the supplemental tables are calculated as a negative logarithm of the probability that a random set of fragments would have as many or more shared peaks with the measured spectrum as the ranked glycopeptide. The probability that the random spectra have more or equal shared peaks than the glycopeptide spectrum is calculated using binomial distribution. The glycopeptide scores therefore provide an indication of the quality of the identification. We have now referred to the study explaining this in detail in the Material and methods section.

The p-values provided in the tables in the manuscript are uncorrected, with p-values of <0.05 being considered significant. We also used the Benjamini-Hochberg procedure to control the false-discovery rate and provide information regarding if the adjusted p-values were significant or not in the supplementary tables.

4. For some glycopeptides the antennary fucoses were assigned to galactoses, for other to N-acetylglucosamine. I would love to see the data that support this assignment, in the form of some exemplary MS/MS spectra. I would be particularly interested to know whether the authors saw diagnostic B ions such as fucose-hexose, that would substantiate such a claim.

The MS/MS spectra of all fucosylated glycopeptides were manually examined in order to see if, in the case of fucose residues assigned to the core, we could identify primarily and secondarily peptide-HexNAc-fucose ions so that monosaccharides instead of HexNAc residues could be assigned to the N-core, as only one fucose can be assigned to the N-core. In the case of fucose residues assigned to branches, we looked for fucose-hexose-HexNAc (FHN) ions.

In cases where fucose residues were only assigned to the core, no FHN ions were present in any of the spectra. In cases where fucose residues were assigned to branches, the software identified diagnostic FHN ions in 28 cases but missed 3 cases. In all cases, the final glycan composition used is the one with the best fit as calculated by fitting the total mass of the glycan composition, all the available intact peptide-glycan fragments, glycan fragments, and peptide-partial glycan fragments to the peptide whose sequence was assigned to that glycan. However, we would like to mention that wherever mentioned, we have only discussed proposed glycan structures.

Attachment

Submitted filename: Response to reviewers.docx

Decision Letter 2

Frederique Lisacek

20 Apr 2022

Quantitative glycoproteomics of human milk and association with atopic disease

PONE-D-21-30987R2

Dear Dr. Jarvinen,

We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements.

Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication.

An invoice for payment will follow shortly after the formal acceptance. To ensure an efficient process, please log into Editorial Manager at http://www.editorialmanager.com/pone/, click the 'Update My Information' link at the top of the page, and double check that your user information is up-to-date. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org.

If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.

Kind regards,

Frederique Lisacek

Academic Editor

PLOS ONE

Additional Editor Comments (optional):

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation.

Reviewer #2: All comments have been addressed

**********

2. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #2: Partly

**********

3. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #2: Yes

**********

4. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #2: Yes

**********

5. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #2: Yes

**********

6. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #2: (No Response)

**********

7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #2: No

Acceptance letter

Frederique Lisacek

27 Apr 2022

PONE-D-21-30987R2

Quantitative glycoproteomics of human milk and association with atopic disease

Dear Dr. Jarvinen:

I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department.

If your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact onepress@plos.org.

If we can help with anything else, please email us at plosone@plos.org.

Thank you for submitting your work to PLOS ONE and supporting open access.

Kind regards,

PLOS ONE Editorial Office Staff

on behalf of

Dr. Frederique Lisacek

Academic Editor

PLOS ONE

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    S1 Table. The 79 N-glycopeptides identified in this study.

    (XLSX)

    S2 Table. The 38 N-glycopeptides with significantly different abundances when samples were compared according to community (Rochester vs. OOM).

    (XLSX)

    S3 Table. The N-glycopeptides identified from alpha-S1-casein, IgA, pIgR, and lactotransferrin with significantly different abundances when samples were compared according to community (Rochester vs. OOM).

    (XLSX)

    S4 Table. The eight N-glycopeptides with significantly different abundances when samples were compared between children who developed atopic disease and those that did not.

    (XLSX)

    S5 Table. The 18 N-glycopeptides with significantly different abundances when samples were compared between mothers with or without atopic disease.

    (XLSX)

    S6 Table. The milk samples analyzed in this study together with details regarding the mother’s community and the presence or absence of atopic disease in both mothers and children.

    (XLSX)

    Attachment

    Submitted filename: Response to reviewers.docx

    Attachment

    Submitted filename: Response to reviewers.docx

    Data Availability Statement

    The data underlying this article are available at the ProteomeXchange Consortium (http://proteomecentral.proteomexchange.org) via the PRIDE partner repository and can be accessed with the dataset identifier PXD026644.


    Articles from PLoS ONE are provided here courtesy of PLOS

    RESOURCES