Prediction and systematic analysis of pollen allergens in model plants suggest that pollen allergens evolved by gene duplication and then functional specification.
Abstract
Pollen allergies have long been a major pandemic health problem for human. However, the evolutionary events and biological function of pollen allergens in plants remain largely unknown. Here, we report the genome-wide prediction of pollen allergens and their biological function in the dicotyledonous model plant Arabidopsis (Arabidopsis thaliana) and the monocotyledonous model plant rice (Oryza sativa). In total, 145 and 107 pollen allergens were predicted from rice and Arabidopsis, respectively. These pollen allergens are putatively involved in stress responses and metabolic processes such as cell wall metabolism during pollen development. Interestingly, these putative pollen allergen genes were derived from large gene families and became diversified during evolution. Sequence analysis across 25 plant species from green alga to angiosperms suggest that about 40% of putative pollen allergenic proteins existed in both lower and higher plants, while other allergens emerged during evolution. Although a high proportion of gene duplication has been observed among allergen-coding genes, our data show that these genes might have undergone purifying selection during evolution. We also observed that epitopes of an allergen might have a biological function, as revealed by comprehensive analysis of two known allergens, expansin and profilin. This implies a crucial role of conserved amino acid residues in both in planta biological function and allergenicity. Finally, a model explaining how pollen allergens were generated and maintained in plants is proposed. Prediction and systematic analysis of pollen allergens in model plants suggest that pollen allergens were evolved by gene duplication and then functional specification. This study provides insight into the phylogenetic and evolutionary scenario of pollen allergens that will be helpful to future characterization and epitope screening of pollen allergens.
During the past four decades, allergic diseases have become a pandemic health problem. In general, pollen allergens are considered a major risk factor for both seasonal allergic rhinitis and asthma, and studies showed that more than 50% of patients with perennial allergic rhinitis are sensitized to pollen allergens. The sensitization rate of pollen is up to 30%, and the number of people affected by pollen allergy is on the increase worldwide (D’Amato et al., 2007; Pawankar et al., 2013). Unfortunately, pollen allergens are difficult to avoid because of the extremely small size and high prevalence of pollen, and this may contribute to pollen-food and pollen-fruit syndromes by cross-reactivity (Vieths et al., 2002).
Pollen from trees, grasses, and weeds all have been found to elicit allergic reactions in atopic individuals (Emberlin, 2009). To date, 11 groups of grass pollen allergens with the ability to elicit a specific IgE response in atopic individuals have been identified (Hrabina et al., 2008), and these mainly focused on pollen allergens from weeds and trees (Gadermaier et al., 2004; Mothes and Valenta, 2004). Those studies also suggested that these pollen allergens only belong to a few protein families, such as expansins, profilins, and calcium-binding proteins. Profilins are conserved in plants and act as a pan-allergen capable of inducing allergic reaction in various species (Valenta et al., 1992; Radauer and Breiteneder, 2006). Biologically, many pollen allergenic proteins are thought to play important physiological roles in pollen, especially the pollination process (Songnuan, 2013).
Pollen is the microgametophyte of seed plants that produces the male gametes (sperm cells) for subsequent sexual reproduction. The pollen protoplasm is surrounded by a specialized cell wall, the pollen wall, in which the inner pollen wall (also called the intine) is typically a thin multilayer composed of cellulose and pectin. In contrast, the exine refers to the very resistant outer wall that provides robust protection of the pollen grain from disintegration (Shi et al., 2015). Allergenic proteins are usually located within the pollen protoplast and readily released during the rehydration process (Grote, 1999). For example, birch (Betula spp.) pollen allergens Bet v 1 and Bet v 2 (profilin) are located within the pollen cytoplasm in the anhydrous state, in close proximity to ribosome-rich areas. Upon rehydration, birch pollen allergens are released within minutes from apertures and subsequently found on the entire pollen surface (Grote et al., 1993).
Over the past few decades, increasing information about allergens together with the advancement of bioinformatics tools have enabled scientists to predict and compare allergens from different sources (FAO/WHO, 2003; Stadler and Stadler, 2003; Saha and Raghava, 2006; Soeria-Atmadja et al., 2006; Wang et al., 2013c). These advances provided the prerequisites to allow a comparative analysis and a molecular evolution analysis of pollen allergens. Radauer and Breiteneder (2007) first introduced the evolutionary scope of the origin of plant allergens and proposed two scenarios for allergen evolution. One was that allergenicity could be an intrinsic property of the ancestral members of certain protein families still present in present-day allergens, and the other was that allergenicity emerged randomly in certain proteins and was inherited by their descendants. Recently, the evolution of major allergen gene families in peanut (Arachis hypogaea) was analyzed and revealed lineage-specific expansion and loss of allergenic genes (Ratnaparkhe et al., 2014). However, little information on the origin and evolution of pollen allergens has been reported.
In this study, we performed genome-wide analysis of potential pollen allergens in two well-studied model plants, the dicot Arabidopsis (Arabidopsis thaliana) and the monocot rice (Oryza sativa ssp. japonica), as well as their homologs in 25 species ranging from basal green alga to angiosperms. While some pollen allergens seemed to be derived from the duplication and diversification of large gene families from lower to higher plants, other allergens seemed to be recently evolved. Importantly, these genes seemed to have undergone purifying selection during evolution, implying that allergenic motifs are associated with the biological function of the allergens. A model is also proposed to explain how plants produced and maintained pollen allergens. This phylogenetic and evolutionary insight into pollen allergens will be useful in future characterization, epitope screening, and medical prevention of pollen allergens.
RESULTS
Prediction and Classification of Pollen Allergens in Arabidopsis and Rice
To identify putative pollen allergens from Arabidopsis and rice, we analyzed 186 and 261 candidates allergenic proteins by comparing the proteomic data of mature pollen from rice and Arabidopsis, respectively (Holmes-Davis et al., 2005; Noir et al., 2005; Dai et al., 2006; Sheoran et al., 2006). Next, using the combination of two methods for allergen prediction (PREAL [Wang et al., 2013c] and a sequence-based approach [FAO/WHO, 2003]), a total of 20 and 31 candidate proteins were identified as allergen proteins from rice and Arabidopsis, respectively. Furthermore, by analyzing transcriptomic data from mature pollen, 140 rice proteins and 94 Arabidopsis proteins were identified as putative allergens (Qin et al., 2009; Wei et al., 2010; Fig. 1, A and B). Together, we obtained 145 and 107 putative pollen allergens from rice and Arabidopsis, respectively (Fig. 1, A and B; Table I). Among the 145 rice candidates, five proteins were present only in the proteomic data, 15 in both the proteomic and transcriptomic data, and the remaining 125 only in the transcriptomic data. Similarly, of the 107 putative pollen allergens in Arabidopsis, 13 proteins were identified only from the proteomic data, 18 from both the proteomic and transcriptomic data, and the remaining 76 only in the transcriptomic data. The observation that most putative allergens were predicted from transcriptomic data sets is explained by the relatively low sensitivity of proteomic analysis.
Figure 1.
Genome-wide identification and expression pattern analysis of pollen allergen genes in rice and Arabidopsis. A and B, Prediction of potential pollen allergens from both proteome and transcriptome data. Totals of 145 and 107 putative pollen allergens were predicted in rice and Arabidopsis, respectively. Among these, 15 rice and 16 Arabidopsis potential pollen allergens already described are shown. C, Expression patterns of 143 putative pollen allergens (two putative allergens in rice, LOC_Os06g45180 and LOC_Os03g01630, have no matched Affymetrix probe identifier) in 13 tissues/developmental stages of rice (a, anther An1; b, anther Mei1; c, anther M1; d, anther M2; e, anther M3; f, anther P1; g, anther P2; h, anther P3; i, inflorescence P1; j, inflorescence P2; k, inflorescence P3; l, inflorescence P4; m, inflorescence P5; n, inflorescence P6; o, seed; p, root; q, shoot; and r, mature leaf). Red labels represent pollen tissue specifically expressed, while green cluster labels represent ubiquitously expressed putative allergens. D, Expression patterns of 107 putative pollen allergens in 13 tissues/developmental stages of Arabidopsis (a′, uninucleate microspore; b′, bicellular pollen; c′, tricellular pollen; d′, mature pollen; e′ to h′, flower stages 9, 10/11, 12, and 15; i′, flower; j′, seed; k′, root; l′, vegetative shoot apex; and m′, leaf). Red labels represent pollen tissue-specific expression, while green cluster labels represent ubiquitously expressed putative allergens.
Table I. Gene information and family classification of putative pollen allergens.
| Gene Identifier | Gene Family | Interpro | Specific Expressiona | Transcriptome | Proteome |
|---|---|---|---|---|---|
| Arabidopsis | |||||
| AT1G23800 | Aldehyde dehydrogenase | IPR015590 | T | ||
| AT4G25780 | Allergen V5/Tpx-1 related | IPR001283 | S | T | |
| AT3G09590 | Allergen V5/Tpx-1 related | IPR001283 | T | ||
| AT1G01310 | Allergen V5/Tpx-1 related | IPR001283 | S | T | |
| AT1G66400 | EF hand | IPR002048 | P | ||
| AT5G17480 | EF hand | IPR002048 | S | T | |
| AT3G03430 | EF hand | IPR002048 | S | T | |
| AT4G03290 | EF hand | IPR002048 | S | T | |
| AT1G77840 | eIF4-γ/eIF5/eIF2-ε | IPR003307 | T | ||
| AT2G36530 | Enolase | IPR000941 | P | ||
| AT5G57320 | Gelsolin | IPR007122 | S | T | |
| AT2G41740 | Gelsolin | IPR007122 | P | ||
| AT4G27120 | Gene family candidate, 0016872 | / | T | ||
| AT2G05620 | Gene family candidate, 0020947 | / | T | ||
| AT5G18310 | Gene family candidate, 0025808 | / | T | ||
| AT3G14040 | Glycoside hydrolase, family 28 | IPR000743 | S | P | |
| AT3G07850 | Glycoside hydrolase, family 28 | IPR000743 | S | P | |
| AT5G48140 | Glycoside hydrolase, family 28 | IPR000743 | S | T | |
| AT1G60590 | Glycoside hydrolase, family 28 | IPR000743 | T | ||
| AT1G02790 | Glycoside hydrolase, family 28 | IPR000743 | S | P | |
| AT1G48100 | Glycoside hydrolase, family 28 | IPR000743 | T | ||
| AT3G07820 | Glycoside hydrolase, family 28 | IPR000743 | S | T | |
| AT3G07840 | Glycoside hydrolase, family 28 | IPR000743 | S | T | |
| AT3G07830 | Glycoside hydrolase, family 28 | IPR000743 | S | T | |
| AT1G55120 | Glycoside hydrolase, family 32 | IPR001362 | T | ||
| AT1G62660 | Glycoside hydrolase, family 32 | IPR001362 | T | ||
| AT3G52600 | Glycoside hydrolase, family 32 | IPR001362 | S | T | |
| AT2G36190 | Glycoside hydrolase, family 32 | IPR001362 | T | ||
| AT1G12240 | Glycoside hydrolase, family 32 | IPR001362 | T | ||
| AT1G09080 | Heat shock protein70 | IPR013126 | S | P | |
| AT3G09440 | Heat shock protein70 | IPR013126 | P | ||
| AT5G02490 | Heat shock protein70 | IPR013126 | P | ||
| AT5G09590 | Heat shock protein70 | IPR013126 | P | ||
| AT5G42020 | Heat shock protein70 | IPR013126 | P | ||
| AT5G28540 | Heat shock protein70 | IPR013126 | P | ||
| AT4G37910 | Heat shock protein70 | IPR013126 | T | ||
| AT3G12580 | Heat shock protein70 | IPR013126 | T | ||
| AT1G11660 | Heat shock protein70 | IPR013126 | P | ||
| AT5G02500 | Heat shock protein70 | IPR013126 | P | ||
| AT5G03030 | Heat shock protein DnaJ, N terminal | IPR001623 | T | ||
| AT4G24190 | Heat shock protein90 | IPR001404 | T | ||
| AT5G56000 | Heat shock protein90 | IPR001404 | T | ||
| AT5G03380 | Heavy metal transport/detoxification protein | IPR006121 | T | ||
| AT3G15020 | Malate dehydrogenase, type 1 | IPR010097 | P | ||
| AT3G47520 | Malate dehydrogenase, type 1 | IPR010097 | P | ||
| AT1G53240 | Malate dehydrogenase, type 1 | IPR010097 | P | ||
| AT3G10920 | Manganese and iron superoxide dismutase | IPR001189 | P | ||
| AT4G08580 | Microfibrillar-associated1, C terminal | IPR009730 | T | ||
| AT3G63140 | NAD-dependent epimerase/dehydratase | IPR001509 | T | ||
| AT3G04500 | Nucleotide-binding, α-β plait | IPR012677 | T | ||
| AT1G14420 | Pectate lyase/Amb allergen | IPR002022 | S | T | |
| AT5G15110 | Pectate lyase/Amb allergen | IPR002022 | S | T | |
| AT3G24670 | Pectate lyase/Amb allergen | IPR002022 | T | ||
| AT3G07010 | Pectate lyase/Amb allergen | IPR002022 | T | ||
| AT3G01270 | Pectate lyase/Amb allergen | IPR002022 | S | T | |
| AT2G02720 | Pectate lyase/Amb allergen | IPR002022 | S | T | |
| AT1G69940 | Pectinesterase, catalytic | IPR000070 | S | P | |
| AT5G07430 | Pectinesterase, catalytic | IPR000070 | S | T | |
| AT5G07420 | Pectinesterase, catalytic | IPR000070 | S | T | |
| AT3G06830 | Pectinesterase, catalytic | IPR000070 | S | T | |
| AT5G43060 | Peptidase C1A, papain | IPR013128 | T | ||
| AT1G09850 | Peptidase C1A, papain | IPR013128 | T | ||
| AT4G39090 | Peptidase C1A, papain | IPR013128 | T | ||
| AT1G47128 | Peptidase C1A, papain | IPR013128 | T | ||
| AT3G63470 | Peptidase S10, Ser carboxypeptidase | IPR001563 | T | ||
| AT3G52000 | Peptidase S10, Ser carboxypeptidase | IPR001563 | S | T | |
| AT2G05850 | Peptidase S10, Ser carboxypeptidase | IPR001563 | S | T | |
| AT2G12480 | Peptidase S10, Ser carboxypeptidase | IPR001563 | T | ||
| AT3G45010 | Peptidase S10, Ser carboxypeptidase | IPR001563 | T | ||
| AT3G10410 | Peptidase S10, Ser carboxypeptidase | IPR001563 | T | ||
| AT4G00230 | Peptidase S8, subtilisin related | IPR015500 | T | ||
| AT3G14067 | Peptidase S8, subtilisin related | IPR015500 | T | ||
| AT4G26330 | Peptidase S8, subtilisin related | IPR015500 | T | ||
| AT4G34870 | Peptidyl-prolyl cis-trans-isomerase, cyclophilin type | IPR002130 | T | ||
| AT2G16600 | Peptidyl-prolyl cis-trans-isomerase, cyclophilin type | IPR002130 | P | ||
| AT4G38740 | Peptidyl-prolyl cis-trans-isomerase, cyclophilin type | IPR002130 | P | ||
| AT2G21130 | Peptidyl-prolyl cis-trans-isomerase, cyclophilin type | IPR002130 | P | ||
| AT3G60570 | Pollen allergen, N terminal | IPR007117 | S | T | |
| AT2G39700 | Pollen allergen, N terminal | IPR007117 | T | ||
| AT1G29140 | Pollen Ole e 1 allergen and extensin | IPR006041 | S | T | |
| AT4G08685 | Pollen Ole e 1 allergen and extensin | IPR006041 | T | ||
| AT2G19760 | Profilin, plant | IPR005455 | P | ||
| AT4G29340 | Profilin, plant | IPR005455 | S | P | |
| AT2G19770 | Profilin, plant | IPR005455 | S | P | |
| AT3G44590 | Ribosomal protein 60S | IPR001813 | T | ||
| AT1G01100 | Ribosomal protein 60S | IPR001813 | T | ||
| AT4G00810 | Ribosomal protein 60S | IPR001813 | T | ||
| AT2G27710 | Ribosomal protein 60S | IPR001813 | P | ||
| AT1G74000 | Strictosidine synthase | IPR004141 | S | T | |
| AT2G28190 | Superoxide dismutase, copper/zinc binding | IPR001424 | T | ||
| AT1G08830 | Superoxide dismutase, copper/zinc binding | IPR001424 | P | ||
| AT1G75040 | Thaumatin, pathogenesis related | IPR001938 | T | ||
| AT1G75050 | Thaumatin, pathogenesis related | IPR001938 | T | ||
| AT1G45145 | Thioredoxin fold | IPR012335 | T | ||
| AT3G08710 | Thioredoxin fold | IPR012335 | T | ||
| AT5G42980 | Thioredoxin fold | IPR012335 | T | ||
| AT1G21750 | Thioredoxin like | IPR017936 | P | ||
| AT2G47470 | Thioredoxin like | IPR017936 | P | ||
| AT1G77510 | Thioredoxin like | IPR017936 | P | ||
| AT5G19510 | Translation elongation factor EF1B, β- and δ-chains, guanine nucleotide exchange | IPR014038 | P | ||
| AT1G68185 | Ubiquitin | IPR000626 | T | ||
| AT1G80040 | Ubiquitin system component Cue | IPR003892 | T | ||
| AT2G39900 | Zinc finger, LIM type | IPR001781 | T | ||
| AT3G55770 | Zinc finger, LIM type | IPR001781 | T | ||
| AT5G64920 | Zinc finger, RING type | IPR001841 | T | ||
| AT3G54680 | / | / | T | ||
| AT1G30750 | / | / | T | ||
| Rice | |||||
| LOC_Os03g18850 | Bet v I allergen | IPR000916 | T | ||
| LOC_Os12g36850 | Bet v I allergen | IPR000916 | T | ||
| LOC_Os07g10570 | Bifunctional inhibitor/plant lipid transfer protein/seed storage | IPR016140 | T | ||
| LOC_Os01g55690 | Cupin1 | IPR006045 | S | T | |
| LOC_Os10g26060 | Cupin1 | IPR006045 | T | ||
| LOC_Os06g36010 | Cupredoxin | IPR008972 | S | T | |
| LOC_Os01g08410 | Cyclin like | IPR005814 | T | ||
| LOC_Os05g27730 | DNA-binding WRKY | IPR003657 | T | ||
| LOC_Os08g44660 | EF hand | IPR002048 | S | T | |
| LOC_Os09g24580 | EF hand | IPR002048 | T | ||
| LOC_Os06g48350 | eIF4-γ/eIF5/eIF2-ε | IPR003307 | T | ||
| LOC_Os09g15770 | eIF4-γ/eIF5/eIF2-ε | IPR003307 | T | ||
| LOC_Os10g08550 | Enolase | IPR000941 | T | P | |
| LOC_Os03g14450 | Enolase | IPR000941 | T | ||
| LOC_Os09g20820 | Enolase | IPR000941 | T | ||
| LOC_Os04g25150 | Expansin, cellulose-binding-like domain | IPR007117 | S | T | |
| LOC_Os04g25160 | Expansin, cellulose-binding-like domain | IPR007117 | S | T | P |
| LOC_Os04g25190 | Expansin, cellulose-binding-like domain | IPR007117 | S | T | |
| LOC_Os06g45200 | Expansin, cellulose-binding-like domain | IPR007117 | S | T | |
| LOC_Os06g44470 | Expansin, cellulose-binding-like domain | IPR007117 | S | T | P |
| LOC_Os06g45160 | Expansin, cellulose-binding-like domain | IPR007117 | S | T | |
| LOC_Os06g45180 | Expansin, cellulose-binding-like domain | IPR007117 | T | P | |
| LOC_Os10g40090 | Expansin/pollen allergen, DPBB domain | IPR007112 | S | T | |
| LOC_Os03g01610 | Expansin/pollen allergen, DPBB domain | IPR007112 | S | T | P |
| LOC_Os12g36040 | Expansin/pollen allergen, DPBB domain | IPR007112 | S | T | |
| LOC_Os03g01640 | Expansin/pollen allergen, DPBB domain | IPR007112 | S | T | P |
| LOC_Os01g60770 | Expansin/pollen allergen, DPBB domain | IPR007112 | T | ||
| LOC_Os02g51040 | Expansin/pollen allergen, DPBB domain | IPR007112 | T | ||
| LOC_Os03g01630 | Expansin/pollen allergen, DPBB domain | IPR007112 | T | P | |
| LOC_Os01g47780 | FAS1 domain | IPR000782 | T | ||
| LOC_Os01g57570 | Flavodoxin/nitric oxide synthase | IPR008254 | T | ||
| LOC_Os05g42190 | Flavodoxin/nitric oxide synthase | IPR008254 | T | ||
| LOC_Os04g51440 | Gelsolin | IPR007122 | S | T | |
| LOC_Os03g24220 | Gelsolin | IPR007122 | T | ||
| LOC_Os06g44890 | Gelsolin | IPR007122 | T | ||
| LOC_Os01g46080 | Gene family candidate, 0005331 | IPR001087 | T | ||
| LOC_Os05g32190 | Gene family candidate, 0014938 | / | T | ||
| LOC_Os09g12620 | Gene family candidate, 0015182 | / | S | T | |
| LOC_Os06g21120 | Gene family candidate, 0015338 | / | T | ||
| LOC_Os08g09180 | Gene family candidate, 0018686 | / | T | ||
| LOC_Os03g17140 | Gene family candidate, 0020281 | / | T | ||
| LOC_Os02g38490 | Gene family candidate, 0021210 | / | T | ||
| LOC_Os08g13980 | Glycoside hydrolase, family 16 | IPR013320 | T | ||
| LOC_Os06g39060 | Glycoside hydrolase, family 17 | IPR000490 | S | T | |
| LOC_Os05g41610 | Glycoside hydrolase, family 17 | IPR000490 | T | ||
| LOC_Os01g71810 | Glycoside hydrolase, family 17 | IPR000490 | T | ||
| LOC_Os09g36280 | Glycoside hydrolase, family 17 | IPR000490 | T | P | |
| LOC_Os04g41680 | Glycoside hydrolase, family 19, catalytic | IPR000726 | T | ||
| LOC_Os02g39330 | Glycoside hydrolase, family 19, catalytic | IPR000726 | T | ||
| LOC_Os08g41100 | Glycoside hydrolase, family 19, catalytic | IPR000726 | T | ||
| LOC_Os02g10300 | Glycoside hydrolase, family 28 | IPR012334 | S | T | P |
| LOC_Os01g33300 | Glycoside hydrolase, family 28 | IPR012334 | S | T | |
| LOC_Os06g35320 | Glycoside hydrolase, family 28 | IPR012334 | S | T | |
| LOC_Os08g23790 | Glycoside hydrolase, family 28 | IPR012334 | S | T | P |
| LOC_Os06g40890 | Glycoside hydrolase, family 28 | IPR012334 | T | ||
| LOC_Os05g46510 | Glycoside hydrolase, family 28 | IPR012334 | T | ||
| LOC_Os03g61800 | Glycoside hydrolase, family 28 | IPR012334 | T | ||
| LOC_Os02g01590 | Glycoside hydrolase, family 32 | IPR013148 | T | ||
| LOC_Os04g45290 | Glycoside hydrolase, family 32 | IPR013148 | T | ||
| LOC_Os11g08440 | Heat shock protein70 | IPR001023 | T | ||
| LOC_Os03g16920 | Heat shock protein70 | IPR001023 | T | ||
| LOC_Os03g16880 | Heat shock protein70 | IPR001023 | T | ||
| LOC_Os11g47760 | Heat shock protein70 | IPR001023 | P | ||
| LOC_Os03g16860 | Heat shock protein70 | IPR001023 | T | ||
| LOC_Os02g02410 | Heat shock protein70 | IPR001023 | T | ||
| LOC_Os02g53420 | Heat shock protein70 | IPR001023 | T | ||
| LOC_Os03g02260 | Heat shock protein70 | IPR001023 | P | ||
| LOC_Os03g60620 | Heat shock protein70 | IPR001023 | T | P | |
| LOC_Os09g31486 | Heat shock protein70 | IPR001023 | T | ||
| LOC_Os05g38530 | Heat shock protein70 | IPR001023 | T | ||
| LOC_Os04g01740 | Heat shock protein90 | IPR001404 | T | ||
| LOC_Os08g39140 | Heat shock protein90 | IPR001404 | T | ||
| LOC_Os09g30412 | Heat shock protein90 | IPR001404 | T | ||
| LOC_Os09g29840 | Heat shock protein90 | IPR001404 | T | ||
| LOC_Os01g51140 | Helix-loop-helix DNA binding | IPR011598 | T | ||
| LOC_Os08g04390 | Helix-loop-helix DNA binding | IPR011598 | T | ||
| LOC_Os10g25420 | Lipase, GDSL | IPR001087 | T | ||
| LOC_Os03g25040 | Lipase, GDSL | IPR001087 | T | ||
| LOC_Os05g49880 | Malate dehydrogenase, type 1 | IPR010097 | T | ||
| LOC_Os01g46070 | Malate dehydrogenase, type 1 | IPR010097 | T | ||
| LOC_Os08g33720 | Malate dehydrogenase, type 1 | IPR010097 | T | ||
| LOC_Os01g61380 | Malate dehydrogenase, type 1 | IPR010097 | T | ||
| LOC_Os03g56280 | Malate dehydrogenase, type 1 | IPR010097 | T | ||
| LOC_Os05g25850 | Manganese and iron superoxide dismutase | IPR001189 | T | P | |
| LOC_Os06g43710 | Man-6-P receptor, binding | IPR009011 | T | ||
| LOC_Os07g06590 | MD-2-related lipid recognition | IPR003172 | T | ||
| LOC_Os06g27760 | Met sulfoxide reductase B | IPR002579 | T | ||
| LOC_Os09g28060 | MORN motif | IPR003409 | T | ||
| LOC_Os06g05660 | Nucleosome assembly protein | IPR002164 | T | ||
| LOC_Os06g05209 | Pectate lyase/Amb allergen | IPR002022 | S | T | |
| LOC_Os02g12300 | Pectate lyase/Amb allergen | IPR002022 | S | T | |
| LOC_Os06g38510 | Pectate lyase/Amb allergen | IPR002022 | S | T | |
| LOC_Os11g45730 | Pectinesterase, catalytic | IPR012334 | S | T | |
| LOC_Os02g43010 | Peptidase C13, legumain | IPR001096 | T | ||
| LOC_Os04g24600 | Peptidase C1A, papain | IPR013128 | T | ||
| LOC_Os02g27030 | Peptidase C1A, papain | IPR013128 | T | ||
| LOC_Os05g01810 | Peptidase C1A, papain | IPR013128 | T | ||
| LOC_Os01g73980 | Peptidase C1A, papain | IPR013128 | T | ||
| LOC_Os11g14900 | Peptidase C1A, papain | IPR013128 | T | ||
| LOC_Os03g09190 | Peptidase S10, Ser carboxypeptidase | IPR001563 | T | ||
| LOC_Os04g47150 | Peptidase S8, subtilisin related | IPR015500 | S | T | P |
| LOC_Os02g44590 | Peptidase S8, subtilisin related | IPR015500 | T | ||
| LOC_Os06g49480 | Peptidyl-prolyl cis-trans-isomerase | IPR002130 | T | ||
| LOC_Os09g39780 | Peptidyl-prolyl cis-trans-isomerase | IPR002130 | P | ||
| LOC_Os05g01270 | Peptidyl-prolyl cis-trans-isomerase | IPR002130 | T | ||
| LOC_Os02g02890 | Peptidyl-prolyl cis-trans-isomerase | IPR002130 | P | ||
| LOC_Os05g40010 | Plant lipid transfer protein | IPR000528 | T | ||
| LOC_Os09g39950 | Pollen Ole e 1 allergen and extensin | IPR006041 | S | P | |
| LOC_Os06g36240 | Pollen Ole e 1 allergen and extensin | IPR006041 | S | T | |
| LOC_Os10g17660 | Profilin, plant | IPR002097 | S | T | |
| LOC_Os06g05880 | Profilin, plant | IPR002097 | T | ||
| LOC_Os02g55710 | Proteasome maturation factor UMP1 | IPR008012 | T | ||
| LOC_Os04g57810 | Protein of unknown function DUF689 | IPR007785 | T | ||
| LOC_Os01g72490 | Protein of unknown function DUF702 | IPR007818 | T | ||
| LOC_Os02g05630 | Protein phosphatase 2C, N terminal | IPR015655 | T | ||
| LOC_Os01g16430 | Proteinase inhibitor I25, cystatin | IPR000010 | T | ||
| LOC_Os01g25540 | Rapid alkalinization factor | IPR008801 | T | ||
| LOC_Os06g48780 | Ribosomal protein 60S | IPR001813 | T | ||
| LOC_Os01g13080 | Ribosomal protein 60S | IPR001813 | T | ||
| LOC_Os05g37330 | Ribosomal protein 60S | IPR001813 | T | ||
| LOC_Os08g02340 | Ribosomal protein 60S | IPR001813 | T | ||
| LOC_Os01g09510 | Ribosomal protein 60S | IPR001813 | T | ||
| LOC_Os02g32760 | Ribosomal protein 60S | IPR001813 | T | ||
| LOC_Os03g22810 | Superoxide dismutase, copper/zinc binding | IPR001424 | T | P | |
| LOC_Os07g46990 | Superoxide dismutase, copper/zinc binding | IPR001424 | T | ||
| LOC_Os03g11960 | Superoxide dismutase, copper/zinc binding | IPR001424 | T | ||
| LOC_Os11g36340 | Targeting for Xklp2 | IPR009675 | T | ||
| LOC_Os01g24090 | Tetratricopeptide region | IPR013026 | T | ||
| LOC_Os04g44830 | Thioredoxin, core | IPR013766 | T | ||
| LOC_Os05g06430 | Thioredoxin like | IPR013766 | T | ||
| LOC_Os09g27830 | Thioredoxin like | IPR013766 | T | ||
| LOC_Os01g23740 | Thioredoxin like | IPR013766 | T | ||
| LOC_Os06g42000 | Thioredoxin-like fold | IPR012336 | T | ||
| LOC_Os10g25290 | Tify | IPR010399 | T | ||
| LOC_Os07g46750 | Translation elongation factor EF1B, β- and δ-chains, guanine nucleotide exchange | IPR014038 | T | ||
| LOC_Os07g42300 | Translation elongation factor EF1B, β- and δ-chains, guanine nucleotide exchange | IPR014038 | T | P | |
| LOC_Os08g37310 | Uncharacterized protein family UPF0029, N terminal | IPR001498 | T | ||
| LOC_Os05g50490 | X8 domain | IPR012946 | T | ||
| LOC_Os09g36820 | Zinc finger, Mcm10/DnaG type | IPR015408 | T | ||
| LOC_Os01g72480 | Zinc finger, RING type | IPR001841 | T | ||
| LOC_Os05g30940 | / | / | S | T | |
| LOC_Os05g29740 | / | / | S | T | |
| LOC_Os09g31080 | / | / | T | ||
| LOC_Os08g25080 | / | / | T | ||
| LOC_Os04g57440 | / | / | T |
Genes that belong to the pollen specifically expressed gene cluster are marked S.
The AllFam database of allergen families (Radauer and Breiteneder, 2006) contains over 2,500 protein families present in seed plants, and of these 2,500 families, about 59 plant protein families are inhalation allergens. In our analysis, we identified 254 putative pollen allergens (145 in rice and 107 in Arabidopsis) that were classified into 81 protein families, including most of the known allergenic pollen protein families present in the AllFam database (Radauer et al., 2008; Table I; Supplemental Fig. S1). Of these 81 families, 10 of the 13 known allergens were identified, except for three known allergens (Ara t GLP, Ara t 3, and Ory s 23) from Arabidopsis and rice (Supplemental Table S1), demonstrating the reliability of our prediction method. The absence of three known allergens is possibly due to their low expression levels in pollen and the absence of probes in microarrays (Qin et al., 2009; Wei et al., 2010).
Expression Analysis and Functional Prediction of Candidate Pollen Allergens
To understand the biological functions of these identified putative pollen allergens from Arabidopsis and rice, we performed Gene Ontology analysis and observed that these putative pollen allergens have key housekeeping biological functions, such as metabolic and cellular activities, stress response, and cellular component formation (Fig. 2A). For instance: Bet v 1, PR10 proteins, are associated with stress responses; profilins regulate actin polymerization by sequestering or releasing actin monomer during pollen growth; and polcalcins are involved in calcium signaling to help guide pollen tube growth (Supplemental Table S1). To further characterize the functions of these putative pollen allergens, we performed in silico expression analysis. Our previous clustering analysis demonstrated that these candidates were present in pollen proteomic and transcriptomic data, but they also displayed distinct temporal expression patterns. In Figure 1, C and D, genes present in the green cluster (60 of 143 genes in rice and 33 of 107 genes in Arabidopsis) displayed ubiquitous expression that was associated mainly with stress responses, oxygen species metabolism, and glycolysis. The genes in the red cluster (31 of 143 genes in rice and 26 of 107 genes in Arabidopsis) exhibited high expression specifically in tricellular and mature pollens, and these genes were largely related to cell wall metabolism and organization (Figs. 1, C and D, and 2B). The main allergens specifically expressed in pollen include polygalacturonases, pectate lyases, and expansins that participate in the metabolism of carbohydrates and pollen tube wall formation during germination (Barral et al., 2005). These analyses imply that pollen-specific allergens were functionally restricted in pollen to be involved in cell wall metabolic activities, while the ubiquitous expressed putative allergens were associated mainly with stress responses (Supplemental Fig. S2).
Figure 2.
Gene Ontology (GO) enrichment analysis of putative pollen allergens in rice and Arabidopsis. A, GO analysis of putative allergen genes in rice and Arabidopsis. B, Significant biological process GO terms of ubiquitously widely expressed putative pollen allergens and pollen-specific putative allergens. The significance of each GO term was evaluated by –log10 (P value). Enrichment against all genes in the same GO term and percentage of query genes in each GO term are shown in parentheses.
Phylogenetic Analysis of Putative Pollen Allergens among 25 Plant Species
To understand the evolutionary events that gave rise to pollen allergens in plants, we identified the closest homologs (present in protein families) of these putative pollen allergens from rice and Arabidopsis in 25 sequenced plant species ranging from lower plants (green alga) to higher plants (angiosperms; Fig. 3). During angiosperm evolution, multiple rounds of polyploidy occurred (Bowers et al., 2003; Adams and Wendel, 2005); therefore, we proposed that pollen allergens might have expanded via gene duplication. In our analysis, a total of 1,797 and 1,302 close homologs of pollen allergens in rice and Arabidopsis were identified from the genomes of the 25 plant species, and in most families, the number of homologs increased from green alga to angiosperms. Notably, some putative allergenic protein families displayed multiple sequences with high similarity in one species. For example, two rice expansins had 12 close sequence homologs in Fragaria vesca with little variability (Musidlowska-Persson et al., 2007).
Figure 3.
Taxonomic distribution of putative pollen allergen homologs. The phylogenetic tree shows homologous genes of putative pollen allergens of each protein family identified in rice and Arabidopsis and 25 other plant species (25 species from green alga to higher plant). The numbers of homologous genes found in other species are shown in the matrix. The numbers of recognized plant allergens in databases are shown under the names for each family.
Among the 48 putative pollen allergenic protein families, such as HEAT SHOCK PROTEIN70 (Hsp70), profilin and thioredoxin-like families seemed to have an ancient origin, as evidenced by the presence of the homologs in lower plants. Hsp70 was shown to be expressed in maize (Zea mays) and tomato (Solanum lycopersicum) pollen grains, and without functional characterization, it is plausible to deduce that they are associated with protecting cellular structures from stress (Gruehn et al., 2003; Supplemental Table S1). One maize pollen-expressed profilin, designated ZmPRO4, was characterized to have the poly-l-Pro-binding function that is required for the modulation of actin cytoskeletal dynamics in pollen (Gibbon et al., 1998). Thioredoxin-like proteins expressed in Arabidopsis pollen grains have been reported to be required for osmotic stress tolerance and male sporogenesis as well as male-female interaction (Lakhssassi et al., 2012). Due to the conservation of these gene families in multiple species, it is likely that these genes share an ancient common ancestor and that their functions may be retained in plants (Fig. 3; Supplemental Table S2).
In contrast, putative allergens in 33 other families only had homologs in either monocots or dicots, suggesting that these genes were generated subsequently in higher plants. Plant lipid transfer proteins are small, abundant lipid-binding proteins that are able to exchange lipids between membranes. The pollen allergenic lipid transfer proteins such as Ara t 3, Zea m 14, and Tri a 14 may have the function of transfering lipids and fatty acids through cell membranes (Thoma et al., 1993; Arondel et al., 2000; Pastorello et al., 2000; Wang et al., 2005; Sander et al., 2011). One monocot-specific lipid transfer protein, OsC6, expressed mainly in tapetal cells, is shown to bind lipidic molecules and affect the pollen wall and fertility (Zhang et al., 2010). Several members encoding expansins containing the cellulose-binding-like domain (IPR007117) were observed only in monocots (Fig. 3; Supplemental Fig. S3). The expansin family regulates cell wall expansion, and pollen-expressed β-expansins aid in pollen tube growth and penetration (Supplemental Table S2). The cellulose-binding-like expansin homologs seemed to share a recent monocot common ancestor and to have high sequence conservation between species; however, these genes may have further evolved grass-specific functions compared with the DPBB expansins that are found in both monocots and dicots (Fig. 3). Likewise, the pollen-expressed polygalacturonase, one major allergen in some grass and cypress species, only has homologs in higher plants. Allergenic polygalacturonases from the Japanese cypress Chamaecyparis obtusa (Mori et al., 1999) and timothy grass (Phleum pratense; Suck et al., 2000) play roles in pollen maturation and pollen tube growth (Supplemental Table S2). Another allergen found only in higher plants, pollen Ole e 1 allergen (Jimenez-Lopez et al., 2012), accumulates in pollen tube cell walls and may have a role in pollen germination and pollen tube growth (Supplemental Table S2). Interestingly, Arabidopsis pectinesterase, another pollen allergen (Mahler et al., 2001), only has homologs in dicots, which have sequence variation with that of rice counterparts. Pectinesterase from olive (Olea europaea) was reported to affect cell wall stability during pollen germination and pollen tube growth through the deesterification of pectin into pectate and methanol (Salamanca et al., 2010; Esteve et al., 2012; Jimenez-Lopez et al., 2012). Altogether, our observations on the putative allergens of 33 other families suggest that they may have evolved in parallel in either monocots or dicots with diversified biological functions.
Evolutionary Events in Generating and Maintaining Pollen Allergens
Gene duplication events that produce functionally redundant genes have been considered a main driver underlying gene evolution (Nei, 1969; Lynch and Conery, 2000; Cui et al., 2015). Therefore, we asked whether sequence variation within these duplicated genes affects the allergenicity of proteins. Pollen allergens seemed to be produced by gene duplication events. The proportion of duplicated genes (including tandem repeat and block repeat) in pollen-expressed genes was about 40% in Arabidopsis and 30% in rice. However, the percentage of duplicated genes in putative pollen allergens increased markedly, 60% in Arabidopsis and 49% in rice (Fig. 4, A and B). In genetics, Ka/Ks represents the ratio of the number of nonsynonymous substitutions per nonsynonymous site to the number of synonymous substitutions per synonymous site, and the value of Ka/Ks can be used as an indicator of selective pressure on a protein-coding gene. A gene with Ka/Ks > 1 is usually regarded as having evolved under positive selection, while Ka/Ks < 1 is usually regarded as an indicator of genes having undergone purifying selection (Hurst, 2002). Although many putative pollen allergen genes seemed to be produced by duplication, the Ka/Ks values of these genes were low, which means a low ratio of nonsynonymous substitutions of these genes, suggesting that these pollen allergenic proteins evolved under purifying selection (Fig. 4, C and D). In rice, the Ka/Ks rate of allergen genes was around 0.25, which is similar to that of Arabidopsis (below 0.2), indicating that pollen allergens generated from duplication events have been maintained by purifying selection.
Figure 4.
Percentage of gene duplication and Ka/Ks rate of potential pollen allergens. A, Percentage of gene duplication events found in background (pollen-expressed genes), putative pollen allergens, and pollen-specific allergens. Different duplication types are shown as different colors (blue = block, purple = tandem and block, and red = tandem). The number of genes detected and the total number of genes are shown on top of each column. The significance of differences was calculated by a hypergeometric distribution (**, P < 10−4 and ***, P < 10−5). B, Box plot of Ka/Ks rates of putative pollen allergens and pollen-expressed genes in rice and Arabidopsis. Four related species were selected to calculate Ka/Ks rates for rice (Oryza sativa ssp. indica 9311, Oryza brachyantha, Oryza glaberrima, and Peiai 64S [PA64S]) and Arabidopsis (Arabidopsis lyrata, Brassica rapa, Capsella rubella, and Thellungiella parvula).
Profilins Represent the Ancient Allergenic Families
To further investigate the evolution and the relationship between allergenicity and the biological function of pollen allergens, two major allergenic families, profilins and expansins, were analyzed further. Profilin is an actin-binding protein involved in the dynamic turnover and restructuring of the actin cytoskeleton. Plant profilins share many of the same biochemical properties and are structurally similar to nonplant profilins (Thorn et al., 1997). Profilin is a common pan-allergen in plants and is present in many plant organs, thereby leading to various routes of exposure depending on the plant species (Valenta et al., 1992). As shown in the phylogenetic tree, profilins in six monocots were present in one clade, and allergenic profilins that have the same route of exposure tend to be in another clade. For example, pollen profilins were seen in the grass family, fruit profilins in family Rosaceae, and seed profilins in family Leguminosae (Fig. 5A). LOC_Os10g17660 and LOC_Os10g17680 are tandem duplicated genes in rice, and both were highly expressed in late anther developmental stages, while tandem duplicated gene pairs (AtProfilin1/AtProfilin5 and AtProfilin2/AtProfilin4) showed totally different expression patterns in Arabidopsis. AtProfilin1 and AtProfilin2 were expressed in many tissues, while AtProfilin4 and AtProfilin5 were expressed specifically and highly in pollen (Fig. 5B). AtProfilin4 and AtProfilin5 redundantly regulate polarized pollen tube growth (Liu et al., 2015). Obviously, proteins like AtProfilin4 and AtProfilin5 have a higher probability to be pollen allergens. The sequence and structure of Ara h 5 in peanut have been studied extensively, and eight surface-exposed epitopes were identified (Radauer et al., 2006; Cabanos et al., 2010) and are were mapped in Figure 5C for comparison. These epitopes included some crucial amino acid residues required for biological function and structural roles in profilin; for example, epitope 1 includes two pyridoxal-5′-phosphate-binding residues (Thorn et al., 1997; Fig. 5C; Supplemental Fig. S5). Most known allergenic profilins, such as Zea m 12, Mal d 4, and Api g 4, displayed almost no variation in epitope sequence, while profilins in rice and lower plants exhibited more variation (Fig. 5C; Supplemental Fig. S4). These results indicate that the allergenicity of the profilin family was changed possibly through evolution. Furthermore, variations in epitope position caused structural changes in the proteins of Ara h 5 and AtProfilin1 (Fig. 5D). Variations in epitopes also were found among members of the profilin family of Arabidopsis.
Figure 5.
Phylogenetic tree, expression patterns, and sequence alignment of the profilin family. A, Unrooted neighbor-joining tree generated from sequence alignments of profilins in rice and Arabidopsis and major profilin allergens in other species. Ama r, Amaranthus retroflexus; Amb a, Ambrosia artemisiifolia; Art v, Artemisia vulgaris; Bet v, Betula verrucosa; Mer a, Mercurialis annua; Ric c, Ricinus communis; Par j, Parietaria judaica; Pro j, Prosopis juliflora; Sal k, Salsola kali; Ana c, Ananas comosus; Cyn d, Cynodon dactylon; Lil l, Lilium longiflorm; Phl p, Phleum pratense; Tri a, Triticum aestivum; Zea m, Zea mays; Cit s, Citrus sinensis; Fra a, Fragaria ananassa; Mal d, Malus domestica; Pyr c, Pyrus communis; Pru p, Prunus persica; Man i, Mangifera indica; Gly m, Glycine max; Ara h, Arachis hypogaea; Sin a, Sinapis alba; Hev b, Hevea brasiliensis. B, Expression patterns of profilins in rice and Arabidopsis. Developmental stages and tissues are described in Figure 1. C, Partial sequence alignment (amino acids 1–56) of known pollen allergens in the profilin family and profilins in various species. The secondary structure of Ara h 5 is shown in the first line, and seven putative surface-exposed epitopes are marked by black boxes (#1–#7; Radauer et al., 2006; Cabanos et al., 2010). Within these seven surface-exposed epitopes, amino acids in common with Ara h 5 of allergenic profilins are colored in yellow. Amino acids different from Ara h 5 common epitopes are colored in blue. Crucial residues of known biological function and structural role are marked by red stars (Thorn et al., 1997). Ara_h, Arachis hypogaea; Mal_d, Malus domestica; Api_g, Apium graveolens; Zea_m, Zea mays; Os, Oryza sativa; CR, Chlamydomonas reinhardtii; VC, Volovox carteri; MRCC, Micromonas sp. RCC. D, Three-dimensional (3D) models of Ara h 4 (left) and AtProfilin1 (right), with putative surface epitopes of Ara h 5 and the corresponding position of AtProfilin1 shown. Different epitopes (#1–#7) were mapped on the surface in different colors.
Allergenicity Evolved with the Functional Specification of Expansins in Grass
Expansins are proteins that promote cell wall loosening and extension (Cosgrove, 2000). In pollen, expansins may facilitate cell wall deposition in pollen grains and are involved in pollen germination (Choi et al., 2006). Even though expansins have numerous family members present in both dicots and monocots, only members in the EXPB-I (for β-expansin I) clade of β-expansins in grass are allergenic. In grasses, the EXPB-I clade was separated into two groups (conservative EXPB-I and divergent EXPB-I) by the sigma whole-genome duplication, while known allergenic β-expansins gathered in subbranches of divergent EXPB-I (Tang et al., 2010). The divergent EXPB-I might have evolved to act on highly substituted xylans that were the interstitial material of primary walls in grasses (Sampedro et al., 2015). Phylogenetic analysis showed that expansins in rice clustered into two main branches (conservative EXPB-I and divergent EXPB-I), and all expansins in Arabidopsis belonged to the conserved EXPB-I clade (Fig. 6A). Ory s 1 allergens, which include OsEXPB1, OsEXPB10, and OsEXPB13, were highly expressed in late developmental stages of anther (microspore/pollen) and inflorescence development (Xu et al., 1995; Hirano et al., 2013; Fig. 6B).
Figure 6.
Phylogenetic tree, expression patterns, and sequence alignment of the expansin family. A, Unrooted neighbor-joining tree generated from sequence alignments of some known allergenic β-expansins in plants and β-expansins in rice and Arabidopsis. Known allergens are colored red. OsEXPB1, OsEXPB10, and OsEXPB13 are known pollen allergens and are shown as Ory s 1. Rice β-expansins were separated into two clades, the conservative EXPB-I (yellow) and the divergent EXPB-I (red and green). A short-range translocation event separated the divergent EXPB-I into two clades: a pollen-expressed clade and a vegetative-expressed clade. All known allergens clustered together with the pollen-expressed divergent EXPB-I. Cyn d, Cynodon dactylon; Dac g, Dactylis glomerata; Hol l, Holcus lanatus; Lol p, Lolium perenne; Phl p, Phleum pratense; Poa p, Poa pratensis; Zea m, Zea mays. B, Expression patterns of β-expansins in rice and Arabidopsis. Developmental stages and tissues are described in Figure 1. C, Partial sequence alignment (amino acids 1–47) of known pollen allergens in the β-expansin family and β-expansins in rice and Arabidopsis. The secondary structure of Zea m 1 is shown in the first line. The known epitopes are marked by black squares, and functional binding sites are marked by red stars. Zea_m, Zea mays; Cyn_d, Cynodon dactylon; Dac_g, Dactylis glomerata; Os, Oryza sativa; SM, Selaginella moellendorffii; PP, Physcomitrella patens.
Sequence alignment of β-expansins demonstrated that the identified epitopes of allergenic expansins differed from those of their nonallergic expansin orthologs present in lower plants (Selaginella moellendorffii and Physcomitrella patens), dicots, and monocots (Fig. 6C; Supplemental Fig. S5). These epitopes also included important residues: the epitope SITE-A identified by Esch and Klapper (1989) contained a short binding pocket, and SITE-D identified by Hiller et al. (1997) covered part of the long conserved binding surface with the motif TWYG (Yennawar et al., 2006). Sequence variation among these expansins may lead to diverse functions and allergenicity of each expansin. Rice Ory s 1 is homologous to the maize allergen Zea m 1 and two other pollen allergens, Lol p 1 and Phl p 1 from ryegrass (Lolium perenne) and timothy grass, respectively (Petersen et al., 1995; Cosgrove et al., 1997; Yennawar et al., 2006). Zea m 1 was suggested to be involved in cell wall loosening of the stigma and style, aiding in pollen tube invasion of maternal tissue (Cosgrove et al., 1997). Likewise, Zea m 1 and its isoforms also were shown to have a dose effect in inducing cell wall expansion in wheat (Triticum aestivum) pollen and nonreproductive cells (Li et al., 2003). Furthermore, mutated Zea m 1 isoforms caused delayed pollen growth and the accumulation of large aggregates, possibly as a consequence of aberrant cell wall expansion (Valdivia et al., 2009). Both Zea m 1 and Ory s 1 were present in the divergent EXPB-I group, showing a high expression in pollen (Hirano et al., 2013), suggesting that these pollen allergens may be evolved from a common ancestor and have a conserved biological function. Supportively, Zea m 1, Ory s 1, and Phl p 1 isoforms share a conserved functional binding site (Fig. 6; Petersen et al., 1995; Yennawar et al., 2006).
DISCUSSION
Pollen grain-caused allergen is one of the most intractable problems in allergy research. Large numbers of pollen allergens have been characterized, but little is known about their evolution and taxonomic distribution patterns. To provide answers to these questions, we performed genome-wide allergen prediction of transcriptomic and proteomic data sets in the model monocot rice and dicot Arabidopsis and performed phylogenetic analysis of pollen allergens. The taxonomic distribution of putative pollen allergens was investigated using phylogenetic analysis, which showed distinct distribution patterns for some of these allergens. Both the expression pattern and the taxonomic distribution of these putative pollen allergens in model plants are likely to be useful to predict potential allergens in other plant species, especially those species without complete genome sequences. The sequence variation of allergen proteins among species, especially between lower and higher plants, indicated that allergenicity might change along with plant evolution.
In many pollen allergens like allergenic expansins and profilins, epitopes usually include important functional amino acid residues. We observed low Ka/Ks values and higher gene duplication ratios in putative pollen allergens, which importantly also indicated a relationship between allergenicity and the evolution of protein functions. Therefore, we suggest that allergenicity might be a by-product of gene duplication and functional specification.
Conserved epitope sequences in allergens have been proposed to result in desensitization in humans after long-term exposure (Radauer et al., 2012). Gene duplication promotes neofunctionalization by variation of protein sequence, thereby promoting the opportunity for new allergen formation or changing the allergenicity of previous allergens. We observed significantly higher gene duplication rates of putative pollen allergens in both rice and Arabidopsis. Allergenicity emerged from gene duplication events in some cases. For example, the EXPB-I clade of this family was separated into two groups by gene duplication: a divergent group containing allergenic β-expansins and a conservative group (Sampedro et al., 2015). The lack of divergent EXPB-I genes in eudicots or in the recently sequenced genomes of banana (Musa spp.), date palm (Phoenix dactylifera), and oil palm (Elaeis guineensis) also supports a recent split (Tang et al., 2010). In rice, divergent and conservative EXPB-I groups were inferred to have evolved from the sigma whole-genome duplication in grasses, and changes in tissue expression of divergent EXPB-I permitted pollen-specific β-expansins (OsEXPB1, OsEXPB10, OsEXPB13, and OsEXPB9). OsEXPB1, OsEXPB10, and OsEXPB13 were produced by tandem duplication events, and OsEXPB9 was produced by the rho whole-genome duplication (Tang et al., 2010). In addition, features of the expansin family demonstrated the way that gene duplication led to function specification and allergenicity. Divergent EXPB-I proteins may have evolved to act on a preferred substrate, highly substituted xylans in grasses (Sampedro et al., 2015). Unfortunately, these changes also generated the specific epitopes recognized by immunoglobulins from individuals allergic to group 1 grass pollen allergens (Flicker et al., 2006).
Allergens have stringent structural and epitope requirements (Burks et al., 1999); however, variation within the epitope may create new allergens or disrupt the allergenicity. One good example is the peanut allergen Ara h 3 gene family, which arose by segmental and tandem duplications and evolved in a conservative manner (Ratnaparkhe et al., 2014). Low Ka/Ks rates of putative pollen allergens in rice and Arabidopsis indicate that these allergens might have experienced purifying selection (Fig. 4, C and D). The limited ratio of nonsynonymous mutations implied that these allergens might have evolved to have unique functions in pollen. The molecular function of a protein requires a stable structure, and so do existing allergens. Our data suggest that epitopes might be located in conserved functional sites of putative allergenic proteins, as we observed a limited ratio of nonsynonymous mutation in putative pollen allergens. As mentioned previously, pollen allergens tended to be involved in cell wall (pollen wall) metabolic processes and stress responses (Supplemental Table S2), which indicated that they underwent a strict purifying selection through pollen competition or other stresses to perform the function. That also may be the reason for the phenomenon that putative pollen allergens showed both higher gene duplication rates and lower Ka/Ks values. Allergenic β-expansins are good examples influencing the outcome of pollen competition by affecting pollen tube growth (Valdivia et al., 2007).
CONCLUSION
In summary, this work predicted 145 and 107 pollen allergens from rice and Arabidopsis, respectively and these pollen allergens are associated with stress responses and metabolic events during pollen development. Interestingly, sequence analysis across 25 plant species from low plants to high plants suggests that some pollen allergens belongs to large gene families generated by gene duplication, purifying selection, and functional diversification during evolution. During this process, two selection processes were evident: the fixation of duplication (maintaining the allergenicity) and the fixation of allergen-determining residues (retaining allergenic epitopes). Stress, pollen competition, and functional selection (like cell wall metabolic processes) could be involved in the fixation processes (Fig. 7). Our analysis of putative pollen allergens from model plants is helpful to predict pollen allergens in other species and future medical treatment of pollen allergenicity. Our model of pollen allergen evolution could provide an insight into the mechanisms underlying how allergenicity evolved and help in the identification of epitopes.
Figure 7.
Model of the origination and evolution of pollen allergen genes in plants. Conserved allergens may lead to the induction of immunological tolerance, while duplicated genes may either diverge in protein sequence to generate new allergens or maintain the original allergenicity. During this process, two selection processes are likely: fixation of the duplication (copies maintained) or fixation of allergen-determining mutations (retaining of allergenic epitopes).
MATERIALS AND METHODS
Identification of Allergenic Genes
Gene sequences for the prediction of allergens present in mature pollen grains of Arabidopsis (Arabidopsis thaliana) and rice (Oryza sativa) were collected from the literature (Holmes-Davis et al., 2005; Noir et al., 2005; Dai et al., 2006; Sheoran et al., 2006). Gene expression data sets of pollen in Arabidopsis and rice (Qin et al., 2009; Wei et al., 2010), GSM692545, GSM692546, GSM69254, GSM433634, GSM433635, GSM433636, and GSM433637, were downloaded from the Gene Expression Omnibus at the National Center for Biotechnology Information (Barrett and Edgar, 2006). Only genes found in proteome data and presented (MAS5.0 AP call) in more than half of the replicates of microarray analysis were chosen as candidate genes for allergen prediction (Pepper et al., 2007).
All gene identifiers, transcript identifiers, and the corresponding descriptions were collected from the Arabidopsis Information Resource database (http://www.arabidopsis.org/; Lamesch et al., 2012), the Rice Genome Annotation Project database (http://rice.plantbiology.msu.edu/; Kawahara et al., 2013), or the Rice Annotation Project database (http://rapdb.dna.affrc.go.jp/; Sakai et al., 2013). Information of known pollen allergens was obtained from the Allergome database (http://www.allergome.org/; Mari et al., 2006) and the World Health Organization/IUIS Allergen Nomenclature official database (http://www.allergen.org/). Protein sequence data in FASTA format were downloaded from the Universal Protein Resource database release 2014_03 (http://www.uniprot.org/; UniProt Consortium, 2014).
Sequence-based and maximum relevance minimum redundancy feature selection methods were used to detect potential allergens in mature pollen using two published prediction tools, proAP (Wang et al., 2013b) and PREAL (Wang et al., 2013c), on our server. The sequence-based approach was proposed by the Food and Agriculture Organization of the United Nations and the World Health Organization (FAO/WHO, 2003), and the number of exact matches in a stretch of consecutive identical amino acids was set to more than eight (rule 1). Proteins predicted by both methods were retained as potential pollen allergens to ensure accuracy (Wang et al., 2013c). The prediction results and information on putative pollen allergens are shown in Supplemental Data Set S1.
Gene Expression Profile Analysis
The expression data of genes corresponding to potential allergens in Arabidopsis and rice were downloaded from the Bio-Analytic Resource for Plant Biology (http://bar.utoronto.ca/; Toufighi et al., 2005) or the Rice Oligonucleotide Array Database (http://www.ricearray.org/; Cao et al., 2012), respectively. Information about the expression data, such as growth stage, tissues, and samples, is listed in Supplemental Data Set S2. To avoid batch effects, ComBat (Johnson et al., 2007), an R package, was used to adjust expression data from different experiments. To examine expression patterns and the specificity of target genes, the data were clustered by Genesis release 1.7.6 (Sturn et al., 2002).
MapMan and GO Analysis
The PLAZA database version 2.5 (http://bioinformatics.psb.ugent.be/plaza/; Van Bel et al., 2012) and the PANTHER classification system (http://pantherdb.org/; Mi et al., 2013) were used to perform GO classification and enrichment analysis. To investigate the metabolic processes involved, MapMan was used to check the metabolic overview of potential allergens (Thimm et al., 2004), and significance was tested by a hypergeometric distribution test.
Protein Family and Taxonomic Distribution Analysis
Genes were classified into protein families using the Pfam protein families database version 27.0 (http://pfam.xfam.org/; Finn et al., 2014) and the Plant Gene Family Database (http://green.dna.affrc.go.jp/PGF-DB/). Homologs including in-paralogs (i.e. BLAST hit of genes in the same species having higher bit scores than the best hit from any other species) were obtained from 25 plants (including Arabidopsis and rice) after BLAST at the PLAZA database version 2.5 (E value threshold of 1e-05).
Construction of the Phylogenetic Tree, Sequence Analysis, and 3D Modeling
The Clustal Omega (Sievers and Higgins, 2014) server at the European Bioinformatics Institute (http://www.ebi.ac.uk/Tools/msa/clustalo/) was used to compare protein sequences downloaded from the UniProt database. Results of sequence alignments are shown with known secondary structure information from the Protein Data Bank (Berman et al., 2000) by the Web-based tool Easy Sequencing in PostScript (Robert and Gouet, 2014). Unrooted phylogenetic trees were reconstructed by MEGA6 (Tamura et al., 2013) using neighbor-joining and maximum likelihood methods. The 3D structures of Ara h 5 and AtProfilin1 were obtained from the Protein Data Bank under accession numbers 4ESP (Wang et al., 2013d) and 1A0K (Thorn et al., 1997), and the 3D models were visualized by UCSF Chimera (Pettersen et al., 2004).
Gene Duplication Analysis and Genome-Level Ka/Ks Estimation
Gene duplication data were obtained from the PLAZA database version 2.5 including tandem duplication and block duplication. These duplication events were identified through collinearity information using i-ADHoRe version 3.0 (Proost et al., 2012). To estimate selective pressure acting on genes, four closely related species to Arabidopsis and rice were chosen to calculate Ka/Ks rates in each species. Homolog gene pairs of Arabidopsis and Arabidopsis lyrata, Brassica rapa, Capsella rubella, or Thellungiella parvula were identified with the method of best hits of BLASTP at the PLAZA database version 2.5. ParaAT 1.0 (Zhang et al., 2012) and Clustal Omega were used for multiple sequence alignment, then Ka/Ks rates were calculated by KaKs_Calculator 2.0 (Wang et al., 2010) using the γ-MYN method (Wang et al., 2009). For homolog gene pairs in rice, Ka/Ks data calculated by the γ-MYN method in Oryza brachyantha, Oryza sativa ssp. indica 9311, Oryza glaberrima, and Peiai 64S (PA64S) were downloaded from RGKbase (Wang et al., 2013a). Gene pairs with number of nonsynonymous substitutions per nonsynonymous site < 0.5, number of synonymous substitutions per synonymous site < 5, and Ka/Ks < 2 were retained for comparison (Supplemental Data Set S3).
Accession Numbers
Accession numbers for the genes in this article are as follows: OsEXPB1a (LOC_Os03g01610), OsEXPB1b (LOC_Os03g01650), OsEXPB2 (LOC_Os10g40710), OsEXPB2 (LOC_Os10g40710), OsEXPB3 (LOC_Os10g40720), OsEXPB4 (LOC_Os10g40730), OsEXPB5 (LOC_Os04g46650), OsEXPB6 (LOC_Os10g40700), OsEXPB7 (LOC_Os03g01270), OsEXPB8 (LOC_Os03g01260), OsEXPB9 (LOC_Os10g40090), OsEXPB10 (LOC_Os03g01640), OsEXPB11 (LOC_Os02g44108), OsEXPB12 (LOC_Os03g44290), OsEXPB13 (LOC_Os03g01630), OsEXPB14 (LOC_Os02g44106), OsEXPB15 (LOC_Os04g46630), OsEXPB16 (LOC_Os02g42650), OsEXPB17 (LOC_Os04g44780), OsEXPB18 (LOC_Os05g15690), AtEXPB1 (AT2G20750), AtEXPB2 (AT1G65680), AtEXPB3 (AT4G28250), AtEXPB4 (AT2G45110), AtEXPB5 (AT3G60570), AtProfilin1 (AT2G19760), AtProfilin2 (AT4G29350), AtProfilin3 (AT5G56600), AtProfilin4 (AT4G29340), and AtProfilin5 (AT2G19770).
Supplemental Data
The following supplemental materials are available.
Supplemental Figure S1. Analysis of the putative pollen allergen protein families in rice and Arabidopsis.
Supplemental Figure S2. Summary of the MapMan classification of the candidate allergen genes in rice and Arabidopsis.
Supplemental Figure S3. Sequence alignment of the expansin family.
Supplemental Figure S4. Taxonomic distribution of expansin homologs.
Supplemental Figure S5. Sequence alignment of the profilin family.
Supplemental Table S1. Published allergens found in pollen genes predicted in rice and Arabidopsis.
Supplemental Table S2. Summary of gene family functions of putative allergen homologs.
Supplemental Data Set S1. Candidate pollen allergens identified in proteome and transcriptome.
Supplemental Data Set S2. Expression profiles of putative pollen allergens in rice and Arabidopsis.
Supplemental Data Set S3. Ka/Ks values of putative pollen allergens in rice and Arabidopsis.
Supplementary Material
Glossary
- Ka/Ks
ratio of the number of nonsynonymous substitutions per nonsynonymous site to the number of synonymous substitutions per synonymous site
- 3D
three-dimensional
- GO
Gene Ontology
Footnotes
This work was supported by The National Key Technologies Research and Development Program of China (2016YFD0100804); the National Natural Science Foundation of China (grant nos. 31370026, 31570312, J1210047, and 31110103915); the China Innovative Research Team, Ministry of Education, and the Programme of Introducing Talents of Discipline to Universities (111 Project, grant no. B14016); the Chun-Tsung Program of Shanghai Jiao Tong University; the Innovative Research Team in University of the Ministry of Education of China; the Innovative Research Team in University of the Ministry of Science and Technology of China; the School of Agriculture, Food, and Wine, University of Adelaide (start-up grant to D.Z.); and the Australian Research Council (grant no. FT130100525 to I.S.).
Articles can be viewed without a subscription.
References
- Adams KL, Wendel JF (2005) Polyploidy and genome evolution in plants. Curr Opin Plant Biol 8: 135–141 [DOI] [PubMed] [Google Scholar]
- Arondel VV, Vergnolle C, Cantrel C, Kader J (2000) Lipid transfer proteins are encoded by a small multigene family in Arabidopsis thaliana. Plant Sci 157: 1–12 [DOI] [PubMed] [Google Scholar]
- Barral P, Suárez C, Batanero E, Alfonso C, Alché JdeD, Rodríguez-García MI, Villalba M, Rivas G, Rodríguez R (2005) An olive pollen protein with allergenic activity, Ole e 10, defines a novel family of carbohydrate-binding modules and is potentially implicated in pollen germination. Biochem J 390: 77–84 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Barrett T, Edgar R (2006) Gene Expression Omnibus: microarray data storage, submission, retrieval, and analysis. Methods Enzymol 411: 352–369 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H, Shindyalov IN, Bourne PE (2000) The Protein Data Bank. Nucleic Acids Res 28: 235–242 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bowers JE, Chapman BA, Rong J, Paterson AH (2003) Unravelling angiosperm genome evolution by phylogenetic analysis of chromosomal duplication events. Nature 422: 433–438 [DOI] [PubMed] [Google Scholar]
- Burks AW, King N, Bannon GA (1999) Modification of a major peanut allergen leads to loss of IgE binding. Int Arch Allergy Immunol 118: 313–314 [DOI] [PubMed] [Google Scholar]
- Cabanos C, Tandang-Silvas MR, Odijk V, Brostedt P, Tanaka A, Utsumi S, Maruyama N (2010) Expression, purification, cross-reactivity and homology modeling of peanut profilin. Protein Expr Purif 73: 36–45 [DOI] [PubMed] [Google Scholar]
- Cao P, Jung KH, Choi D, Hwang D, Zhu J, Ronald PC (2012) The Rice Oligonucleotide Array Database: an atlas of rice gene expression. Rice (N Y) 5: 17. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Choi D, Cho H, Lee Y (2006) Expansins: expanding importance in plant growth and development. Physiol Plant 126: 511–518 [Google Scholar]
- Cosgrove DJ, Bedinger P, Durachko DM (1997) Group I allergens of grass pollen as cell wall-loosening agents. Proc Natl Acad Sci USA 94: 6559–6564 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cosgrove DJ. (2000) Loosening of plant cell walls by expansins. Nature 407: 321–326 [DOI] [PubMed] [Google Scholar]
- Cui X, Lv Y, Chen ML, Nikoloski Z, Twell D, Zhang DB (2015) Young genes out of the male: an insight from evolutionary age analysis of the pollen transcriptome. Mol Plant 8: 935–945 [DOI] [PubMed] [Google Scholar]
- Dai S, Li L, Chen T, Chong K, Xue Y, Wang T (2006) Proteomic analyses of Oryza sativa mature pollen reveal novel proteins associated with pollen germination and tube growth. Proteomics 6: 2504–2529 [DOI] [PubMed] [Google Scholar]
- D’Amato G, Cecchi L, Bonini S, Nunes C, Annesi-Maesano I, Behrendt H, Liccardi G, Popov T, van Cauwenberge P (2007) Allergenic pollen and pollen allergy in Europe. Allergy 62: 976–990 [DOI] [PubMed] [Google Scholar]
- Emberlin J. (2009) Grass, tree, and weed pollen. In Kay AB, Kaplan AP, Bousquet J, Holt PG, eds, Allergy and Allergic Diseases, Ed 2 Wiley-Blackwell, Oxford, pp 942–962 [Google Scholar]
- Esch RE, Klapper DG (1989) Identification and localization of allergenic determinants on grass group I antigens using monoclonal antibodies. J Immunol 142: 179–184 [PubMed] [Google Scholar]
- Esteve C, Montealegre C, Marina ML, Garcia MC (2012) Analysis of olive allergens. Talanta 92: 1–14 [DOI] [PubMed] [Google Scholar]
- FAO/WHO (2003) Report of a Joint FAO/WHO Expert Consultation on Allergenicity of Foods Derived from Bio-technology. FAO/WHO, Rome [Google Scholar]
- Finn RD, Bateman A, Clements J, Coggill P, Eberhardt RY, Eddy SR, Heger A, Hetherington K, Holm L, Mistry J, et al. (2014) Pfam: the protein families database. Nucleic Acids Res 42: D222–D230 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Flicker S, Steinberger P, Ball T, Krauth MT, Verdino P, Valent P, Almo S, Valenta R (2006) Spatial clustering of the IgE epitopes on the major timothy grass pollen allergen Phl p 1: importance for allergenic activity. J Allergy Clin Immunol 117: 1336–1343 [DOI] [PubMed] [Google Scholar]
- Gadermaier G, Dedic A, Obermeyer G, Frank S, Himly M, Ferreira F (2004) Biology of weed pollen allergens. Curr Allergy Asthma Rep 4: 391–400 [DOI] [PubMed] [Google Scholar]
- Gibbon BC, Zonia LE, Kovar DR, Hussey PJ, Staiger CJ (1998) Pollen profilin function depends on interaction with proline-rich motifs. Plant Cell 10: 981–993 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Grote M. (1999) In situ localization of pollen allergens by immunogold electron microscopy: allergens at unexpected sites. Int Arch Allergy Immunol 118: 1–6 [DOI] [PubMed] [Google Scholar]
- Grote M, Vrtala S, Valenta R (1993) Monitoring of two allergens, Bet v I and profilin, in dry and rehydrated birch pollen by immunogold electron microscopy and immunoblotting. J Histochem Cytochem 41: 745–750 [DOI] [PubMed] [Google Scholar]
- Gruehn S, Suphioglu C, O’Hehir RE, Volkmann D (2003) Molecular cloning and characterization of hazel pollen protein (70 kD) as a luminal binding protein (BiP): a novel cross-reactive plant allergen. Int Arch Allergy Immunol 131: 91–100 [DOI] [PubMed] [Google Scholar]
- Hiller KM, Esch RE, Klapper DG (1997) Mapping of an allergenically important determinant of grass group I allergens. J Allergy Clin Immunol 100: 335–340 [DOI] [PubMed] [Google Scholar]
- Hirano K, Hino S, Oshima K, Okajima T, Nadano D, Urisu A, Takaiwa F, Matsuda T (2013) Allergenic potential of rice-pollen proteins: expression, immuno-cross reactivity and IgE-binding. J Biochem 154: 195–205 [DOI] [PubMed] [Google Scholar]
- Holmes-Davis R, Tanaka CK, Vensel WH, Hurkman WJ, McCormick S (2005) Proteome mapping of mature pollen of Arabidopsis thaliana. Proteomics 5: 4864–4884 [DOI] [PubMed] [Google Scholar]
- Hrabina MÃ, Peltre G, Van Ree R, Moingeon PÃ (2008) Grass pollen allergens. Clin Exp Allergy Rev 3: 7–11 [Google Scholar]
- Hurst LD. (2002) The Ka/Ks ratio: diagnosing the form of sequence evolution. Trends Genet 18: 486. [DOI] [PubMed] [Google Scholar]
- Jimenez-Lopez JC, Kotchoni SO, Rodriguez-Garcia MI, Alche JD (2012) Structure and functional features of olive pollen pectin methylesterase using homology modeling and molecular docking methods. J Mol Model 18: 4965–4984 [DOI] [PubMed] [Google Scholar]
- Johnson WE, Li C, Rabinovic A (2007) Adjusting batch effects in microarray expression data using empirical Bayes methods. Biostatistics 8: 118–127 [DOI] [PubMed] [Google Scholar]
- Kawahara Y, de la Bastide M, Hamilton JP, Kanamori H, McCombie WR, Ouyang S, Schwartz DC, Tanaka T, Wu J, Zhou S, et al. (2013) Improvement of the Oryza sativa Nipponbare reference genome using next generation sequence and optical map data. Rice (N Y) 6: 4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lakhssassi N, Doblas VG, Rosado A, Esteban del Valle A, Pose D, Jimenez AJ, Castillo AG, Valpuesta V, Borsani O, Botella MA (2012) The Arabidopsis tetratricopeptide thioredoxin-like gene family is required for osmotic stress tolerance and male sporogenesis. Plant Physiol 158: 1252–1266 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lamesch P, Berardini TZ, Li D, Swarbreck D, Wilks C, Sasidharan R, Muller R, Dreher K, Alexander DL, Garcia-Hernandez M, et al. (2012) The Arabidopsis Information Resource (TAIR): improved gene annotation and new tools. Nucleic Acids Res 40: D1202–D1210 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Li LC, Bedinger PA, Volk C, Jones AD, Cosgrove DJ (2003) Purification and characterization of four β-expansins (Zea m 1 isoforms) from maize pollen. Plant Physiol 132: 2073–2085 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Liu X, Qu X, Jiang Y, Chang M, Zhang R, Wu Y, Fu Y, Huang S (2015) Profilin regulates apical actin polymerization to control polarized pollen tube growth. Mol Plant 8: 1694–1709 [DOI] [PubMed] [Google Scholar]
- Lynch M, Conery JS (2000) The evolutionary fate and consequences of duplicate genes. Science 290: 1151–1155 [DOI] [PubMed] [Google Scholar]
- Mahler V, Fischer S, Heiss S, Duchêne M, Kraft D, Valenta R (2001) cDNA cloning and characterization of a cross-reactive birch pollen allergen: identification as a pectin esterase. Int Arch Allergy Immunol 124: 64–66. [DOI] [PubMed] [Google Scholar]
- Mari A, Scala E, Palazzo P, Ridolfi S, Zennaro D, Carabella G (2006) Bioinformatics applied to allergy: allergen databases, from collecting sequence information to data integration. The Allergome platform as a model. Cell Immunol 244: 97–100 [DOI] [PubMed] [Google Scholar]
- Mi H, Muruganujan A, Thomas PD (2013) PANTHER in 2013: modeling the evolution of gene function, and other gene attributes, in the context of phylogenetic trees. Nucleic Acids Res 41: D377–D386 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mori T, Yokoyama M, Komiyama N, Okano M, Kino K (1999) Purification, identification, and cDNA cloning of Cha o 2, the second major allergen of Japanese cypress pollen. Biochem Biophys Res Commun 263: 166–171 [DOI] [PubMed] [Google Scholar]
- Mothes N, Valenta R (2004) Biology of tree pollen allergens. Curr Allergy Asthma Rep 4: 384–390 [DOI] [PubMed] [Google Scholar]
- Musidlowska-Persson A, Alm R, Emanuelsson C (2007) Cloning and sequencing of the Bet v 1-homologous allergen Fra a 1 in strawberry (Fragaria ananassa) shows the presence of an intron and little variability in amino acid sequence. Mol Immunol 44: 1245–1252 [DOI] [PubMed] [Google Scholar]
- Nei M. (1969) Gene duplication and nucleotide substitution in evolution. Nature 221: 40–42 [DOI] [PubMed] [Google Scholar]
- Noir S, Bräutigam A, Colby T, Schmidt J, Panstruga R (2005) A reference map of the Arabidopsis thaliana mature pollen proteome. Biochem Biophys Res Commun 337: 1257–1266 [DOI] [PubMed] [Google Scholar]
- Pastorello EA, Farioli L, Pravettoni V, Ispano M, Scibola E, Trambaioli C, Giuffrida MG, Ansaloni R, Godovac-Zimmermann J, Conti A, et al. (2000) The maize major allergen, which is responsible for food-induced allergic reactions, is a lipid transfer protein. J Allergy Clin Immunol 106: 744–751 [DOI] [PubMed] [Google Scholar]
- Pawankar R, Canonica GW, Holgate ST, Lockey RF, Blaiss M (2013) The WAO White Book on Allergy (Update 2013). Wisconsin World Allergy Organization, Milwaukee, WI [Google Scholar]
- Pepper SD, Saunders EK, Edwards LE, Wilson CL, Miller CJ (2007) The utility of MAS5 expression summary and detection call algorithms. BMC Bioinformatics 8: 273. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Petersen A, Schramm G, Bufe A, Schlaak M, Becker WM (1995) Structural investigations of the major allergen Phl p I on the Complimentary-DNA and protein levels. J Allergy Clin Immunol 95: 987–994 [DOI] [PubMed] [Google Scholar]
- Pettersen EF, Goddard TD, Huang CC, Couch GS, Greenblatt DM, Meng EC, Ferrin TE (2004) UCSF Chimera: a visualization system for exploratory research and analysis. J Comput Chem 25: 1605–1612 [DOI] [PubMed] [Google Scholar]
- Proost S, Fostier J, De Witte D, Dhoedt B, Demeester P, Van de Peer Y, Vandepoele K (2012) i-ADHoRe 3.0: fast and sensitive detection of genomic homology in extremely large data sets. Nucleic Acids Res 40: e11. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Qin Y, Leydon AR, Manziello A, Pandey R, Mount D, Denic S, Vasic B, Johnson MA, Palanivelu R (2009) Penetration of the stigma and style elicits a novel transcriptome in pollen tubes, pointing to genes critical for growth in a pistil. PLoS Genet 5: e1000621. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Radauer C, Breiteneder H (2006) Pollen allergens are restricted to few protein families and show distinct patterns of species distribution. J Allergy Clin Immunol 117: 141–147 [DOI] [PubMed] [Google Scholar]
- Radauer C, Breiteneder H (2007) Evolutionary biology of plant food allergens. J Allergy Clin Immunol 120: 518–525 [DOI] [PubMed] [Google Scholar]
- Radauer C, Bublin M, Wagner S, Mari A, Breiteneder H (2008) Allergens are distributed into few protein families and possess a restricted number of biochemical functions. J Allergy Clin Immunol 121: 847–852.e7 [DOI] [PubMed] [Google Scholar]
- Radauer C, Guhslc E, Bublin M, Breiteneder H (2012) Pollen allergens differ from nonallergenic pollen proteins by their lower extent of evolutionary conservation. World Allergy Organ J (Suppl 2) 5: S23 [Google Scholar]
- Radauer C, Willerroider M, Fuchs H, Hoffmann-Sommergruber K, Thalhamer J, Ferreira F, Scheiner O, Breiteneder H (2006) Cross-reactive and species-specific immunoglobulin E epitopes of plant profilins: an experimental and structure-based analysis. Clin Exp Allergy 36: 920–929 [DOI] [PubMed] [Google Scholar]
- Ratnaparkhe MB, Lee TH, Tan X, Wang X, Li J, Kim C, Rainville LK, Lemke C, Compton RO, Robertson J, et al. (2014) Comparative and evolutionary analysis of major peanut allergen gene families. Genome Biol Evol 6: 2468–2488 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Robert X, Gouet P (2014) Deciphering key features in protein structures with the new ENDscript server. Nucleic Acids Res 42: W320–W324 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Saha S, Raghava GPS (2006) AlgPred: prediction of allergenic proteins and mapping of IgE epitopes. Nucleic Acids Res 34: W202–W209 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sakai H, Lee SS, Tanaka T, Numa H, Kim J, Kawahara Y, Wakimoto H, Yang CC, Iwamoto M, Abe T, et al. (2013) Rice Annotation Project Database (RAP-DB): an integrative and interactive database for rice genomics. Plant Cell Physiol 54: e6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Salamanca G, Rodriguez R, Quiralte J, Moreno C, Pascual CY, Barber D, Villalba M (2010) Pectin methylesterases of pollen tissue, a major allergen in olive tree. FEBS J 277: 2729–2739 [DOI] [PubMed] [Google Scholar]
- Sampedro J, Guttman M, Li LC, Cosgrove DJ (2015) Evolutionary divergence of β-expansin structure and function in grasses parallels emergence of distinctive primary cell wall traits. Plant J 81: 108–120 [DOI] [PubMed] [Google Scholar]
- Sander I, Rozynek P, Rihs HP, van Kampen V, Chew FT, Lee WS, Kotschy-Lang N, Merget R, Bruning T, Raulf-Heimsoth M (2011) Multiple wheat flour allergens and cross-reactive carbohydrate determinants bind IgE in baker's asthma. Allergy 66: 1208–1215 [DOI] [PubMed] [Google Scholar]
- Sheoran IS, Sproule KA, Olson DJH, Ross ARS, Sawhney VK (2006) Proteome profile and functional classification of proteins in Arabidopsis thaliana (Landsberg erecta) mature pollen. Sex Plant Reprod 19: 185–196 [Google Scholar]
- Shi J, Cui M, Yang L, Kim YJ, Zhang D (2015) Genetic and biochemical mechanisms of pollen wall development. Trends Plant Sci 20: 741–753 [DOI] [PubMed] [Google Scholar]
- Sievers F, Higgins DG (2014) Clustal Omega, accurate alignment of very large numbers of sequences. Methods Mol Biol 1079: 105–116 [DOI] [PubMed] [Google Scholar]
- Soeria-Atmadja D, Lundell T, Gustafsson MG, Hammerling U (2006) Computational detection of allergenic proteins attains a new level of accuracy with in silico variable-length peptide extraction and machine learning. Nucleic Acids Res 34: 3779–3793 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Songnuan W. (2013) Wind-pollination and the roles of pollen allergenic proteins. Asian Pac J Allergy Immunol 31: 261–270 [PubMed] [Google Scholar]
- Stadler MB, Stadler BM (2003) Allergenicity prediction by protein sequence. FASEB J 17: 1141–1143 [DOI] [PubMed] [Google Scholar]
- Sturn A, Quackenbush J, Trajanoski Z (2002) Genesis: cluster analysis of microarray data. Bioinformatics 18: 207–208 [DOI] [PubMed] [Google Scholar]
- Suck R, Petersen A, Hagen S, Cromwell O, Becker WM, Fiebig H (2000) Complementary DNA cloning and expression of a newly recognized high molecular mass allergen Phl p 13 from timothy grass pollen (Phleum pratense). Clin Exp Allergy 30: 324–332 [DOI] [PubMed] [Google Scholar]
- Tamura K, Stecher G, Peterson D, Filipski A, Kumar S (2013) MEGA6: Molecular Evolutionary Genetics Analysis version 6.0. Mol Biol Evol 30: 2725–2729 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tang H, Bowers JE, Wang X, Paterson AH (2010) Angiosperm genome comparisons reveal early polyploidy in the monocot lineage. Proc Natl Acad Sci USA 107: 472–477 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Thimm O, Bläsing O, Gibon Y, Nagel A, Meyer S, Krüger P, Selbig J, Müller LA, Rhee SY, Stitt M (2004) MAPMAN: a user-driven tool to display genomics data sets onto diagrams of metabolic pathways and other biological processes. Plant J 37: 914–939 [DOI] [PubMed] [Google Scholar]
- Thoma S, Kaneko Y, Somerville C (1993) A non-specific lipid transfer protein from Arabidopsis is a cell wall protein. Plant J 3: 427–436 [DOI] [PubMed] [Google Scholar]
- Thorn KS, Christensen HEM, Shigeta R, Huddler D, Shalaby L, Lindberg U, Chua NH, Schutt CE (1997) The crystal structure of a major allergen from plants. Structure 5: 19–32 [DOI] [PubMed] [Google Scholar]
- Toufighi K, Brady SM, Austin R, Ly E, Provart NJ (2005) The Botany Array Resource: e-northerns, expression angling, and promoter analyses. Plant J 43: 153–163 [DOI] [PubMed] [Google Scholar]
- UniProt Consortium (2014) UniProt: a hub for protein information. Nucleic Acids Res 43: D204–D212 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Valdivia ER, Stephenson AG, Durachko DM, Cosgrove DJ (2009) Class B beta-expansins are needed for pollen separation and stigma penetration. Sex Plant Reprod 22: 141–152 [DOI] [PubMed] [Google Scholar]
- Valdivia ER, Wu Y, Li LC, Cosgrove DJ, Stephenson AG (2007) A group-1 grass pollen allergen influences the outcome of pollen competition in maize. PLoS ONE 2: e154. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Valenta R, Duchene M, Ebner C, Valent P, Sillaber C, Deviller P, Ferreira F, Tejkl M, Edelmann H, Kraft D, et al. (1992) Profilins constitute a novel family of functional plant pan-allergens. J Exp Med 175: 377–385 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Van Bel M, Proost S, Wischnitzki E, Movahedi S, Scheerlinck C, Van de Peer Y, Vandepoele K (2012) Dissecting plant genomes with the PLAZA comparative genomics platform. Plant Physiol 158: 590–600 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Vieths S, Scheurer S, Ballmer-Weber B (2002) Current understanding of cross-reactivity of food allergens and pollen. Ann N Y Acad Sci 964: 47–68 [DOI] [PubMed] [Google Scholar]
- Wang D, Xia Y, Li X, Hou L, Yu J (2013a) The Rice Genome Knowledgebase (RGKbase): an annotation database for rice comparative genomics and evolutionary biology. Nucleic Acids Res 41: D1199–D1205 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wang D, Zhang Y, Zhang Z, Zhu J, Yu J (2010) KaKs_Calculator 2.0: a toolkit incorporating gamma-series methods and sliding window strategies. Genomics Proteomics Bioinformatics 8: 77–80 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wang DP, Wan HL, Zhang S, Yu J (2009) γ-MYN: a new algorithm for estimating Ka and Ks with consideration of variable substitution rates. Biol Direct 4: 20. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wang J, Yu Y, Zhao Y, Zhang D, Li J (2013b) Evaluation and integration of existing methods for computational prediction of allergens. BMC Bioinformatics (Suppl 4) 14: S1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wang J, Zhang D, Li J (2013c) PREAL: prediction of allergenic protein by maximum relevance minimum redundancy (mRMR) feature selection. BMC Syst Biol (Suppl 5) 7: S9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wang Y, Fu TJ, Howard A, Kothary MH, McHugh TH, Zhang Y (2013d) Crystal structure of peanut (Arachis hypogaea) allergen Ara h 5. J Agric Food Chem 61: 1573–1578 [DOI] [PubMed] [Google Scholar]
- Wang Z, Xie W, Chi F, Li C (2005) Identification of non-specific lipid transfer protein-1 as a calmodulin-binding protein in Arabidopsis. FEBS Lett 579: 1683–1687 [DOI] [PubMed] [Google Scholar]
- Wei LQ, Xu WY, Deng ZY, Su Z, Xue Y, Wang T (2010) Genome-scale analysis and comparison of gene expression profiles in developing and germinated pollen in Oryza sativa. BMC Genomics 11: 338. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Xu H, Theerakulpisut P, Goulding N, Suphioglu C, Singh MB, Bhalla PL (1995) Cloning, expression and immunological characterization of Ory s 1, the major allergen of rice pollen. Gene 164: 255–259 [DOI] [PubMed] [Google Scholar]
- Yennawar NH, Li LC, Dudzinski DM, Tabuchi A, Cosgrove DJ (2006) Crystal structure and activities of EXPB1 (Zea m 1), a beta-expansin and group-1 pollen allergen from maize. Proc Natl Acad Sci USA 103: 14664–14671 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zhang D, Liang W, Yin C, Zong J, Gu F, Zhang D (2010) OsC6, encoding a lipid transfer protein, is required for postmeiotic anther development in rice. Plant Physiol 154: 149–162 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zhang Z, Xiao J, Wu J, Zhang H, Liu G, Wang X, Dai L (2012) ParaAT: a parallel tool for constructing multiple protein-coding DNA alignments. Biochem Biophys Res Commun 419: 779–781 [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.







