Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2015 May 6.
Published in final edited form as: Chembiochem. 2009 Aug 17;10(12):1951–1954. doi: 10.1002/cbic.200900373

Amyloids go genomic: Insights regarding the sequence determinants of prion formation from genome-wide studies **

Hilal A Lashuel a*, Rohit V Pappu b*
PMCID: PMC4422328  NIHMSID: NIHMS149892  PMID: 19598186

Abstract

graphic file with name nihms-149892-f0001.jpg

The availability of fully sequenced genomes provides a useful starting point for identifying putative amyloid and prion forming sequences through genome-wide scans. With an inventory in hand, one can assess the amyloid forming potential and the functional consequences of amyloid formation for each sequence. Thus, advancing our understanding of how cells process and utilize deleterious and functional aggregates, respectively.

Keywords: Amyloid, Prions, poly-glutamine (glutamine-rich), poly-asparagine (asparagine-rich), fibrils, misfolding, aggregation


Protein misfolding is the primary cause of several systemic and neurodegenerative diseases [1, 2] and a major challenge in the development of protein-based therapeutics,[3] Figure 1a. In misfolding diseases such as cystic fibrosis and α1-anti-trypsin deficiency, degradation and/or mistrafficking of specific proteins causes loss of protein function. A second class of misfolding diseases includes Systemic Amyloidoses, Type II diabetes, Alzheimer’s disease, and Parkinson’s disease. These are caused by gains of toxic function resulting from the aggregation of specific misfolded protein into highly ordered structures, termed amyloid fibrils, Figure 1b. Amyloid-like protein aggregates that act as infectious agents by replicating and transmitting their misfolded state are called prions,[4] Figure 2c. Prion diseases include Creuzfeldt-Jakob disease (CJD), Gertsmann-Straussler syndrome (GSS), and fatal familial insomnia (FFI) in humans, bovine spongiform encephalopathy (BSE) in cattle, scrapie in sheep, and chronic wasting disease in elk. These diseases are characterized by the presence of an abnormal form of the prion protein (PrP) in the brain.[5] Prion aggregates isolated from tissues or prepared in vitro from purified recombinant proteins share a common core structure (cross-β-sheet), have fibrillar morphologies and show dye binding properties that are akin to classical amyloids. The transmission of prion diseases via misfolded and/or aggregated form(s) of the prion protein as an infectious agent distinguishes them from other amyloid diseases.

Figure 1.

Figure 1

Schematic illustrating of proposed mechanisms of protein misfolding, prion formation, and prion propagation in yeast. The molecular chaperone Hsp104 plays a critical role in prion propagation by fragmenting prion particles, thereby creating additional seeds that nucleate the formation of additional prion particles, which are transmitted from the daughter cells.

Figure 2.

Figure 2

Schematic illustrating the primary sequence of the prionogenic glutamine and / or asparagine-rich domains of the prion proteins Sup35NM (Swiss-Prot P05453), Ure2p (Swiss-Prot P23202), Rnq1p (Swiss-Prot P25367), HET-s (Swiss-Prot B8M2I5) and three of the prion candidates identified by Alberti et al (MOT3 (Swiss-Prot P54785), NRP1 (Swiss-Prot P32770), SWI1 (Swiss-Prot P09547). Glutamine (Q) residues are colored in red and Asparagine (N) residues are colored blue.

Given their link to disease, research interest in amyloids and prions has been centered on understanding the molecular mechanisms that govern their formation and toxicity.[2] The goal is to identify molecules or agents that can inhibit and/or reverse these processes in vitro and disease progression in animal models. During the last decade, the debate on amyloid has shifted from whether amyloids are the cause or consequence of disease [6]to whether they may even have a protective physiological role. The idea of amyloid as being a protective rather than a toxic species is anchored in the recognition that amyloid formation appears to be a property that is generic to many proteins. Furthermore, amyloid fibrils and prions appear to be essential constituents of many living organisms and are associated with a number of important biological functions.[1, 7] In bacterial cells, the process of amyloid formation has been linked to biofilm formation on their surfaces.[8] In mammalian cells, amyloid formation mediates melanin formation,[9] synaptic changes involved in memory storage[10] and trait inheritance.[11]

Generating inventories of functional amyloids

The full extent to which amyloid formation is used in biological functions remains unknown. Clearly, this knowledge will be useful for understanding the mechanisms by which cells process or utilize pathological amyloid formation. It also has relevance for understanding the role of amyloid formation in regulation of different biological processes. To fill these gaps in our knowledge, it is useful to start with an inventory of putative amyloid forming sequences. Annotations of these sequences will provide hints regarding the functional / dysfunctional roles of amyloid formation. The availability of fully sequenced genomes provides a useful starting point for identifying putative amyloid forming sequences through genome-wide scans. With an inventory in hand, one can assess the amyloid forming potential and the functional consequences of amyloid formation for each sequence.

Recent progress

Although the approach outlined above is easy to articulate in theory, there are numerous hurdles to overcome, not the least of which is the lack of clarity regarding signatures of amyloid forming sequences. The recent work of Lindquist and coworkers [12] provides the first proof-of-concept study that starts with genome-wide scanning to generate an inventory of putative amyloid formers (yeast prions in this case) and then going beyond this inventory to identify the bona fide prions. Starting with prior knowledge regarding the signatures of prion forming domains in Sup35, Ure2p, and Rnq1 Alberti et al. trained a hidden Markov model and carried out a proteome-wide scan of the S. cervisiae genome.[12] The scan yielded ca. 200 putative proteins with prion forming domains. The training set biased the search to identify proteins with lengthy domains that are rich in either glutamine and / or asparagine residues. Based on these sequence characteristics, they identified 200 candidate proteins containing prion-coding sequences, which were tested using an array of functional in vivo assays and in vitro biophysical studies. Alberti et al. selected the top 100 candidate proteins from their scan to determine if they possess prion-like properties. Specifically, they tested the ability of these proteins to a) form cytosolic foci; b) form ordered aggregates that are resistant to detergent; c) form amyloids in vitro as monitored using thioflavin-T (ThT) fluorescence; d) to generate so-called [PSI+] states using chimeras of candidate prion forming domains with the M and C-terminal domains of Sup35.[12]

Not all primary amides are the same

Prior to this study, it was believed that although the glutamine and asparagine content is important in determining the prion-forming propensity for a given sequence, the actual ratio of glutamine to asparagine plays a minor role. The assessments of Alberti et al. yielded a rather surprising result because they noticed that the aggregation prone prion forming domains tended to be enriched in asparagine residues. Conversely, the false positive putative prions were enriched in glutamine residues as well as prolines and charged residues, Figure 2. One might argue that this result is the consequence of enrichment in charged residues and amyloid breakers such as proline. Lindquist and co-workers have initiated studies to understand the preference for asparagine-rich – as opposed to glutamine-rich – regions in strongly amyloidogenic prion forming sequences.

Understanding the origins of the observed specificities

Polyglutamine-rich regions have occupied the attention of researchers in the amyloid field because of their association with nine neurodegenerative diseases. [13, 14] In these diseases, there is a striking inverse correlation between the lengths of polyglutamine expansions within specific proteins and the ages-of-onset for diseases associated with the aggregation of these proteins or their fragments.[15] Biophysical studies have shed light on several important aspects of polyglutamine tracts.[16-21] In aqueous milieus, monomeric polyglutamine forms collapsed structures to minimize the interface with the surrounding aqueous milieu.[17, 22] In short polyglutamine tracts, this preference for collapsed structures can be reversed by the addition of multiple lysine residues at the N- and C-termini [23]. Initial studies suggested that polyglutamine aggregation proceeds through a homogeneous nucleation mechanism involving the formation of a monomeric, β-sheet conformation.[24] Recent investigations indicate that the mechanism of polyglutamine aggregation is considerably more complex and in all likelihood involves the spontaneous formation of either spherical aggregates[21] or large linear aggregates[20] depending on the presence or absence of flanking lysine residues. The formation of ordered aggregates – which might be questionable for polyglutamine alone – appears to be a slow step that involves conformational rearrangements of small numbers of molecules within droplets that can be referred to as molten oligomers.[25, 26] Polyglutamine molecules collapse on themselves partly because the sidechain primary amides solvate backbone secondary amides; similarly, soluble molten oligomers form primarily through favorable interactions between sidechain primary amides.

If the preference for asparagine-rich regions in putative prion forming domains originates in intrinsic differences between asparagine and glutamine residues, then it should be possible to tease out these differences by comparing the conformational ensembles for monomeric and oligomeric polyglutamine and polyasparagine, respectively. Alternatively, the differences observed by Alberti et al. may reflect the effects of different sequence contexts, which modulate the common intrinsic preferences of asparagine and glutamine residues, Figure 2. Ongoing studies from different laboratories that utilize combinations of techniques will allow us to dissect the origin for the observed specificity for asparagine over glutamine in putative prion forming domains. Of particular interest is the role of sequence context and coarse grain characteristics such as the charge / hydropathy of flanking sequences in determining the differential prion / amyloid forming abilities of glutamine and asparagine rich regions.

Looking ahead

The ability of proteins to adopt functional and pathogenic states sharing similar structural properties suggests that formation of the functional state is tightly regulated and occurs under defined conditions. It has been hypothesized that proteins do not have to choose between an amyloidogenic / prionogenic and a functional state; rather, they can switch back and forth between the two states in response to cellular cues and / or environmental stresses. This hypothesis is supported by increasing evidence demonstrating the dynamic nature and reversibility of amyloid formation.[27, 28] If true then what determines if an amyloid or prionogenic protein is pathogenic or beneficial? Solving this puzzle requires detailed analysis of the prion-coding domains; we also need a mapping of intracellular localization and their roles in different biological pathways. The ability of proteins to form distinct fibrillar morphologies from the same precursor protein i.e., prion strains, also remains a mystery.[29] This concept has important implications for the diversity in function and pathogenecity of amyloid and prion forming proteins and may constitute the molecular basis underlying the ability of these proteins to switch between the different states.

The work of Alberti et al. is encouraging and constitutes an important first step toward more large-scale studies designed to answer precise questions regarding the amyloid forming abilities of different sequences, not just those that contain glutamine and asparagine rich regions. In particular, systematic studies are needed to assess the validity of various predictors of amyloidogenicity that have been developed by different laboratories.[30, 31] [32] Such studies will provide us with an improved understanding of comparative driving forces and mechanisms of protein aggregation and are likely to be useful in developing quantitative models that enable us to understand how cells process deleterious or functional aggregates. From a biological standpoint the work Alberti et al is encouraging because it is suggestive of routes that can be explored to discover novel amyloid and prion forming candidates in genomes of other organisms, especially those exposed to stressful environments where the ability to switch between different functional conformations may be advantageous. A better understanding of the sequence determinants of amyloid and prion formation and propagation and the molecular determinants that govern the reversibility between functional and pathogenic amyloids and prions will improve our understanding of how nature maintains the delicate balance between the two states to generate novel biological functions and protect against disease.

Acknowledgment

The authors would like to thank Mr. Bruno Fauvet for preparing Figure 2.

Footnotes

[**]

This work was supported by grants from Ecole Polytechnique Federale de Lausanne and Swiss National Science Foundation (HAL) and the National Institutes of Health 5R01NS056114 (RVP)

References

  • [1].Chiti F, Dobson CM. Annu Rev Biochem. 2006;75:333. doi: 10.1146/annurev.biochem.75.101304.123901. [DOI] [PubMed] [Google Scholar]
  • [2].Lansbury PT, Lashuel HA. Nature. 2006;443:774. doi: 10.1038/nature05290. [DOI] [PubMed] [Google Scholar]
  • [3].Wang N, Smith WF, Miller BR, Aivazian D, Lugovskoy AA, Reff ME, Glaser SM, Croner LJ, Demarest SJ. Proteins. 2009;76:99. doi: 10.1002/prot.22319. [DOI] [PubMed] [Google Scholar]
  • [4].Prusiner SB. Proc Natl Acad Sci U S A. 1998;95:13363. doi: 10.1073/pnas.95.23.13363. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [5].Aguzzi A, Sigurdson C, Heikenwaelder M. Annu Rev Pathol. 2008;3:11. doi: 10.1146/annurev.pathmechdis.3.121806.154326. [DOI] [PubMed] [Google Scholar]
  • [6].Caughey B, Lansbury PT. Annu Rev Neurosci. 2003;26:267. doi: 10.1146/annurev.neuro.26.010302.081142. [DOI] [PubMed] [Google Scholar]
  • [7].Fowler DM, Koulov AV, Balch WE, Kelly JW. Trends Biochem Sci. 2007;32:217. doi: 10.1016/j.tibs.2007.03.003. [DOI] [PubMed] [Google Scholar]
  • [8].Barnhart MM, Chapman MR. Annu Rev Microbiol. 2006;60:131. doi: 10.1146/annurev.micro.60.080805.142106. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [9].Fowler DM, Koulov AV, Alory-Jost C, Marks MS, Balch WE, Kelly JW. PLoS Biol. 2006;4:e6. doi: 10.1371/journal.pbio.0040006. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [10].Si K, Lindquist S, Kandel ER. Cell. 2003;115:879. doi: 10.1016/s0092-8674(03)01020-1. [DOI] [PubMed] [Google Scholar]
  • [11].Shorter J, Lindquist S. Nat Rev Genet. 2005;6:435. doi: 10.1038/nrg1616. [DOI] [PubMed] [Google Scholar]
  • [12].Alberti S, Halfmann R, King O, Kapila A, Lindquist S. Cell. 2009;137:146. doi: 10.1016/j.cell.2009.02.044. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [13].MacDonald ME, Ambrose CM, Duyao MP, Myers RH, Lin C, Srinidhi L, Barnes G, Taylor SA, James M, Groot N, MacFarlane H, Jenkins B, Anderson MA, Wexler NS, Gusella JF, Bates GP, Baxendale S, Hummerich H, Kirby S. Cell. 1993;72:971. [Google Scholar]
  • [14].Cummings CJ, Zoghbi HY. Human Molecular Genetics. 2000;9:909. doi: 10.1093/hmg/9.6.909. [DOI] [PubMed] [Google Scholar]
  • [15].Ross CA, Poirier MA. Nature Reviews Neuroscience. 2004:S10. doi: 10.1038/nm1066. [DOI] [PubMed] [Google Scholar]
  • [16].Chen S, Berthelier V, Yang W, Wetzel R. Journal Of Molecular Biology. 2001;311:173. doi: 10.1006/jmbi.2001.4850. [DOI] [PubMed] [Google Scholar]
  • [17].Crick SL, Jayaraman M, Frieden C, Wetzel R, Pappu RV. Proceedings Of The National Academy Of Sciences Of The United States Of America. 2006;103:16764. doi: 10.1073/pnas.0608175103. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [18].Bhattacharyya AM, Thakur AK, Wetzel R. Proceedings Of The National Academy Of Sciences Of The United States Of America. 2005;102:15400. doi: 10.1073/pnas.0501651102. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [19].Bhattacharyya A, Thakur AK, Chellgren VM, Thiagarajan G, Williams AD, Chellgren BW, Creamer TP, Wetzel R. Journal Of Molecular Biology. 2006;355:524. doi: 10.1016/j.jmb.2005.10.053. [DOI] [PubMed] [Google Scholar]
  • [20].Lee CC, Walters RH, Murphy RM. Biochemistry. 2007;46:12810. doi: 10.1021/bi700806c. [DOI] [PubMed] [Google Scholar]
  • [21].Vitalis A, Wang X, Pappu RV. Journal of Molecular Biology. 2008;384:279. doi: 10.1016/j.jmb.2008.09.026. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [22].Vitalis A, Wang X, Pappu RV. Biophysical Journal. 2007;93:1923. doi: 10.1529/biophysj.107.110080. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [23].Singh VR, Lapidus LJ. Journal of Physical Chemistry B. 2008;112:13172. doi: 10.1021/jp805636p. [DOI] [PubMed] [Google Scholar]
  • [24].Chen SM, Berthelier V, Hamilton JB, O’Nuallain B, Wetzel R. Biochemistry. 2002;41:7391. doi: 10.1021/bi011772q. [DOI] [PubMed] [Google Scholar]
  • [25].Krishnan R, Lindquist SL. Nature. 2005;435:765. doi: 10.1038/nature03679. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [26].Pappu RV, Wang X, Vitalis A, Crick SL. Archives of Biochemistry and Biophysics. 2008;469:132. doi: 10.1016/j.abb.2007.08.033. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [27].Mimna R, Camus MS, Schmid A, Tuchscherer G, Lashuel HA, Mutter M. Angew Chem Int Ed Engl. 2007;46:2681. doi: 10.1002/anie.200603681. [DOI] [PubMed] [Google Scholar]
  • [28].Schenk D, Barbour R, Dunn W, Gordon G, Grajeda H, Guido T, Hu K, Huang J, Johnson-Wood K, Khan K, Kholodenko D, Lee M, Liao Z, Lieberburg I, Motter R, Mutter L, Soriano F, Shopp G, Vasquez N, Vandevert C, Walker S, Wogulis M, Yednock T, Games D, Seubert P. Nature. 1999;400:173. doi: 10.1038/22124. [DOI] [PubMed] [Google Scholar]
  • [29].Aguzzi A. Nat Cell Biol. 2004;6:290. doi: 10.1038/ncb0404-290. [DOI] [PubMed] [Google Scholar]
  • [30].Tartaglia GG, Pawar AP, Campioni S, Dobson CM, Chiti F, Vendruscolo M. J Mol Biol. 2008;380:425. doi: 10.1016/j.jmb.2008.05.013. [DOI] [PubMed] [Google Scholar]
  • [31].Pawar AP, Dubay KF, Zurdo J, Chiti F, Vendruscolo M, Dobson CM. J Mol Biol. 2005;350:379. doi: 10.1016/j.jmb.2005.04.016. [DOI] [PubMed] [Google Scholar]
  • [32].Fernandez-Escamilla AM, Rousseau F, Schymkowitz J, Serrano L. Nat Biotechnol. 2004;22:1302. doi: 10.1038/nbt1012. [DOI] [PubMed] [Google Scholar]

RESOURCES