Abstract
The regulation of gene transcription is fundamental to the existence of complex multicellular organisms such as humans. This process dictates which genes are expressed in which tissues, and controls how various cell types grow, differentiate, and respond to their environments. Although the deciphering of the human genome sequence has given us the “source code” for life, we still know far too little about the mechanisms that control which sets of genes are active in which tissues, and how their expression is regulated. It is clear, however, that much of this system depends upon the sequence-specific interactions of regulatory proteins with particular genetic loci. To be able to unravel the details of these interactions on a genome-wide basis, it is necessary to know what proteins are bound to the DNA where in the genome, and to be able to monitor how those proteins change over time and in response to external stimuli. Developing a new technology to provide this information constitutes a “Grand Challenge” for Analytical Chemistry. In this brief article we outline the nature of this challenge, and propose one strategy to address it.
Introduction
One of the most important biological functions of a cell is the regulation of gene transcription to translate the information encoded in the genome into biological function. Gene expression is primarily controlled by the availabilities and activities of specific transcription factors and other regulatory proteins and by the physical accessibility of specific genomic regions to the transcriptional machinery. 1 These DNA-binding proteins influence genetic expression by interacting with promoters, enhancers, silencers, insulators and locus control regions, both proximal and distal to a gene. 2 Despite the availability of the entire sequence of many genomes, our empirical knowledge of the DNA sequences that are targeted for binding by regulatory proteins is limited, and the prediction of these sites and sequences computationally from DNA sequence information continues to prove challenging. 3 This situation was underscored by the Encyclopedia of DNA Elements (ENCODE) Consortium 2, an international effort to analyze and annotate the human genome and its DNA sequence. To quote from their findings, “Consensus sequences of transcription factor binding sites (typically 6 to 10 bases) have relatively little information content and are present numerous times in the genome, with the great majority of these not participating in transcriptional regulation. Does chromatin structure then determine whether such a sequence has a regulatory role? Are there complex inter-factor interactions that integrate the signals from multiple sites? How are signals from different distal regulatory elements coupled without affecting all neighboring genes?”
We briefly review below the roles that chromatin structure and accessibility, epigenetic modification to histones and DNA, and critical genomic regulatory elements play in controlling transcription. We then will make the case that the major missing component in gaining a complete understanding of gene transcription is knowledge of the identities and locations of the proteins that associate with and control expression of the genome. Developing a technology that will reveal this information constitutes a “Grand Challenge” for Analytical Chemistry.
Chromatin Changes and Histone Modifications
Modification of histone proteins affects the accessibility of genomic DNA in the nucleus to the protein machinery responsible for translation. Genomic DNA within the nucleus of a cell is normally packaged into a smaller volume by forming a complex with histones and other structural proteins. This complex of protein and DNA, referred to as chromatin, provides multiple layers of structural organization. The base unit of chromatin, the nucleosome core particle, comprises 147 base pairs of DNA wrapped 1.6 times around a core histone octamer. This octamer consists of two molecules each of the four core histone proteins: H2A, H2B, H3, and H4. A short span of “linker” DNA connects adjacent core particles and is capped in mammalian cells by a molecule of linker histone H1. Further levels of compaction, and the addition of scaffold proteins, produce complex arrangements creating chromatin fibers of varying thickness and with different transcriptional activity. DNA in highly compacted fibers is usually transcriptionally inactive, and genes in those regions are not actively expressed. In contrast, DNA regions that are actively transcribed, are usually opened to provide for access of transcription factors, regulatory proteins, and required enzymes to copy the DNA sequence into RNA molecules. 4
Accessibility of chromatin and the packaging of DNA into condensed structures is mediated by changes in structural proteins associated with chromatin. Specifically, the N-terminal tails of histone proteins at the core of the nucleosome undergo extensive posttranslational modifications, including acetylation, methylation, phosphorylation, ubiquitination, ADP ribosylation, biotinylation, citrullination, and sumoylation. 5 As early as 1964 Allfrey et al. observed that increased histone acetylation levels correlate with active transcription. 6 Conversely, histone methylation has been linked to gene repression. 7 Antibodies targeting specific histone modifications have uncovered site or pattern-dependent correlations between modification and gene activity. Turner et al., for example, examined Drosophila polytene chromosomes to revisit Allfrey’s findings. 8 With antibodies specific to each of the four acetylation sites of H4, they found strikingly divergent staining patterns: acetylated lysine 5 or lysine 8 (acK5 or acK8) stained throughout euchromatin (extended chromatin)) regions, while acK12 was overrepresented in heterochromatin (condensed chromatin) and acK16 localized to the male X chromosome. These and subsequent studies 9 led to the proposal of the histone code hypothesis by Strahl and Allis: “multiple histone modifications, acting in a combinatorial or sequential fashion on one or multiple histone tails, specify unique downstream functions”. 10 This hypothesis essentially proposes that the location of histone cores along the DNA is not static. Rather, different types of histone modification(s) modulate and alter the binding of DNA to nucleosome cores in chromatin structures, and thus affect transcriptional activation or repression.
DNA Methylation
In addition to the modification of histone proteins, chemical modification of the DNA molecule itself has also been shown to affect the level of gene expression. DNA methylation11, 12 involves the addition of a methyl group, commonly to the number 5 carbon of the cytosine pyrimidine ring. In adult somatic tissues, DNA methylation typically occurs in a CpG dinucleotide. Clusters of CpG dinucleotides (CpG islands) are often found in regulatory regions surrounding genes, and increased methylation of these regions is correlated with suppression of transcription. DNA methylation may affect transcription of genes by physically hindering the binding of transcriptional proteins to the gene, or by preferentially binding to proteins known as methyl-CpG-binding domain proteins (MBDs). MBDs bind to histone deacetylases and other chromatin remodeling proteins that can modify histones, thereby forming compact, inactive chromatin, which has been associated with a variety of human disorders. The loss of methyl-CpG-binding protein 2 (MeCP2) has been implicated in Rett Syndrome 13 and methyl-CpG binding domain protein 2 (MBD2) mediates the transcriptional silencing of hypermethylated genes in cancer. 14
Identification of Gene Regulatory Elements and DNA-binding Proteins
Methods to identify DNA-binding proteins and their specific position of binding are beginning to emerge. 15–19 However, no methodologies exist to probe protein-DNA interactions on a genome-wide scale. Traditionally, individual DNA sequences are investigated for binding of nuclear proteins using electrophoretic mobility shift assays (EMSA). 20 In this approach, protein binding is assessed by examining the mobility of the DNA fragment on the gel, which is retarded if the DNA is bound by proteins. While this method is often used to test whether a particular DNA sequence has the ability to bind nuclear proteins, it does not allow routine identification of the bound proteins, and is only suitable for the analysis of individual DNA sequences. Similar to EMSA, DNaseI footprinting21 has been used to analyze protein-DNA interactions in vitro whereby proteins of interest (e.g. nuclear extracts) are allowed to bind to a DNA fragment. The fragment is then cut with DNaseI and/or other agents, generating a series of smaller fragments of various sizes. The part of the DNA bound by a protein cannot be cut, resulting in a gap in the ladder of small fragments produced. This approach can be applied genome-wide by using DNA tiling arrays. 22 It is important to note, however, these methods do not provide information on the nature of the protein-DNA interaction that is present in vivo, which is critical to understanding the regulatory mechanism and dynamics.
Alternatively, numerous methods have been developed that allow the investigation of individual proteins and their interaction with DNA. Chromatin immunoprecipitation (ChIP) is probably the most widely accepted method for studying protein-DNA interactions. 23, 24 Cross-linking by formaldehyde is necessary for ChIP analysis of most non-histone proteins, followed by sonication to generate DNA fragments of a few hundred bp with physically attached proteins. Antibodies specific to a protein of interest, including specific PTMs, are used to immunoprecipitate the protein-DNA complex. The DNA fragments are then dissociated from the protein and analyzed by PCR, real-time PCR, DNA microarray (ChIP-Chip), or sequencing (ChIP-seq). 25 The resolution and coverage of ChIP-Chip ultimately depends on the composition of the DNA chip. Efficient sequencing of short DNA fragments following ChIP (ChIP-seq) 26, on the other hand, may reveal unexpected protein binding sites within DNA sequences that have not been pre-selected.
Both DNaseI and ChIP-Chip approaches have been used extensively by the ENCODE Consortium to reveal transcription factor binding sites using antibodies against a growing panel of sequence-specific transcription factors, components of the general transcription machinery, and modified histone proteins. In addition, the Consortium tested more than 600 potential promoter fragments for transcriptional activity by transient-transfection reporter assays in different human cell lines. This functional test is the primary validation of a biological role of the identified DNA sequences. However, such validation is elaborate and difficult, and negative results do not prove that a particular sequence is without biological relevance in transcriptional regulation (it may simply not play a significant role in the cell system tested).
New methods for mass-spectrometric identification of proteins binding to specific genomic loci are also beginning to emerge. 17–19, 27–30 Early attempts at accomplishing this involved exposure of synthetic dsDNA as bait to trap specific DNA-binding proteins from nuclear extract. 27–29 The technique of SILAC (Stable Isotope Labeling by Amino acids in Cell culture) has been used to improve the confidence of such methods. 17 These ex vivo approaches have an advantage in that large amounts of DNA and extract can be used to isolate sufficient material for MS identification. In contrast, in vivo approaches are considerably more challenging because the DNA sequence of interest may be present at a level of as few as one copy per cell. Butala et al. 18 were able to achieve successful identification of proteins from protein-DNA complexes formed in vivo in bacteria by increasing the abundance of the DNA through clever use of a low copy number plasmid containing the sequnce of interest and LacI to facilitate extraction. Déjardin and Kingston 19 used locked nucleic acid (LNA) probes to isolate genomic DNA with its associated proteins. There, they captured telomeric sequences, which are highly repetitive regions at the end of chromosomes, to obtain sufficient material for protein idenfication. It remains to be seen if any of these methods can be multiplexed for parallel analysis of multiple gene sequences. Furthermore, none have yet demonstrated sensitivity for identification of in vivo bound DNA-binding proteins when the sequence of interest is present at only a single copy per cell.
A “Grand Challenge” for Analytical Chemistry
Based on this brief overview, it is clear that a complete analysis of transcriptional regulation and the identification of mechanisms underlying changes in gene transcription observed in physiological systems or disease will require a comprehensive analysis of global gene expression, DNA methylation, protein-DNA interactions and histone modifications, including the resulting changes in nucleosome positioning and DNA binding. While technologies already exist for the genome-wide analysis of gene transcription and DNA methylation, there is a desperate need for new technologies that enable the comprehensive parallel analysis of all protein-DNA interactions (including histones) without prior knowledge or assumptions.
How might this be achieved? One powerful strategy that we are pursuing as part of the Wisconsin Center of Excellence in Genomics Science is called GENECAPP, for Global ExoNuclease-based Enrichment of Chromatin-Associated Proteins for Proteomics, and is illustrated in Figure 1. In this approach, a specific DNA fragment is captured in a sequence-specific manner, allowing the isolation and subsequent characterization of all proteins bound to that region. As for chromatin immunoprecipitation, formaldehyde may be used to crosslink proteins and DNA in vivo, locking into place the protein-DNA interactions which are present at that time. The chromatin is then fragmented, either by a physical means such as sonication, or by restriction enzyme digestion. An exonuclease then removes one of the two strands of the duplexes protruding from the complex, leaving behind a free single-stranded region suitable for DNA hybridization. Incubation of this material with a solid support modified with complementary single-stranded DNA capture probes results in specific binding of the chromatin fragments of interest along with associated proteins. Subsequent characterization of these bound proteins by standard proteomic mass spectrometry techniques provides a comprehensive identification of all proteins that are bound to the targeted DNA region, and potentially the characterization of posttranslational protein modifications. Parallel analysis of many regions across the genome can be achieved by using the multiplexing capabilities of either array-based or bead-based platforms with multiple capture oligonucleotide probes that are complementary to targeted DNA regions of interest.
Figure 1.
The GENECAPP (Global ExoNuclease-based Enrichment of Chromatin-Associated Proteins for Proteomics) strategy. The parallel identification of proteins bound at specific genomic loci begins with fragmentation of cross-linked chromatin from fixed cells or tissues. Single-stranded regions of DNA are created on each fragment using an exonuclease to digest one of the two DNA strands. Individual fragments are captured on DNA arrays by hybridization to the free single-stranded portions of DNA on each fragment prior to proteolytic digestion and identification of the proteins using mass spectrometry.
Although this would appear to be a conceptually straightforward task, it is in fact an extremely challenging proposition (and hence worthy of the term "Grand Challenge"!). The biggest obstacle to the successful implementation of this approach involves detection sensitivity. In the ChIP approach, the captured nucleic acid is amplified by PCR, to convert trace amounts of captured DNA into the much larger quantities needed for subsequent identification and analysis. However, since no such capability is available to amplify trace amounts of proteins, one must make do with what one is able to capture. A specific discovery proteomics experiment, for example, may require as much as a picomole of a protein of interest to be captured and available for mass spectrometry analysis. If the protein of interest is present at a very low abundance (e.g. a single copy of the protein binds to a specific target sequence), as is often the case for “master regulators” of gene expression such as transcription factors, the approach described above, at best, would allow the isolation of a single protein copy per cell. Obtaining a picomole of protein, accordingly, would require at least a picomole of cells, that is 6×1011 cells. In tissue culture of human cells, 106 cells per ml of culture media is a reasonable standard cell density, which leads one to the somewhat less reasonable projection of 105 ml or 102 liters to capture and isolate the necessary amount of protein for the mass spectral analysis. These 100 liters of cell culture medium would need to end up eventually in a volume of a few microliters for the mass spectral analysis, a concentration factor of well over a million.
So, is this at all plausible? Sure, but it clearly is not easy. Continuous advances in instrument sensitivity for mass spectrometers suggest we may need well less than a picomole of a protein of interest for mass spectral analysis. Furthermore, proteins of interest will be present at much higher levels than one copy per cell (e.g. histone proteins on a DNA fragment with multiple nucleosomes, or transcription factors that bind as dimers or multimers or contain multiple binding sites for cooperative binding). With such revised assumptions, the total required culture volume for cells may be significantly reduced. For individual studies of high importance, it is certainly plausible that an investigator would be willing and able to produce the requisite tens to hundreds of liters of cell culture. However, for the approach described above to have widespread impact, it would be most advantageous to develop very efficient strategies for isolating and concentrating molecules of interest away from the complex cellular background, and to push the state of the art in mass spectrometric analysis as far as possible, thereby reducing the requirements for such voluminous and expensive cell culture work at the front end. Much work will also need to be done to implement the multiplexed technology necessary to probe many sites across the genome in parallel, and the technology will have to be rapid, efficient, and inexpensive, in order for it to be feasible to study the critically important dynamic and spatial variations occurring in cells and tissues during processes such as development, differentiation, and cellular communication and signaling.
The challenge does not end there. Other noteworthy issues and questions to be addressed include:
Will the kinetics of the formaldehyde cross-linking be fast enough to capture transient protein:protein interactions, which, although weak, may nonetheless play critical roles in gene regulation?
Will there be enough DNA present and accessible in the fragmented chromatin for robust DNA hybridization to solid supports?
Will it be possible to identify post-translational modifications of the bound proteins and monitor how they vary with time?
Will it be possible to obtain not just qualitative, but also quantitative information on all forms of the proteins?
Will it be possible to integrate the results that are obtained with information from other studies (e.g. genome sequence variation; gene expression analysis) to develop an integrated view of the system-wide regulation of gene expression?
These and many other issues will have to be tackled as the development of this novel and important new technology proceeds. The reward for this effort will be critical new insights into the workings of the genome, leading eventually to the much fuller understanding of normal and disease biology that is the ultimate goal of biological research worldwide.
Conclusions
The successful sequencing of the human and many other genomes has ushered in a new age in Biology – the “Genome Age”. Armed with this “Blueprint of Life”, we must now learn the mechanisms that control which sets of genes are active in which tissues, and how their expression is regulated. New technologies are needed to provide information on how organisms control and express their “source code”. In this brief article we have described the need to know what proteins are bound to the DNA where in the genome, and to be able to monitor how those proteins change in time and in response to external stimuli. We have outlined one possible strategy for obtaining this information. Addressing this problem clearly constitutes a “Grand Challenge” in Analytical Chemistry.
Acknowledgements
This work was funded by the Wisconsin Center for Excellence in Genomics Science through NIH/NHGRI grant 1P50HG004952. We gratefully acknowledge A.J. Bureta for figure preparation.
Notes and references
- 1.Farnham PJ. Nat Rev Genet. 2009;10:605–616. doi: 10.1038/nrg2636. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Birney E, Stamatoyannopoulos JA, Dutta A, Guigo R, Gingeras TR, Margulies EH, Weng Z, Snyder M, Dermitzakis ET, Thurman RE, Kuehn MS, Taylor CM, Neph S, Koch CM, Asthana S, Malhotra A, Adzhubei I, Greenbaum JA, Andrews RM, Flicek P, Boyle PJ, Cao H, Carter NP, Clelland GK, Davis S, Day N, Dhami P, Dillon SC, Dorschner MO, Fiegler H, Giresi PG, Goldy J, Hawrylycz M, Haydock A, Humbert R, James KD, Johnson BE, Johnson EM, Frum TT, Rosenzweig ER, Karnani N, Lee K, Lefebvre GC, Navas PA, Neri F, Parker SC, Sabo PJ, Sandstrom R, Shafer A, Vetrie D, Weaver M, Wilcox S, Yu M, Collins FS, Dekker J, Lieb JD, Tullius TD, Crawford GE, Sunyaev S, Noble WS, Dunham I, Denoeud F, Reymond A, Kapranov P, Rozowsky J, Zheng D, Castelo R, Frankish A, Harrow J, Ghosh S, Sandelin A, Hofacker IL, Baertsch R, Keefe D, Dike S, Cheng J, Hirsch HA, Sekinger EA, Lagarde J, Abril JF, Shahab A, Flamm C, Fried C, Hackermuller J, Hertel J, Lindemeyer M, Missal K, Tanzer A, Washietl S, Korbel J, Emanuelsson O, Pedersen JS, Holroyd N, Taylor R, Swarbreck D, Matthews N, Dickson MC, Thomas DJ, Weirauch MT, Gilbert J, Drenkow J, Bell I, Zhao X, Srinivasan KG, Sung WK, Ooi HS, Chiu KP, Foissac S, Alioto T, Brent M, Pachter L, Tress ML, Valencia A, Choo SW, Choo CY, Ucla C, Manzano C, Wyss C, Cheung E, Clark TG, Brown JB, Ganesh M, Patel S, Tammana H, Chrast J, Henrichsen CN, Kai C, Kawai J, Nagalakshmi U, Wu J, Lian Z, Lian J, Newburger P, Zhang X, Bickel P, Mattick JS, Carninci P, Hayashizaki Y, Weissman S, Hubbard T, Myers RM, Rogers J, Stadler PF, Lowe TM, Wei CL, Ruan Y, Struhl K, Gerstein M, Antonarakis SE, Fu Y, Green ED, Karaoz U, Siepel A, Taylor J, Liefer LA, Wetterstrand KA, Good PJ, Feingold EA, Guyer MS, Cooper GM, Asimenos G, Dewey CN, Hou M, Nikolaev S, Montoya-Burgos JI, Loytynoja A, Whelan S, Pardi F, Massingham T, Huang H, Zhang NR, Holmes I, Mullikin JC, Ureta-Vidal A, Paten B, Seringhaus M, Church D, Rosenbloom K, Kent WJ, Stone EA, Batzoglou S, Goldman N, Hardison RC, Haussler D, Miller W, Sidow A, Trinklein ND, Zhang ZD, Barrera L, Stuart R, King DC, Ameur A, Enroth S, Bieda MC, Kim J, Bhinge AA, Jiang N, Liu J, Yao F, Vega VB, Lee CW, Ng P, Yang A, Moqtaderi Z, Zhu Z, Xu X, Squazzo S, Oberley MJ, Inman D, Singer MA, Richmond TA, Munn KJ, Rada-Iglesias A, Wallerman O, Komorowski J, Fowler JC, Couttet P, Bruce AW, Dovey OM, Ellis PD, Langford CF, Nix DA, Euskirchen G, Hartman S, Urban AE, Kraus P, Van Calcar S, Heintzman N, Kim TH, Wang K, Qu C, Hon G, Luna R, Glass CK, Rosenfeld MG, Aldred SF, Cooper SJ, Halees A, Lin JM, Shulha HP, Xu M, Haidar JN, Yu Y, Iyer VR, Green RD, Wadelius C, Farnham PJ, Ren B, Harte RA, Hinrichs AS, Trumbower H, Clawson H, Hillman-Jackson J, Zweig AS, Smith K, Thakkapallayil A, Barber G, Kuhn RM, Karolchik D, Armengol L, Bird CP, de Bakker PI, Kern AD, Lopez-Bigas N, Martin JD, Stranger BE, Woodroffe A, Davydov E, Dimas A, Eyras E, Hallgrimsdottir IB, Huppert J, Zody MC, Abecasis GR, Estivill X, Bouffard GG, Guan X, Hansen NF, Idol JR, Maduro VV, Maskeri B, McDowell JC, Park M, Thomas PJ, Young AC, Blakesley RW, Muzny DM, Sodergren E, Wheeler DA, Worley KC, Jiang H, Weinstock GM, Gibbs RA, Graves T, Fulton R, Mardis ER, Wilson RK, Clamp M, Cuff J, Gnerre S, Jaffe DB, Chang JL, Lindblad-Toh K, Lander ES, Koriabine M, Nefedov M, Osoegawa K, Yoshinaga Y, Zhu B, de Jong PJ. Nature. 2007;447:799–816. doi: 10.1038/nature05874. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Boeva V, Surdez D, Guillon N, Tirode F, Fejes AP, Delattre O, Barillot E. Nucleic Acids Res. 2010;38:e126. doi: 10.1093/nar/gkq217. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Chioda M, Becker PB. Heredity. 2010;105:71–79. doi: 10.1038/hdy.2010.34. [DOI] [PubMed] [Google Scholar]
- 5.Alberts B, Johnson A, Lewis J, Raff M, Roberts K, Walter P. Molecular biology of the cell. New York: Garland Science; 2002. [Google Scholar]
- 6.Allfrey VG, Faulkner R, Mirsky AE. Proc Natl Acad Sci U S A. 1964;51:786–794. doi: 10.1073/pnas.51.5.786. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Desrosiers R, Tanguay RM. Biochem Biophys Res Commun. 1985;133:823–829. doi: 10.1016/0006-291x(85)90978-7. [DOI] [PubMed] [Google Scholar]
- 8.Turner BM, Birley AJ, Lavender J. Cell. 1992;69:375–384. doi: 10.1016/0092-8674(92)90417-b. [DOI] [PubMed] [Google Scholar]
- 9.Brownell JE, Allis CD. Proc Natl Acad Sci U S A. 1995;92:6364–6368. doi: 10.1073/pnas.92.14.6364. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Strahl BD, Allis CD. Nature. 2000;403:41–45. doi: 10.1038/47412. [DOI] [PubMed] [Google Scholar]
- 11.Klose RJ, Bird AP. Trends Biochem Sci. 2006;31:89–97. doi: 10.1016/j.tibs.2005.12.008. [DOI] [PubMed] [Google Scholar]
- 12.Morgan HD, Santos F, Green K, Dean W, Reik W. Hum Mol Genet. 2005;14(Spec No 1):R47–R58. doi: 10.1093/hmg/ddi114. [DOI] [PubMed] [Google Scholar]
- 13.Amir RE, Van den Veyver IB, Wan M, Tran CQ, Francke U, Zoghbi HY. Nat Genet. 1999;23:185–188. doi: 10.1038/13810. [DOI] [PubMed] [Google Scholar]
- 14.Berger J, Bird A. Biochem Soc Trans. 2005;33:1537–1540. doi: 10.1042/BST0331537. [DOI] [PubMed] [Google Scholar]
- 15.Lambert JP, Fillingham J, Siahbazi M, Greenblatt J, Baetz K, Figeys D. Mol Syst Biol. 2010;6:448. doi: 10.1038/msb.2010.104. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Jiang D, Jarrett HW, Haskins WE. J Chromatogr A. 2009;1216:6881–6889. doi: 10.1016/j.chroma.2009.08.044. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Mittler G, Butter F, Mann M. Genome Res. 2009;19:284–293. doi: 10.1101/gr.081711.108. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Butala M, Busby SJ, Lee DJ. Nucleic Acids Res. 2009;37:e37. doi: 10.1093/nar/gkp043. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Dejardin J, Kingston RE. Cell. 2009;136:175–186. doi: 10.1016/j.cell.2008.11.045. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Perez-Romero P, Imperiale MJ. Methods Mol Med. 2007;131:123–139. doi: 10.1007/978-1-59745-277-9_10. [DOI] [PubMed] [Google Scholar]
- 21.Boyle AP, Song L, Lee BK, London D, Keefe D, Birney E, Iyer VR, Crawford GE, Furey TS. Genome Res. 2010 doi: 10.1101/gr.112656.110. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Shi B, Guo X, Wu T, Sheng S, Wang J, Skogerbo G, Zhu X, Chen R. BMC Genomics. 2009;10:92. doi: 10.1186/1471-2164-10-92. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Elnitski L, Jin VX, Farnham PJ, Jones SJ. Genome Res. 2006;16:1455–1464. doi: 10.1101/gr.4140006. [DOI] [PubMed] [Google Scholar]
- 24.Walhout AJ. Genome Res. 2006;16:1445–1454. doi: 10.1101/gr.5321506. [DOI] [PubMed] [Google Scholar]
- 25.Collas P. Mol Biotechnol. 2010;45:87–100. doi: 10.1007/s12033-009-9239-8. [DOI] [PubMed] [Google Scholar]
- 26.Robertson G, Hirst M, Bainbridge M, Bilenky M, Zhao Y, Zeng T, Euskirchen G, Bernier B, Varhol R, Delaney A, Thiessen N, Griffith OL, He A, Marra M, Snyder M, Jones S. Nat Methods. 2007;4:651–657. doi: 10.1038/nmeth1068. [DOI] [PubMed] [Google Scholar]
- 27.Stead JA, Keen JN, McDowall KJ. Mol Cell Proteomics. 2006;5:1697–1702. doi: 10.1074/mcp.T600027-MCP200. [DOI] [PubMed] [Google Scholar]
- 28.Nordhoff E, Krogsdam AM, Jorgensen HF, Kallipolitis BH, Clark BF, Roepstorff P, Kristiansen K. Nat Biotechnol. 1999;17:884–888. doi: 10.1038/12873. [DOI] [PubMed] [Google Scholar]
- 29.Griffin TJ, Aebersold R. J Biol Chem. 2001;276:45497–45500. doi: 10.1074/jbc.R100014200. [DOI] [PubMed] [Google Scholar]
- 30.Himeda CL, Ranish JA, Angello JC, Maire P, Aebersold R, Hauschka SD. Mol Cell Biol. 2004;24:2132–2143. doi: 10.1128/MCB.24.5.2132-2143.2004. [DOI] [PMC free article] [PubMed] [Google Scholar]

