Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2017 Sep 12.
Published in final edited form as: Cell Rep. 2016 Sep 27;17(1):303–315. doi: 10.1016/j.celrep.2016.08.095

DEEPN as an Approach for Batch Processing of Yeast 2-Hybrid Interactions

Natasha Pashkova 1,3, Tabitha A Peterson 1,3, Venkatramanan Krishnamani 1, Patrick Breheny 2, Mark Stamnes 1, Robert C Piper 1,4,
PMCID: PMC5594928  NIHMSID: NIHMS885867  PMID: 27681439

Summary

We adapted the yeast 2-hybrid assay to simultaneously uncover multiple transient protein interactions within a single screen by using a strategy termed DEEPN (dynamic enrichment for evaluation of protein networks). This approach incorporates high-throughput DNA sequencing and computation to follow competition among a plasmid population encoding interacting partners. To demonstrate the capacity of DEEPN, we identify a wide range of ubiq-uitin-binding proteins, including interactors that we verify biochemically. To demonstrate the specificity of DEEPN, we show that DEEPN allows simultaneous comparison of candidate interactors across multiple bait proteins, allowing differential interactions to be identified. This feature was used to identify interactors that distinguish between GTP- and GDP-bound conformations of Rab5.

Graphical abstract

Using deep sequencing to follow a highly complex library upon selection for yeast 2-hybrid interactions provides a way to compare the interactome of one protein to another in parallel. Pashkova et al. provide a comprehensive workflow complete with a bioinformatics package that makes this approach accessible to a variety of investigators.

graphic file with name nihms885867u1.jpg

Introduction

One of the most effective strategies for understanding how a particular protein drives a biological process is to identify interaction partners, which often leads to mechanistic hypotheses that can be tested definitively. Moreover, more comprehensive maps of protein interactions help conceptualize associations that lead to unexpected functions, as is evident from bio-informatic developments in network analyses exploiting large datasets (Fraser et al., 2013; Ryan et al., 2013). Although many methods can yield an extensive repertoire of protein interactors (interactome), they generally have significant limitations and pose further challenges for accurately comparing the interactomes of two different proteins or two forms of the same protein. Moreover, many methods that identify protein-protein interactions are biased in favor of abundant partners that bind with high affinity (Silberberg et al., 2014). For example, a common approach is to purify a protein complex from crude cell lysates and then identify its components by mass spectrometry (Aeber-sold and Mann, 2003; Nilsson et al., 2010). However, many well-acknowledged technical limitations diminish the ability to identify important interacting proteins that are only present at a few copies per cell or that interact with their binding partner only transiently (Hamdan and Righetti, 2002; Hein et al., 2015; Lubec and Afjehi-Sadat, 2007). An approach that partly circumvents these limitations is the yeast 2-hybrid (Y2H) assay, which exploits engineered yeast cells that grow only when a protein of interest binds to a fragment of an interacting partner (Fields and Song, 1989). However, results from Y2H experiments are often incomplete because there is a limit to how many individual yeast colonies can be isolated and analyzed in a single experiment. Increasing the stringency of interaction by demanding higher levels of transcriptional activation reduces the number of colonies to be analyzed but potentially eliminates hundreds of authentic interactions. Although the Y2H approach has been adapted to matrix-formatted methodology using curated ORFeome prey plasmids, this requires a large infrastructure (Ito et al., 2001; Li et al., 2004; Rual et al., 2005) that is not easily accessible to individual investigators studying the interactions of particular proteins or protein domains. Next-generation sequencing (NGS) has been applied to Y2H screens wherein prey plasmids can be detected in pools of yeast (Rajagopala, 2015; Suter et al., 2015). Those studies used NGS as a way to better count the colonies that are produced in a traditional Y2H format, and this method works well when small, well-characterized prey libraries (e.g., those containing pools of an ORFeome) are used (Rolland et al., 2014; Weimann et al., 2013). However, this type of approach is not readily accessible for a typical investigator focused on the functional aspects of a particular protein. This type of analysis is also insufficient for highly complex prey libraries in which a single gene could be represented by several different cDNA fragments that start and end at different points and in different reading frames to the reference mRNA. In addition, analyzing pools of Y2H-positive colonies preserves the bias in the initial composition of the prey library, such that rare prey plasmids may stay too rare to be reliably detected in small pools of sequence reads. An equally challenging problem is how to reliably preserve and monitor enough library complexity to directly compare the interactions of one bait versus another so that differential can be reliably detected.

Here we describe a method termed DEEPN (dynamic enrichment for evaluation of protein networks) that avoids these limitations while minimizing the requirements for instrumentation, optimization, and bioinformatics expertise. It monitors Y2H interactions among prey plasmids as they compete in batch for a growth advantage, allowing rare components within the prey library to amplify. Analysis of NGS data is facilitated by DEEPN software, which extracts the identity of individual prey plasmids that confer Y2H-positive growth and applies a statistical model to rank the specificity of interactions across multiple bait proteins.

Results

DEEPN follows the abundance of Y2H-cDNA prey plasmids in a library population as they are selected for interaction with a given bait plasmid. If different prey populations selected using distinct bait plasmids arise from similar initial populations of a Y2H-cDNA library, then a direct comparison of Y2H interactions will reveal the relative affinity of a specific prey fusion protein for one bait protein versus another. Executing this strategy relies on reproducibly generating populations of yeast containing the same initial distribution of library plasmids for each bait plasmid. We optimized a mating procedure between MATA PJY69 yeast (carrying the Gal4 DNA-binding domain [DBD]-bait construct) and MATα Y187 yeast (containing the Y2H prey library composed of the Gal4 activation domain [AD] fused to mouse cDNA fragments) (Figure 1). This protocol generated ∼5 × 106 to 10 × 106 independent diploid strains for a cDNA library that we determined to have ∼1 × 106 elements (Figure S1). Deep sequencing of PCR products that spanned cDNA inserts amplified from multiple independently generated diploid populations showed that the cDNA library populations were reproduced well (Figure 1B). In addition, cells that carried different Gal4-DBD-bait plasmids yielded comparable Gal4-AD-cDNA library populations. Thus, the bait fusion constructs did not distort the library population under conditions for which a Y2H interaction was not under selection. As expected, most of the variability across these yeast populations was confined to the least abundant genes (Figures 1C and 1D).

Figure 1. Generation of Reproducible Y2H Prey/Bait Populations.

Figure 1

(A) Mating strategy. Top row: individual MAT(A) PJY69 strains carrying a TRP1-containing Gal4-DBD vector plasmid, alone (vector) or fused to 1 of 3 different bait sequences (Bait 1–3), were mated to the MATα Y187 strain containing a Gal4-AD mouse cDNA library. Middle row: mating reactions produced 6.2 × 106 to 2.6 × 107 initial diploids. Diploid populations were expanded in SD-Trp-Leu for five generations, after which the cDNA library inserts were PCR amplified and deep sequenced and the number of reads corresponding to each mouse mRNA was determined. Correlations (R2) of these gene counts between each diploid population are shown. For R2 values designated by an asterisk, the comparison is made to the average gene count across all seven datasets.

(B) Abundance of each gene (15,000 total) in each diploid population was plotted as a function of the average abundance for each gene across all seven independent populations. Each population is plotted in color corresponding to that indicated in (A).

(C) Abundance of each gene (15,000 total) on a log2 scale plotted as a function of the average gene abundance for the average across populations. The average is plotted as a black line.

(D) Ranked abundance (right y axis) and relative SD (left y axis) of each gene as an average for all populations on a log2 scale.

(E) Models of population abundance under non-limiting and limiting conditions. Left: theoretical growth curves for a population of yeast that contains five types of plasmids, modeled using low-stringency conditions (top) and high-stringency conditions (bottom). Under low-stringency conditions, a plasmid that does not support growth is initially abundant but then de-enriched, with its population reduced by 10% per unit time. In the cases of four other plasmids that start at the same low initial abundance but impart a growth advantage, the populations expand over time. Right: the proportion of each plasmid in plots at left that would be detected in a finite number of sequence reads. As the population grows, the plasmid that confers the strongest growth rate eventually outcompetes other plasmids. Under high-stringency conditions (bottom), growth rates are more stratified, making other competitors diminish more. The percent growth rate equals the amount a population increases per unit time.

In PJY69/Y187 diploids, a positive Y2H interaction results in the production of His3, and in principle, the growth of strains containing Y2H-interacting plasmids will outpace that of competitors in media lacking histidine (His). Thus, it was necessary to consider how the population responds to selection at different levels of stringency, as illustrated in a simple model for growth (Figure 1E). Whereas in a growth assay the expansion of a population would be unlimited, DEEPN assesses only a finite number of sequence reads, sampling a finite read space or competition space. Thus, detecting the expansion of one prey plasmid comes at the expense of others. Initially, growth under selection depletes the plasmids that do not have a positive Y2H interaction, leading to an increase in abundance of all those that do. However, at later times of growth, when the proportion of Y2H-negative plasmids low, the Y2H-positive plasmids must compete against one another. Thus, the abundance of all but one Y2H-interacting plasmid will initially expand but then contract during selective growth. This model also explains how competition is enhanced when the growth rate is further stratified, such as when the stringency of selection is increased by addition of the His3 competitor 3-aminitriazole (3AT). Here, increasing the disparity in growth rates quickly allows a few dominant plasmids to crowd out multiple interactors. Thus, fewer rounds of growth under low-stringency selection favor detection of the largest set of Y2H-interacting partners.

We developed a bioinformatics workflow to identify Y2H candidates based on Illumina sequencing of prey plasmid inserts (Figure 2). Tophat2 is used to generate a dataset that is mapped to the genome from which the prey library inserts were generated, as well as to a dataset of unmapped reads containing fragments of the flanking Gal4-AD expression plasmid. The DEEPN GeneCount module counts the number of mapped reads per candidate gene, measuring the level of enrichment a given gene undergoes during growth under selection. Output from GeneCount can be used by the separate StatMaker program that provides a two-or three-way statistical ranking of Y2H interactions. JunctionMake and QueryBlast modules analyze the unmapped sequences by finding reads that span the junction between the Gal4-AD and the prey insert and searching these junctions against a curated mRNA or open reading frame (ORF) database. This determines where the fusion point is located along a given cDNA and whether the protein fragment encoded by the prey insert is in the proper translational frame. Finally, ReadDepth calculates read depth along the length of a prey insert to extract information about the border of the 3′ end and uses this information to reconstruct the candidate Y2H prey plasmid for further validation.

Figure 2. Data Processing.

Figure 2

(A) Computational workflow. Fastq files are mapped using Tophat2/Bowtie2 to produce sets of mapped and unmapped.sam files that are subsequently analyzed by four modules within the DEEPN software. GeneCount enumerates the number of reads corresponding to each gene, which can then be ranked using StatMaker. JunctionMake filters unmapped reads for junctions between the Gal4-AD and the cDNA insert and performs a Blastn search to identify insert fragments. QueryBlast determines the abundance, position, and reading frame of each junction site for a particular gene. ReadDepth uses mapped read data to plot the abundance of reads for a given cDNA as a function of nucleotide position.

(B) Window of the graphic user interface that controls QueryBlast.

(C) Information retrieved from the QueryBlast and ReadDepth modules. Top: plot of the junction sites along a cDNA (corresponding to TBC1D15, NM_025706) across the x axis and the corresponding abundance across the y axis in parts per million for the starting library population grown under non-selective conditions (gray) and under selective conditions (blue) for interaction with diUb. Middle: junction sites (blue) for selective data on the right y axis, and read depth for that cDNA in the same dataset plotted on the left y axis. Lower: scheme to recreate deduced prey plasmid from calculated data. PCR amplification from DNA isolated from the selected population is used to generate an amplicon that is then recombined into an empty host Gal4-AD-expressing plasmid.

Search for Ub-Interacting Proteins

We used DEEPN to discover proteins that interact with ubiquitin (Ub). Ub was chosen because it has many binding partners that typically bind at low affinity (∼20–200 mM). Preliminary experiments showed that when the known Ub-binding domain of HD-PTP was fused to Gal4 as bait, a Y2H interaction could not be detected by growth on synthetic minimal media with dextrose (SD)-His. However, when two linear Ub moieties (linked by a Ub-peptidase-resistant [G75S76M1] linkage) were fused to Gal4 to increase avidity, growth on SD-His occurred. Although diubiquitin (diUb) could increase avidity for mono-Ub-binding proteins, other Ub-binding proteins require multiple Ub moieties with more precise spacing, as conferred by linkage through specific Ub lysines, such as K63. Computational predictions showed that a linear diUb containing a three- or four-residue in-frame spacer could adopt the same conformations as K63-diUb when both Ub moieties are bound to K63-diUb-specific Ub-binding domains, such as the tandem ubiquitin-interaction motifs (UIMs) of Rap80, the NZF domain of TAB2, and the catalytic domain of AMSH (PDB: 3A1Q, 2WXO, and 2ZNV, respectively). Therefore, we also included Gal4-Ub-3 spacer-Ub and Gal4-4 spacer-Ub fusion protein bait plasmids.

Diploid populations containing the diUb-Gal4 fusion baits and a Gal4-AD mouse cDNA prey library were divided and grown in the presence and absence of His, the latter of which showed a lag phase of ∼24 hr. With a normal doubling time of ∼4 hr for PJY69/Y187 diploids, this delay suggested that ∼1% of diploids in the culture carried plasmids capable of a Y2H interaction. The yeast populations were selected in two consecutive rounds of growth, each initiated by a 1:37 dilution. DNA was then isolated and cDNA inserts were amplified by PCR, sheared, and subjected to Illumina sequencing and analysis using the DEEPN software. We were able to follow 15,668 genes for interaction with linear diUb; among these, 65 showed 2-fold or greater enrichment under selection, whereas the remainder (99.6%) were de-enriched and 79.1% were eliminated (Figure 3; Table S1). The candidate list was refined by eliminating genes that were enriched when selected with the Gal4-DBD vector alone, as well as those that did not contain a protein-encoding region within proper translational reading frame. This filtered set of 32 proteins (Figure 3A) included several known to have Ub-binding activity, confirming gene enrichment analysis that showed this cohort was highly statistically significant for Ub binding (Figure S1A). Closer inspection revealed that for the known Ub inter-actors, the cDNA fragments that were enriched upon Y2H selection all contained known Ub-binding modules (Figure 3D), indicating that this type of junction analysis could indicate whether a particular subdomain is sufficient for interaction. In addition, for each of the Gal4-AD-cDNA clones present in the library before selection, we could follow how the abundance evolved. This revealed that although many plasmids carried parts of a given cDNA, only a few of these were enriched during selection. Thus, for individual plasmids, the fold enrichment was far higher than that quantified based on the total of mapped gene counts. For several candidate Ub-interacting genes, we found that the cDNA library contained additional unique plasmids that carry the gene but were not enriched during selection. Invariably, these had cDNA fragments that were not in frame or were fused to the 3′ end of the region encoding the known Ub-binding module.

Figure 3. DEEPN Identifies Proteins that Interact with diUb.

Figure 3

Protein-protein interactions and protein expression in yeast expressing the Gal4-DBD vector, alone or fused to linear diUb (0sp), a form of diUb separated with a three amino acid linker (3sp), or a form of diUb separated with four amino acid linker (4sp).

(A) Heatmap display of the level of enrichment (red) or de-enrichment (green) for a subset of genes after selection for a Y2H interaction on the indicated Gal4-DBD-bait plasmids. The dataset includes novel and known Ub-interacting proteins. To the far left is a summary of relevant genes that were either found previously to bind Ub through known Ub-binding domains or found here as a new verified Y2H interaction. Heatmap results from two independent vector-alone populations (1 and 2) are shown for populations grown in SD-His to select for Y2H interactions. For diUb (0sp), diUb (4sp), and diUb (3sp), the diploid populations were also selected in SD-His media containing 3AT.

(B) Enrichment as a function of serial rounds of growth. A yeast population containing a bait plasmid and the Gal4-AD-cDNA library was grown in the absence of selection (0) or under selection (SD-His) for one (1) or two (2) sequential rounds of dilution. Left: DNA gel of PCR products obtained from amplification across cDNA inserts. Note the emergence of major bands after longer selection times. Right: relative abundance of empty vector within the library as the population is grown under selection (left y axis), and percentage of reads from those PCR products that were successfully mapped onto the mouse genome (right y axis).

(C) Dynamics of candidate enrichment during selection. Abundance of the indicated interacting genes in the starting population and during the first and second round of growth. Data were normalized by expressing all values as a function of abundance in the initial non-selected population. Total represents the average change across all genes.

(D) Junction sequences within the data describing how a given cDNA candidate is fused to Gal4-AD. Abundance (in parts per million) is plotted against the position of the fusion point; data from the starting (unselected; gray) and selected (SD-His; blue) populations are shown. Start and stop codons are indicated, and a domain map highlighting known Ub-interacting modules is shown to scale and is proportional to the size of the ORF.

(E) Left: immunoblot (anti-myc) ofGal4-DBD fusions of vector alone (ø) or fused to the diUb variants. Right: growth of PJY69/Y187 diploids harboring the Gal4-AD empty library plasmid and the indicated Gal4-DBD-diUb bait plasmids.

(F) Y2H-positive Gal4-AD-cDNA clones containing Usp5, Mysm1, and Dcn1 were reconstructed from sequence data and tested in a traditional Y2H format. A positive Y2H interaction was measured on plates lacking His, alone and in the presence of 3 mM 3AT.

After recreating the corresponding Gal4-AD-cDNA clones for Usp5, Mysm1, and Dcn1 (Figure 2C) we found all had a specific Y2H interaction with the diUb in a traditional Y2H format (Figure 3F). DEEPN also indicated several proteins and protein fragments that were not previously known to bind Ub. Further validation showed 8 of the 11 candidates interacted specifically with Ub using a traditional Y2H assay (Figure 4A). These proteins also bound Ub in vitro. For this assessment, the cDNA regions that yielded a positive interaction in the Y2H assay were expressed as recombinant glutathione S-transferase (GST) fusion proteins in bacteria, immobilized on glutathione beads, incubated with K63 polyubiquitin chains or a linear polyubiquitin (Ub5), and analyzed by SDS-PAGE and immunoblotting with anti-Ub antibodies (Figures 4C–4E).

Figure 4. DEEPN Identifies Novel Ub Interactors.

Figure 4

(A) Analysis of junction sequences of novel candidates for interaction with Ub by frequency analysis and traditional Y2H assay. Left: the abundance of each junction read (in parts per million) is plotted as a function of the position of the fusion point within the indicated cDNA in the unselected (gray) and selected (blue) populations. Right: the corresponding Gal4-AD-cDNA plasmid was extrapolated from sequencing data, reconstructed, and tested for Y2H interaction in traditional format. Magenta lines indicate the 3′ regions of the junction site that were predicted to be included in the interacting plasmid based on ReadDepth analysis.

(B) Analysis as in (A), but for candidates that did not pass subsequent verification.

(C) Binding of K63 polyubiquitin chains in a GST pull-down experiment. The indicated GST fusion proteins were immobilized on reduced glutathione (GSH) beads and incubated with K63-linked polyubiquitin chains. Beads were washed and bound protein was immunoblotted with anti-Ub monoclonal antibodies in the presence of a 5% equivalent of input polyubiquitin chains.

(D) GST pull-down experiments as in (C) but using a purified 6 × His-V5 epitope-tagged concatamer of five in-frame (linear) repeats of Ub. Bound fractions, together with a 5% equivalent of input, were immunoblotted with anti-V5 monoclonal antibodies.

(E) Coomassie blue stained gel of bead fractions used in the immunoblots in (C) and (D).

Altogether, these data uncover Ub-binding proteins and modules, providing a molecular lead into how they may function or be regulated. For example, Tdb2 (tyrosyl-DNA phosphodiesterase 2), which repairs double-stranded DNA breaks caused by aberrant topoisomerase activity, has an N-terminal ubiquitin-associated (UBA) domain, which has been proposed to regulate its ability to bind to ubiquitinated topoisomerase (Pommier et al., 2014). However, DEEPN and biochemical data indicate that interaction with Ub can also be mediated by a C-terminal fragment encompassing its catalytic domain. Another Ub interactor was the Rab7 GTPase accelerating protein (GAP) TBC1D15, which also binds the light chain 3 (LC3) protein and is involved in coordinating mitophagy induced by the Ub ligase Parkin. Thus, TBC1D15 may belong to an expanding class of proteins (including optineurin, p62, and NBR1) that bind both LC3 and Ub to properly convey substrates into autophagosomes (Johansen and Lamark, 2011; Kirkin et al., 2009). Ub binding might orient TBC1D15 to particular subdomains of the growing phagophore, where regulation of Rab7 is needed (Yamano et al., 2014). Another Ub interactor was FKBP6, which works with Hsp90 to load piRNAs onto PIWI proteins as a means to suppress the transcription of transposons during spermatogenesis (Watanabe and Lin, 2014; Xiol et al., 2012). Ubiquitination plays roles in both degradation of piwi-interacting RNA (piRNA)-PIWI complexes and specification of the loading of piRNAs into PIWI proteins (Anand and Kai, 2012; Hein et al., 2015; Zhang et al., 2011). Finally, Ub bound the N-terminal portion of Hip1, an ANTH domain-containing protein that belongs to a larger family of structurally related proteins containing ENTH and VHS domains that are known to have a variety of Ub-binding modules (De Craene et al., 2012; Legendre-Guillemin et al., 2004).

Overall, the analysis with diUb revealed important aspects of how the DEEPN performs. Sequence analysis of the populations showed that changes in the abundance of genes with a positive Y2H interaction were apparent only after the second round of growth, indicating that multiple rounds of division are required for His+/Y2H-positive cells to outgrow their competitors (Figures 3B and 3C). Data also demonstrated the advantage of using low-stringency conditions for selection, because many genes that were enriched under low-stringency selection were crowded out under higher-stringency selection (addition of 3AT to the media) (Figure S1A). Examples include Usp5, Mysm1, and Dcn1 (Figure 3A), which disappeared from populations grown in 3AT, even though each could support growth on 3AT in a traditional Y2H assay (Figure 3F).

One initially surprising finding was that some candidates that appeared to be specific Ub interactors on the basis of their DEEPN enrichment profiles interacted non-specifically when tested in a traditional Y2H assay (Figure 4B). Moreover, in the cases of some prey plasmids (e.g., those encoding fragments of Arhgdib and CD48), interaction with the Gal4-DBD vector alone was even stronger than that with the Gal4-AD-diUb fusions. This led us to question why the initial gene enrichment data did not identify Arhgdib and CD48 as non-specific interactors. Statistical analysis found that much of this discrepancy could be explained by noise in the system, and it motivated us to develop ways to minimize this problem (discussed later). Such noise greatly affects plasmids that are far less abundant in the initial population, as was the case for Mysm1, Khdc1, Aup1, Usp3, and N4bp2. Genes such as Poli and Stambp were missed by the simple stratification criteria in Figure 3A but were found in more sophisticated statistical ranking calculations (Table S2). Finally, genes such as Khdc1 and Stamb/AMSH likely showed discrepant results because of their specificity for K63 polyubiquitin over other topologies, a feature that the bait of diUb with the three-residue spacer was designed to mimic.

DEEPN as a Tool to Identify Conformation-Specific Interacting Partners

A feature of the DEEPN approach is that by validating the generation of separate yeast populations that contain different bait plasmid but comparable distributions of the Gal4-AD-cDNA library and then following those populations under selection, one can simultaneously detect interactions that are strong in the case of one bait but weak in the case of another. We found that DEEPN could discriminate interactors that were specific for the guanosine triphosphate (GTP)-bound conformation of Rab5 versus its inactive guanosine diphosphate (GDP)-bound conformation, which can be achieved using a Q79L versus S34N substitution mutation, respectively (Bucci et al., 1992). Initially, lists of candidates were sorted based on enrichment on either Rab5-QL or Rab5-SN but de-enrichment on vector alone. Subsequently, gene enrichment data were evaluated with a statistical model (described later). These ranked candidates were then filtered for those that contain protein-encoding regions in the proper translational frame, yielding a subset summarized in Figure 5 and Table S3.

Figure 5. DEEPN Identifies Rab5 Interactors that Distinguish between the GTP- and the GDP-Bound Conformations.

Figure 5

(A) Heatmap display of genes enriched for Y2H interaction with the Gal4-DBD, alone (ΔVect) or fused to Rab5 locked into either the GTP-bound conformation by the Q79L mutation or the GDP-bound conformation by the S34N mutation. The double asterisk indicates candidate was out of proper translational reading frame.

(B) Expression of Gal4-DBD fusion proteins of Rab5(SN)GDP and Rab5(QL)GTP by immunoblot.

(C) Analysis of junction sequences of Rab5-GTP interactors. Left: the abundance of each junction read (in parts per million) plotted as a function of the position of the fusion within the indicated cDNA for the unselected starting population (gray) and the population selected on the RabGTP-Q79L bait and the RabGDP-S34N bait fusion proteins (blue). Right: the corresponding Gal4-AD-cDNA plasmid was extrapolated from sequencing data, reconstructed, and verified for Y2H interaction by growth on plates.

(D) Analysis of junction sequences for known Rab5GDP interactors that were not identified by DEEPN.

(E) cDNA fragments for EEA1 and ZFyve20 that were present in the unselected library population. Arrowheads denote cDNA fusions that are out of the proper translational frame.

We validated these high-ranking candidates with a subsequent traditional Y2H assay and subjected lower-ranked candidates to Y2H assays to gauge the predictive value of the ranking system. This analysis confirmed that RabEP1/Rabaptin and the phosphatidylinositol phosphatase Inpp5b interacted with the GTP-bound form of Rab5. Previous experiments showed that Rab5 is required for the localization of Inpp5b to endosomes and that Inpp5b interacts with another Rab5 effector, Appl1 (Bohdanowicz et al., 2012). These DEEPN data suggest that Rab5 GTP can interact with Inpp5b directly, in addition to binding indirectly via Appl1. Two other candidates, Alcam and Pcgf6, were ranked highly for differential interaction with Rab5-QL(GTP) versus Rab5-SN(GDP) and were able to recapitulate that behavior when assessed in isolated Y2H format. However, validation assays showed considerable interaction with vector alone despite a relatively low indication of this from the DEEPN data. As discussed later, this can be explained partly by noise inherent in the selection and expansion process of Y2H-positive clones. Another contributor would be that these particular candidates competed relatively poorly against other prey plasmids in conditions selecting for interaction with the Gal4-DBD vector alone versus when selected with Gal4-AD-Rab5 fusions. Notwithstanding this discrepancy in the results for DEEPN and isolated Y2H with regard to specificity, the correlation in behavior of candidates in the two analyses was appreciable.

Our analysis did not identify other known Rab5-interacting proteins, such as Appl1, Zfyve20/Rabenosyn, and EEA1. However, computational analysis of the available clones within the Gal4-AD-cDNA library revealed that no clones encoding these proteins were present to mediate a Y2H interaction with Rab5 (Figure 5E). This was due to the absence of the gene in the library (e.g., Appl1), fusions to the 3′ UTR of the cDNA, or clones that were fused to regions of the ORF but were out of frame. The results suggest a way to consider negative results when using DEEPN.

One of the strongest candidates for a Rab5-GTP interactor based on our gene enrichment data was RabEP1/Rabaptin, a well-known Rab5 effector (Stenmark et al., 1995). However, analysis of the junction reads showed that the interacting portion of the Rabep1 cDNA was out of frame with the upstream Gal4-AD protein (Figure 6). Additional experiments showed that the positive Y2H interaction was due to a low level of frameshifting into the +1 reading frame encoding RabEP1. Three Gal4-AD-RabEP1 fusion plasmids were made: one that reconstructed the out-of-frame version deduced from the deep-sequence data, one in which we took the out-of-frame version and inserted a stop codon farther downstream in the +1 reading frame, and another in which RabEP1 was in the same reading frame as Gal4-AD (Figure 6A). Immunoblot analysis of these hemagglutinin (HA)-epitope tagged Gal4-AD fusion proteins revealed that the in-frame RabEP1 fusion (1F) produced a single protein of ∼72 kDa but that both out-of-frame Rabep1 fusion plasmids produced a protein of only 27 kDa, the size of Gal4-AD alone; no larger species could be detected by immunoblotting, even upon extended exposure (Figure 6B). Nonetheless, in the context of an isolated Y2H assay, the out-of-frame Gal4-AD-RabEP1 plasmid corresponding to that in the Y2H library gave a clear Y2H-positive interaction specific for the GTP-bound form of Rab5 (Figure 6C). In contrast, interrupting the +1 reading frame encoding the Rabep1 protein destroyed the Y2H interaction. As expected, the Gal4-AD fusion in which Rabep1 was in frame also resulted in a positive Y2H interaction that was stronger than the original out-of-frame Rabep1 plasmid. Many previous studies showed that translational frameshifting can take place at a low rate and can be promoted by mutations in the translation machinery and by particular sequences around the shift site (Atkins and Björk, 2009). Regardless of the precise mechanism, the discovery of this protein-protein interaction by DEEPN when the levels of the interacting partner were too low to detect is remarkable. These findings illustrate that the DEEPN method is extremely sensitive yet provide a cautionary caveat to automatically eliminating all candidates based on reading frame alone.

Figure 6. DEEPN Identifies Rabep1/Rabaptin as a Rab5GTP Interactor despite a Frame-shift.

Figure 6

(A) Sequences of Gal4-AD-Rabep1 fusion plasmids. RabEP1-S1 is the original plasmid deduced by DEEPN analysis and shows RabEP1 out of frame with upstream Gal4-AD. RabEP-S2 has a second stop codon in the alternative frame, which is that of RabEP1. In RabEP1-IF, RabEP1 is in frame with upstream Gal4-AD. Shown are the Sanger sequence chromatograms and the deduced amino acid sequence of the C-terminal end of Gal4-AD (red) and the N terminus of the Rabep1 cDNA insert (blue).

(B) Immunoblot (α-HA) of the Gal4-AD-RabEP1 fusion plasmids that were described in (A).

(C) Isolated Y2H assay with the three GAL4-AD-Rabep1 fusion plasmids described in (A) and the Gal4-DBD vector, alone (ø) or fused to the Rab5-QL mutant in the GTP conformation or the Rab5-SN mutant in the GDP conformation. Growth of serially diluted PJY69/Y187 diploid yeast under non-selective conditions (+His), low-stringency selection (−His), or higher-stringency selection (−His+3AT).

Statistical Analysis of Gene Enrichment

Although the simple descriptive techniques of sorting and filtering the number of reads mapped to a given gene can be used to identify candidates with large enrichments and de-enrichments during selective growth, it was apparent for many reasons that analysis would benefit from a more sophisticated statistical model. In some instances, the prey construct simply maintained its abundance in the context of selection for interaction with the bait protein of interest but was eliminated from the population under selection for interaction with vector alone. Such patterns are difficult to detect using simple sorting techniques but are consistent with the possibility that the candidate interacting prey protein has a specific yet modest binding affinity for the bait protein of interest. Statistical models also provide a systematic method to account for the variability of count data depending on the mean. For example, an observed 2-fold enrichment from 1 to 2 ppm is less convincing than a 2-fold enrichment from 1,000 to 2,000 ppm. Statistical inference could calculate the probability of a true bait-target interaction and indicate how reliable each signal is and how likely it is to be replicated. Finally, statistical modeling provides a measure of the experimental noise, which is useful for troubleshooting.

Models for characterizing the digital nature of count data from sequencing experiments often use a negative binomial distribution (Auer and Doerge, 2010). This distribution is naturally discrete and describes the probability of observing exact counts and zero counts. In addition, it provides a simple relationship between the mean and the variance of a given count. For a negative binomial distribution, the variance divided by the mean is given by 1 + μϕ, where μ is the expected number of counts (directly proportional to the abundance of the target) and ϕ is a quantity known as the overdispersion parameter. The special case ϕ = 0 is known as the Poisson distribution—in a sense, the mathematical ideal of a count distribution, in which the only factors that increase variability are those inherent to observing count processes. In real experiments, however, variability is larger than the predicted Poisson distribution and is accommodated using the overdispersion parameter.

For the DEEPN experiments here, we made the simplifying assumption that the overdispersion parameter is the same for all genes in an experiment. This enabled us to estimate the experiment-wide overdispersion using just two control replicates (i.e., vector alone). A Bayesian hierarchical model was fit to both the Ub and the Rab5 DEEPN datasets to estimate specificity of Y2H interactions. In addition, for the Rab5 datasets, we evaluated the posterior probabilities of specific binding to Rab5 in each of the two conformations (Rab5-QL versus Rab5-SN). This procedure was codified into a standardized software program termed StatMaker (Figure 2). Comparison of these rankings based on empirical results was good, demonstrating that the statistical model is useful for identifying candidates for further validation (Tables S1 and S3). For instance, higher probabilities were calculated for the Ub interactors that were verified (Rev1, Usp5, Fkbp6, and Birc2) than for those that were not (Cd48 and Arhgdib). Likewise, the verified Rab5-QL (GTP)-specific interactors (Rabep1 and Inpp5b) scored better than candidates that failed (Ddx10, Rpn2, and Cntn1).

An important aspect of this analysis is that the level of variability between duplicate experiments under selective conditions was unexpectedly high, such as of data for two separate experiments using the Gal4-DBD vector alone. This overdispersion was not due to differences in the initial unselected populations, which were highly reproducible (Figure 1) with essentially no overdispersion (φ = 0.06). However, populations grown under selective conditions were characterized by heavy overdispersion (φ = 3.75). This degree of overdispersion substantially diminishes the confidence with which one can assign specific binding partners and is responsible for substantial statistical uncertainty for many of the strongest DEEPN signals (Table S3).

A possible source of variability is that the initial sub-population of yeast subjected to Y2H selection was too small. This could be remedied in two ways. First, genes present at low abundance in the population could be computationally eliminated, because the fold enrichment they would undergo would be more susceptible to undersampling in the starting population. Second, the scheme for sampling cells that are grown under selection could be enhanced. The first possibility was explored by imposing a threshold value on the gene enrichment data so that only genes with a minimum parts per million value in the starting unselected population were ranked. Overdispersion decreased to φ = 2.86 when a parts per million threshold of 50 was imposed, down from φ = 3.65 when no threshold was used. However, this moderate improvement also diminished the number of candidates that could be evaluated, because fewer met that threshold (Table S3). To enhance sampling, we evaluated the effect of sample size on reproducibility when generating populations under selective conditions. In the preceding experiments, we generated a population of library-containing diploids that would undergo selection by taking 2 mL (4 optical density 600 [OD600], or ∼1 × 108 cells) of the starting saturated unselected diploid population and diluting that to 75 mL in selective SD-His media. The initial round of selective (SD-His) growth was always delayed compared to that in the non-selective SD+His media because of the smaller proportion of yeast that could produce His3 and grow in SD-His. Calculating from the initial library distribution in the unselected populations, this sampling of 1 × 108 cells contained between 1 million and 2 million cells that were Y2H positive, indicating that this was the main evolutionary bottleneck. Thus, we increased the sample size of the yeast population in the first dilution growth phase by diluting 20 mL of cells into 750 mL of cells. As shown in Figure 7, the larger sampling size yielded gene counts after selection that were far more reproducible. The estimated overdispersion under the increased sample size scheme dropped from 3.65 to 1.05 (>3-fold reduction). Thus, sampling size is an important parameter that must be considered if DEEPN is to be optimally executed.

Figure 7. Reproducing Evolved Populations.

Figure 7

(A) Scheme for examining sample size and the populations resulting from growth in the presence or absence of Y2H selection. The 2.6 × 107 independent diploids, formed when mating cells with the Gal4-DBD bait vector alone to a Gal4-AD-cDNA library, were grown to saturation in 500 mL and subjected to sequential rounds of growth. Large circles represent a dilution of 20 mL into 750 mL of minimal media, and small circles represent a dilution of 2 mL into 75 mL of minimal media. The abundance of each gene within the cDNA library of each population was correlated as indicated.

(B) Graphic representation of correlation of the abundance of each gene in the cDNA library of the starting population with that in the ending populations in (A). In media with no selection for a Y2H interaction (+His), the population is stable. However, under selection (−His), the use of a larger sample (20 versus 2 mL) results in a more reproducible population of library plasmids.

Discussion

The fidelity of DEEPN is underpinned by reproducibly introducing the same Y2H prey library into strains carrying different bait plasmids and by sampling sufficient volumes of the resulting yeast populations to give consistent results after selection for positive Y2H interactions. Large-scale mating (Figure 1) ensures adequate generation of the starting populations, and sampling a large proportion of this population yields similar evolutionary results after selection (Figure 7). Once these procedures are in place, our data demonstrate that following differential growth rates in a pool of yeast transformants greatly facilitates identification of candidate binding partners that faithfully reproduce an interaction within a traditional Y2H format. With the DEEPN software, the protein fragment responsible for interaction can be quickly deduced and the corresponding Y2H prey plasmid can be reconstructed de novo for further verification and analysis. Altogether, these tools solve the logistical challenges associated with following many Y2H-positive interactions at once and provide a way to directly compare the repertoire of interactors that have specificity for one bait protein over another.

Provided the read or competition space is large enough, sequencing allows one to monitor the enrichment of multiple candidates simultaneously, without overwhelming interference from a small subset of high-abundance Y2H-positive plasmids. This was especially helpful in identifying interactors of Ub, which has multiple partners. With a traditional Y2H assay, it would be necessary to screen many individual positive colonies to identify novel interacting candidates. We calculated that plasmids encoding the Ub-binding proteins Usp5 and Mysm1 were composed of only ∼0.02% and ∼0.002% of total Y2H-positive plasmids, respectively, of the total number of plasmids in the starting population that gave a Y2H-positive interaction. With such low proportions, surveys of 1.5 × 103 and 1.5 × 104 colonies would be required to have a 95% chance of detecting Usp5 and Mysm1, respectively, in a traditional Y2H format. Because DEEPN is dynamic, competitive growth provides a way for rare plasmids within the library to amplify and be reliably detected. The reproducibility of that process is increased when a larger sampling size is used: one that contains an estimated 10 million to 20 million Y2H-positive cells within a larger cohort of ∼1 × 109 cells.

Previous studies have applied NGS to identify genes within colonies produced in a traditional Y2H format by scraping the resulting colonies together into sets of pools followed by sequencing (Lewis et al., 2012), which has been followed with several refinements (Weimann et al., 2013; Yu et al., 2011). Rather than the scaled response of DEEPN data, colony growth offers a more digital response. For methods that use controlled bait-prey cell matings to produce normalized confined pools of a relatively small ORFeome library, the use of an NGS Y2H-sequencing method to survey colonies is effective (Weimann et al., 2013; Yu et al., 2011). However, complex prey libraries with millions of different prey plasmids offer the benefit of breaking down known genes into domains, which often bind better to their biological partners than they do within full-length proteins (Boxem et al., 2008). Complex libraries are also the norm for most Y2H libraries available. Still, such complex libraries cannot be maintained in arrays and must be used in large pools, in which the proportion of each prey plasmid can be skewed and the identity of each plasmid will be ambiguous if it is based solely on mapping NGS reads to a genome. DEEPN works around these problems by following dynamic changes within a complex pool of prey plasmids and using computation to extract the identity of the particular prey inserts that confer Y2H-positive interactions. As exemplified earlier for detecting Mysm1, more than 150 plates containing 1,000 colonies each would need to be pooled just to detect Mysm1 as a Ub interactor, let alone the full list of interactors in a single batch-processed DEEPN approach. Moreover, DEEPN required large sample sizes (∼1 × 109) of Y2H cells to be placed under selection to provide results with low overdispersion so that bait-to-bait comparisons could be made. Another aspect of DEEPN is that it keeps track of the library distribution within separate unselected populations containing different bait proteins so that meaningful enrichment measurements can be made. However, bootstrapping simulations revealed to us that more than 1 million colonies of even the exact same library population would be needed to verify the distribution of components to the same level of variability that DEEPN obtained across true experimental replicates. Such numbers would overtax any plate-type format for growth. Nonetheless, extensive use of a given Y2H library might be enhanced by an additional effort to capture some non-specific prey clones that interact with vector alone using a colony-based digital approach, thus providing identification of clones that interact with vector-alone but compete poorly against vector-specific Y2H-interacting clones (Figure S1C).

With the ability of DEEPN to comprehensively sample and track the prey library, its main limitation is the composition of the Y2H prey library. Much of the commercial library used here contained fragments of 3′ UTRs (Figure S1C). We found that while the library was composed of just over 1 million different plasmids, more than 50% contained only regions that were 3′ of the coding sequence. In addition, 18% of all genes in the library were exclusively represented by 3′ UTRs. We see the potential for a more effective DEEPN approach if library quality could be increased by eliminating irrelevant plasmids and normalized better in favor of rare mRNAs. One possibility would be to use exome-capture technologies to obtain cDNAs that contain only coding regions and that are normalized per exon (Mercer et al., 2014). Another would be to use libraries constructed from randomly sheared fragments of ORFeome collections (Waaijers et al., 2013). Another improvement would be to sequence the entire insert of every plasmid within the library population, made possible using long-read sequencing technology (Rhoads and Au, 2015). A targeted-resequencing strategy focused on just the junctions that bridge Gal4-AD with the cDNA fragment (acting as a unique molecular identifier) would provide enough information to enable identification of the whole plasmid and allow more efficient use of sequencing. Finally, as sequencing costs continue to decrease, it may be useful to monitor the levels of each plasmid multiple times during selection so that expansion and contraction trends can be better tracked well before the competition space within the population approaches exhaustion.

Experimental Procedures

Detailed protocols and computational software for DEEPN can be found at https://github.com/emptyewer/DEEPN/releases.

Materials: Antibodies, Y2H Library, Plasmids, and Strains

Monoclonal anti-HA antibodies were purchased from BioLegend (cat# 901514). Polyclonal anti-myc antibodies were purchased from QED Biosciences (cat# 18826). Monoclonal anti-6 × His antibodies were purchased from GenScript (cat# A00186). Monoclonal V5 antibodies were purchased from Novex by Life Technologies (Thermo Fisher Scientific, cat# R96025). Monoclonal Ub antibodies were purchased from Santa Cruz Biotechnology (cat# sc-8017). The normalized universal mouse cDNA library Mate&Plate was purchased from Clontech Laboratories (cat# 630483). Plasmids and strains used are listed in Table S1.

Construction and Validation of Gal4-DBD Plasmids

Expression of Gal4-DBD (residues 1–147) fusion bait proteins was accomplished with the TRP1-and Kanr-containing plasmid pGBKT7. DNA fragments encoding proteins of interest to be cloned downstream of the Gal4 DNA binding domain region were made by gene synthesis using the codon bias of S. cerevisiae (gBlocks, Integrated DNA Technologies, or Strings, Thermo Fisher Scientific) and cloned into pGBKT7 cut with EcoRI and BamHI using the Gibson Assembly Master Mix kit (NEB). For immunoblotting, cells were resuspended in 1 mL of 0.2 N NaOH for 5 min at 25°C, repelleted, and solubilized in 100 μL of 8 M urea, 5% SDS, and 10 mM Tris (pH 6.8) before SDS-PAGE.

To check for baseline production of His3 from different bait plasmids in PJ69-4A/Y187 diploids, growth was monitored on SD-His plates containing 0, 0.2, 0.5, and/or 1 mM 3AT.

Mating and Selection

An optimized procedure for mating the pGBKT7-transformed PJ69-4A (PJY69) strain with the library-containing Y187 strain was developed, because Y187 does not mate efficiently in comparison to other strains such as BY4742, SEY6210, and W303. PJ69-4A and Y187 transformants were grown in SD-Trp (tryptophan) or SD-Leu (leucine), respectively, to an OD600 of 1.0–1.5. Cells were pelleted and resuspended in YPDA (YPD-rich media supplemented with 100 μM adenine) to 5 OD600/mL (PJ69-4A) and 3.5 OD600/mL (Y187). A mixture containing 1 mL from each cell suspension was incubated in a 50 mL conical tube with gentle agitation for 90 min at 30°C. Cells were pelleted and spread onto a 150 mM YPD agar plate and incubated for 12 hr. Cells were harvested in 40 mL SD-Leu-Trp media, titrated for the number of diploids present, and grown for 36 hr in 500 mL SD-Leu-Trp media at 30°C until saturation (∼2.0 OD/mL).

To generate selected and unselected yeast populations, 20 mL of the 500 mL SD-Leu-Trp culture were diluted in 750 mL SD-His-Leu-Trp and 750 mL SD-Leu-Trp media, respectively. Cultures were grown until they reached saturation (OD600 of ∼2), typically for 35–40 hr for cells in SD-His-Leu-Trp media. A 10 mL sample was removed for analysis of gene distribution in the unselected library, pelleted, and stored at −20°C. A 2 mL sample of the saturated SD-His-Leu-Trp culture was further diluted into 75 mL and grown to saturation, after which 10 mL were removed to analyze the library distribution following selection and the cells were pelleted and stored at −20°C.

Sample Preparation and Sequencing

Yeast pellets were resuspended in 500 μL of 50 mM Tris/20 mM EDTA and 1% β-mercaptoethanol and digested with zymolase at 37°C for 24–26 hr. Samples were extracted with phenol/chloroform/isoamyl alcohol, ethanol precipitated, resuspended in 100 μLof 50 mM Tris/20 mM EDTA containing 10 μg RNase A, incubated for 1 hr at 37°C, and ethanol precipitated after the addition of 10 mL of 4 M NaCl. After resuspension in 100 μLTE (10 mM Tris, pH8.0, 1 mM EDTA), 5 μg DNA was used to PCR across the cDNA inserts using 50 μL of NEB Next High-Fidelity Master Mix for 25 cycles with extension times of 3 min. Then, 500 ng of PCR product (purified using the QIAquick PCR purification kit) was sheared using a Covaris E220, yielding fragments of an average length of ∼300 bp. Indexed sequencing libraries were generated using the KAPA Hyper Prep kit (KAPA Biosystems, cat# KK8500) for Illumina sequencing to add linkers encoding barcodes, priming sites, and capture sequences asymmetrically to the ends of the DNA fragments. Indexed libraries were then pooled and sequenced using Illumina 2 × 100 nt SBS v3 chemistry and run on an Illumina HiSeq2500.

Computational Analysis

Illumina reads were mapped using Tophat2 (https://ccb.jhu.edu/software/tophat/index.shtml), using the mouse GRC38/mm10 database (http://genome.ucsc.edu). Output was to .sam files containing separate mapped and unmapped reads, which were further processed and analyzed using the DEEPN application (https://github.com/emptyewer/DEEPN/releases) as outlined in Figure 2.

Statistical Analysis

Bayesian non-negative binomial regression models with log links were fit to the data on the raw-count scale. Hierarchical modeling was used to estimate the dispersion parameter and assess the degree of shrinkage for the estimated enrichment ratios. The posterior probability that enrichment in the presence of bait exceeded enrichment in vector alone is reported as the probability of bait-target binding; for three-way comparisons, the probability that enrichment for one bait exceeds that for the other, as well as that for vector, was also factored in. All analysis was carried out using R (http://www.r-project.org) and jags (http://mcmc-jags.sourceforge.net).

Supplementary Material

Supplemental

Highlights.

  • Parameters of the batch Y2H approach are defined

  • Data processing and statistical analysis software packages are outlined

  • A DEEPN screen identifies multiple proteins that interact with ubiquitin

  • A DEEPN screen differentiates proteins that bind Rab5-GTP versus Rab5-GDP

Acknowledgments

We thank Anna Pashkova and Chelsea Christopher for technical help in verifying Y2H-interacting candidates. We also thank the staff within the Institute of Human Genetics for help with sequence analysis and NGS library preparation. This work was supported by the NIH (5R01GM058202), the Iowa Cardiovascular Interdisciplinary Research Fellowship (T32HL007121), and the NSF Research Project (grant 1517110).

Footnotes

Supplemental information: Supplemental Information includes one figure, three tables, and one data file and can be found with this article online at http://dx.doi.org/10.1016/j. celrep.2016.08.095.

Author Contributions: N.P. and T.A.P. conducted experiments and developed methods. V.K. wrote GUI applications. P.B. developed statistics methods. M.S. and R.C.P. developed computational methods, designed experiments, and wrote the manuscript.

References

  1. Aebersold R, Mann M. Mass spectrometry-based proteomics. Nature. 2003;422:198–207. doi: 10.1038/nature01511. [DOI] [PubMed] [Google Scholar]
  2. Anand A, Kai T. The tudor domain protein kumo is required to assemble the nuage and to generate germline piRNAs in Drosophila. EMBO J. 2012;31:870–882. doi: 10.1038/emboj.2011.449. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Atkins JF, Björk GR. A gripping tale of ribosomal frameshifting: extragenic suppressors of frameshift mutations spotlight P-site realignment. Microbiol Mol Biol Rev. 2009;73:178–210. doi: 10.1128/MMBR.00010-08. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Auer PL, Doerge RW. Statistical design and analysis of RNA sequencing data. Genetics. 2010;185:405–416. doi: 10.1534/genetics.110.114983. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Bohdanowicz M, Balkin DM, De Camilli P, Grinstein S. Recruitment of OCRL and Inpp5B to phagosomes by Rab5 and APPL1 depletes phosphoinositides and attenuates Akt signaling. Mol Biol Cell. 2012;23:176–187. doi: 10.1091/mbc.E11-06-0489. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Boxem M, Maliga Z, Klitgord N, Li N, Lemmens I, Mana M, de Lichtervelde L, Mul JD, van de Peut D, Devos M, et al. A protein domain-based interactome network for C. elegans early embryogenesis. Cell. 2008;134:534–545. doi: 10.1016/j.cell.2008.07.009. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Bucci C, Parton RG, Mather IH, Stunnenberg H, Simons K, Hoflack B, Zerial M. The small GTPase rab5 functions as a regulatory factor in the early endocytic pathway. Cell. 1992;70:715–728. doi: 10.1016/0092-8674(92)90306-w. [DOI] [PubMed] [Google Scholar]
  8. De Craene JO, Ripp R, Lecompte O, Thompson JD, Poch O, Friant S. Evolutionary analysis of the ENTH/ANTH/VHS protein super-family reveals a coevolution between membrane trafficking and metabolism. BMC Genomics. 2012;13:297. doi: 10.1186/1471-2164-13-297. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Fields S, Song O. A novel genetic system to detect protein-protein interactions. Nature. 1989;340:245–246. doi: 10.1038/340245a0. [DOI] [PubMed] [Google Scholar]
  10. Fraser JS, Gross JD, Krogan NJ. From systems to structure: bridging networks and mechanism. Mol Cell. 2013;49:222–231. doi: 10.1016/j.molcel.2013.01.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Hamdan M, Righetti PG. Modern strategies for protein quantification in proteome analysis: advantages and limitations. Mass Spectrom Rev. 2002;21:287–302. doi: 10.1002/mas.10032. [DOI] [PubMed] [Google Scholar]
  12. Hein MY, Hubner NC, Poser I, Cox J, Nagaraj N, Toyoda Y, Gak IA, Weisswange I, Mansfeld J, Buchholz F, et al. A human interactome in three quantitative dimensions organized by stoichiometries and abundances. Cell. 2015;163:712–723. doi: 10.1016/j.cell.2015.09.053. [DOI] [PubMed] [Google Scholar]
  13. Ito T, Chiba T, Ozawa R, Yoshida M, Hattori M, Sakaki Y. A comprehensive two-hybrid analysis to explore the yeast protein interactome. Proc Natl Acad Sci USA. 2001;98:4569–4574. doi: 10.1073/pnas.061034498. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Johansen T, Lamark T. Selective autophagy mediated by auto-phagic adapter proteins. Autophagy. 2011;7:279–296. doi: 10.4161/auto.7.3.14487. [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Kirkin V, McEwan DG, Novak I, Dikic I. A role for ubiquitin in selective autophagy. Mol Cell. 2009;34:259–269. doi: 10.1016/j.molcel.2009.04.026. [DOI] [PubMed] [Google Scholar]
  16. Legendre-Guillemin V, Wasiak S, Hussain NK, Angers A, McPherson PS. ENTH/ANTH proteins and clathrin-mediated membrane budding. J Cell Sci. 2004;117:9–18. doi: 10.1242/jcs.00928. [DOI] [PubMed] [Google Scholar]
  17. Lewis JD, Wan J, Ford R, Gong Y, Fung P, Nahal H, Wang PW, Desveaux D, Guttman DS. Quantitative interactor screening with next-generation sequencing (QIS-Seq) identifies Arabidopsis thaliana MLO2 as a target of the Pseudomonas syringae type III effector HopZ2. BMC Genomics. 2012;13:8. doi: 10.1186/1471-2164-13-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Li S, Armstrong CM, Bertin N, Ge H, Milstein S, Boxem M, Vidalain PO, Han JD, Chesneau A, Hao T, et al. A map of the interactome network of the metazoan C. elegans. Science. 2004;303:540–543. doi: 10.1126/science.1091403. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Lubec G, Afjehi-Sadat L. Limitations and pitfalls in protein identification by mass spectrometry. Chem Rev. 2007;107:3568–3584. doi: 10.1021/cr068213f. [DOI] [PubMed] [Google Scholar]
  20. Mercer TR, Clark MB, Crawford J, Brunck ME, Gerhardt DJ, Taft RJ, Nielsen LK, Dinger ME, Mattick JS. Targeted sequencing for gene discovery and quantification using RNA CaptureSeq. Nat Protoc. 2014;9:989–1009. doi: 10.1038/nprot.2014.058. [DOI] [PubMed] [Google Scholar]
  21. Nilsson T, Mann M, Aebersold R, Yates JR, Bairoch A, 3rd, Bergeron JJ. Mass spectrometry in high-throughput proteomics: ready for the big time. Nat Methods. 2010;7:681–685. doi: 10.1038/nmeth0910-681. [DOI] [PubMed] [Google Scholar]
  22. Pommier Y, Huang SY, Gao R, Das BB, Murai J, Marchand C. Tyrosyl-DNA-phosphodiesterases (TDP1 and TDP2) DNA Repair (Amst) 2014;19:114–129. doi: 10.1016/j.dnarep.2014.03.020. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Rajagopala SV. Mapping the protein-protein interactome networks using yeast two-hybrid screens. Adv Exp Med Biol. 2015;883:187–214. doi: 10.1007/978-3-319-23603-2_11. [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Rhoads A, Au KF. PacBio sequencing and its applications. Genomics Proteomics Bioinformatics. 2015;13:278–289. doi: 10.1016/j.gpb.2015.08.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Rolland T, Tasan M, Charloteaux B, Pevzner SJ, Zhong Q, Sahni N, Yi S, Lemmens I, Fontanillo C, Mosca R, et al. Aproteome-scalemap of the human interactome network. Cell. 2014;159:1212–1226. doi: 10.1016/j.cell.2014.10.050. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Rual JF, Venkatesan K, Hao T, Hirozane-Kishikawa T, Dricot A, Li N, Berriz GF, Gibbons FD, Dreze M, Ayivi-Guedehoussou N, et al. Towards a proteome-scale map of the human protein-protein interaction network. Nature. 2005;437:1173–1178. doi: 10.1038/nature04209. [DOI] [PubMed] [Google Scholar]
  27. Ryan CJ, Cimermancic P, Szpiech ZA, Sali A, Hernandez RD, Krogan NJ. High-resolution network biology: connecting sequence with function. Nat Rev Genet. 2013;14:865–879. doi: 10.1038/nrg3574. [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Silberberg Y, Kupiec M, Sharan R. A method for predicting protein-protein interaction types. PLoS ONE. 2014;9:e90904. doi: 10.1371/journal.pone.0090904. [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Stenmark H, Vitale G, Ullrich O, Zerial M. Rabaptin-5 is a direct effector of the small GTPase Rab5 in endocytic membrane fusion. Cell. 1995;83:423–432. doi: 10.1016/0092-8674(95)90120-5. [DOI] [PubMed] [Google Scholar]
  30. Suter B, Zhang X, Pesce CG, Mendelsohn AR, Dinesh-Kumar SP, Mao JH. Next-generation sequencing for binary protein-protein interactions. Front Genet. 2015;6:346. doi: 10.3389/fgene.2015.00346. [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Waaijers S, Koorman T, Kerver J, Boxem M. Identification of human protein interaction domains using an ORFeome-based yeast two-hybrid fragment library. J Proteome Res. 2013;12:3181–3192. doi: 10.1021/pr400047p. [DOI] [PubMed] [Google Scholar]
  32. Watanabe T, Lin H. Posttranscriptional regulation of gene expression by Piwi proteins and piRNAs. Mol Cell. 2014;56:18–27. doi: 10.1016/j.molcel.2014.09.012. [DOI] [PMC free article] [PubMed] [Google Scholar]
  33. Weimann M, Grossmann A, Woodsmith J, Özkan Z, Birth P, Meierhofer D, Benlasfer N, Valovka T, Timmermann B, Wanker EE, et al. A Y2H-seq approach defines the human protein methyltransferase interactome. Nat Methods. 2013;10:339–342. doi: 10.1038/nmeth.2397. [DOI] [PubMed] [Google Scholar]
  34. Xiol J, Cora E, Koglgruber R, Chuma S, Subramanian S, Hosokawa M, Reuter M, Yang Z, Berninger P, Palencia A, et al. A role for Fkbp6 and the chaperone machinery in piRNA amplification and transposon silencing. Mol Cell. 2012;47:970–979. doi: 10.1016/j.molcel.2012.07.019. [DOI] [PubMed] [Google Scholar]
  35. Yamano K, Fogel AI, Wang C, van der Bliek AM, Youle RJ. Mitochondrial Rab GAPs govern autophagosome biogenesis during mitophagy. eLife. 2014;3:e01612. doi: 10.7554/eLife.01612. [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. Yu H, Tardivo L, Tam S, Weiner E, Gebreab F, Fan C, Svrzikapa N, Hirozane-Kishikawa T, Rietman E, Yang X, et al. Next-generation sequencing to generate interactome datasets. Nat Methods. 2011;8:478–480. doi: 10.1038/nmeth.1597. [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Zhang Z, Xu J, Koppetsch BS, Wang J, Tipping C, Ma S, Weng Z, Theurkauf WE, Zamore PD. Heterotypic piRNA Ping-Pong requires qin, a protein with both E3 ligase and Tudor domains. Mol Cell. 2011;44:572–584. doi: 10.1016/j.molcel.2011.10.011. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplemental

RESOURCES