Abstract
The endoplasmic reticulum (ER), the most pervasive organelle, exchanges information and material with many other organelles, but the extent of its inter-organelle connections and the proteins that form bridges are not well known. The integral ER membrane protein VAMP-associated protein (VAP) is found in multiple bridges, interacting with many proteins that contain a short linear motif consisting of “two phenylalanines in an acidic tract” (FFAT). The VAP-FFAT interaction is the most common mechanism by which cytoplasmic proteins, particularly inter-organelle bridges, target the ER. Therefore, predicting new FFAT motifs may both find new individual peripheral ER proteins and identify new routes of communication involving the ER. Here we searched for FFAT motifs across whole proteomes. The excess of eukaryotic proteins with FFAT motifs over background was ≥0.8%, suggesting this is the minimum number of peripheral ER proteins. In yeast, where VAP was previously known to bind 4 proteins with FFAT motifs, a detailed analysis of a subset of proteins predicted 20 FFAT motifs. Extrapolating these findings to the whole proteome estimated the number of FFAT motifs in yeast at approximately 50-55 (0.9% of proteome). Among these previously unstudied FFAT motifs, most have known functions outside the ER, so could be involved in inter-organelle communication. Many of these can target well-characterised membrane contact sites, however some are in nucleoli and eisosomes, organelles previously unknown to have molecular bridges to the ER. We speculate that the nucleolar and eisosomal proteins with predicted motifs may function while bridging to the ER, indicating novel ER-nucleolus and ER-eisosome routes of inter-organelle communication.
Keywords: endoplasmic reticulum, lipid transfer protein, membrane contact sites, nucleolus, short linear motif, eisosome, VPS13, NOP2/Nsun6, BOP1, UBR4
Introduction
The endoplasmic reticulum (ER) is the most widely distributed organelle in eukaryotic cells. Its functions are created in part by peripheral membrane proteins that reversibly bind its cytoplasmic surface. The sole widely documented mechanism by which cytoplasmic proteins target the ER is through a short linear motif (SLiM) called “two phenylalanines in an acidic tract” (FFAT) (Loewen et al., 2003). Proteins with FFAT motifs communicate between the ER and many other compartments, including plasma membrane, Golgi apparatus, peroxisomes and mitochondria (Hanada et al., 2003; Hynynen et al., 2005; Saita et al., 2009; De Vos et al., 2011; Murphy and Levine, 2016; Costello et al., 2017; Hua et al., 2017; Yadav et al., 2018). This fits the idea that SLiM interactions allow links to arise between seemingly distinct subcellular processes (Kim et al., 2014). Significantly, all proteins in a non-ER location with FFAT motifs have since been found to target both that location and the ER at the same time. They bridge between the two compartments at membrane contact sites (Murphy and Levine, 2016). Thus, finding a FFAT motif is significant because it suggests direct bridging to the ER, rather than the existence of a long-distance intracellular shuttle between two compartments (Kumagai and Hanada, 2019).
FFAT motifs bind to the major sperm protein (MSP, 120 residues) domain of Vesicle-Associated Membrane Protein (VAMP)-Associated Protein (VAP), a protein found in almost all eukaryotes. The MSP domain is linked by a flexible region to the cytoplasmic face of the ER (Skehel et al., 2000). The core of FFAT motifs is an extended peptide of 7 residues, two of which bind into pockets in VAP (Figure 1A) (Kaiser et al., 2005). Prior to the engagement of the core, an acidic tract N-terminal to the core has a low affinity charge-based interaction with the MSP domain (Furuita et al., 2010). All FFAT motifs that fitted the early, strict definition, have since been validated as interacting with VAP, except for one that is located in the lysosomal lumen. Subsequently, predicting new FFAT motifs has been made possible because the definition of FFAT motifs has been expanded through the discovery of new variants (Saita et al., 2009; Mikitova and Levine, 2012; Baron et al., 2014). Further sequence variation has been shown with substitution of anionic positions with serine or threonine (S/T in one letter notation), which allows reversible phospho-mimicking of motifs (Alpy et al., 2013; Kumagai et al., 2014; McCune et al., 2017; Johnson et al., 2018; Kirmiz et al., 2018).
Domain-SLiM interactions are hard to capture by standard protein-protein interaction techniques, and their discovery rate in databases is estimated at only ≤4% (Davey et al., 2017). Bioinformatics can predict new instances (Krystkowiak and Davey, 2017). Widening the definition of FFAT’s pattern allowed the prediction of many new human motifs, which were assessed against three criteria to reduce the number of false positives: (i) cytoplasmic location, (ii) structure allowing formation of an extended loop, also described as an intrinsically disordered region (IDR), (iii) local conservation (Mikitova and Levine, 2012). Many of the FFAT motifs identified by bioinformatics have been verified in high-throughput protein interactomes (Hein et al., 2015; Huttlin et al., 2015), and some have been verified by specific mutagenesis (Costello et al., 2017; Kumar et al., 2018). To identify motifs that vary at multiple positions, a position weight matrix (PWM) can be used. The PWM for FFAT motifs produces a FFAT score for runs of 13 amino acids (7 core and 6 acidic tract preceding) (Murphy and Levine, 2016). FFAT score varies from “perfect” motifs with FFAT score =0, increasing with each suboptimal substitution (Figure 1B) (Murphy and Levine, 2016). Three sources of information contributed to the PWM: (i) residues in VAP interactors and their homologues; (ii) the effect of substitutions on affinity for VAP; (iii) VAP-FFAT structures (Mikitova and Levine, 2012). A cut-off FFAT score ≤2.5 was chosen to reduce false positives, initially estimated at ~1% for this cut-off (Murphy and Levine, 2016). This PWM has been used to scan interactors of the two major human VAPs: VAPA and VAPB. These are 77% similar by sequence and share 80% of interactors (Huttlin et al., 2015), so they can be considered together. 40% interactors contained a motif with FFAT score ≤2.5, and 10% formed complexes with these, so 50% of VAP interactions are attributable to FFAT motifs (Murphy and Levine, 2016).
Here we applied the PWM for FFAT motifs systematically across eukaryote proteomes using negative controls to estimate background. We predict that at least 0.8% of the proteome (50 proteins in budding yeast, 170 in humans) bind VAP. This estimate was confirmed by detailed consideration of a subset of yeast motifs. Some of the FFAT positive proteins we predict act in compartments that have no known molecular link to the ER. Since most known FFAT positive proteins carry out their function while binding VAP, this indicates novel pathways of inter-organelle communication.
Results
Proteins with highly optimal FFAT motifs are already known to bind VAP
Every residue of every protein in 6 model eukaryotes (human, D. melanogaster, C. elegans, budding yeast S. cerevisiae, A. thaliana and P. falciparum) was scanned with the PWM we developed previously to identify FFAT scores for every sequence of 13 residues. The position with the lowest FFAT score in each protein, was recorded as the FFAT score for that protein (Murphy and Levine, 2016). The overall distribution of FFAT scores was non-normal, and was similar across eukaryotes (Figure 2A, and data not shown). Looking at proteins with extremely low FFAT scores, where VAP interactors are expected, the proteomes showed some variation, for example Arabidopsis had fewer proteins with FFAT scores ≤1 (Figure 2B).
We next determined the correlation between low FFAT score and documented physical interaction with VAP, combining data from the two species with the best documented protein-protein interaction networks: human and yeast. For the human VAP interactome, we included VAPA, VAPB and also MOSPD2. The latter is a newly described variant of VAP known to bind only a minority of the interactors of VAPA/B, possibly because it has not been studied in as much detail, or because the region of its MSP domain that binds FFATs is divergent from VAPA/B (Di Mattia et al., 2018). The three VAPs have 341 distinct interactors (1.7% of proteome) (Chatr-Aryamontri et al., 2017). Yeast has two VAPs, the major form Scs2p and a minor homologue Scs22p, with 74 physical interactors between them (1.2% of proteome). All of the proteins with FFAT score =0.0 (2 in yeast, 6 in human) are already known to bind VAP. This is to be expected because the sequences of proteins with the lowest FFAT scores were used to create our PWM, so by definition their FFAT scores are low. The two yeast proteins in this group are the transcriptional regulator Opi1p and the oxysterol binding protein (OSBP)-related protein (ORP) homolog Osh1p (Table 1A and B). For FFAT scores marginally higher (range 0.5 – 1.0), 2 of the 6 yeast proteins (Table 1A and B) and 13 of 17 human proteins are documented to bind VAP (65% overall, Figure 2C). The motifs in the four missing human proteins (FRPD1, S22AF, TSYL2, ORC2) were all identified previously by SCAN-PROSITE (Mikitova and Levine, 2012). The four missing yeast proteins (Bbc1p, Kri1p, Seg2p, Ypr097wp, Table 1) are considered below.
Table 1. Yeast proteins with motifs with FFAT scores ≤1.5.
Name | FFAT score | Start | Flank | Core | End |
---|---|---|---|---|---|
(a) Description of motifs | |||||
OPI1 a | 0 | 194 | EDDDDE | EFFDASE | 206 |
OSH1 a | 0 | 710 | EDSDAD | EFFDAEE | 722 |
BBC1 | 1 | 547 | EDTDDH | EFEDAND | 559 |
KRI1 | 1 | 243 | NEEDDE | EFEDAAE | 255 |
OSH2 a | 1 | 739 | EASDAD | EFYDAAE | 751 |
OSH3 a | 1 | 508 | YLSEND | EFFDAEE | 520 |
SEG2 | 1.2 | 637 | DDDDDD | EYHDSYD | 649 |
YPR097W | 1 | 449 | EESDFD | EYKDASD | 461 |
ASG7 | 1.5 | 121 | SDSNSE | EYYESKD | 133 |
EDE1 | 1.5 | 1130 | SSSDDD | EFEDTRE | 1142 |
ERB1 | 1.5 | 46 | SDEDDD | EYESAVE | 58 |
EXO1 | 1.5 | 541 | DGDTSE | DYSETAE | 553 |
MCM3 | 1.5 | 746 | IDEEES | EYEEALS | 758 |
NOP2 | 1.5 | 254 | SPAEAM | EFFEANE | 266 |
OST1 | 1.5 | 59 | ASEPAT | EYFTAFE | 71 |
PET123 | 1.5 | 224 | FNDETE | EFTDAYD | 236 |
RQC1 | 1.5 | 149 | DDTNEE | GFFTASE | 161 |
SFB3 | 1.5 | 747 | LEETDL | TFYDAND | 759 |
SKP2 | 1.5 | 458 | SLDTED | DFDDCNS | 470 |
STE13 | 1.5 | 554 | EYETDT | IFFTANE | 566 |
TUB1 | 1.5 | 412 | EGMEEG | EFTEARE | 424 |
TUB3 | 1.5 | 412 | EGMEEG | EFTEARE | 424 |
UBP10 | 1.5 | 139 | EEEEGE | IFHEARD | 151 |
UBR1 | 1.5 | 1608 | EVEEEL | EFEDTAE | 1620 |
VID27 | 1.5 | 193 | LDSSSD | DFQDAKD | 205 |
WSC2 | 1.5 | 484 | VSDGDD | DYDDAKD | 496 |
YLR149C | 1.5 | 605 | DVFEDD | EYYEAYN | 617 |
YOR238W | 1.5 | 280 | GIEDDE | EYFETKI | 292 |
(b) Descriptions of proteins |
---|
FFAT score
0.0 Opi1p:btranscription factor that senses PA in ER to regulate phospholipid pathways Osh1p:b lipid transfer protein in the Oxysterol binding protein (OSBP)-related protein (ORP) family found in all eukaryotes; 64% similar to paralogue Osh2p (below). Other name: Swh1p |
FFAT score
1.0 Bbc1p: assembles actin patches in clathrin-mediated endocytosis; only few homologues Kri1p: nucleolus; synthesis of 40S ribosomal subunit; widespread homologues incl. human. Osh2p:b ORP lipid transfer protein; 64% similar to paralogue Osh1p (above) Osh3p:b ORP lipid transfer protein with GOLD domain, a combination only found in fungi Seg2p: stabilises eisosomes, which are fungal-only protein-rich assemblies that deform the plasma membrane into furrows (Moreira et al., 2012); no OrthoDB family Ypr097wp: protein of unknown function (PUF) found in purified mitochondria with PX and PXB domains – the same domain structure as human sorting nexins SNX20/21, indicating endocytic sorting function (Clairfeuille et al., 2015; Danson et al., 2018). |
FFAT score
1.5 Asg7p: regulates signaling from G beta subunit Ste4p in mating Ede1p: scaffolds clathrin-mediated endocytosis, binding membrane proteins with a C-terminal UBA domain; homologue of human Eps15, which has a C-terminal UIM-domain. Erb1p: nucleolus; 60S ribosomal subunit maturation; homologues in all eukaryotes, called BOP1 Exo1p: 5'-3' exonuclease & flap-endonuclease involved in many pathways Mcm3p: subunit of the mini chromosome maintenance DNA helicase complex Nop2p: nucleolus; rRNA-methyltransferase of 25S rRNA, required for ribosomal maturation Ost1p: N-linked oligosaccharyltransferase complex in ER lumen Pet123p: component of 37S mito-ribosomal small subunit Rqc1p: component of ribosomal quality control pathway; functional homologues in all eukaryote kingdoms share domain called Tcf25; mammalian components bind VAP (Zuzow et al., 2018) Sfb3p: homologous to Sec24, forming an alternate coat for COPII vesicles in heterodimers with Sec23p Skp2p: SCF ubiquitin ligase complex subunit Ste13p: Golgi-localised peptidyl aminopeptidase Tub1p: alpha- tubulin; 95% similar to paralogue Tub3p Tub3p: alpha- tubulin; 95% similar to paralogue Tub1p Ubp10p: nucleolus; Ub-specific peptidase in rRNA synthesis; human orthologue is USP36 Ubr1p: N-recognin E3 Ubiquitin-ligase; degrades proteins by the N-end rule; homologous to UBR4 in most non-fungal eukaryotes Vid27p: implicated in the vacuolar import and degradation (VID) pathway for lysosomal protein degradation; homologues in all eukaryote kingdoms except animals Wsc2p: type I plasma membrane protein involved in sensing stress (Wilk et al., 2010) Ylr149cp: cytosolic PUF consisting almost mostly of WD40 repeats Yor238wp: cytosolic PUF; predicted by HHsearchc to bind S-adenosyl methionine like YdcF. |
Notes:
FFAT = two phenylalanines in an acidic tract; ER = endoplasmic reticulum; VAP = vesicle-associated membrane protein-associated protein; ORP = oxysterol binding protein-related protein; PUF = protein of unknown function; VID = vacuolar import and degradation.
4 motifs already known
4 proteins already shown to bind VAP.
HHsearch was carried out to predict domains for all yeast proteins, using settings described previously (Fidler et al., 2016).
Few of the many proteins with sub-optimal FFAT motifs are known to bind VAP
The number of proteins containing FFAT motifs that score 1.5, 2.0 or 2.5 rose steeply, so that across the 6 model eukaryotes 3.5% of proteins have sub-optimal FFAT motifs with such scores (Figure 2B, and Supplementary Tables 1A, 2 & 3). With higher FFAT scores the proportion already identified as interacting with VAP declined: 11%, 6.7% and 2.7% for FFAT score =1.5, 2.0 and 2.5 respectively (Figure 2D). This could reflect a genuine decrease in the proportion of proteins with these motifs to ever bind VAP. Alternately, the co-purification and co-precipitation methods of major high-throughput studies (Hein et al., 2015; Huttlin et al., 2015) could be too harsh to identify interactions between VAP and sub-optimal motifs (Davey et al., 2017). Interactions between domains and SLiMs tend to have lower affinity than the interactions between domain pairs (Kim et al., 2014).
Estimating the proportion of proteins with suboptimal FFAT motifs that interact with VAP
Among the eukaryotic proteins that have suboptimal FFAT motifs, we used two approaches to estimate the proportion that are irrelevant, equivalent to random background. In the first approach, we analysed species without VAP, which have no selective pressure for FFAT motifs, so they may act as negative controls. Organisms from the two non-eukaryotic kingdoms (bacteria and archaea) have been used before for this purpose (Meszaros et al., 2012). In addition, we identified a small group of variant unicellular eukaryotes that lack VAP: Euglena gracilis, Spironucleus salmonicida and Paratrypansoma confusum. The levels of suboptimal FFAT motifs were lowest in bacteria, and similar in archaea and VAP-negative eukaryotes. For FFAT scores ≤1.0, the controls had ~10% observed motifs as VAP-positive eukaryotes. This proportion rose to 40-50% for motifs with FFAT score = 2.5 (Figure 3A).
The second estimation of the background level of low FFAT scores was achieved by randomising VAP-negative proteomes. To check this randomisation acted as expected, we compared the numbers of motifs before and after randomisation in VAP-negative proteomes, with the expectation that there would be no difference. Four out of 13 proteomes had more motifs after randomisation. Among the 9 others, we tested for an excess of motifs over background, using the “N-1” Chi2 test (Campbell, 2007) to compare total number of motifs at different levels of FFAT score: there was no significant excess of observed motifs over expected (lowest p-value =0.08, Supplementary Table 1B). By comparison, all VAP-positive genomes had excess motifs with p values varying from 10-4 to 10-20 (Supplementary Table 1B, bottom). This showed that the randomisation is a useful way to estimate background.
The background from randomisation of VAP-positive proteomes was 76% for all FFAT score ≤2.5, mainly because it was 85% for FFAT score =2.5 (Figure 3B). This background is higher than the background estimated from VAP-negative organisms, and even this more conservative prediction of FFAT motifs still produces a minimum estimate that ~0.8% of the proteome are positively selected above background to have motifs with FFAT score ≤2.5 (50 in yeast and 170 in human). The two background estimations agreed that the proportion of VAP interactors is higher where the FFAT score is lower, with rates of false positives start low but rise steeply as FFAT score increases from 1.5 to 2.5 (40% to 85% respectively, Figure 3C).
Using families of fungal orthologues to determine whether the motif in an individual protein is subject to preferential conservation
To study VAP interactors in one organism we chose yeast because its proteome is small and better documented than other eukaryotes. Inclusion criteria for determining if motifs are functionally relevant have been developed previously: (i) location in the cytoplasm, not inside an organelle or secreted; (ii) predicted to form an extended loop, not in a helix or sheet, particularly if the secondary structural element is conserved and in a globular domain and if it has been confirmed experimentally (crystal structure etc.); (iii) preferential conservation of motifs that are not excluded by the other two criteria among closely related homologues, ideally orthologues (Mikitova and Levine, 2012; Murphy and Levine, 2016).
This third criterion was previously carried out by manual examination of homologues, particularly of the key core residues of the motif (#2/4/5). Here we developed the criterion to measure conservation statistically. Families of orthologues of each yeast protein have been created by tools such as OrthoDB (family size n=18-700, median 150, Table 2A) (Kriventseva et al., 2015). We tested if the rate of occurrence of motifs among orthologues was significantly above background, which was calculated from randomised sequences, as done for proteomes above. Based on the statistical robustness of randomising VAP-negative control proteomes (Supplementary Table 1B), our cut-off for statistical significance of excess of motifs with FFAT at different scores (≤2.5/2.0/1.5/1.0) was p≤0.01, with borderline excess where p-values lie between 0.001 and 0.01, and confirmed excess for p≤0.001.
Table 2. Application of exclusion criteria to potential FFAT motifs in yeast.
A. Motifs with FFAT scores 0.0 to 1.5 | |||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Name | FFAT score | Exclusion criteria | |||||||||||||||
1. loc | 2. struct | 3. Conservation (family of orthologues) | |||||||||||||||
FFAT score ≤2.5 | size of fam. | observed motifs | expected background (random) | p-value (-log10) | |||||||||||||
if not cyto. | if not IDR | 2.5 | 2.0 | 1.5 | 0.0-1.0 | 2.5 | 2.0 | 1.5 | 0.0-1.0 | ≤2.5 | ≤2.0 | ≤1.5 | 0.0-1.0 | ||||
Opi1 er | 0 | H b | 13% | 151 | 1 | 3 | 2 | 13 | 2.2 | 0.7 | 0.0 | 0.0 | 3.2 | 4.2 | 4.0 | 3.5 | |
Osh1 er | 0 | 99% | 154 | 55 | 26 | 3 | 149 | 9.2 | 2.7 | 0.5 | 0.1 | 44 | 38 | 34 | 34 | ||
Bbc1 | 1 | H b | 20% | 168 | 15 | 8 | 4 | 7 | 7.0 | 1.6 | 0.4 | 0.0 | 3.9 | 3.7 | 2.8 | 2.1 | |
Kri1 | 1 | H b | 23% | 154 | 14 | 3 | 5 | 19 | 12 | 5.0 | 1.1 | 0.4 | 2.4 | 3.4 | 5 | 4.6 | |
Osh2 er | 1 | H b | as Osh1 | ||||||||||||||
Osh3 er | 1 | 74% | 156 | 13 | 31 | 23 | 62 | 6.6 | 1.2 | 0.0 | 0.0 | 24 | 26 | 20 | 15 | ||
Seg2 | 1.2 | 33% | 21 | 3 | 1 | 1 | 2 | 0.1 | 0.0 | 0.0 | 0.0 | 2.0 | 1.3 | 1.1 | 0.8 | ||
Ypr097w | 1 | 14% | 168 | 7 | 6 | 4 | 7 | 9.0 | 1.7 | 0.3 | 0.1 | 1.5 | 3.2 | 2.8 | 2.0 | ||
Asg7 | 1.5 | lumen | 11% | 18 | 0 | 1 | 1 | 0 | 0.2 | 0.0 | 0.0 | 0.0 | 0.6 | 0.8 | / | / | |
Ede1 | 1.5 | 18% | 221 | 16 | 7 | 11 | 5 | 12 | 2.6 | 0.3 | 0.0 | 3.0 | 4.1 | 4.0 | 1.6 | ||
‡ Erb1 er | 1.5 | 31% | 125 | 25 | 10 | 11 | 0 | 5.3 | 2.3 | 0.5 | 0.0 | 7 | 3.7 | 2.7 | / | ||
Exo1 | 1.5 | E | 9.9% | 171 | 17 | 0 | 1 | 0 | 4.7 | 0.7 | 0.2 | 0.0 | 2.0 | / | / | / | |
Mcm3 | 1.5 | H b | 3.1% | 732 | 18 | 4 | 1 | 0 | 26 | 6.0 | 2.0 | 0.2 | / | / | / | / | |
Nop2 | 1.5 | H | 85% | 145 | 62 | 18 | 46 | 16 | 5.9 | 1.3 | 0.8 | 0.2 | 27 | 17 | 14 | 4.1 | |
12% | 17 | 4 | 9 | 0 | 4 | 1.1 | 2.2 | 0.7 | 1.2 | ||||||||
‡ Ost1 er | 1.5 | lumen | E | 2.0% | 148 | 2 | 0 | 1 | 0 | 1.9 | 0.6 | 0.2 | 0.0 | 0.0 | / | / | / |
Pet123 | 1.5 | mito. | 5.7% | 35 | 0 | 1 | 1 | 0 | 0.4 | 0.0 | 0.0 | 0.0 | 0.5 | 0.8 | / | / | |
Rqc1 | 1.5 | 20% | 151 | 7 | 9 | 6 | 8 | 6.9 | 1.7 | 0.1 | 0.1 | 3.2 | 4.6 | 3.6 | 2.4 | ||
Sfb3 er | 1.5 | 7.4% | 274 | 10 | 4 | 6 | 0 | 6.8 | 1.9 | 0.1 | 0.1 | 1.4 | 1.6 | 1.7 | / | ||
Skp2 | 1.5 | 5.4% | 37 | 1 | 0 | 1 | 0 | 0.6 | 0.4 | 0.0 | 0.0 | 0.2 | / | / | / | ||
Ste13 | 1.5 | lumen | E | 4.8% | 230 | 8 | 2 | 2 | 0 | 10 | 2.4 | 0.7 | 0.1 | / | 0.1 | 0.3 | / |
Tub1 | 1.5 | H | 100% | 74 | 2 | 0 | 71 | 0 | 3.0 | 0.4 | 0.0 | 0.0 | 15 | 16 | 17 | / | |
Tub3 | 1.5 | H | as Tub1 | ||||||||||||||
Ubp10 | 1.5 | H b | 8.1% | 533 | 20 | 14 | 8 | 4 | 24 | 6.2 | 1.0 | 0.1 | 1 | 2.9 | 2.6 | 1.3 | |
Ubr1 | 1.5 | H b | 18% | 186 | 18 | 13 | 9 | 3 | 19 | 3.6 | 1.1 | 0.1 | 1.7 | 3.7 | 2.5 | 1.0 | |
Vid27 | 1.5 | 39% | 155 | 17 | 25 | 17 | 15 | 12 | 1.5 | 0.7 | 0.0 | 10 | 12 | 7 | 4.0 | ||
Wsc2 | 1.5 | 1.9% | 105 | 1 | 0 | 1 | 0 | 1.6 | 0.2 | 0.0 | 0.0 | 0.0 | / | / | / | ||
Ylr149c | 1.5 | H b | 5.1% | 99 | 3 | 1 | 1 | 0 | 2.9 | 0.6 | 0.2 | 0.1 | 0.2 | 0.3 | / | / | |
Yor238w | 1.5 | H b | 0.9% | 117 | 1 | 0 | 0 | 0 | 1.3 | 0.1 | 0.1 | 0.0 | / | / | / | / |
B. Samples of motifs with FFAT scores 2.0 and 2.5 | |||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Name | FFAT score | Exclusion criteria | |||||||||||||||
1. loc | 2. struct | 3. Conservation (family of orthologues) | |||||||||||||||
FFAT score ≤2.5 | size of fam. | observed motifs | expected background (random) | p-value (-log10) | |||||||||||||
if not cyto. | if not IDR | 2.5 | 2.0 | 1.5 | 0.0-1.0 | 2.5 | 2.0 | 1.5 | 0.0-1.0 | ≤2.5 | ≤2.0 | ≤1.5 | 0.0-1.0 | ||||
Acs1 | 2.0 | E | 10% | 240 | 12 | 11 | 1 | 0 | 5.1 | 1.3 | 0.2 | 0.0 | 2.8 | 2.4 | / | / | |
Ecm13 | 2.0 | 27% | 123 | 24 | 12 | 0 | 2 | 5.7 | 2.7 | 0.7 | 0.1 | 4.6 | 1.9 | 0.3 | 0.7 | ||
Efr3 | 2.0 | 8.2% | 159 | 4 | 2 | 6 | 1 | 4.3 | 0.6 | 0.1 | 0.0 | 1.2 | 2.1 | 2.0 | / | ||
Ent3 | 2.0 | 71% | 139 | 85 | 19 | 0 | 0 | 3.4 | 0.6 | 0.2 | 0.0 | 21 | 4.4 | / | / | ||
Irr1 | 2.0 | 9.9% | 302 | 23 | 8 | 0 | 0 | 24.3 | 5.9 | 2.0 | 0.6 | 0.1 | 0.0 | / | / | ||
Kar2 er | 2.0 | lumen | 3.1% | 413 | 6 | 13 | 0 | 0 | 13.1 | 3.5 | 1.3 | 0.0 | 0.1 | 1.3 | / | / | |
Met4 | 2.0 | 5.0% | 20 | 1 | 0 | 0 | 0 | 0.4 | 0.0 | 0.1 | 0.0 | / | / | / | / | ||
Mgr3 | 2.0 | mito. | 33% | 43 | 2 | 12 | 0 | 0 | 0.7 | 0.0 | 0.1 | 0.1 | 3.2 | 3.1 | / | / | |
Msh6 | 2.0 | 13% | 563 | 57 | 20 | 1 | 1 | 26.7 | 5.6 | 1.3 | 0.3 | 4.7 | 2.2 | 0.1 | / | ||
Rkr1 | 2.0 | 4.5% | 154 | 5 | 2 | 0 | 0 | 9.9 | 1.9 | 0.5 | 0.0 | / | / | / | / | ||
Sxm1 | 2.0 | 19% | 158 | 27 | 4 | 1 | 4 | 12.8 | 3.2 | 0.3 | 0.1 | 2.2 | 0.9 | 1.3 | 1.3 | ||
‡ Utp9 | 2.0 | 38% | 13 | 1 | 4 | 0 | 0 | 0.2 | 0 | 0 | 0 | 1.5 | 1.3 | / | / | ||
Yhl026c | 2.0 | 3.7% | 54 | 0 | 2 | 0 | 0 | 0.7 | 0.0 | 0.1 | 0.0 | 0.3 | 0.7 | / | / | ||
Aim44 | 2.5 | 35% | 20 | 5 | 2 | 0 | 0 | 0.8 | 0.1 | 0.1 | 0.0 | 1.5 | 0.6 | / | / | ||
Chm7 er | 2.5 | 7.4% | 149 | 11 | 0 | 0 | 0 | 3.0 | 0.7 | 0.1 | 0.1 | 1.2 | / | / | / | ||
Dyn1 | 2.5 | 2.3% | 175 | 4 | 0 | 0 | 0 | 23.8 | 4.7 | 0.9 | 0.4 | / | / | / | / | ||
‡ Epo1 er | 2.5 | 24% | 38 | 8 | 2 | 0 | 0 | 0.6 | 0.1 | 0.0 | 0.0 | 2.4 | 0.7 | / | / | ||
Kxd1 (x2) | 2.5 | H b | 6.0% | 83 | 6 | 1 | 0 | 0 | 0.5 | 0.8 | 0.0 | 0.0 | 1.3 | / | / | / | |
Mdv1 | 2.5 | E | 4.1% | 765 | 25 | 5 | 0 | 0 | 16.2 | 3.2 | 0.6 | 0.2 | 0.8 | 0.1 | / | / | |
Mga2er | 2.5 | 7.1% | 311 | 13 | 5 | 3 | 0 | 7.2 | 2.1 | 0.3 | 0.1 | 1.4 | 1.0 | 0.8 | / | ||
‡ Num1 er | 2.5 | 52% | 42 | 15 | 7 | 4 | 1 | 6.9 | 2.3 | 0.0 | 0.0 | 2.5 | 2.0 | 1.6 | / | ||
Pbi1 ‡ | 2.5 | H | 4.2% | 24 | 1 | 0 | 0 | 0 | 0.5 | 0.3 | 0.0 | 0.0 | 0.1 | / | / | / | |
Sec2 | 2.5 | 20% | 163 | 24 | 7 | 9 | 0 | 4.6 | 1.7 | 0.2 | 0.1 | 6 | 3.0 | 2.4 | / | ||
Sec27 | 2.5 | 3.2% | 1157 | 26 | 10 | 2 | 0 | 35.8 | 7.6 | 1.1 | 0.3 | 0.3 | 0.3 | 0.1 | / | ||
Sec63 | 2.5 | lumen | 6.4% | 141 | 4 | 4 | 1 | 0 | 6.9 | 1.2 | 0.3 | 0.0 | 0.1 | 0.8 | / | / | |
Sec66 | 2.5 | lumen | 2.2% | 136 | 2 | 0 | 1 | 0 | 1.2 | 0.5 | 0.0 | 0.0 | 0.3 | / | / | / | |
Tor1 (x2) | 2.5 | H (x2) | 75% | 158 | 133 | 1 | 0 | 0 | 9.3 | 1.2 | 0.1 | 0.0 | 24 | / | / | / | |
Ubp1 er | 2.5 | 5.2% | 154 | 3 | 4 | 1 | 0 | 5.5 | 1.0 | 0.3 | 0.0 | 0.1 | 0.9 | / | / | ||
Ubp2 | 2.5 | H | 11% | 154 | 18 | 1 | 0 | 0 | 13.9 | 2.7 | 0.8 | 0.3 | 0.1 | / | / | / | |
Vps13 er | 2.5 | 48% | 142 | 27 | 24 | 10 | 14 | 23.0 | 4.7 | 1.2 | 0.3 | 5 | 8 | 5 | 3.5 | ||
Yml002w | 2.5 | H | 11% | 113 | 11 | 0 | 1 | 0 | 5.2 | 1.1 | 0.3 | 0.0 | 0.7 | / | / | / | |
Ypt7 | 2.5 | 11% | 38 | 3 | 1 | 0 | 0 | 0.9 | 0.2 | 0.0 | 0.0 | 0.7 | / | / | / |
Notes: Sequences in order of increasing FFAT score. Red text, yellow fill indicates motif excluded and exclusion criterion: Location: lumen = translocated; mito. = mitochondrion; empty: cytoplasm. Structure: H helix, E sheet (red confirmed, black predicted, H b indicates helix between domains); empty: loop. Conservation: exclusion either if observed motifs ≤ expected (numbers of motifs in red) or if all p values for excess motifs with FFAT score ≤2.5/2.0/1.5/1.0 ≥0.01 (i.e. -1*log10 p value <2.0). Display of p value: nonsignificant (p>10-2); borderline (10-3≤p<10-2): pink, or only 2 motifs: grey; positive (p≤10-3): blue. The most significant Chi2 is used to colour the protein name. Chi2 is not valid if all expected values =0, and not given if only 1 motif was observed (“/”). ‡Erb1 and Ost1 have been previously identified as binding yeast VAP (Chatr-Aryamontri et al., 2017). Superscript er: known on ER. NOP2 has motifs of 2 main types: one in a confirmed helix and others in a loop.
Notes: colouring, p-values and exclusions as in part A. Kxd1 and Tor1 have 2 motifs with FFAT score =2.5. ‡ Epo1, Num1 and Pbi1 have been reported to bind VAP, but potential FFAT motifs have not been tested (Chao et al., 2014; Riekhof et al., 2014). Superscript er: known on ER.
We first tested this approach on the four yeast proteins with known FFAT motifs: Osh1/2/3p and Opi1p. Since these sequences were used to create our PWM, by definition their FFAT scores are low, nevertheless they illustrate the use of families of orthologues. For Osh1p (FFAT score =0.0), which includes Osh2p (FFAT score =1.0) as a paralogue in the same family, 99% of orthologues had motifs with FFAT score ≤2.5, indicating uniform conservation of the VAP-Osh1/2 interaction. By comparison, only 74% of Osh3p orthologues (FFAT score =1.0) had motifs with FFAT score ≤2.5. Another 15% were short sequences representing fragments of longer Osh3p sequences that were readily identifiable in sequence databases. All of those contained motifs with low FFAT scores (data not shown). The remaining 10% Osh3 orthologues all had degenerate FFAT like motifs at the same position as the motif in Osh3p. For example, Meyerozyma guilliermondii has GFVDAQD without acids upstream. Motifs with tryptophan substituted at position 2 (which cannot arise from a point mutation of F or Y) were found both in Taphrinomycotina, a branch of the Ascomycota (Pneumocystis, S. pombe, etc.) and in Agaricomycetes, a branch of the Basidiomycota (Fomitopsis, Coprinopsis, etc.), but not in the closely related Ustilaginomycotina. Since these position 2=tryptophan FFAT like motifs are evolutionarily separate (Stajich et al., 2009), this substitution appears to have evolved in Osh3 orthologues on multiple occasions.
In the family of orthologues of Opi1p (FFAT score =0.0), only 13% of proteins had a motif with FFAT score ≤2.5. This created a statistical excess (p=10-4 for motifs with scores ≤2.0) of far less statistical significance than for Osh3 (p=10-26) or Osh1 (p=10-44). Many of the Opi1 orthologues have an alternate name: “related to clock-controlled protein 8 (ccg-8)”. ccg-8 is a Neurospora crassa transcription factor that regulates an aspect of lipid metabolism different from that regulated by Opi1p in S. cerevisiae (Xue et al., 2019). FFAT motifs are absent from all proteins named ccg-8. This indicates that the fungal orthologue families have subfamilies with diverged function. To address this, we first looked at a narrower orthology family from the Kyoto Encyclopedia of Genes and Genomes (KEGG) Orthology database (Kanehisa et al., 2004), excluding all ccg-8s. This had a higher incidence of FFAT motifs (28%), but the smaller size of the family meant that there was no greater statistical significance (Supplementary Table 4). We then created a custom-made family using BLAST to identify proteins closer to the budding yeast Opi1p, but with only 15-20% overall sequence identity, and no selection for the FFAT motif in the BLAST search. These proteins had a 48% incidence of motifs with FFAT scores ≤2.5 (most FFAT scores =0.0), and a much higher statistical significance (p=10-10, Supplementary Table 4).
Overall, this indicates that low discovery rates of a motif among Opi1p orthologues did not result from a lack of sensitivity (i.e. not false negatives). Instead, there appears to have been evolutionary pressure to interact with VAP only in some ecological niches, reducing the statistical significance of excess motifs across a wide range of organisms.
Ten new predicted VAP interactors in 23 yeast proteins identified with FFAT scores 1.0 or 1.5
To predict new FFAT motifs, first we considered proteins with FFAT score =1.0 (n=4) and =1.5 (n=20) (Table 1A and B). Among these 24 proteins, Erb1p had previously been identified as an interactor of yeast VAP (Ho et al., 2002). This protein contained a motif that met all our criteria (p=10-7), strongly suggesting a VAP-FFAT interaction in this family. Among the newly identified 23 proteins our criteria excluded 13 of the candidate proteins with motifs with FFAT scores =1.5 (Table 2A). Seven proteins were excluded by location or structure of the region containing the motif. Five of these had weak (or no) conservation (Asg7p, Exo1p, Ost1p, Pet123p and Ste13p, Table 2A). In contrast two highly homologous proteins (Tub1p and Tub3p) were excluded because the motifs are in a helix even though the motifs were universal among their orthologues. All α-tubulins, including in humans, apparently have the same inaccessible FFAT like sequence (Murphy and Levine, 2016).
Six proteins were excluded because they failed the conservation criterion. Two of these had fewer motifs with low FFAT scores in the orthologue family than expected: Mcm3p, Yor238wp (Table 2A). Three had low numbers of motifs similar to the expected numbers (Skp2p, Wsc2p, Ylr149cp). The final protein, Sfb3p, had a non-significant excess of motifs with FFAT score 1.5–2.5 within a large family of orthologues (7% of 274, p=0.02=10−1.7, Table 2A). Limiting Sfb3 orthologues to the narrower group of Ascomycota increased the rate of occurrence of motifs with low FFAT scores but reduced the significance (16% of 43, p=0.06=10−1.2, Supplementary Table 4), so there is a statistical trend for an excess of motifs in Sfb3 orthologues, but they fail to meet the statistical criterion.
After these exclusions, ten new proteins with FFAT score =1.0 or 1.5 met all the criteria for new predictions of FFAT motifs: Bbc1p, Ede1p, Kri1p, Nop2p, Rqc1p, Seg2p, Ubp10p, Ubr1p, Vid17p and Ypr097wp (Table 2A and described below). The positive prediction rates for motifs with FFAT scores =1.0 and 1.5 were 100% and 32% respectively. The latter is lower than the frequency of motifs we had predicted (60%), but with the low numbers involved the difference is not statistically significant.
Three of the predictions only reached borderline statistical significance (p-value between 0.001 and 0.01): Seg2p, Ubp10p and Nop2p. For Seg2p, the low significance was most likely because the protein family is small (n=21). Ubp10p orthologues showed a small excess of motifs over background with FFAT scores 1.5–2.0 in a large orthologue family. Like Opi1p, the motif showed more significant conservation in a subfamily of Ubp10 orthologues from Ascomycota only (Supplementary Table 4). Nop2 was unique among the proteins we studied in having two types of motifs with low FFAT scores in different orthologues: the vast majority (97%), including in S. cerevisiae, had a motif in a helix near the start of the RNA methyltransferase domain (Yakubovskaya et al., 2012). A minority (12%) had a second motif with FFAT score as low as 1.0 in a highly anionic IDR, and one orthologue had a motif after the methyltransferase domain. After excluding motifs with helical structure, the minority in IDRs were still significantly in excess over background (Table 2A). This predicts that Nop2 orthologues functionally interact with VAP. This implies that the potential motif in Nop2p is conserved to interact with VAP (see Discussion).
These ten proteins with putative FFAT motifs are implicated in diverse cellular activities (Figure 4). Three (Rqc1p, Sfb3p and Ubr1p) have diverse cytoplasmic functions that are already linked to the ER. Four are involved in processes that have previously been linked to VAP-FFAT interactions: (i) Bbc1p and Ede1p act in clathrin coated endocytosis, which is also linked to VAP and ORPs via Myo2p (Encinar Del Dedo et al., 2017); (ii) Ypr097wp localises to the vacuole (the yeast equivalent of the lysosome), and resembles a sorting nexin, so it may parallel the binding of human SNX2 to VAP (Dong et al., 2016); (iii) Vid27p localises to puncta resembling late (or multivesicular) endosomes and it has an ill-defined function in endocytosis, similar to human proteins at ER-endosome contacts (Rocha et al., 2009; Saita et al., 2009). The four other hits are involved with cellular components not previously known to be linked to VAP: Erb1, Kri1p Nop2p and Ubp10p assemble ribosomes in nucleoli; Seg2p stabilises eisosomes.
Seven new predicted VAP interactors in a sample of 28 yeast proteins identified with FFAT scores 2.0 and 2.5
Even though yeast has a relatively small proteome for a eukaryote, there are too many proteins with FFAT scores 2.0 and 2.5 (n=44 and 143 respectively, Supplementary Table 1A) to consider them all here in detail. Therefore, we sampled these groups (Supplementary Table 1C). The basis for this sampling was firstly to include all proteins previously identified as interactors of yeast VAP, and secondly to randomly sample the remaining proteins.
Among 13 proteins we examined with FFAT score =2.0, one had previously been shown to bind yeast VAP: Utp9p. This protein contained a motif that just failed the statistical significance criterion possibly because the family of orthologues is small (n=13, p=0.04=10-1.5) suggesting the VAP-FFAT interaction is conserved at a rate below the sensitivity of our approach. Among the other 12 proteins in this group, three were excluded by structure or location and four by lack of conservation. Five met all criteria for predicting an active motif (p<0.01), two with borderline statistical significance (Efr3p and Sxm1p) and three with the stronger significance (Ecm13p, Ent3p and Msh6p, Table 2B, top). Changing the orthologue families for Efr3p and Sxm1p to Ascomycota only increased the proportion of proteins positive for motifs, but did not increase their statistical significance (Supplementary Table 4). For the sample of proteins with motifs with FFAT score =2.0, the positive prediction rate 5/12 (42%) was close to the predicted rate (40%, Figure 3B/C). This implies a further 13 motifs would be found among the 31 proteins with FFAT score =2.0 we did not study.
Among 19 proteins with FFAT score =2.5 we examined, three (Epo1p, Num1p and Pbi1p) have been shown to bind yeast VAP previously, though their predicted FFAT motifs have not been tested (Chao et al., 2014; Riekhof et al., 2014). Our analysis predicted that both Epo1p and Num1p have motifs that meet all the criteria for being an active FFAT motif, albeit with borderline significance in terms of conservation (Table 2B, bottom). However, the motif in Pbi1p is in a known helix and did not meet the conservation criterion (Table 2B, bottom). Among the other 16 proteins, six proteins were excluded by structure or location criteria, and eight by lack of sequence conservation. Four of these just missed our significance criterion (p≤0.05 but p>0.01, like Utp9p see above), including integral ER proteins (Mga2p and Ubp1p), a trafficking protein (Kxd1p) and a polarity protein (Aim44p) (Table 2B). A subfamily of sequences more closely related to Mga2p showed no greater statistical significance (Supplementary Table 4). This left two proteins that met all the criteria for predicting a FFAT motif: Sec2p and Vps13p (Table 2B, bottom). The positive prediction rate among motifs with FFAT score =2.5 (2/16, 13%) is similar to our proteome-wide estimate (15%, Figure 3B). Applying this rate to all other 124 yeast proteins with FFAT scores =2.5 would produce a further 16 motifs.
The nine predicted motifs with FFAT scores 2.0 or 2.5 are in proteins localised to multiple cellular components that communicate with the ER, functioning in membrane traffic (Ent3p, Sec2p, Vps13p), cytoskeleton and polarity (Epo1p and Num1p), PI4P synthesis at the plasma membrane (Efr3p), nuclear components (Msh6p and Sxm1p) and a cytoplasmic protein of unknown function (Ecm13p) (Figure 4).
These findings create an overall picture of the number of FFAT motifs in yeast: 4 previously known, 3 in 5 partially characterised VAP interactors, and 17 we predict by detailed study. Extrapolating from the sample of proteins with FFAT scores 2.0–2.5, indicates the remainder of the proteome contains a further 29 proteins with motifs that meet all our criteria. However, with 155 proteins yet to be studied in these groups, this number has considerable margin for error. Thus, the bioinformatics approach predicts that the number of yeast proteins with FFAT motifs is approximately 50-55 (0.90% of the whole proteome), >10-fold higher than currently tested and “proven”.
Comparison of these predictions with the MEME Suite automated pipeline
We next asked if an automated pipeline for SLiM discovery would perform just as well as the custom-made PWM. For this we used the MEME Suite, which creates ungapped SLiMs from sequences and then searches for other occurrences in unrelated sequences (Bailey et al., 2009), and has been used to identify motifs in contact site biology (Bean et al., 2018). Seeded with a family of FFAT positive proteins (here we used ORPs), MEME identified their shared FFAT motifs as the 7 residue core, together with acidic flanking residues both upstream and downstream (Supplementary Table 5A). We then used the FIMO (find motifs) module in MEME to identify motifs that fit this statistical model. To understand the significance of hits at the whole proteome level, we searched both the actual yeast proteome and randomisations of the yeast proteome, as we had done for the PWM. Motifs that match the MEME model most closely (with p-values as low as 2x10-12) were only found in the actual proteome. With increasing p-values, the background hit rate increased, eventually approaching that of the observed proteome (Supplementary Table 5B), which is the same as we had observed for our PWM. The excess number of strong hits to the matrix in the actual proteome compared to random background was 37±16 (number of randomisations =4), which is ~70% of the level estimated by the PWM.
At the level of individual proteins there was variation after the 6th hit (Supplementary Table 5C), and the rank order of yeast hits correlated only partially with that produced by FFAT scores (r=0.3). ~20% of the top 3% of MEME hits contained residues that would prevent productive interaction with VAP (for example: L2, E5). Thus, while the automated tool MEME generated a list of hits with much less work than our approach, in detail its predictions are less informative.
Widespread motifs found in non-fungal homologues in 30% of yeast proteins with a motif
Finally we looked for conservation of yeast FFAT motifs in other species for the 20 proteins with newly predicted FFAT motifs. 14 have non-fungal families, of which 6 had a clear excess of FFAT motifs, i.e. overall rate 30% (Table 3). Excess FFAT motifs occur in all eukaryotic kingdoms for Vps13, Erb1 (alternate name BOP1 in other species), and Ubr1 (orthologue called UBR4 in mammals) (data not shown). For Nop2, motifs are found in animal homologues but not significantly in plants. For Kri1, motifs are found in plants not animals. Vid27 homologues are only found in plants and alveolates, both of which have more motifs than fungal homologues. 12 of the 20 new yeast hits have human homologues, three of which have motifs with low FFAT scores: VPS13 =0.0, NOP2 and UBR4 =2.5.
Table 3. Using fungal hits to search for FFAT motifs in non-fungal homologues.
Name | FFAT score | Family of orthologues | |||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
alternate name | species group | FFAFFATscores ≤2.5 | size of family | observed motifs | expected background (random) | p-value (-log10) | |||||||||||
2.5 | 2.0 | 1.5 | 0.0–1.0 | 2.5 | 2.0 | 1.5 | 0.0 – 1.0 | ≤2.5 | ≤2.0 | ≤1.5 | ≤1.0 | ||||||
EDE1 | 1.5 | EPS15 | non-fungi | 2.6% | 391 | 9 | 1 | 0 | 0 | 17 | 4.6 | 0.5 | 0.1 | / | / | / | / |
EFR3 | 2.0 | animals | 3.0% | 364 | 11 | 0 | 0 | 0 | 8.2 | 1.9 | 0.4 | 0.0 | 0.0 | / | / | / | |
plants | 3.0% | 560 | 11 | 6 | 0 | 0 | 16 | 2.7 | 0.8 | 0.1 | 0.2 | 0.4 | 0.5 | / | |||
ENT3 | 2.0 | Epsin | animals | 5.5% | 218 | 11 | 1 | 0 | 0 | 2.6 | 0.8 | 0.1 | 0.0 | 1.5 | 0.0 | / | / |
plants | 5.9% | 170 | 8 | 2 | 0 | 0 | 3.6 | 1.0 | 0.3 | 0.0 | 0.7 | 0.1 | / | / | |||
ERB1 | 1.5 | BOP1 | All | 17% | 904 | 91 | 41 | 31 | 2 | 36 | 7.9 | 2.5 | 0.3 | 15 | 11 | 6 | 0.6 |
KRI1 | 1.0 | animals | 14% | 226 | 16 | 16 | 1 | 0 | 19 | 5.8 | 1.5 | 0.3 | 0.4 | 1.2 | / | / | |
plants | 39% | 79 | 17 | 14 | 5 | 0 | 6.7 | 2.5 | 0.5 | 0.1 | 4.0 | 3.1 | 1.2 | / | |||
MSH6 | 2.0 | non-fungi | 6.7% | 794 | 44 | 11 | 0 | 0 | 30 | 7.0 | 0.6 | 0.1 | 1.1 | 0.4 | / | / | |
NOP2 | 1.5 | bacteria | 1.0% | 620 | 4 | 1 | 0 | 1 | 8.2 | 1.6 | 0.4 | 0.0 | / | 0.0 | / | / | |
animals | 82% | 56 | 48 | 1 | 0 | 0 | 2.5 | 0.8 | 0.0 | 0.0 | 10 | / | |||||
plants | 9.8% | 122 | 10 | 1 | 1 | 0 | 3.3 | 1.0 | 0.2 | 0.0 | 1.2 | 0.2 | |||||
RQC1 | 1.5 | TCF25 | animals | 6.2% | 324 | 21 | 0 | 0 | 0 | 7.4 | 1.8 | 0.5 | 0.0 | 1.4 | / | ||
plants | 14% | 107 | 15 | 0 | 0 | 0 | 4.1 | 0.9 | 0.0 | 0.0 | 1.6 | / | |||||
SEC2 | 2.5 | non-fungi | 0.0% | 76 | 1 | 0 | 0 | 0 | 1.1 | 0.2 | 0.0 | 0.0 | / | / | / | / | |
SXM1 | 2.0 | Cse1 | all | 9.9% | 565 | 44 | 20 | 1 | 0 | 37 | 7.0 | 2.0 | 0.5 | 1.1 | 1.4 | ||
UBP10 | 1.5 | USP36 | animals | 24% | 306 | 1 | 0 | 0 | 0 | 1.2 | 0.8 | 0.0 | 0.0 | / | / | / | / |
UBR1 | 1.5 | UBR4 | non-fungi | 32% | 507 | 147 | 20 | 4 | 3 | 58 | 12 | 2.2 | 0.4 | 10 | 1.3 | 0.8 | 0.8 |
VID27 | 1.5 | non-fungi | 24% | 304 | 47 | 26 | 6 | 7 | 10 | 3.0 | 0.8 | 0.0 | 12 | 7 | 3.0 | 2.1 | |
VPS13 | 2.5 | all | 18% | 3860 | 415 | 210 | 83 | 138 | 242 | 50 | 17 | 3.0 | 55 | 58 | 38 | 29 |
Notes: Alternate names for human homologues and range of species searched are in column 3. For some families, different species groups were searched separately (indicated in column 4). Coloring, p values and exclusions as in Table 2. Where different groups of species within an overall family have different significance levels, only the most significant group is coloued (column 4). FFAT = two phenylalanines in an acidic tract. "/" indicates that χ2 is not valid or only motif was observed.
Discussion
We studied one SLiM in one organism in detail. The potential FFAT motifs we predicted need verification of ER targeting dependent on VAP. One question we cannot exclude is whether these sequences might be selected for another reason. Many eukaryotic SLiMs combine aromatic and charged residues. One overlaps with the FFAT motif: “FxDxF”, which sorts endocytic cargo (Brett et al., 2002; Dinkel et al., 2016). In future, it is predicted that many more SLiMs will be discovered (Tompa et al., 2014), and some may overlap FFAT more closely.
Maximising accurate detection with a PWM
There are some issues with our PWM. Firstly, it overvalues the known VAP interactors used to create it. Position 3 of FFAT motifs exemplifies this: while the original motifs varied little here, crystal structures showed little interaction with VAP (Kaiser et al., 2005), and divergent residues were later found (Saita et al., 2009; Baron et al., 2014). This raises a second problem: the PWM weights need verification by systematic measurement of how different substitutions affect affinity, especially combinations of multiple substitutions, which have only rarely been studied (Furuita et al., 2010; Mikitova and Levine, 2012). Phage display of peptides, particularly variants of FFAT, might address this in future. For example, the finding of repeated substitution of F2 with tryptophan in Osh3 suggests the possibility that tryptophan at this position is not completely inhibitory to VAP interaction. This could be tested experimentally.
Searching for motifs in orthologues
Application of a PWM to large families of orthologues provided evidence that powerfully influenced the assessment of motifs. In some cases the OrthoDB families were too inclusive: subfamilies of proteins more closely related to the Opi1p and Ubp10p produced more excess motifs that were more statistically significant. Therefore, improved accuracy in curating orthologue families would increase sensitivity of this approach.
Nop2p is unique because the yeast protein, like almost all fungal NOPs has a motif that fails to meet our criteria because it occurs in a helix in the methyltransferase domain. However other motifs in a minority of Nop2 orthologues cast doubt on the exclusion These additional motifs in an anionic IDR (Figure 5A) were significantly in excess over the background of the whole family (p=0.006, Table 2A). This has implications for the motifs in the methyltransferase domain, both in fungi (consensus EFFEANE) and mammals (hNOP2 called NSun6, consensus EFLEANE), but not in plants (consensus ELIEAFE) and bacteria (consensus ALLAALN). One explanation for the motif in yeast and human is that its function is unrelated to VAP in the methyltransferase domain. An alternative explanation is that the first helix of the domain identified in crystal structures is dynamic (Figure 5B), and can switch to an unstructured loop. This scenario speculates that sequences that match FFAT motifs have been positively selected in opisthokonts in a region with variable structure that is permissive for FFAT formation. This would explain the definite but low rate of motifs with low FFAT scores in bacterial NOP2 homologues (Table 3).
Estimates of the background rate of FFAT motifs
Background levels of FFAT motifs were estimated from VAP-negative proteomes and sequence randomisation The main issue with VAP-negative proteomes is their variability. For example, 4.5% of proteins in the halophile archaeon Haloterrigena turkmenica have FFAT scores ≤2.5. The level of low FFAT scores (more than eukaryotes) was not changed by randomising the proteome (data not shown). This effect is associated with halophiles having high proportions of acidic residues and few bases to survive in their highly saline (2-4M salt) habitat (Lanyi, 1974). VAP-negative eukaryotes may represent better controls than prokaryotes, but they are separated from the bulk of eukaryotes by long phylogenetic branches, so their proteomes may vary for other unknown reasons.
For our randomisations, all residues in each protein were treated the same. In future we might attempt to improve this, randomising residues in folded domains separately from those in IDRs, which are enriched for charged residues (Davey et al., 2012). A different way to avoid mixing IDRs with folded domains is to randomise within tiled windows longer than the motif. However, sequences rich in D/E with a minority of F/Y may produce a low FFAT score randomly. For example, the FFAT score of …DDDDDDD[F/Y]DD@DD… (where @ is A/C/S/T) is 1.0-1.5. Therefore, sliding windows would over-estimate background levels of FFAT motifs because of their inherent repetitive nature.
Evolution of SLiMs
The ability of FFAT motifs to arise simply by the appearance in an acidic tract of F/Y and a small residue at +3 shows how FFAT motifs might evolve, especially as Y and A can both result from point mutations of D codons. We assume that motifs that occur in isolation have evolved relatively recently. Like any mutation, they may have no productive function initially, and could be counter-selected if deleterious. This may have occurred in the highly anionic motor protein Dyn1, orthologues of which have many fewer motifs than the same sequences randomised.
Among the eukaryotes we studied, A. thaliana lacks proteins with the lowest FFAT scores (Figure 2B). Could the plant VAP-FFAT interaction be marginally different from yeast/human (opisthokonts)? Comparison of motifs in A. thaliana with those in yeast and human showed that the cores were similar, but upstream residues were less acidic in plants (data not shown). The suggests that plant VAP might have a compensatory difference from yeast and human VAP.
Can VAP bind ~0.9% of the proteome?
There are pleiotropic phenotypes in cells lacking VAP that are consistent with dysfunction of many pathways in combination (Loewen and Levine, 2005; Di Mattia et al., 2018). However, the finding that cells still grow without VAP implies that VAP-FFAT interactions mainly lead to reduced efficiency, which can be compensated by over-expression (Hanada et al., 2003). There are far less copies of VAP than the sum of its interactors, so only a fraction of FFAT motifs can be active at any one time. Part of the explanation might be that FFAT motifs interact only weakly (micromolar Kd), they only bind VAP when multimerised. In addition, they may be masked by intramolecular interactions regulated by post-translational modifications (Alli-Balogun and Levine, 2019), as indicated by considerable diffuse background of GFP-tagged FFAT positive proteins (Breker et al., 2014). If the active FFAT motifs in total saturate VAP, then competitive binding will depend on both affinity (related to FFAT score) and avidity determined by local concentration. Thus, interaction may favour proteins concentrated near the ER, such as on an adjacent organelle.
FFAT motifs at contact sites in yeast
Most of the proteins with FFAT motifs newly predicted here are located on organelles that are already known to use VAP-FFAT interactions to communicate with the ER, either in yeast or other cells (Figure 4). Some are involved in functions apparently outside the ER, but their location allows them to access it, for example the intranuclear proteins Sxm1, Msh6, which can access VAP that diffuses to the inner nuclear envelope (Beilharz et al., 2003; Brickner and Walter, 2004; James et al., 2019).
There are two yeast organelles that contain proteins with putative FFAT motifs where ER contact is not well characterised: nucleolus and eisosome. For the nucleolus, we had four hits. Among these, Erb1p and Nop2p localise partly to the nuclear envelope, so may be constitutively associated with VAP, while Kri1p and Ubp10p are diffuse in the nucleolus (Breker et al., 2014). For the eisosome, Seg2 has motif with a low FFAT score as does its paralogue Seg1p. Eisosomes are grooves 50 nm deep 1 μm long in the yeast plasma membrane that sense tension (Berchtold et al., 2012). The majority of eisosomes are well separated from the nearest element of the cortical ER network (150-250 nm). This has been the basis for considering eisosomes to be separated from the ER (Stradalova et al., 2012). However ~20% eisosomes are ≤50 nm distant from cortical ER (Stradalova et al., 2012), making it possible for Seg2p or Seg1p to bridge from eisosomes to ER. Mutating putative FFAT motifs in both Seg1/2p could probe this potential ER-eisosome bridging.
Contact sites in humans and beyond
Conserved ER-nucleolar relationships are already known for the integral inner nuclear envelope protein RRS1 both in yeast (Horigome et al., 2011) and human cells, where it mediates targeting of ER proteins to nucleoli (Sutherland et al., 2004; Carnemolla et al., 2009). We found an excess sequences with low FFAT scores in nucleolar proteins in different kingdoms: ERB1 in all eukaryotes, KRI1 in fungi and plants, and NOP2 (with the caveats above) in fungi and animals (Table 3). These should the focus of future study into the relationship between the ER and nucleolus. Given that proteins with FFAT motifs typically engage VAP when they function to form bridges between the ER and other compartments, we speculate that there is ER-nucleolar bridging. ER-nucleolus bridging in yeast is easy to conceptualise, because the nucleolus forms a crescent adherent to ~50% of the inner nuclear envelope (Thiry and Lafontaine, 2005). By contrast nucleoli in plants and animals are spherical structures deep in the nucleoplasm. Despite this ER-nucleolar contact could be achieved by the tubular nucleoplasmic reticulum, which extends from the inner nuclear envelope deep into the nucleus of animal and plant cells (Malhas et al., 2011; Ohsaki et al., 2016). Our data suggest that nucleoplasmic reticulum may directly contact nucleoli. Alternatively, there may be no bridging, and the nucleolar proteins with FFAT motifs have inactive pools targeted to the nuclear periphery. Therefore, motifs with low FFAT scores in proteins with unexpected locations (nucleoli and likewise eisosomes) prompt further study of the relationship between the ER and other organelles.
Methods
Producing FFAT scores
The scoring system is identical to that used previously (Murphy and Levine, 2016). In brief, to produce FFAT scores, suboptimal substitutions in any core of 7 residues were allocated a score in a PWM (Table 4A). Some sequences in databases contain undetermined residues (identified as “X”). These were scored as inhibiting VAP binding. The 6 residues upstream were assessed as a single unit for charge (Table 4B), and the estimate of overall negative charge was scaled (Table 4C) and combined with the core to create the FFAT score. This represents the sum total of sub-optimal scores across 13 continuous residues. FFAT scores were binned into 0.5 unit bins: “0” = 0-0.25, “0.5” = 0.251-0.75, “1.0” = 0.751-1.25, etc. For proteomes and families of sequences the scoring system was enacted in Python. The code for the scripts will be made available on GitHub or on request from the authors.
Table 4. Suboptimal Scores for Substitutions in Core of FFAT motif.
(a) | Score for core residues | ||||||
1 | 2 | 3 | 4 | 5 | 6 | 7 | |
Ideal | E | F | F | D | A | x | E |
A | 1 | 4 | 1 | 2 | 0 | 0 | 1 |
C | 1 | 4 | 1 | 2 | 0 | 0 | 1 |
D | 0 | 4 | 1 | 0 | 4 | 0 | 0 |
E | 0 | 4 | 1 | 0 | 4 | 0 | 0 |
F | 1 | 0 | 0 | 2 | 2 | 0 | 1 |
G | 1 | 4 | 1 | 2 | 1 | 0 | 1 |
H | 1 | 4 | 0.7 | 2 | 2 | 0 | 1.5 |
I | 1 | 4 | 1 | 2 | 2 | 0 | 1 |
K | 1.5 | 4 | 1 | 2 | 2.5 | 0 | 1.5 |
L | 1 | 4 | 1 | 2 | 2 | 0 | 1 |
M | 1 | 4 | 1 | 2 | 2 | 0 | 1 |
N | 1 | 4 | 1 | 2 | 2 | 0 | 1 |
P | 1 | 4 | 1 | 2 | 2 | 0 | 1 |
Q | 1 | 4 | 1 | 2 | 2 | 0 | 1 |
R | 2 | 4 | 1 | 2 | 3 | 0 | 1.5 |
S | 0.5 | 4 | 1 | 0.5 | 0.5 | 0 | 0.5 |
T | 0.5 | 4 | 1 | 0.5 | 0.5 | 0 | 0.5 |
V | 1 | 4 | 1 | 2 | 2 | 0 | 1 |
W | 1 | 2 | 0.7 | 2 | 2 | 0 | 1 |
Y | 1 | 0 | 0.5 | 2 | 2 | 0 | 1 |
X | 2 | 4 | 2 | 2 | 2 | 2 | 2 |
Acidic Tract (N-terminal to core): charge | |||||||
(b) | 1 | 2 | 3 | 4 | 5 | 6 | |
Ideal | D/E | D/E | D/E | D/E | D/E | D/E | |
D/E | +1 | +1 | +1 | +1 | +1 | +1 | |
S/T | +0.5 | +0.5 | +0.5 | +0.5 | +0.5 | +0.5 | |
K/R | -1 | -1 | -1 | -1 | -1 | -1 | |
Others | 0 | 0 | 0 | 0 | 0 | 0 | |
Convert charge into score | |||||||
(c) | ≥4 | 3-3.5 | 2-2.5 | ≤1.5 | |||
score | 0 | 0.5 | 1 | 1.5 |
Sequences
Proteome-wide sequences were obtained from Uniprot. Control proteomes were: bacteria – E. coli., M tuberculosis, S. aureus, M. hominis; archaea – Pyrococcus horikoshii, Methanosarcina mazei, Thermoplasma acidophilum, Archaeoglobus fulgidus, Sulfolobus solfataricus, Methanocaldococcus jannaschii; VAP negative eukaryotes – Euglena gracilis, the diplomonad Spironucleus salmonicida, and the kinetoplastid Paratrypansoma confusum. Randomisation was carried out within each protein, repeated 10-fold to obtain a mean and standard deviation for FFAT scores. Families of proteins orthologous to yeast hits were obtained from Uniprot using the OrthoDB identifier of the yeast protein (Supplementary Table 1D). All lists were filtered to remove fragments, and also with “identity:0.9” filter, or “identity:0.5” if there were more than 1000 sequences. Where no OrthoDB family was available (Ypt7, Seg2), we searched for homologues using BLAST filtered to nr90 (Ypt7p), or for iterative searches we used HHblits 3 iterations (Seg2p) at the Tuebingen Toolkit (Zimmermann et al., 2018). Similarly subfamilies made by KEGG and OMA (Kanehisa et al., 2004; Schneider et al., 2007) used the KEGG/OMA identifiers. For OSBP homologues to submit to MEME, non-redundant proteins containing the Pfam domain PF01237 (Oxysterol_BP) were downloaded from Uniprot, and submitted (at meme-suite.org/tools/meme), with the resulting matrix forwarded to FIMO (http://meme-suite.org/tools/fimo) (Bailey et al., 2009).
Analysis of potential motifs in yeast
Criteria applied to each potential FFAT motif included two used previously: location and secondary structure (Murphy and Levine, 2016). Location was determined by combining two sources of data: (a) topology prediction from signal sequences (Signal4.0) and transmembrane domains (TMHMM2.0 and TOPCONS) (Kahsay et al., 2005; Petersen et al., 2011; Tsirigos et al., 2015); and (b) reconsideration of topology predicted in (a) in the light of functional requirements implied by domain analysis in HHsearch (Soding et al., 2005), which is available online at the Tuebingen Toolkit as HHpred (Zimmermann et al., 2018). Predicted secondary structure was in three states (helix, sheet, unstructured loop) was produced by PSI-PRED 3.0 as part of HHsearch (Buchan et al., 2013).
For the third criterion, conservation of the motif in closely related species, excess motifs with low FFAT scores in families of orthologue were estimated by comparing numbers of observed motifs with FFAT scores ≤2.5 (and ≤2.0, ≤1.5, ≤1.0 in parallel) with the average numbers of motifs in randomised proteins by the “N-1” Chi2 test that compares the (ad-bc)2(N-1)/mnrs with the χ2 distribution with 1 degree of freedom, as described (Campbell, 2007). Multiple incidences of motifs with FFAT scores ≤2.5 per protein were included, i.e. statistics refer to every residue, with no reduction to the most optimal FFAT score per protein. The most significant of the 4 tests (excess ≤2.5, ≤2.0, ≤1.5 or ≤1.0) was assumed to apply (Table 2/3 etc.). A cut-off p≤10-2 was used (Table 2, final 4 columns: cut-off ≥2 when expressed as –log10). P-values between 10-2 and 10-3 were considered borderline. Any protein family with only one motif or fewer motifs than the background at a particular FFAT score was excluded as having too low numbers to assess, and with only two motifs it was considered of borderline significance. The level of motifs with low FFAT scores in all randomised families together was similar to that in the randomised yeast proteome (data not shown).
For data mining, documented physical interactions of VAPA/VAPB/MOSPD2 and Scs2/Scs22 were obtained from BIOGRID. Where published literature is not informative on intracellular targeting, we used high-throughput localisation studies at the LOQATE database (Breker et al., 2014), which includes genome-wide libraries of all yeast proteins tagged with fluorescent proteins at both N- and C-termini (Weill et al., 2018).
Analysis of potential motifs in non-fungal relatives of yeast hits
Twenty lists of non-fungal homologues of 14 yeast hits were constructed from UNIREF (non-redundant as above) based either on OrthoDB families, or on combinations of domains (Supplementary Table 1E). Sequence fragments were removed, and sequences filtered for non-redundancy as above. Observed motifs with low FFAT scores were compared with the number of motifs after sequence randomisation (n=10), including multiple incidences per protein as above.
Supplementary Material
Acknowledgments
We thank Martin Frisher (Keele University) and Matt Hayes (UCL) for helpful discussions. This work was supported by a grant from the Bioinformatics and Biological Resources Fund of the Biotechnology and Biological Sciences Research Council UK (BB/M011801/1).
Abbreviations
- ER
endoplasmic reticulum
- FFAT
two phenylalanines in an acidic tract
- IDR
intrinsically disordered region
- MSP
major sperm protein
- ORP
OSBP-related protein
- OSBP
oxysterol binding protein
- PUF
protein of unknown function
- PWM
position weight matrix
- SLiM
short linear motif
- TMH
transmembrane helix
- VAMP
vesicle-associated membrane protein
- VAP
VAMP-associated protein
References
- Alli-Balogun GO, Levine TP. Regulation of targeting determinants in interorganelle communication. Curr Opin Cell Biol. 2019;57:106–114. doi: 10.1016/j.ceb.2018.12.010. [DOI] [PubMed] [Google Scholar]
- Alpy F, Rousseau A, Schwab Y, Legueux F, Stoll I, Wendling C, Spiegelhalter C, Kessler P, Mathelin C, Rio MC, et al. STARD3 or STARD3NL and VAP form a novel molecular tether between late endosomes and the ER. J Cell Sci. 2013;126:5500–5512. doi: 10.1242/jcs.139295. [DOI] [PubMed] [Google Scholar]
- Bailey TL, Boden M, Buske FA, Frith M, Grant CE, Clementi L, Ren J, Li WW, Noble WS. MEME SUITE: tools for motif discovery and searching. Nucleic Acids Res. 2009;37:W202–208. doi: 10.1093/nar/gkp335. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Baron Y, Pedrioli PG, Tyagi K, Johnson C, Wood NT, Fountaine D, Wightman M, Alexandru G. VAPB/ALS8 interacts with FFAT like proteins including the p97 cofactor FAF1 and the ASNA1 ATPase. BMC Biol. 2014;12:39. doi: 10.1186/1741-7007-12-39. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bean BDM, Dziurdzik SK, Kolehmainen KL, Fowler CMS, Kwong WK, Grad LI, Davey M, Schluter C, Conibear E. Competitive organelle-specific adaptors recruit Vps13 to membrane contact sites. J Cell Biol. 2018;217:3593–3607. doi: 10.1083/jcb.201804111. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Beilharz T, Egan B, Silver PA, Hofmann K, Lithgow T. Bipartite signals mediate subcellular targeting of tail-anchored membrane proteins in Saccharomyces cerevisiae. J Biol Chem. 2003;278:8219–8223. doi: 10.1074/jbc.M212725200. [DOI] [PubMed] [Google Scholar]
- Berchtold D, Piccolis M, Chiaruttini N, Riezman I, Riezman H, Roux A, Walther TC, Loewith R. Plasma membrane stress induces relocalization of Slm proteins and activation of TORC2 to promote sphingolipid synthesis. Nat Cell Biol. 2012;14:542–547. doi: 10.1038/ncb2480. [DOI] [PubMed] [Google Scholar]
- Breker M, Gymrek M, Moldavski O, Schuldiner M. LoQAtE--Localization and Quantitation ATlas of the yeast proteomE. A new tool for multiparametric dissection of single-protein behavior in response to biological perturbations in yeast. Nucleic Acids Res. 2014;42:D726–730. doi: 10.1093/nar/gkt933. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Brett TJ, Traub LM, Fremont DH. Accessory protein recruitment motifs in clathrin-mediated endocytosis. Structure. 2002;10:797–809. doi: 10.1016/s0969-2126(02)00784-0. [DOI] [PubMed] [Google Scholar]
- Brickner JH, Walter P. Gene Recruitment of the Activated INO1 Locus to the Nuclear Membrane. PLoS Biol. 2004;2:E342. doi: 10.1371/journal.pbio.0020342. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Buchan DW, Minneci F, Nugent TC, Bryson K, Jones DT. Scalable web services for the PSIPRED Protein Analysis Workbench. Nucleic Acids Res. 2013;41:W349–357. doi: 10.1093/nar/gkt381. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Campbell I. Chi-squared and Fisher-Irwin tests of two-by-two tables with small sample recommendations. Stat Med. 2007;26:3661–3675. doi: 10.1002/sim.2832. [DOI] [PubMed] [Google Scholar]
- Carnemolla A, Fossale E, Agostoni E, Michelazzi S, Calligaris R, De Maso L, Del Sal G, MacDonald ME, Persichetti F. Rrs1 is involved in endoplasmic reticulum stress response in Huntington disease. J Biol Chem. 2009;284:18167–18173. doi: 10.1074/jbc.M109.018325. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Chao JT, Wong AK, Tavassoli S, Young BP, Chruscicki A, Fang NN, Howe LJ, Mayor T, Foster LJ, Loewen CJ. Polarization of the endoplasmic reticulum by ER-septin tethering. Cell. 2014;158:620–632. doi: 10.1016/j.cell.2014.06.033. [DOI] [PubMed] [Google Scholar]
- Chatr-Aryamontri A, Oughtred R, Boucher L, Rust J, Chang C, Kolas NK, O'Donnell L, Oster S, Theesfeld C, Sellam A, et al. The BioGRID interaction database: 2017 update. Nucleic Acids Res. 2017;45:D369–D379. doi: 10.1093/nar/gkw1102. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Clairfeuille T, Norwood SJ, Qi X, Teasdale RD, Collins BM. Structure and Membrane Binding Properties of the Endosomal Tetratricopeptide Repeat (TPR) Domain-containing Sorting Nexins SNX20 and SNX21. J Biol Chem. 2015;290:14504–14517. doi: 10.1074/jbc.M115.650598. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Costaguta G, Duncan MC, Fernandez GE, Huang GH, Payne GS. Distinct roles for TGN/endosome epsin-like adaptors Ent3p and Ent5p. Mol Biol Cell. 2006;17:3907–3920. doi: 10.1091/mbc.e06-05-0410. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Costello JL, Castro IG, Hacker C, Schrader TA, Metz J, Zeuschner D, Azadi AS, Godinho LF, Costina V, Findeisen P, et al. ACBD5 and VAPB mediate membrane associations between peroxisomes and the ER. J Cell Biol. 2017;216:331–342. doi: 10.1083/jcb.201607055. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Danson CM, Pearson N, Heesom KJ, Cullen PJ. Sorting nexin-21 is a scaffold for the endosomal recruitment of huntingtin. J Cell Sci. 2018;131 doi: 10.1242/jcs.211672. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Davey NE, Seo MH, Yadav VK, Jeon J, Nim S, Krystkowiak I, Blikstad C, Dong D, Markova N, Kim PM, Ivarsson Y. Discovery of short linear motif-mediated interactions through phage display of intrinsically disordered regions of the human proteome. FEBS J. 2017;284:485–498. doi: 10.1111/febs.13995. [DOI] [PubMed] [Google Scholar]
- Davey NE, Van Roey K, Weatheritt RJ, Toedt G, Uyar B, Altenberg B, Budd A, Diella F, Dinkel H, Gibson TJ. Attributes of short linear motifs. Mol Biosyst. 2012;8:268–281. doi: 10.1039/c1mb05231d. [DOI] [PubMed] [Google Scholar]
- De Vos KJ, Morotz GM, Stoica R, Tudor EL, Lau KF, Ackerley S, Warley A, Shaw CE, Miller CC. VAPB interacts with the mitochondrial protein PTPIP51 to regulate calcium homeostasis. Hum Mol Genet. 2011;21:1299–1311. doi: 10.1093/hmg/ddr559. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Di Mattia T, Wilhelm LP, Ikhlef S, Wendling C, Spehner D, Nomine Y, Giordano F, Mathelin C, Drin G, Tomasetto C, Alpy F. Identification of MOSPD2, a novel scaffold for endoplasmic reticulum membrane contact sites. EMBO Rep. 2018;19 doi: 10.15252/embr.201745453. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Dinkel H, Van Roey K, Michael S, Kumar M, Uyar B, Altenberg B, Milchevskaya V, Schneider M, Kuhn H, Behrendt A, et al. ELM 2016--data update and new functionality of the eukaryotic linear motif resource. Nucleic Acids Res. 2016;44:D294–300. doi: 10.1093/nar/gkv1291. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Dong R, Saheki Y, Swarup S, Lucast L, Harper JW, De Camilli P. Endosome-ER Contacts Control Actin Nucleation and Retromer Function through VAP-Dependent Regulation of PI4P. Cell. 2016;166:408–423. doi: 10.1016/j.cell.2016.06.037. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Encinar Del Dedo J, Idrissi FZ, Fernandez-Golbano IM, Garcia P, Rebollo E, Krzyzanowski MK, Grotsch H, Geli MI. ORP-Mediated ER Contact with Endocytic Sites Facilitates Actin Polymerization. Dev Cell. 2017;43:588–602 e586. doi: 10.1016/j.devcel.2017.10.031. [DOI] [PubMed] [Google Scholar]
- Fidler DR, Murphy SE, Courtis K, Antonoudiou P, El-Tohamy R, Ient J, Levine TP. Using HHsearch to tackle proteins of unknown function: A pilot study with PH domains. Traffic. 2016;17:1214–1226. doi: 10.1111/tra.12432. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Friedman JR, Dibenedetto JR, West M, Rowland AA, Voeltz GK. Endoplasmic reticulum-endosome contact increases as endosomes traffic and mature. Mol Biol Cell. 2013;24:1030–1040. doi: 10.1091/mbc.E12-10-0733. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Furuita K, Jee J, Fukada H, Mishima M, Kojima C. Electrostatic interaction between oxysterol-binding protein and VAMP-associated protein A revealed by NMR and mutagenesis studies. J Biol Chem. 2010;285:12961–12970. doi: 10.1074/jbc.M109.082602. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hanada K, Kumagai K, Yasuda S, Miura Y, Kawano M, Fukasawa M, Nishijima M. Molecular machinery for non-vesicular trafficking of ceramide. Nature. 2003;426:803–809. doi: 10.1038/nature02188. [DOI] [PubMed] [Google Scholar]
- Hein MY, Hubner NC, Poser I, Cox J, Nagaraj N, Toyoda Y, Gak IA, Weisswange I, Mansfeld J, Buchholz F, et al. A human interactome in three quantitative dimensions organized by stoichiometries and abundances. Cell. 2015;163:712–723. doi: 10.1016/j.cell.2015.09.053. [DOI] [PubMed] [Google Scholar]
- Ho Y, Gruhler A, Heilbut A, Bader GD, Moore L, Adams SL, Millar A, Taylor P, Bennett K, Boutilier K, et al. Systematic identification of protein complexes in Saccharomyces cerevisiae by mass spectrometry. Nature. 2002;415:180–183. doi: 10.1038/415180a. [DOI] [PubMed] [Google Scholar]
- Horigome C, Okada T, Shimazu K, Gasser SM, Mizuta K. Ribosome biogenesis factors bind a nuclear envelope SUN domain protein to cluster yeast telomeres. EMBO J. 2011;30:3799–3811. doi: 10.1038/emboj.2011.267. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hua R, Cheng D, Coyaud E, Freeman S, Di Pietro E, Wang Y, Vissa A, Yip CM, Fairn GD, Braverman N, et al. VAPs and ACBD5 tether peroxisomes to the ER for peroxisome maintenance and lipid homeostasis. J Cell Biol. 2017;216:367–377. doi: 10.1083/jcb.201608128. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Huttlin EL, Ting L, Bruckner RJ, Gebreab F, Gygi MP, Szpyt J, Tam S, Zarraga G, Colby G, Baltier K, et al. The BioPlex Network: A Systematic Exploration of the Human Interactome. Cell. 2015;162:425–440. doi: 10.1016/j.cell.2015.06.043. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hynynen R, Laitinen S, Kakela R, Tanhuanpaa K, Lusa S, Ehnholm C, Somerharju P, Ikonen E, Olkkonen VM. Overexpression of OSBP-related protein 2 (ORP2) induces changes in cellular cholesterol metabolism and enhances endocytosis. Biochem J. 2005;390:273–283. doi: 10.1042/BJ20042082. [DOI] [PMC free article] [PubMed] [Google Scholar]
- James C, Muller M, Goldberg MW, Lenz C, Urlaub H, Kehlenbach RH. Proteomic mapping by rapamycin-dependent targeting of APEX2 identifies binding partners of VAPB at the inner nuclear membrane. J Biol Chem. 2019 doi: 10.1074/jbc.RA118.007283. [Epub ahead of print] [DOI] [PMC free article] [PubMed] [Google Scholar]
- Johnson B, Leek AN, Sole L, Maverick EE, Levine TP, Tamkun MM. Kv2 potassium channels form endoplasmic reticulum/plasma membrane junctions via interaction with VAPA and VAPB. Proc Natl Acad Sci U S A. 2018;115:E7331–E7340. doi: 10.1073/pnas.1805757115. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kahsay RY, Gao G, Liao L. An improved hidden Markov model for transmembrane protein detection and topology prediction and its applications to complete genomes. Bioinformatics. 2005;21:1853–1858. doi: 10.1093/bioinformatics/bti303. [DOI] [PubMed] [Google Scholar]
- Kaiser SE, Brickner JH, Reilein AR, Fenn TD, Walter P, Brunger AT. Structural basis of FFAT motif-mediated ER targeting. Structure. 2005;13:1035–1045. doi: 10.1016/j.str.2005.04.010. [DOI] [PubMed] [Google Scholar]
- Kanehisa M, Goto S, Kawashima S, Okuno Y, Hattori M. The KEGG resource for deciphering the genome. Nucleic Acids Res. 2004;32:D277–280. doi: 10.1093/nar/gkh063. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kim I, Lee H, Han SK, Kim S. Linear motif-mediated interactions have contributed to the evolution of modularity in complex protein interaction networks. PLoS Comput Biol. 2014;10:e1003881. doi: 10.1371/journal.pcbi.1003881. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kirmiz M, Vierra NC, Palacio S, Trimmer JS. Identification of VAPA and VAPB as Kv2 Channel-Interacting Proteins Defining Endoplasmic Reticulum-Plasma Membrane Junctions in Mammalian Brain Neurons. J Neurosci. 2018;38:7562–7584. doi: 10.1523/JNEUROSCI.0893-18.2018. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kriventseva EV, Tegenfeldt F, Petty TJ, Waterhouse RM, Simao FA, Pozdnyakov IA, Ioannidis P, Zdobnov EM. OrthoDB v8: update of the hierarchical catalog of orthologs and the underlying free software. Nucleic Acids Res. 2015;43:D250–256. doi: 10.1093/nar/gku1220. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Krystkowiak I, Davey NE. SLiMSearch: a framework for proteome-wide discovery and annotation of functional modules in intrinsically disordered regions. Nucleic Acids Res. 2017;45:W464–W469. doi: 10.1093/nar/gkx238. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kumagai K, Hanada K. Structure, functions and regulation of CERT, a lipid-transfer protein for the delivery of ceramide at the ER-Golgi membrane contact sites. FEBS Lett. 2019 doi: 10.1002/1873-3468.13511. [DOI] [PubMed] [Google Scholar]
- Kumagai K, Kawano-Kawada M, Hanada K. Phosphoregulation of the ceramide transport protein CERT at serine 315 in the interaction with VAMP-associated protein (VAP) for inter-organelle trafficking of ceramide in mammalian cells. J Biol Chem. 2014;289:10748–10760. doi: 10.1074/jbc.M113.528380. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kumar N, Leonzino M, Hancock-Cerutti W, Horenkamp FA, Li P, Lees JA, Wheeler H, Reinisch KM, De Camilli P. VPS13A and VPS13C are lipid transport proteins differentially localized at ER contact sites. J Cell Biol. 2018 doi: 10.1083/jcb.201807019. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lanyi JK. Salt-dependent properties of proteins from extremely halophilic bacteria. Bacteriol Rev. 1974;38:272–290. doi: 10.1128/br.38.3.272-290.1974. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Liu RJ, Long T, Li J, Li H, Wang ED. Structural basis for substrate binding and catalytic mechanism of a human RNA:m5C methyltransferase NSun6. Nucleic Acids Res. 2017;45:6684–6697. doi: 10.1093/nar/gkx473. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Loewen CJ, Levine TP. A highly conserved binding site in VAP for the FFAT motif of lipid binding proteins. J Biol Chem. 2005;280:14097–14104. doi: 10.1074/jbc.M500147200. [DOI] [PubMed] [Google Scholar]
- Loewen CJ, Roy A, Levine TP. A conserved ER targeting motif in three families of lipid binding proteins and in Opi1p binds VAP. EMBO J. 2003;22:2025–2035. doi: 10.1093/emboj/cdg201. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Malhas A, Goulbourne C, Vaux DJ. The nucleoplasmic reticulum: form and function. Trends Cell Biol. 2011;21:362–373. doi: 10.1016/j.tcb.2011.03.008. [DOI] [PubMed] [Google Scholar]
- McCune BT, Tang W, Lu J, Eaglesham JB, Thorne L, Mayer AE, Condiff E, Nice TJ, Goodfellow I, Krezel AM, Virgin HW. Noroviruses Co-opt the Function of Host Proteins VAPA and VAPB for Replication via a Phenylalanine-Phenylalanine-Acidic-Tract-Motif Mimic in Nonstructural Viral Protein NS1/2. MBio. 2017;8 doi: 10.1128/mBio.00668-17. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Meszaros B, Dosztanyi Z, Simon I. Disordered binding regions and linear motifs--bridging the gap between two models of molecular recognition. PLoS One. 2012;7:e46829. doi: 10.1371/journal.pone.0046829. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mikitova V, Levine TP. Analysis of the key elements of FFAT like motifs identifies new proteins that potentially bind VAP on the ER, including two AKAPs and FAPP2. PLoS One. 2012;7:e30455. doi: 10.1371/journal.pone.0030455. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Moreira KE, Schuck S, Schrul B, Frohlich F, Moseley JB, Walther TC, Walter P. Seg1 controls eisosome assembly and shape. J Cell Biol. 2012;198:405–420. doi: 10.1083/jcb.201202097. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Murphy SE, Levine TP. VAP, a Versatile Access Point for the Endoplasmic Reticulum: Review and analysis of FFAT like motifs in the VAPome. Biochim Biophys Acta. 2016;1861:952–961. doi: 10.1016/j.bbalip.2016.02.009. [DOI] [PubMed] [Google Scholar]
- Ohsaki Y, Kawai T, Yoshikawa Y, Cheng J, Jokitalo E, Fujimoto T. PML isoform II plays a critical role in nuclear lipid droplet formation. J Cell Biol. 2016;212:29–38. doi: 10.1083/jcb.201507122. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Petersen TN, Brunak S, von Heijne G, Nielsen H. SignalP 4.0: discriminating signal peptides from transmembrane regions. Nat Methods. 2011;8:785–786. doi: 10.1038/nmeth.1701. [DOI] [PubMed] [Google Scholar]
- Phillips MJ, Voeltz GK. Structure and function of ER membrane contact sites with other organelles. Nat Rev Mol Cell Biol. 2016;17:69–82. doi: 10.1038/nrm.2015.8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Riekhof WR, Wu WI, Jones JL, Nikrad M, Chan MM, Loewen CJ, Voelker DR. An assembly of proteins and lipid domains regulates transport of phosphatidylserine to phosphatidylserine decarboxylase 2 in Saccharomyces cerevisiae. J Biol Chem. 2014;289:5809–5819. doi: 10.1074/jbc.M113.518217. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rocha N, Kuijl C, van der Kant R, Janssen L, Houben D, Janssen H, Zwart W, Neefjes J. Cholesterol sensor ORP1L contacts the ER protein VAP to control Rab7-RILP-p150 Glued and late endosome positioning. J Cell Biol. 2009;185:1209–1225. doi: 10.1083/jcb.200811005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Saita S, Shirane M, Natume T, Iemura S, Nakayama KI. Promotion of neurite extension by protrudin requires its interaction with vesicle-associated membrane protein-associated protein. J Biol Chem. 2009;284:13766–13777. doi: 10.1074/jbc.M807938200. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Schneider A, Dessimoz C, Gonnet GH. OMA Browser--exploring orthologous relations across 352 complete genomes. Bioinformatics. 2007;23:2180–2182. doi: 10.1093/bioinformatics/btm295. [DOI] [PubMed] [Google Scholar]
- Skehel PA, Fabian-Fine R, Kandel ER. Mouse VAP33 is associated with the endoplasmic reticulum and microtubules. Proc Natl Acad Sci U S A. 2000;97:1101–1106. doi: 10.1073/pnas.97.3.1101. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Soding J, Biegert A, Lupas AN. The HHpred interactive server for protein homology detection and structure prediction. Nucleic Acids Res. 2005;33:W244–248. doi: 10.1093/nar/gki408. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Stajich JE, Berbee ML, Blackwell M, Hibbett DS, James TY, Spatafora JW, Taylor JW. The fungi. Curr Biol. 2009;19:R840–845. doi: 10.1016/j.cub.2009.07.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Stradalova V, Blazikova M, Grossmann G, Opekarova M, Tanner W, Malinsky J. Distribution of cortical endoplasmic reticulum determines positioning of endocytic events in yeast plasma membrane. PLoS One. 2012;7:e35132. doi: 10.1371/journal.pone.0035132. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sutherland HG, Lam YW, Briers S, Lamond AI, Bickmore WA. 3D3/lyric: a novel transmembrane protein of the endoplasmic reticulum and nuclear envelope, which is also present in the nucleolus. Exp Cell Res. 2004;294:94–105. doi: 10.1016/j.yexcr.2003.11.020. [DOI] [PubMed] [Google Scholar]
- Thiry M, Lafontaine DL. Birth of a nucleolus: the evolution of nucleolar compartments. Trends Cell Biol. 2005;15:194–199. doi: 10.1016/j.tcb.2005.02.007. [DOI] [PubMed] [Google Scholar]
- Tompa P, Davey NE, Gibson TJ, Babu MM. A million peptide motifs for the molecular biologist. Mol Cell. 2014;55:161–169. doi: 10.1016/j.molcel.2014.05.032. [DOI] [PubMed] [Google Scholar]
- Tsirigos KD, Peters C, Shu N, Kall L, Elofsson A. The TOPCONS web server for consensus prediction of membrane protein topology and signal peptides. Nucleic Acids Res. 2015;43:W401–407. doi: 10.1093/nar/gkv485. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Weill U, Yofe I, Sass E, Stynen B, Davidi D, Natarajan J, Ben-Menachem R, Avihou Z, Goldman O, Harpaz N, et al. Genome-wide SWAp-Tag yeast libraries for proteome exploration. Nat Methods. 2018;15:617–622. doi: 10.1038/s41592-018-0044-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wilk S, Wittland J, Thywissen A, Schmitz HP, Heinisch JJ. A block of endocytosis of the yeast cell wall integrity sensors Wsc1 and Wsc2 results in reduced fitness in vivo. Mol Genet Genomics. 2010;284:217–229. doi: 10.1007/s00438-010-0563-2. [DOI] [PubMed] [Google Scholar]
- Xue W, Yin Y, Ismail F, Hu C, Zhou M, Cao X, Li S, Sun X. Transcription factor CCG-8 plays a pivotal role in azole adaptive responses of Neurospora crassa by regulating intracellular azole accumulation. Curr Genet. 2019;65:735–745. doi: 10.1007/s00294-018-0924-7. [DOI] [PubMed] [Google Scholar]
- Yadav S, Thakur R, Georgiev P, Deivasigamani S, Krishnan H, Ratnaparkhi G, Raghu P. RDGBalpha localization and function at membrane contact sites is regulated by FFAT VAP interactions. J Cell Sci. 2018;131 doi: 10.1242/jcs.207985. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yakubovskaya E, Guja KE, Mejia E, Castano S, Hambardjieva E, Choi WS, Garcia-Diaz M. Structure of the essential MTERF4:NSUN4 protein complex reveals how an MTERF protein collaborates to facilitate rRNA modification. Structure. 2012;20:1940–1947. doi: 10.1016/j.str.2012.08.027. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zimmermann L, Stephens A, Nam SZ, Rau D, Kubler J, Lozajic M, Gabler F, Soding J, Lupas AN, Alva V. A Completely Reimplemented MPI Bioinformatics Toolkit with a New HHpred Server at its Core. J Mol Biol. 2018;430:2237–2243. doi: 10.1016/j.jmb.2017.12.007. [DOI] [PubMed] [Google Scholar]
- Zuzow N, Ghosh A, Leonard M, Liao J, Yang B, Bennett EJ. Mapping the mammalian ribosome quality control complex interactome using proximity labeling approaches. Mol Biol Cell. 2018;29:1258–1269. doi: 10.1091/mbc.E17-12-0714. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.