Abstract
Proteogenomics and ribosome profiling concurrently show that genes may code for both a large and one or more small proteins translated from annotated coding sequences (CDSs) and unannotated alternative open reading frames (named alternative ORFs or altORFs), respectively, but the stoichiometry between large and small proteins translated from a same gene is unknown. MIEF1, a gene recently identified as a dual-coding gene, harbors a CDS and a newly annotated and actively translated altORF located in the 5′UTR. Here, we use absolute quantification with stable isotope-labeled peptides and parallel reaction monitoring to determine levels of both proteins in two human cells lines and in human colon. We report that the main MIEF1 translational product is not the canonical 463 amino acid MiD51 protein but the small 70 amino acid alternative MiD51 protein (altMiD51). These results demonstrate the inadequacy of the single CDS concept and provide a strong argument for incorporating altORFs and small proteins in functional annotations.
Keywords: Proteogenomics, Absolute quantification, Parallel reaction monitoring, Translation*, Gene Expression*, Knockouts*, Mass Spectrometry, Mitochondria function or biology, alternative translation, short ORF
According to the traditional view of protein synthesis, each protein-coding gene harbors a single annotated open reading frames (ORF)1 or coding sequence (CDS) encoding a canonical protein. However, genes contain more than one ORF, and the longest ORF is generally designated as the canonical CDS in genome annotations (1). In eukaryotes, alternative splicing results in the production of several mRNAs and the translation of different isoforms, in addition to the canonical protein. Hence, the translational output of a protein-coding gene is currently concealed to a canonical protein and one or several isoforms.
This concept was recently disproved by two modern approaches to the accurate measurement of translation, ribosome profiling and proteogenomics. Ribosome profiling maps the regions of the transcriptome which are actively translated with nucleotide resolution (2). Proteogenomics approaches use customized protein databases and mass spectrometry (MS)-based proteomics to detect translated proteins (3–10). Both methods have revealed prevalent translation of ORFs outside of annotated CDSs and of out-of-frame ORFs (altORFs) (2, 11). These findings call into question the concept of the single CDS in eukaryotic mRNAs (12). In addition, they also highlight the need to redefine translated sequences (13), modernize functional genome annotations with shorter ORFs (14), and reassess the translation output of protein coding-genes by considering smaller proteins in addition to larger canonical proteins. The cellular stoichiometry of a canonical protein versus a small protein encoded in the same gene and their respective concentrations is unknown. Yet, proteins are the primary effectors of biological processes and deciphering the function of a gene in health and disease requires accurate characterization of its products.
The mitochondrial elongation factor 1 gene or MIEF1 also termed SMCR7L/MiD51, localized at the Chr22q13.1 locus, codes for a mitochondrial receptor of Drp1, a GTPase which functions in mitochondrial fission (15–17).
Ribosome profiling and proteogeomics studies recently demonstrated the translation of a stable 70 amino acid protein product encoded in a altORF localized in the 5′UTR (Fig. 1A) (3, 18–24). Thus, MIEF1 is a prototypical gene coding for both a large and a small protein. For simplicity, we termed this novel protein “alternative-MiD51” or altMiD51. Remarkably, both proteins are localized at the mitochondria. MiD51 is an outer mitochondrial membrane protein (17) whereas altMiD51 is located at the mitochondrial matrix (24) and both are involved in mitochondrial fission (17, 24). AltMiD51 has also been reported to be a new assembly factor of the mitochondrial ribosome and implicated in its biogenesis (25).
Here, we employ a targeted proteomics approach based on AQUA peptides to reliably quantify the absolute amount of MiD51 and altMiD51 in two human cell lines and one human tissue, and thus we establish an improved map of the translational output of MIEF1/SMCR7L/MiD51 by directly measuring the final protein products.
EXPERIMENTAL PROCEDURES
Experimental Design and Statistical Rationale
In this study, our aim is to determine the stoichiometry of two distinct proteins encoded within the gene MIEF1, the canonical protein MiD51 and altMiD51. AltMiD51 is a 70 amino acids protein coded by a short ORF in the 5′UTR region. We expect to unveil the stoichiometry of the two proteins using targeted proteomics and stable isotope labeled synthetic peptides. To determine proteotypic peptides, we first performed AP-MS DDA experiments on both proteins. Peptides were then validated in overexpressed and endogenous conditions using targeted proteomics. Selected peptides were tested for coefficient of variation (CV) using technical replicates and were validated if their CV fell below 20%. One stable isotope peptide for each protein was purchased and tested for linearity in a peptide mixture using two technical replicates at various concentrations. Both peptides showed a linearity range between 40 amol and 250 fmol. Absolute quantification experiments were then conducted in two cell lines (three biological replicates each) and colon tissue (three technical replicates), in MiD51 knockout HeLa cells and altMiD51 knockout HeLa cells (three biological replicates each). Protein amounts were compared using Welch's two samples t-tests.
Tissue Collection and Ethics
Normal human colon tissue sample was obtained from the Biobanque des maladies digestives du CRCHUS. Patient gave informed consent for the banking and use of tissue sample. The ethic review board at the Centre intégré universitaire de santé et de services sociaux de l'Estrie - Centre hospitalier universitaire de Sherbrooke (CIUSSS de l'Estrie - CHUS) approved the use of this sample for this study. Briefly, following resection for colorectal cancer, the colon was washed thoroughly and normal tissue sampling was performed at more than 10 cm of the tumor, within a region confirmed by a pathologist to be uninvolved by tumor cells (H&E staining). Fresh sample was flash frozen in liquid nitrogen within 30 min of surgical resection.
Cell Culture
Cells were grown in Dulbecco's Modified Eagle Medium (DMEM, Wisent, St-Bruno, Quebec, Canada) supplemented with 10% fetal bovine serum (FBS, Wisent) and antibiotic-antimycotic mixture (Wisent). Cells were mycoplasma free (routinely tested). For transfections, cells were grown in 100 mm Petri dishes until 80% confluent and were transfected by adding 10 μg of plasmidic DNA in 2 ml of FBS/antibiotics-free DMEM and 10 μl of GeneCellIn (Eurobio, Les Ulis, France) and let to grow for 24 h before cell lysis. For parallel reaction monitoring (PRM) experiments, cells were grown in 60 mm Petri dishes until about 80% confluent.
DNA Constructs
DNA constructs were generated by Gibson assembly (26) of synthetic DNA (Gblocks, IDT, Skokie, Illinois) using the NEBuilder HiFi DNA Assembly Cloning Kit (New England BioLabs, Ipswich, Massachusetts) according to manufacturer's recommendation. DNA blocks of C-terminally LAP-tagged (27) MiD51 and GFP-tagged altMiD51 were inserted separately into pcDNA 3.1(-) expression vector (Thermo Fisher Scientific, Waltham, Massachusetts). The context construct was built on the assembly of the full 5′ region containing the altMiD51 coding sequence with a C-terminal 2 FLAG tag and the canonical MiD51 coding sequence with a C-terminal HA tag (transcript NM_019008.4) into pcDNA 3.1(-) expression vector. DNA sequences were controlled by sequencing.
Mitochondrial Extracts
For Western blot analysis of HeLa and CRISPR-Cas9 HeLa clones as well as PRM optimizations, mitochondrial extracts were performed according to (28) with minor modifications. Cells were grown into three 100 mm dishes until 80% confluent, rinsed twice with PBS 1× and collected using a cell scrapper. Cells were pelleted by centrifugation at 500 × g for 10 min at 4 °C. Supernatant was discarded and cells were suspended in mitochondrial buffer (mito-buffer : 210 mm mannitol, 70 mm sucrose, 1 mm EDTA, 10 mm HEPES-NaOH, pH 7.5, 2 mg/ml Bovine Serum Albumin (BSA), 0.5 mm PMSF and EDTA-free protease inhibitor (Thermo Fisher Scientific)) and disrupted by passage through a 25G1 0.5 × 25 needle syringe 15 consecutive times on ice followed by a 3 min centrifugation at 2000 × g at 4 °C. Supernatant was collected and the pellet was resuspended in mitochondrial buffer. The breakage procedure was repeated four times. All four supernatants containing mitochondria were again passed through syringe needle in mito-buffer and cleared by centrifugation for 3 min at 2000 × g at 4 °C. Supernatants were collected and centrifuged for 10 min at 13,000 × g at 4 °C to pellet mitochondria. Pellets were washed twice with BSA-free mitochondrial buffer and pooled. Final mitochondrial pellet was lysed in SDS buffer (4% SDS, Tris-HCl 100 mm pH 7.6). After sonication, protein content was assessed using BCA assay (Thermo Fisher Scientific).
Mass Spectrometry Sample Preparation
Preliminary Affinity-purification (AP)
Cells were rinsed twice with cold PBS 1X and lysed with 1 ml of AP-buffer (NP-40 0.5%, Tris-HCl 50 mm pH 7.5, NaCl 150 mm, EDTA-free protease inhibitor 1×). Lysate was cleared by centrifugation (2000 × g, 5 min) and supernatant was collected. GFP-Trap agarose beads (ChromoTek, Planegg-Martinsried) were conditioned with three consecutive PBS 1× washes followed by three AP-buffer washes. Lysate supernatant was mixed with beads and incubated at 4 °C for 18 h on a rotating device. Beads were then washed 3 times with AP-buffer and 5 times with 50 mm NH4HCO3 (ABC). Digestion was performed on beads by adding 1 μg of trypsin (Promega, Madison, Wiscosin) in 100 μl ABC at 37 °C overnight. Digestion was quenched with formic acid to a final concentration of 1% and supernatant, containing peptides, was collected. Beads were then washed once with acetonitrile/water/formic acid (1/1/0.01 v/v) and pooled with supernatant. Peptides were dried using a speedvac, desalted using a C18 Zip-Tip (Millipore Sigma, Etobicoke, Ontario, Canada) and resuspended into 25 μl of 1% formic acid in water prior to MS analysis.
PRM Experiments
For mitochondrial extracts, mitochondrial pellet was lysed using SDS buffer as described above. For whole cell lysates, cells were rinsed twice with cold PBS 1× and lysed using SDS buffer. Tissue sample were homogenized using a TissueRuptor (Qiagen, Toronto, Ontario, Canada) in SDS buffer. Lysates were sonicated to reduce viscosity followed by a 5 min centrifugation at 14,000 × g to discard debris and insoluble parts. Protein content was assessed using BCA protein assay (Thermo Fisher Scientific). A total of 100 μg of protein and 1 μg of recombinant Glutathione S-transferase (GST, Schistosoma japonicum) were reduced by adding dithiothreitol to a final concentration of 50 mm and incubated 15 min at 55 °C. Lysates were prepared according to the filter aided sample preparation protocol (FASP) with minor modifications (29). Lysates were diluted with 500 μl of 8 m urea solution and transferred into a 3 kDa centrifugation device (Amicon Ultra, Millipore Sigma) and centrifuged for 30 min at 14,000 × g. After one 8 m urea wash and centrifugation, samples were diluted with 200 μl of 50 mm iodoacetamide in 8 m urea and left at room temperature in the dark for 30 min. Samples were centrifuged and washed 3 times with 8 m urea. Buffer was then exchanged for 50 mm ABC with three consecutive 200 μl washes. The final retentate was digested overnight at 37 °C with 1 μg of trypsin (Gold, Promega) in 40 μl ABC and AQUA (30) peptides (pepoTec Ultimate, Thermo Fisher Scientific). Tryptic peptides were collected by filter centrifugation followed by three ABC washes and centrifugation. Peptide-containing filtrate was concentrated using a speedvac and then acidified by formic acid to a final concentration of 1%. Peptides were desalted using a C18 Zip-Tip and dried using a speedvac.
Calf Intestinal Phosphatase Treatment
Peptides were dephosphorylated using calf intestinal phosphatase (CIP) according to (31). Briefly, 5 μg of desalted peptides were solubilized with 10 units of CIP (New England Biolabs) in 50 μl of CIP buffer (100 mm NaCl, 50 mm Tris-HCl, 10 mm MgCl2 and 1 mm DTT; pH 7.9) and incubated at 37 °C for 2 h. The mixture was acidified by adding trifluoroacetic acid (TFA) to a final concentration of 0.5%. Peptides were desalted using a C18 Zip-Tip (Millipore Sigma), dried and solubilized with 25 μl of 1% formic acid in water.
nanoLC-MS/MS Analysis
Instrument Setup
A total of 12 μl of peptide mixture was loaded onto a trap column (Acclaim PepMap100 C18 column, 0.3 mm id × 50 mm, Thermo Fisher Scientific) at a constant flow rate of 4 μl/min. Peptides were separated in a PepMap C18 nano column (75 μm × 50 cm, Thermo Fisher Scientific) using a 0–35% gradient (0–215 min) of 90% acetonitrile, 0.1% formic acid at a flow rate of 200 nL/min followed by acetonitrile wash and column re-equilibration for a total gradient duration of 4 h with a RSLC Ultimate 3000 (Thermo Fisher Scientific, Dionex). Peptides were sprayed using an EASYSpray source (Thermo Fisher Scientific) at 2 kV coupled to a quadrupole-Orbitrap (QExactive, Thermo Fisher Scientific) mass spectrometer.
Affinity Purification and MS Analysis (AP-MS)
For preliminary AP-MS of GFP-tagged constructs, the mass spectrometer was used in data dependent acquisition mode (DDA). Full-MS spectra within a m/z 350–1600 mass range at 70,000 resolution were acquired with an automatic gain control (AGC) target of 1e6 and a maximum accumulation time (maximum IT) of 20 ms. Fragmentation (MS/MS) of the top ten ions detected in the Full-MS scan at 17,500 resolution, AGC target of 5e5, a maximum IT of 60 ms with a fixed first mass of 50 within a 3 m/z isolation window at a normalized collision energy (NCE) of 25. Dynamic exclusion was set to 40 s.
Data-dependent Protein Identification of AP-MS Samples
Mass spectrometry RAW files were searched with Andromeda (32), search engine implemented in MaxQuant 1.5.5.1 (33). Trypsin/P was set as digestion mode with a maximum of two missed cleavages per peptides. Oxidation of methionine and acetylation of N-terminal were set as variable modifications. Carbamidomethylation of cysteine was set as fixed modification. Precursor and fragment tolerances were set at 4.5 and 20 ppm respectively (defaults settings). Files were searched using a target-decoy approach (34) against Uniprot-Human 03/2017 release (35) and GST (92,949 entries) at a 1% false discovery rate at peptide-spectrum-match, peptide and protein levels. Peptides sequences were recovered from MaxQuant output files.
PRM Method Refinement
Developed and applied PRM analyses were Tier 2 assays. A first PRM method was determined with a large number of peptides in order to discriminate peptides that were detectable in overexpression as well as endogenous conditions in mitochondrial extracts. Peptide unicity was first checked using neXtProt peptide uniqueness checker (36). MS/MS spectra were then manually inspected and peptides with highest MS intensities, absence of miscleavage and high identification scores were selected for preliminary PRM peptide evaluation. The peptide list consisted in 30 mass over charges corresponding to unique peptides of MiD51 (11 peptides), altMiD51 (4 peptides), HSP60 (4 peptides) and GST (5 peptides) at various charge states. Method consisted in a Full-MS spectra acquisition with an AGC target of 3e6, maximum IT of 70 ms and a resolution of 70,000 followed by an unscheduled targeted-MS2 method with an AGC target of 5e5 ions, maximum IT of 130 ms, resolution of 17,500 with a 2 m/z isolation window and a NCE of 27.
A second method was used to evaluate the signal recovered after CIP treatment on endogenous mitochondrial extracts. The peptide list consisted in 11 mass over charges based on previous PRM experiments corresponding to peptides of MiD51 (3 peptides), altMiD51 (2 peptides), HSP60 (3 peptides) and GST (3 peptides). Method consisted in a Full-MS spectra acquisition with an AGC target of 3e6, maximum IT of 70 ms and a resolution of 70,000 followed by an unscheduled targeted-MS2 method with an AGC target of 5e5 ions, maximum IT of 150 ms, resolution of 17,500 with a 2 m/z isolation window and a NCE of 27. All method optimization files were processed using Skyline (37).
High Sensitivity PRM
For endogenous CV analysis in whole cell extracts and mitochondrial extracts as well as absolute quantification experiments, mass spectrometer was set for highest sensitivity according to (38). Method consisted into a Full-MS spectra acquisition with an AGC target of 3e6, maximum IT of 70 ms and a resolution of 70,000 followed by an unscheduled targeted MS2 method with an AGC target of 1e6 ions, maximum IT of 250 ms resolution of 70,000 and a NCE of 27. Isolation list contained one peptide from altMiD51, one for MiD51 and their AQUA standards, one peptide from GST spike-in as well as one peptide from HSP60 which were used as sample processing controls.
High Sensitivity PRM Sample Analysis
Mass-spectrometry RAW files were analyzed using Xcalibur 2.2 (Thermo Fisher Scientific) by measuring area of each peptide monoisotopic transitions within a 3 ppm mass precision window. For AQUA peptide calibration curves, internal standards were spiked into a HeLa digest and analyzed with high sensitivity PRM in conditions described above. For each peptide, five precursor-to-fragment transitions starting from N terminus within a mass deviation of 3 ppm were assessed for linearity and CV analysis, considering that a transition with a CV below 20% at a given concentration is quantifiable. For endogenous CV analysis, most quantifiable precursor to fragment transitions were measured for each peptide within a 3-ppm precision window, and two replicates were compared. For absolute quantification experiments, protein concentration was determined by comparing the ratio of the endogenous peptide to spiked-in AQUA standard and its concentration with the same precursor to fragment transitions within a 3-ppm mass precision window. Peptide ratios were kept below 25. Spectral similarity was controlled by importing RAW files into Skyline and peptides were validated if their spectral contrast angles (39) or ratio dot products were close to 1 as well as their retention times matching AQUA standards.
The mass spectrometry proteomics data have been deposited to the ProteomeXchange Consortium via the PRIDE (40) partner repository with the data set identifier PXD008147.
CRISPR-Cas9-mediated MiD51 and altMiD51 Knockout (KO)
Knockouts Clonal Cell Generation
CRISPR-Cas9-mediated MiD51 and altMiD51 KO HeLa cells were generated according to (41) with minor modifications. Briefly, sgRNAs were designed using the Broad Institute sgRNA Designer (CRISPRko) tool (http://portals.broadinstitute.org/gpp/public/analysis-tools/sgrna-design, (42)) and confirmed with the CCTOP tool (http://crispr.cos.uni-heidelberg.de/, (43)). CRISPR-Cas9 related oligonucleotides are described in Table I. The sgRNA inserts, containing an extra G in 5′ required for the U6 RNA polymerase III promoter, were prepared by annealing the top and bottom oligos (Table I) and cloned into the pSpCas9(BB)-2A-GFP plasmid (Addgene #48138, Cambridge, Massachusetts, (41)). The resulting plasmids were verified by sequencing. Enrichment for Cas9–2A-GFP expressing cells and isolation of clonal cell populations were performed 24 h after transfection by single-cell FACS sorting. The initial validation of genome editing was done by Mismatch-cleavage assay using T7 Endonuclease I (NEB), GenElute Mammalian Genomic DNA Miniprep Kit (Millipore Sigma) with missmatch assays primers (Table I). Cells edition was confirmed by Western blotting (Fig. 3) and by sequencing PCR amplicons derived from the target sites.
Table I. Oligonucleotide sequences used for CRISPR-Cas9 genome editing experiments.
Oligonucleotide sequences for altMiD51 knock out | Oligonucleotide sequences for MiD51 knock out | |
---|---|---|
Genomic target site | 5′-TGGAGCCGAGAGGCGGTGCT-3′ | 5′-CGCTGGCAGTTAAGCGGGTA-3′ |
Top oligonucleotide | 5′-CACCGAGCACCGCCTCTCGGCTCCA-3′ | 5′-CACCGTACCCGCTTAACTGCCAGCG-3′ |
Bottom oligonucleotide | 5′-AAACTGGAGCCGAGAGGCGGTGCTC-3′ | 5′-AAACCGCTGGCAGTTAAGCGGGTAC-3′ |
T7 endonuclease 1 mismatch assays primers | 5′-GGGGTCTCTGGAACTTGGAT-3′ 5′-TCCTTTTCTCGGTCCCTTGC-3′ | 5′-GGTCCCAGTACTTATGGCCG-3′ 5′-CCACGCAGAAAATCTCAGGG-3′ |
Characterization of Heterozygote MiD51 KOs
Genomic DNA was amplified using MiD51 mismatch assays primers with primer extension allowing its insertion into linearized (EcoRI, BamHI, New England Biolabs) pcDNA 3.1(-) expression vector via Gibson assembly as mentioned above. Plasmids were purified and sequenced.
Western Blotting
For each sample, 50 μg of mitochondrial protein extract was mixed 1/1 (v/v) with Laemmli buffer (4% SDS w/v, 20% glycerol v/v, Tris-HCl 100 mm pH 6.8, 5% β-mercapto ethanol v/v) and heated at 95 °C for 5 min. For altMiD51, proteins were separated in a 4% stacking/15% acrylamide-bisacrylamide (29/1 w/w) resolving SDS-PAGE for one hour at 200 V constant voltage using a glycine-buffer. For MiD51, proteins were separated in a 4% stacking/10% acrylamide-bisacrylamide (49.5% T, 3% C) resolving tricine SDS-PAGE (15, 44) gel (16 × 18 cm) for 18 h using 0.2 m Tris-HCl pH 8.9 as anode buffer and 0.1 m Tris-HCl, 0.1 m tricine, 0.1% SDS pH 8.25 as cathode buffer (25 mA constant current). Proteins were transferred onto polyvinyldiene difluoride membranes. The membranes were blocked with 5% milk supplemented Tris-buffered saline 0.2% Tween-20 (TBST). Membranes were probed with a custom anti-altMiD51 rabbit antibody (Proteintech, Rosemont, Illinois, see below), a polyclonal anti MiD51 rabbit antibody (Proteintech 20164–1-AP) and a mouse monoclonal anti-mitochondrial HSP 70 antibody (MA3028, Thermo Fisher Scientific) at 4 °C overnight. The membrane was then washed three times with TBST and probed with goat anti-mouse (sc-2005, Santa-Cruz Biotechnology, Mississauga, Ontario, Canada) or goat anti-rabbit-conjugated horseradish peroxidase antibodies (7074S, Cell Signaling Technology, Danvers, Massachusetts).
AltMiD51 Antibodies
Rabbit polyclonal anti-altMiD51 antibodies were raised against the full-length 70 amino acids recombinant altMiD51 protein and affinity purified (Proteintech).
Statistics
All graphics and statistics were made using R (45) 3.3.2 and ggplot2 (46) 2.2.1 or higher.
Cross Validation with Elongating Ribosomes
Ribo-Seq coverage of both ORFs were extracted from GWIPS (47) for HEK and HeLa cells. All nucleotides of both ORFs were considered as mappable using Umap track of UCSC Genome Browser (48) and ribosome densities were compared between altMiD51 and MiD51 ORFs.
RESULTS
Determination of MiD51 and altMiD51 Proteotypic Peptides
In addition to the canonical CDS (Consensus CDS CCDS13995; RefSeq NM_019008.5; Ensembl ENST00000325301) and associated protein (RefSeq NP_061881, UniProt Q9NQG6; Ensembl ENSP00000327124), human MIEF1 contains a functional and recently annotated altORF (GenBank HF548110) coding for a small protein (Uniprot L0R8F8; GenBank CCO13821.1; Ensembl ENSP00000490747) (Fig. 1A and 1B). Thus, MIEF1 is clearly a prototypical dual-coding gene for which the absolute quantification of the large and small protein products is unknown. We evaluated the ability of MiD51 and altMiD51 to generate proteotypic peptides after trypsin digestion. Proteotypic peptides are specific for each protein and they must be consistently detected with excellent quality precursor and fragment mass transitions (49–52). To facilitate the detection of specific tryptic peptides for both proteins, we used affinity purification coupled with mass spectrometry. MiD51GFP and altMiD51GFP were independently overexpressed in HeLa cells. Both proteins were affinity purified and analyzed via data-dependant (DDA) nano capillary liquid chromatography mass spectrometry (nanoLC-MS/MS). Several proteotypic peptides were detected for a total sequence coverage of 68.6% and 71.9% for altMiD51 and MiD51, respectively (Fig. 1B and supplemental Data S1). After manual evaluation (see method section), best quality proteotypic peptides were selected for parallel reaction monitoring (PRM) optimization (38, 53).
PRM Optimization for MiD51 and altMiD51 Proteotypic Peptides
Selected proteotypic peptides were then validated in low-sensitivity PRM experiments in 3 kDa FASP processed samples (29). Indeed, altMiD51 is a small protein of 70 amino acids and a low M.W. cut-off is necessary to ensure protein retention during sample preparation. Because both MiD51 and altMiD51 are mitochondrial proteins (24), mitochondria were isolated from mock-transfected cells and from cells transfected with a cDNA containing both the CDS coding for MiD51 and the native 5′UTR containing the altORF coding altMiD51 (RefSeq transcript NM_019008.4). Two proteotypic peptides for each protein were detected in these mitochondrial extracts (supplemental Fig. S1). Signal intensity for endogenous mitochondrial HSP60 peptide shows that the protein concentration of mock and transfected mitochondrial extracts were similar, and that the intensity difference for altMiD51 and MiD51 peptides between mock-transfected and altMiD51/MiD51-transfected samples did not result from differences in mitochondria preparation.
As MiD51's most intensely detected peptide (AISAPTSPTR) bore known phosphosites (Fig. 1B), a second PRM method including a dephosphorylation step using calf intestinal phosphatase (CIP) was implemented (31). A fraction of MiD51 was phosphorylated because CIP treatment resulted in a 20% increase in intensity for AISAPTSPTR (supplemental Fig. S2). The efficiency of CIP treatment was validated with two known HSP60 tryptic phosphorylated peptides, VGGTSDVEVNEK and VTDALNATR (phosphosite.org) with an increase in intensity of 103 and 133%, respectively. The intensity of a non-phosphorylated HSP60 peptide did not change significantly. AltMiD51 peptides are clearly nonphosphorylated because CIP treatment did not change significantly their intensity (supplemental Fig. S2). Based on these results, we selected peptides EAVLSLYR and AISAPTSPTR for absolute quantitation of altMiD51 and MiD51, respectively.
Finally, the precision of the most sensitive PRM method across different samples was estimated with the measure of the coefficient of variation (CV) on mitochondrial and whole cell extracts. Indeed, a CV below 20% is required for absolute quantification (54). The CVs were systematically below 20%, indicating that both mitochondrial and whole cell extracts were suitable for quantification (supplemental Fig. S3). Even though peptide intensities are higher in mitochondrial extracts, we decided to use whole cell lysates for absolute quantification of altMiD51 and MiD51 as their preparation does not involve cell fractionation, with the risk of variable mitochondrial recovery.
AltMiD51 and MiD51 Protein Abundances
Two synthetic stable isotope-labeled peptides for absolute quantification (AQUA) (30), EAVLSLYR and AISAPTSPTR, were spiked into the protein sample after trypsin digestion from HeLa cells and analyzed via PRM (Fig. 1B). A total of 5 y ion transitions starting from the most N-terminal amino acid were measured and both peptides displayed at least one quantifiable transition within a range of 40 amol - 250 fmol and a CV < 20% (supplemental Fig. S4).
Absolute quantification PRM experiments were performed by spiking AQUA peptides with trypsin into the digestion mixture as described by (30). After desalting and dephosphorylation with CIP treatment, the resulting peptides were processed using a high sensitivity PRM method (38). For each peptide, retention times for the corresponding native and AQUA species as well as spectral contrast angles or ratio dot product (39) were controlled to ensure correct identification (Fig. 2A, 2B and supplemental Fig. S6–S12). The absolute amount of native peptides were thus determined (supplemental Data S2).
CRISPR-Cas9-mediated Independent Inactivation of altMiD51 or MiD51
As this is the first absolute quantification of a large and small protein encoded by two independent ORFs in the same gene, it is important to show that absolute amounts of MiD51 and altMiD51 are partially or completely obliterated by inactivating their respective coding sequences. Experimental modulation of altMiD51 expression independently of MiD51 expression using a RNAi-based knockdown approach is impossible because both proteins are coded by the same gene, and both coding sequences are present in the same transcripts. This is a general challenge for the study of small and large proteins coded in the same gene (14). Thus, we implemented a CRISPR-Cas9 approach to independently prevent the expression of either altMiD51 or MiD51 (Fig. 3A and 3B) (41, 55).
Genome-edited clonal cell lines were validated by sequencing the targeted genomic region. The sequence of the PCR-amplified altMiD51 genomic region confirmed the homozygous 1 bp insertion of a A/T at position 40 of exon 2, at the Cas9 cleavage site (Fig. 3C). For MiD51, the sequence electropherogram of the PCR-amplified genomic region showed overlapping peaks (Fig. 3C), indicating the presence of heterozygous mutations in the different alleles.
AltMiD51 was completely undetectable both by Western blotting (Fig. 3D) and absolute quantification (Fig. 4A, Welch's t test p value = 0.0013), confirming successful editing of the altMiD51 ORF (Fig. 3C). Remarkably, levels of MiD51 were significantly increased in altMiD51-edited cells (Fig. 4A, Welch's t test p value = 0.0006). Although MiD51 was not detected by Western blotting in CRISPR-Cas9 MiD51 edited cells (Fig. 3D), PRM analyses showed a 86% reduction in MiD51 levels (Fig. 4A, Welch's t test p value = 0.0004), suggesting that non-edited WT alleles remained. However, sequencing alleles of MiD51-edited HeLa (Fig. 3E) revealed that no WT sequence was detected, suggesting that signal from PRM experiments is because of the 6 nucleotides, and thus 2 amino acids, loss in MiD51 sequence, giving a non-frameshifted sequence coding for a truncated protein containing the AQUA MiD51 peptide (Fig. 3E, blue bar). Overall, genome editing of altMiD51 and MiD51 conclusively validated the proteotypic peptides selected for absolute quantification, and the presence of two functional and physically independent coding information in the same gene.
Absolute Amounts and Ratio of altMiD51 to MiD51
Absolute quantification performed in HEK 293, HeLa and colon tissue indicate that altMiD51 is the most abundant protein product of MIEF1 (Fig. 4A, supplemental Data S2). We compared absolute quantities of altMiD51 and MiD51 in HEK 293, HeLa and human colon tissue samples to determine their stoichiometric relationship. The stoichiometry indicated that the most abundant translation product from MIEF1 is altMiD51 rather than the canonical MiD51 protein. The ratio of altMiD51 to MiD51 is 2.71 in HEK 293 cells, 5.73 in HeLa cells, and 2.62 in Human colon tissue (Fig. 4B and supplemental Fig. S13). This observation is consistent with our analysis of ribosome occupancy with data extracted from GWIPS (supplemental Fig. S14; supplemental Data S3).
DISCUSSION
Ribosome profiling and proteogenomics strongly support the translation of alternative protein products from altORFs in addition to the translation of canonical CDSs. Yet, the absolute quantification of a small and a large protein coded by the same gene is unknown. Here, we show that levels of the 70 amino acid altMiD51, a small protein encoded in an exon originally annotated as “non-coding” of MIEF1/SMCR7L/MiD51 are two to six times higher than the levels of the canonical MiD51 protein in cells and in a human tissue. This work illustrates that small proteins are important contributors of the proteome, and it is not because that altORFs and alternative proteins are not annotated, unlike large proteins, that they do not exist. Obviously, it is very likely that this is not a general feature of altORFs and that the expression levels of small and large proteins coded by the same genes are highly variable and gene-specific. Also, there is no correlation between protein abundance and functionality, and because the ratio altMiD51/MiD51 is > 2 does not mean that the function of altMiD51 is more significant than that of MiD51.
Several physiological processes could explain the higher ratio of altMiD51 to MiD51, including a difference in protein synthesis, a difference in protein degradation or a combination of both. Nonetheless, according to the scanning model for translation initiation, the most likely mechanism is the localization of altMiD51 upstream of MiD51 that would favor altMiD51 translation. This hypothesis is supported by ribosome profiling data aligned to the MIEF1 locus which indicate that the density of elongating ribosomes is higher on the altORF compared with the CDS (21, 23, 47), suggesting that ribosomes efficiently translate altMiD51. In addition, MIEF1 is moderately resistant to eIF2 repression in response to severe stress induced by sodium arsenite (21). Genes resistant to eIF2 repression are characterized by the presence of an efficiently translated upstream ORF and partial repression of translation of the main CDS in normal conditions, and derepression in response to environmental stresses (56). Our proteomics data agree with the dampening of MiD51 translation under physiological conditions.
CRISPR-mediated altMiD51 and MiD51 KO experiments resulted in two important observations. First, we observed that MiD51 expression was significantly increased in altMiD51 KO cells (Fig. 4A). In these cells, the single bp (A/T) insertion in Cas9-edited altMiD51 coding sequence resulted in the truncation of altMiD51 ORF from 210 bps to 78 bps and a parallel increase of intercistronic distance altMiD51-MiD51 from 98 to 231. The combination of a shorter upstream ORF and a longer intercistronic distance were previously shown to increase re-initiation of the downstream ORF (12, 57–60). Thus, in addition to its role as a coding sequence for a novel mitochondrial fission factor (24) and an assembly factor implicated in mitoribosomal biogenesis (25), altMiD51 ORF may function as an upstream ORF regulating the translation of MiD51. Second, MiD51 KO cells still express altMiD51 at normal levels, which demonstrates that knocking out the canonical CDS does not completely inactivate MIEF1. This result illustrates for the first time that inactivating an annotated CDS may not necessarily obliterate a gene.
A combination of several circumstances have allowed small proteins to go unnoticed until recently. First, according to current human annotations, protein-coding genes have a single CDS, generally the longest ORF (1). Thus, all efforts to find the physiological function or role in the pathology of a specific gene are invariably focused on the protein encoded by this CDS, or one of its variants generated by alternative splicing. Second, in the absence of annotation of non-canonical ORFs, the protein sequence of the corresponding proteins cannot be routinely detected by MS-based proteomics approaches which rely on current protein databases containing the sequences of canonical proteins only. Third, the widely used Western blotting technique relies on specific antibodies, but antibodies have been raised and commercialized for canonical proteins only. Raising novel specific antibodies may take time and several attempts, thus delaying the investigations on small proteins. Fourth, the detection of small proteins by MS-based proteomics is more challenging than for large proteins. Typically, the proteome has to be fractionated to enrich low molecular weight proteins, and the identification often relies on a single tryptic peptide (5, 9). In addition, there may be no sites for trypsin digestion and peptides exceeding 25 aa are rarely identified in bottom-up proteomics. Fifth, because they are short, small proteins are less likely to have known protein domains discovered in large proteins, or to display a specific structure. Thus, there might exist a biased perception that small proteins have minor functions compared with large proteins in biological mechanisms. Yet, many small proteins have essential functions in prokaryotes and eukaryotes (14, 61).
AltMiD51 was integrated into the automatically annotated UniProtKB/TrEMBL database (identifier L0R8F8) in March 2013, following its detection under the name altSMCR7L (3). It was integrated into the manually annotated UniProtKB/Swiss-Prot database in March 2017. Like altMiD51, which is now a manually annotated bicistronic gene, it will be important to update genome annotations according to recent proteogenomics studies (14, 24). Indeed, the function of a dual-coding gene should not be inferred according to the molecular activity of the larger protein product only. In addition, the impact of mutations on gene function should not be analyzed in the conceptual frame of a single CDS, because mutations outside currently annotated CDSs may affect noncanonical ORFs and ultimately, gene function (62). Finally, our results clearly demonstrate that knocking out the canonical CDS in a gene and leaving altORFs unnaltered does not completely abrogate the translation output of that gene.
DATA AVAILABILITY
Mass spectrometry data are available at the PRIDE repository with the data set identifier PXD008147.
Supplementary Material
Acknowledgments
We thank Michael T. Ryan and Laura Osellame for constructive exchanges, particularly on MiD51 detection by western-blot, François-Michel Boisvert for access to mass spectrometer. Recombinant GST was a generous gift of Marie-Line Dubois.
Footnotes
* This study was supported by Canadian Institutes of Health Research (CIHR) grants MOP-137056 and MOP-136962, and by a Canada Research Chair in Functional Proteomics and Discovery of Novel Proteins to X.R.
This article contains supplemental Figures and Data.
1 The abbreviations used are:
- ORF
- open reading frames.
REFERENCES
- 1. Dinger M. E., Pang K. C., Mercer T. R., and Mattick J. S. (2008) Differentiating protein-coding and noncoding rna: challenges and ambiguities. PLoS Computational Biol. 4, e1000176. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2. Brar G. A., and Weissman J. S. (2015) Ribosome profiling reveals the what, when, where and how of protein synthesis. Nat. Rev. Mol. Cell Biol. 16, 651–664 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3. Vanderperre B., Lucier J. F., Bissonnette C., Motard J., Tremblay G., Vanderperre S., Wisztorski M., Salzet M., Boisvert F.M., and Roucou X. (2013) Direct detection of alternative open reading frames translation products in human significantly expands the proteome. PloS ONE, 8, e70698. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4. Menschaert G., Van Criekinge W., Notelaers T., Koch A., Crappé J., Gevaert K., and Van Damme P. (2013) Deep proteome coverage based on ribosome profiling aids mass spectrometry-based protein and peptide discovery and provides evidence of alternative translation products and near-cognate translation initiation events. Mol. Cell. Proteomics 12, 1780–1790 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5. Ma J., Ward C. C., Jungreis I., Slavoff S. A., Schwaid A. G., Neveu J., Budnik B. A., Kellis M., and Saghatelian A. (2014) Discovery of human sorf-encoded polypeptides (seps) in cell lines and tissue. J. Proteome Res. 13, 1757–1765 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6. Koch A., Gawron G., Steyaert S., Ndah E., Crappé J., De Keulenaer S., De Meester E., Ma M., Shen B., Gevaert K., Van Criekinge W., Van Damme P., Menschaert G. (2014) A proteogenomics approach integrating proteomics and ribosome profiling increases the efficiency of protein identification and enables the discovery of alternative translation start sites. Proteomics 14, 2688–2698 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7. Bazzini A. A., Johnstone T. G., Christiano R., Mackowiak S. D., Obermayer B., Fleming E. S., Vejnar C. E., Lee M. T., Rajewsky N., Walther T. C., Giraldez A. J. (2014) Identification of small orfs in vertebrates using ribosome footprinting and evolutionary conservation. EMBO J. e201488411. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8. Nesvizhskii A. I. (2014) Proteogenomics: concepts, applications and computational strategies. Nat. Methods 11, 1114–1125 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9. Ma J., Diedrich J. K., Jungreis I., Donaldson C., Vaughan J., Kellis M., Yates J. R. III, and Saghatelian Alan. (2016) Improved identification and analysis of small open reading frame encoded polypeptides. Anal. Chem. 88, 3967–3975 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10. Olexiouk V., and Menschaert G. (2016) Identification of small novel coding sequences, a proteogenomics endeavor. Proteogenomics 926, 49–64 [DOI] [PubMed] [Google Scholar]
- 11. Ingolia N.T., Ghaemmaghami S., Newman J. R., and Weissman J. S. (2009) Genome-wide analysis in vivo of translation with nucleotide resolution using ribosome profiling. Science 324, 218–223 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12. Mouilleron H., Delcourt V., and Roucou X. (2015) Death of a dogma: eukaryotic mrnas can code for more than one protein. Nucleic Acids Res. 44, 14–23 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13. Ingolia N. T. (2016) Ribosome footprint profiling of translation throughout the genome. Cell 165, 22–33 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14. Delcourt V., Staskevicius A., Salzet M., Fournier I., and Roucou X. (2017) Small proteins encoded by unannotated orfs are rising stars of the proteome, confirming shortcomings in genome annotations and current vision of an mrna. Proteomics 18, e1700058. [DOI] [PubMed] [Google Scholar]
- 15. Palmer C. S., Osellame L. D., Laine D., Koutsopoulos O. S., Frazier A. E., and Ryan M. T. (2011) Mid49 and mid51, new components of the mitochondrial fission machinery. EMBO Reports 12, 565–573 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16. Zhang Z., Liu L., Wu S., and Xing, (2016) D. Drp1, mff, fis1, and mid51 are coordinated to mediate mitochondrial fission during uv irradiation–induced apoptosis. FASEB J. 30, 466–476 [DOI] [PubMed] [Google Scholar]
- 17. Osellame L. D., Singh A. P., Stroud D. A., Palmer C. S., Stojanovski D., Ramachandran R., and Ryan M. T., (2016) Cooperative and independent roles of the drp1 adaptors mff, mid49 and mid51 in mitochondrial fission. J. Cell Sci. 129, 2170–2181 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18. Lee S., Liu B., Lee S., Huang S. X., Shen B., and Qian S. B. (2012) Global mapping of translation initiation sites in mammalian cells at single-nucleotide resolution. Proc. Natl. Acad. Sci. USA 109, E2424–E2432 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19. Kim M. S., Pinto S. M., Getnet D., Nirujogi R. S., Manda S. S., Chaerkady R., Madugundu A. K., Kelkar D. S., Isserlin R., Jain S., Thomas J. K., Muthusamy B., Leal-Rojas P., Kumar P., Sahasrabuddhe N. A., Balakrishnan L., Advani J., George B., Renuse S., Selvan L. D., Patil A. H., Nanjappa V., Radhakrishnan A., Prasad S., Subbannayya T., Raju R., Kumar M., Sreenivasamurthy S. K., Marimuthu A., Sathe G. J., Chavan S., Datta K. K., Subbannayya Y., Sahu A., Yelamanchi S. D., Jayaram S., Rajagopalan P., Sharma J., Murthy K. R., Syed N., Goel R., Khan A. A., Ahmad S., Dey G., Mudgal K., Chatterjee A., Huang T. C., Zhong J., Wu X., Shaw P. G., Freed D., Zahari M. S., Mukherjee K. K., Shankar S., Mahadevan A., Lam H., Mitchell C. J., Shankar S. K., Satishchandra P., Schroeder J. T., Sirdeshmukh R., Maitra A., Leach S. D., Drake C. G., Halushka M. K., Prasad T. S., Hruban R. H., Kerr C. L., Bader G. D., Iacobuzio-Donahue C. A., Gowda H., and Pandey A. (2014) A draft map of the human proteome. Nature 509, 575–581 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20. Crappé J., Ndah E., Koch A., Steyaert S., Gawron D., De Keulenaer S., De Meester E., De Meyer T., Van Criekinge W., Van Damme P., Menschaert G. (2014) Proteoformer: deep proteome coverage through ribosome profiling and ms integration. Nucleic Acids Res. 43, e29–e29 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21. Andreev D. E., O'Connor P. B. F., Fahey C., Kenny E. M., Terenin I. M., Dmitriev S. E., Cormican P., Morris D. W., Shatsky I. N., and Baranov P. V. (2015) Translation of 5′ leaders is pervasive in genes resistant to eif2 repression. Elife 4, e03971. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22. Sidrauski C., McGeachy A. M., Ingolia N.T., and Walter P. (2015) The small molecule isrib reverses the effects of eif2α phosphorylation on translation and stress granule assembly. Elife, 4, e05033. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23. Calviello L., Mukherjee N., Wyler E., Zauber H., Hirsekorn A., Selbach M., Landthaler M., Obermayer B., and Ohler U. Detecting actively translated open reading frames in ribosome profiling data. Nat. Methods 13, 165. [DOI] [PubMed] [Google Scholar]
- 24. Samandi S., Roy A. V., Delcourt V., Lucier J. F., Gagnon J., Beaudoin M. C., Vanderperre B., Breton M. A., Motard J., Jacques J. F., Brunelle M., Gagnon-Arsenault I., Fournier I., Ouangraoua A., Hunting D. J., Cohen A. A., Landry C. R., Scott M. S., Roucou X.. Deep transcriptome annotation enables the discovery and functional characterization of cryptic small proteins. eLife 6, e27860. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25. Brown A., Rathore S., Kimanius D., Aibara S., Bai X. C., Rorbach J., Amunts A., and Ramakrishnan V. (2017) Structures of the human mitochondrial ribosome in native states of assembly. Nat. Structural Mol. Biol. 24, 866–869 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26. Gibson D. G., Young L., Chuang R. Y., Venter J. C., Hutchison C. A., and Smith H. O. (2009) Enzymatic assembly of dna molecules up to several hundred kilobases. Nat. Methods 6, 343–345 [DOI] [PubMed] [Google Scholar]
- 27. Cheeseman I. M., and Desai A. (2005) A combined approach for the localization and tandem affinity purification of protein complexes from metazoans. Sci. STKE 2005, pl1. [DOI] [PubMed] [Google Scholar]
- 28. Antonsson B., Montessuit S., Sanchez B., and Martinou J. C. (2001) Bax is present as a high molecular weight oligomer/complex in the mitochondrial membrane of apoptotic cells. J. Biol. Chem. 276, 11615–11623 [DOI] [PubMed] [Google Scholar]
- 29. Wisniewski J. R., Zougman A., Nagaraj N., and Mann M. (2009) Universal sample preparation method for proteome analysis. Nat. Methods 6, 359. [DOI] [PubMed] [Google Scholar]
- 30. Gerber S. A., Rush J., Stemman O., Kirschner M. W., and Gygi S. P. (2003) Absolute quantification of proteins and phosphoproteins from cell lysates by tandem ms. Proc. Natl. Acad. Sci. U.S.A. 100, 6940–6945 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31. Wu R., Haas W., Dephoure N., Huttlin E. L., Zhai B., Sowa M. E., and Gygi S. P. (2011) A large-scale method to measure absolute protein phosphorylation stoichiometries. Nat. Methods 88, 677–683 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32. Cox J., Neuhauser N., Michalski A., Scheltema R. A., Olsen J. V., and Mann M. (2011) Andromeda: a peptide search engine integrated into the maxquant environment. J. Proteome Res. 10, 1794–1805 [DOI] [PubMed] [Google Scholar]
- 33. Cox J., and Mann M. (2008) Maxquant enables high peptide identification rates, individualized ppb-range mass accuracies and proteome-wide protein quantification. Nat. Biotechnol. 26, 1367–1372 [DOI] [PubMed] [Google Scholar]
- 34. Elias J. E., and Gygi S. P. (2007) Target-decoy search strategy for increased confidence in large-scale protein identifications by mass spectrometry. Nat. Methods 43, 207–214 [DOI] [PubMed] [Google Scholar]
- 35. UniProt. Uniprot: the universal protein knowledgebase. Nucleic Acids Res. 45, D158–D169 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36. Schaeffer M., Gateau A., Teixeira D., Michel P. A., Zahn-Zabal M., and Lane L. The nextprot peptide uniqueness checker: a tool for the proteomics community. Bioinformatics 33, 3471–3472 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37. MacLean B., Tomazela D. M., Shulman N., Chambers M., Finney G. L., Frewen B., Kern R., Tabb D. L., Liebler D. C., and MacCoss M. J.. Skyline: an open source document editor for creating and analyzing targeted proteomics experiments. Bioinformatics 26, 966–968 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38. Gallien S., Bourmaud A., Kim S. Y., and Domon B. (2014) Technical considerations for largescale parallel reaction monitoring analysis. J. Proteomics 100, 147–159 [DOI] [PubMed] [Google Scholar]
- 39. Wan K. X., Vidavsky I., and Gross M. L. (2002) Comparing similar spectra: from similarity index to spectral contrast angle. J. Am. Soc. Mass Spectrometry 13, 85–88 [DOI] [PubMed] [Google Scholar]
- 40. Vizcaíno J. A., Csordas A., Del-Toro N., Dianes J. A., Griss J., Lavidas I., Mayer G., Perez-Riverol Y., Reisinger F., Ternent T., Xu Q. W., Wang R., Hermjakob H. (2015) 2016 update of the pride database and its related tools. Nucleic Acids Res. 44, D447–D456 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41. Ran F. A., Hsu P. D., Wright J., Agarwala V., Scott D. A., and Zhang F. (2013) Genome engineering using the crispr-cas9 system. Nature Protocols 8, 2281–2308 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42. Doench J. G., Fusi N., Sullender M., Hegde M., Vaimberg E. W., Donovan K. F., Smith I., Tothova Z., Wilen C., Orchard R., Virgin H. W., Listgarten J., Root D. E. (2016) Optimized sgrna design to maximize activity and minimize off-target effects of crispr-cas9. Nat. Biotechnol. 34, 184–191 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43. Stemmer M., Thumberger T., del Sol Keyer M., Wittbrodt J., and Mateo J. L. (2015) Cctop: an intuitive, flexible and reliable crispr/cas9 target prediction tool. PloS ONE 10, e0124633. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44. Schägger H. (2006) Tricine–SDS-PAGE. Nat. Protocols 1, 16–22 [DOI] [PubMed] [Google Scholar]
- 45. R Core Team. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria, 2014. URL http://www.R-project.org/ [Google Scholar]
- 46. Wickham H. (2016) ggplot2: elegant graphics for data analysis. Springer [Google Scholar]
- 47. Michel A. M., Fox G., M Kiran A., De Bo C., O'Connor P. B., Heaphy S. M., Mullan J. P., Donohue C. A., Higgins D. G., and Baranov P. V. (2013) Gwips-viz: development of a ribo-seq genome browser. Nucleic Acids Res. 42, D859–D864 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48. Karimzadeh M., Ernst C., Kundaje A., and Hoffman M. M. (2016) Umap and bismap: quantifying genome and methylome mappability. bioRxiv 095463 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49. Kuster B., Schirle M., Mallick P., and Aebersold R. (2005) Scoring proteomes with proteotypic peptide probes. Nat. Reviews Mol. Cell Biol. 6, 577–583 [DOI] [PubMed] [Google Scholar]
- 50. Mallick P., Schirle M., Chen S. S., Flory M. R., Lee H., Martin D., Ranish J., Raught B., Schmitt R., Werner T., Kuster B., Aebersold R. (2007) Computational prediction of proteotypic peptides for quantitative proteomics. Nat. Biotechnol. 25, 125–131 [DOI] [PubMed] [Google Scholar]
- 51. Wilhelm M., Schlegl J., Hahne H., Gholami A. M., Lieberenz M., Savitski M. M., Ziegler E., Butzmann L., Gessulat S., Marx H., Mathieson T., Lemeer S., Schnatbaum K., Reimer U, Wenschuh H, Mollenhauer M, Slotta-Huspenina J, Boese JH, Bantscheff M, Gerstmair A., Faerber F., Kuster B. (2014) Mass-spectrometry-based draft of the human proteome. Nature 509, 582. [DOI] [PubMed] [Google Scholar]
- 52. Zolg D. P., Wilhelm M., Schnatbaum K., Zerweck J., Knaute T., Delanghe B., Bailey D.J., Gessulat S., Ehrlich H.C., Weininger M., Yu P., Schlegl J., Kramer K., Schmidt T., Kusebauch U., Deutsch E. W., Aebersold R., Moritz R. L., Wenschuh H., Moehring T., Aiche S., Huhmer A., Reimer U., Kuster B. (2017) Building proteometools based on a complete synthetic human proteome. Nat. Methods 14, 259–262 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53. Bourmaud A., Gallien S., and Domon, Bruno (2016) Parallel reaction monitoring using quadrupoleorbitrap mass spectrometer: Principle and applications. Proteomics 16, 2146–2159 [DOI] [PubMed] [Google Scholar]
- 54. Gallien S., Kim S. Y., and Domon B. (2015) Large-scale targeted proteomics using internal standard triggered-parallel reaction monitoring (is-prm). Mol. Cell. Proteomics 14, 1630–1644 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55. Barrangou R., Fremaux C., Deveau H., Richards M., Boyaval P., Moineau S., Romero D. A., Horvath P. (2007) Crispr provides acquired resistance against viruses in prokaryotes. Science 315, 1709–1712 [DOI] [PubMed] [Google Scholar]
- 56. Young S. K., and Wek R. C.. Upstream open reading frames differentially regulate gene-specific translation in the integrated stress response. J. Biol. Chem. R116. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57. Kozak M. (1987) Effects of intercistronic length on the efficiency of reinitiation by eucaryotic ribosomes. Mol. Cell. Biol. 7, 3438–3445 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58. Luukkonen B. G., Tan W., and Schwartz S. (1995) Efficiency of reinitiation of translation on human immunodeficiency virus type 1 mrnas is determined by the length of the upstream open reading frame and by intercistronic distance. J. Virol. 69, 4086–4094 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59. Kozak M. (2001) Constraints on reinitiation of translation in mammals. Nucleic Acids Res. 29, 5226–5232 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60. Barbosa C., Peixeiro I., and Romão L.. Gene expression regulation by upstream open reading frames and human disease. PLoS Genetics 9, e1003529. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61. Storz G., Wolf Y. I., and Ramamurthi K. S. (2014) Small proteins can no longer be ignored. Ann. Rev. Biochem. 83, 753–777 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62. Brunet M. A., Levesque S. A., Hunting D. J., Cohen A. A., and Roucou X. (2018) Recognition of the polycistronic nature of human genes is critical to understanding the genotype-phenotype relationship. Genome Res. 28, 609–624 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 63. Hornbeck P. V., Zhang B., Murray B., Kornhauser J. M., Latham V., and Skrzypek E.. Phosphositeplus, 2014: mutations, ptms and recalibrations. Nucleic Acids Res. 43, D512–D520 [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
Mass spectrometry data are available at the PRIDE repository with the data set identifier PXD008147.