Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2020 Apr 22.
Published in final edited form as: J Proteome Res. 2020 Feb 24;19(3):1147–1153. doi: 10.1021/acs.jproteome.9b00666

Matrix-Matched Calibration Curves for Assessing Analytical Figures of Merit in Quantitative Proteomics

Lindsay K Pino 1, Brian C Searle 2, Han-Yin Yang 3, Andrew N Hoofnagle 4, William S Noble 5, Michael J MacCoss 6
PMCID: PMC7175947  NIHMSID: NIHMS1579317  PMID: 32037841

Abstract

Mass spectrometry is a powerful tool for quantifying protein abundance in complex samples. Advances in sample preparation and the development of data-independent acquisition (DIA) mass spectrometry approaches have increased the number of peptides and proteins measured per sample. Here, we present a series of experiments demonstrating how to assess whether a peptide measurement is quantitative by mass spectrometry. Our results demonstrate that increasing the number of detected peptides in a proteomics experiment does not necessarily result in increased numbers of peptides that can be measured quantitatively.

Keywords: quantitative proteomics, mass spectrometry, label-free quantification, calibration curve, figures of merit

Graphical Abstract

graphic file with name nihms-1579317-f0001.jpg

INTRODUCTION

Mass spectrometry-based proteomics has made great progress and is being used to address essential questions in basic biology and of biomedical significance. Of interest, the development of data-independent acquisition mass spectrometry (DIA-MS) has made it possible to measure tens of thousands of peptides in a protein digest in 1–2 h of instrument time. The sampling of tandem mass spectra in DIA-MS is unbiased with respect to precursor intensity1 and systematic,2 in principle making it an appealing compromise between a narrowly focused targeted data acquisition strategy3 and an irregularly sampled discovery method. Although fully targeted proteomics assays often include validation experiments to assess whether the change in measured signal is reflective of the actual change in peptide abundance, proteomics assays measuring thousands of analytes in an unbiased fashion rarely assess which peptide measurements are truly quantitative.

A measurement is quantitative when the change in the measured signal reflects a change in the quantity of the analyte.4 Specifically, in mass spectrometry proteomics, for a method to be considered quantitative, the relationship between the measured signal and the peptide quantity must be assessed. This assessment uses a calibration curve, where the analyte is diluted systematically to demonstrate that the measured signal is precise and above the lower limit of quantitation (LLOQ), the quantity below which a change in signal no longer reflects a change in quantity. Because liquid chromatography-tandem mass spectrometry is subject to matrix effects, calibration curves must be constructed in a relevant sample matrix. For endogenous compounds like peptides that are present in the sample matrix, assessment is frequently performed with reverse calibration curves, where a heavy isotope-labeled synthetic version of the analyte is diluted in the sample matrix.5,6 Although a signal measured below the LLOQ may still be useful to assess a difference between two conditions when compared to a signal above the LLOQ, the magnitude of the difference in the signal is not reflective of the true difference in the analyte quantity. In some papers, this phenomenon has been referred to as ratio compression.7 One of the first reports of this underestimate of the relative abundance in proteomics was in the original implementation of DIA.8 Thus, unless the relationship between the quantity and the signal for each analyte is documented, mass spectrometry measurements should be considered only differential rather than quantitative. In targeted proteomics studies, reverse calibration curves of increasing concentrations of stable isotope-labeled internal standard peptides can be used to approximate the LLOQ and precision of unlabeled peptide responses. However, large-scale studies on the order of 1000’s to 10 000’s of peptides like most DIA/SWATH-MS experiments do not evaluate peptide response. Calibration curves for up to 30 stable isotope-labeled internal standard peptides have been collected using DIA/SWATH-MS methods,9 but it is cost-prohibitive to synthesize stable isotope-labeled peptides for the number of targets detected in DIA. In this work, we propose a framework for discriminating between peptides that are only detectable and those which are both detectable and quantitative in a mass spectrometry experiment. We introduce an alternative to reverse calibration curves called matrix-matched calibration curves.

MATERIALS AND METHODS

Experimental Design

Research objectives.

The goal of this work was to create a method to construct a calibration curve that does not require the use of standards. The curves used in this work followed the Clinical and Laboratory Standards Institute (CLSI) recommendations.6 Specifically, the CLSI recommends calibration curves for LC-MS assays are composed of at minimum a blank (a sample containing the matrix only) and six–eight calibration standards, with the calibration standards commonly spaced logarithmically across several orders of magnitude.

Designing a Serial Dilution Standard Curve Using the Matrix-Matched Calibration Curve Approach.

The matrix-matched curves make use of a diluent matrix that reflects the complexity of the matrix of interest, described in more methodological detail below. Briefly, the yeast dilution series is composed of 13 calibration points and a blank consisting of the matched matrix alone (Table S1). It is also recommended that calibration curves not be composed of one continuous serial dilution because this can propagate pipetting errors throughout the curve. We therefore constructed these yeast calibration curves as a set of five serial dilutions, with each of points A, B, C, D, and E mixed individually from reference and matrix-matched materials, and then subsequent points are dilutions of those original five (F is a dilution of B, G is a dilution of C, H is a dilution of D, and I is a dilution of E, and then J is a dilution of F, K is a dilution of G, L is a dilution of H, and M is a dilution of I). If a pipetting error occurred in one of the dilutions, it would appear as an outlying point in the final calibration curve.

For the cerebrospinal fluid (CSF) curves, we followed the same fractional dilution scheme as above but did not include points K, L, and M due to limited availability of the matched-matrix material (18O-enriched CSF).

For the formalin-fixed paraffin-embedded (FFPE) tissue block proof of concept, we created concentration points of human plasma by diluting healthy donor-pooled plasma into PBS and then mixing an equal volume of each plasma dilution with liver homogenate using an open-end 2 mL syringe (Table S2). Each concentration point-spiked liver homogenate sample was then formalin-fixed and paraffin-embedded into individual tissue blocks. Tissue blocks were scraped and prepared for the analysis by mass spectrometry as described below.

Sample Preparation and Mass Spectrometry Data Acquisition

Yeast Culture and Sample Preparation.

Yeast strains BY4741 (MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0) and S288C (MATα) (Dharmacon) were cultured in YEPD and 15N minimal media, respectively, for matrix-matched calibration curve experiments. Cultures of 50 mL were grown to mid-log phase, harvested, and lysed individually with 8M urea buffer solution and bead beating (7 cycles of 4 min beating with 1 min rest on ice). Cell lysates were reduced with 5 mM DTT, alkylated with 15 mM IAA, and digested for 16 h with 1:50 trypsin to protein. The peptide digests were desalted with a mixed-mode (MCX) method, dried down via speedvac overnight, and brought up with synthetic iRT peptide standards (Pierce Peptide Retention Time Calibration Mixture) to 1 μg/μL total proteome using calculations from a bicinchoninic acid (BCA) assay (Pierce BCA Protein Assay Kit) performed on the lysate.

Cerebrospinal Fluid Sample Preparation.

Pooled human cerebrospinal fluid (CSF) from healthy donors was purchased from Golden West Biologicals. CSF was denatured with 0.2% PPS Silent Surfactant, reduced with 5 mM DTT, alkylated with 15 mM IAA, and digested for 16 h with 1:25 trypsin to protein. The peptide digest was desalted with a mixed-mode (MCX) method, the desalted peptides split into two aliquots, and each aliquot dried down via speedvac overnight. Twenty-four hours prior to MS acquisition, one dried aliquot was resuspended in 0.05 μg/μL trypsin in 18O-enriched water (Cambridge Isotope Laboratories, Inc.) following a standard 18O-labeling protocol,10 and the other was resuspended in 0.05 μg/μL trypsin in conventional molecular-grade water. The digest incubated overnight then was quenched with 5 mM DTT, cooled to room temperature, and acidified with formic acid.

Formalin-Fixed Paraffin-Embedded (FFPE) Sample Preparation.

Pooled human plasma (75 μg/μL; Na-Citrate, Cat# 7303806, Unit# 23–45456A) was diluted with DPBS (Life technologies, 14190–144) to make a plasma dilution series with 13 different concentrations. The 30 μL of human plasma or blank samples was well mixed with 80 μL of homogenized chicken liver in an open-ended syringe. Each concentration mixture was quickly mixed with 200 μL of 20% formalin and followed by 90 μL of 1% agarose. The syringe was then sealed and left on the bench overnight at room temperature to allow the protein–liver mixture form a gel-like structure. Each resulting product was then pushed out from the syringe gently and placed into a tissue cassette for the standard paraffin-embedding procedure.

Six of 10 μm thick tissue slides were obtained from each protein–chicken liver block and then deparaffined. Proteins on the deparaffinized tissue slides were resolubilized in 60 μL of 0.1% RapiGest buffer by undergoing high heat and sonication cycles. The reconstituted protein mixture was reduced, alkylated, and digested with 5 μL of trypsin overnight. The protein digests were stored in −80 °C until the day of analysis.

Liquid Chromatography-Mass Spectrometry.

Peptides were separated by liquid chromatography before the analysis by mass spectrometry, either with a Waters NanoAcquity ultra-performance liquid chromatography (UPLC) for yeast and human CSF DIA experiments or a Thermo easy-nanoLC for FFPE tissue block selected reaction monitoring (SRM) experiments. On all systems, peptides were separated by reverse-phase liquid chromatography using pulled tip columns created from 75 μm inner diameter fused silica capillary (New Objectives, Woburn, MA) in-house using a laser pulling device and packed with 3 μm ReproSil-Pur C18 beads (Dr. Maisch GmbH, Ammerbuch, Germany) to 30 cm. Trap columns were created from 150 μm inner diameter fused silica capillary fritted with Kasil on one end and packed with the same C18 beads to 3 cm.

14N BY4741 Yeast Proteome Separation on Waters NanoAcquity UPLC.

Solvent A was 0.1% formic acid in water (v/v), and solvent B was 0.1% formic acid in 98% acetonitrile (v/v). For each injection, approximately 1 μg of total protein was loaded and eluted using a 90 min gradient from 5 to 35% B, followed by a 40 min wash and equilibration (35 to 60% B for 10 min, 60 to 95% B for 5 min, 95% B for 5 min, 95 to 2% B for 1 min, and 2% B for 19 min).

16O human CSF Proteome Separation on Waters NanoAcquity UPLC.

Solvent A was 0.1% formic acid in water (v/v), and solvent B was 0.1% formic acid in 98% acetonitrile (v/v). For each injection, approximately 1 μg of total protein was loaded and eluted using a 60 min gradient from 5 to 35% B, followed by a 40 min wash and equilibration (35 to 60% B for 10 min, 60 to 95% B for 5 min, 95% B for 5 min, 95 to 2% B for 1 min, and 2% B for 19 min).

FFPE Tissue Block Proteome Separation on Thermo easy-nanoLC.

Solvent A was 0.1% formic acid in water (v/v), and solvent B was 0.1% formic acid in 98% acetonitrile (v/v). For each injection, approximately 1 μg of total protein was loaded and eluted using a 30 min gradient from 0 to 40% B, followed by a 18 min wash and equilibration (40 to 60% B for 5 min, 60% B for 5 min, 60 to 100% B for 1 min, 100% B for 5 min, 100 to 0% B for 1 min, and 0% B for 1 min).

Selected Reaction Monitoring Mass Spectrometry (SRM-MS).

FFPE tissue block curve data were acquired using a targeted SRM-MS method on a Thermo TSQ Quantiva triple quadrupole mass spectrometer. The target list was developed and optimized to measure amyloidosis proteins from formalin-fixed paraffin-embedded (FFPE) human renal biopsy specimens. Protein targets were selected based on amyloidosis typing, including amyloid A (AA), amyloid light chain-κ (AL-κ), and amyloid light chain-λ (AL-λ) amyloidosis subtypes following previous works.11,12 Peptides and transitions were optimized for this amyloidosis typing purpose. Instrument details were as follows: dwell time 2 ms, Q1 resolution set to 0.7 full width at half-maximum (FWHM), Q3 resolution set to 0.7 FWHM, CID gas set to 1.5 mTorr.

Data-Independent Acquisition Mass Spectrometry (DIA-MS).

Yeast curve data were acquired using a data-independent acquisition (DIA) method on a Thermo Q-Exactive HF Orbitrap mass spectrometer. Human CSF curve data were acquired using an equivalent DIA method on a Thermo Lumos mass spectrometer. Both DIA methods followed the chromatogram library workflow, described in greater detail elsewhere.13 To create the chromatogram library, the mass spectrometer was configured to acquire six gas phase fractions of the undiluted reference proteome for each curve (e.g., 14N BY4741 yeast proteome and 16O human CSF).

Thermo Q-Exactive HF Orbitrap Method Details.

The mass range of 388.43190–1012.70480 m/z was monitored in the yeast experiments. The chromatogram library, gas phase-fractionated ″narrow-window″ Thermo QEHF method details were as follows: 4 m/z overlapped windows (effectively 2 m/z isolation), 30K resolution, 55 maximum ion inject time, 1e6 AGC. The quantitative, single-shot ″wide-window″ Thermo QEHF method details were as follows: 24 m/z overlapped windows (effectively 12 m/z isolation),14 30K resolution, 55 maximum ion inject time, 1e6 AGC. All DIA spectra were programmed with a normalized collision energy of 27 and an assumed charge state of +2.

Thermo Lumos Method Details.

The mass range of 394.4319–1006.704807 m/z was monitored in the CSF experiments. The chromatogram library, gas phase narrow-window Thermo Lumos method details were as follows: 4 m/z overlapped windows (effectively 2 m/z isolation), 30K resolution, 60 maximum ion inject time, 4e5 AGC. The quantitative, single-shot ″wide-window″ Thermo Lumos method details were as follows: 12 m/z overlapped windows (effectively 6 m/z isolation), 15K resolution, 20 maximum ion inject time, 4e5 AGC. All DIA spectra were programmed with a normalized collision energy of 33%.

DIA-MS Data Analysis.

Thermo RAW files were converted to. mzML format using the ProteoWizard package (version 3.0.10106), where they were peak picked using vendor libraries. Converted acquisition files were processed using EncyclopeDIA (version 0.8.0) configured with default settings (10 ppm precursor, fragment, and library tolerances, considering both B and Y ions, and trypsin digestion was assumed). EncyclopeDIA was configured to use Percolator (version 3.1).

Data Availability.

The RAW files, converted MZML files, Encyclopedia elib files, and Skyline documents have been deposited in ProteomeXChange Consortium15 via the Panorama16 partner repository with the identifiers PXD014815 (ProteomeXchange) and https://panoramaweb.org/matrix-matched_calcurves.url (Panorama). While chromatograms provided in the Panorama Project are visualized using Skyline, all of the statistical analyses in this work were performed using the quantitative matrix output from EncyclopeDIA.

Statistical Analysis

Piecewise Linear Model to Fit Sparse, Label-Free LC-MS Calibration Curves.

We developed a model to fit the data produced by the matrix-matched calibration curve method. The model is an extension of the work described previously by Galitzine et al.9 Below, we briefly summarize the main steps of the model and then discuss each step in detail.

First, the model assumes that two segments are present in the calibration curve: a noise segment where the measured signal yn (reported as intensity, peak area, estimated concentration, etc.) does not exceed background noise and a signal segment where the measured signal ys is within the linear range for the analyte. Formally, we express this model as

f(x)={yn=bn   x<LODys=msx+bs   x>LOD

where x is the experimentally constructed analyte dilution values given by concentration, copies-per-cell, fractional dilution, etc. We use weighted least squares to minimize the function (lmfit package) using as weights the inverse square root of the curve points, and we constrain the parameters (bn, bs, ms) as follows

ms0
bnbs
bn0

With these constraints, we enforce that the signal segment must have a positive, nonzero slope, and we enforce that the intersection of the noise and the signal segments must be positive. The model is fit in linear space; in other words, the curve points and quantitative measures provided in the input data are not transformed prior to fitting the piecewise equation above.

To determine the standard deviation associated with the noise segment, we calculate the empirical standard deviation of all yn values associated with the noise segment. The yn values are those where the corresponding xn values are less than the intersection Px of the noise and linear segments.

Px=bnbsms

Thus, we compute the empirical standard deviation σ in yn for all points for which xPx.

Next, we determine the figures of merit: limit of detection (LOD) and limit of quantitation (LOQ). We define the limit of detection (LOD) as x for which the corresponding signal y is one standard deviation (σ) above the noise segment,

LOD=bn+σynbsms

The limit of quantitation (LOQ; also referred to as the lower limit of the measuring interval (LLMI)) is defined by the Clinical and Laboratory Standards Institute6 as ″the lowest measurand concentration at which all defined performance characteristics of the measurement procedure are met.″ The performance characteristics we choose to define are the lowest analyte concentration, which (1) is above the LOD and (2) achieves a coefficient of variation (CV) less than a threshold τ selected by a researcher (default is a 20% CV, τ = 0.2). To determine the value x, which meets these two criteria, we first uniformly discretize the range of x above the LOD into 100 bins (xi), for which we will calculate 100 predicted yi by boot-strapping. Then, we calculate the standard deviation and mean in the 100 predicted yi for each xi.

For bootstrapping, we resample the entire dataset with replacement N times (default N = 100). Each of the N resampled data sets is fit to the piecewise regression model described earlier in this section. We use the piecewise regression parameters to calculate the predicted response for a series of curve points spanning the range of curve points in the empirical data. The mean and standard deviation of the bootstrapped yi values are used to calculate a bootstrapped coefficient of variance (CVyi) for each of the curve points in the series. Finally, the LOQ is calculated as the lowest value in the curve point series above the LOD, which passes at or below the CVyi threshold specified by the researcher (default CVyi = 0.2). The user has the option to set more or less conservative thresholds. For instance, the CV threshold recommended by the Clinical and Laboratory Standards Institute guidelines is 10:1 signal:noise, which equates to a CVyi = 0.1 threshold.6

The code is available on Bitbucket (https://bitbucket.org/lkpino/matrix-matched_calcurves).

RESULTS

Our goal was to construct calibration curves and determine the LLOQ for every detectable peptide in a complex protein mixture of interest using one dilution series and without predetermining targets. We propose a matrix-matched calibration curve, in which a complex protein sample of interest (a reference material17) is diluted with a matrix-matched material. A matrix-matched material may be any sample of equivalent biochemical complexity but should not share any endogenous analytes with the reference material. For example, a matrix-matched material could be a stable isotope-labeled reference material that preserves the matrix complexity but shifts the peptide masses or could be an equivalent biosample from an evolutionarily diverged species (Figures 1 and S1). The type of matrix-matched material used for a given reference material should be chosen based on practicality. With cell culture reference materials, culturing in heavy media to incorporate heavy isotopes is feasible; however, for biofluids where it is not feasible to incorporate the heavy isotope by culturing, 18O digestion can be used to incorporate heavy isotopes into the matrix-matched material. Each calibration curve point in the dilution series has the same total protein concentration, composed of some ratio of the reference and matrix-matched material (Figure 1a) spanning several orders of magnitude (Table S1). To fit calibration curves to this novel data, we developed a computational model (Figure 1b), which extends the work described previously by Galitzine et al.9 to accommodate the sparseness of matrix-matched calibration curve data and to determine the LLOQ for each detected analyte. Briefly, the model first fits a piecewise linear regression to the noise and the signal segments of the curve data, then bootstraps the observed data, and refits the piecewise regression to the bootstrapped data to predict the signal over the range of quantities measured. Finally, we calculate the coefficient of variance (CV) of the predicted signal and define the LLOQ as the minimum quantity at which the predicted signal passes a predetermined CV threshold (CV < 20% for the results reported here) (see Materials and Methods).

Figure 1.

Figure 1.

Constructing reference material calibration curves using a matched-matrix diluent. (A) Reference material is diluted into a matrix-matched material of similar matrix complexity but with no shared endogenous analytes, for example, by stable isotope labeling the matrix or using a diverged species. The curve is made from dilutions spanning several orders of magnitude plus a blank with only the matrix-matched proteome. (B) Model for assessing the lower limit of quantification (LLOQ) using the sparse matrix-matched calibration curve data. We assess the LLOQ (cyan line) as the first point that is statistically different from the background (pink line) and has a CV < 20% using bootstrapping (red line). (C) The sequence of plasma membrane ATPase (Pma1) is represented as the black line. The transmembrane domains along the sequence are depicted in gray. Each peptide detected by DIA-MS is represented by a colored box placed along the sequence. The color of the box ranks the peptide LLOQs. Three of the peptide calibration curves are shown above the sequence. The yellow shading indicates two standard deviations above and below the median for the bootstrapped data.

We apply the matrix-matched calibration curve framework first in yeast and find that it highlights the division between detection and quantification especially at low protein abundances. Using the highly abundant yeast proteome plasma membrane ATPase protein (Pma1), as an example, we detect 28 peptides at a 1% FDR threshold across the protein sequence (Figure 1c). Of the detected peptides, only half (15 peptides) are deemed quantitative, defined as a peptide whose endogenous signal is above the LLOQ value determined by the matrix-matched calibration curve. The quantitative peptides display a range of LLOQs spanning more than 20×. The yeast proteome has the advantage of an established reference quantity for each protein, allowing us to contextualize our results. Ghaemmaghami et al. affinity-tagged the protein-coding regions in yeast and reported the protein abundances in molecules-per-cell for 4102 proteins, 3869 of which could be quantified above 50 molecules/cell.18 Using data-independent acquisition mass spectrometry (DIA-MS),13 we detected 24 400 peptides from 2870 of the proteins they quantified in the reference yeast proteome (Figure 2a,b). Using matrix-matched calibration curves to assess the quantitative accuracy of the detected peptides, we found that half of the detected proteins had at least one quantitative peptide (1427 proteins; 8630 peptides) (Figure S2). The proteins with validated peptides are primarily high-quantity proteins, particularly those above 10 000 molecules/cell. As the reported quantity18 decreases, fewer detected proteins have at least one quantitative peptide (Figure 2a,b). We compared the peptides determined to be quantitative by matrix-matched calibration curves with the peptides determined to be quantitative by a more conventional synthetic peptide approach,19 which chose candidate peptides prior to quantitative assessment and then produced paired QConCat standards for each of the candidates. Overall, the proposed framework assessed 6× more candidate peptides and defined 4.7× more peptides as quantitative (Figure S3), demonstrating the higher throughput of the proposed framework compared to conventional approaches.

Figure 2.

Figure 2.

Comparison between the detection of a peptide and the quantification of a peptide. The (A) number and (B) percentage of proteins detected in yeast at different orders of magnitude of abundance. Ghaemmaghami et al. comprehensively estimated protein copies-per cell in yeast (black, 3869 proteins) using epitope tagging.18 The wide-window DIA using a chromatogram-library approach13 detects, at 1% protein-level FDR, 74% of these proteins (blue, 2870 proteins). The number of proteins quantifiable by DIA-MS (proteins with at least one peptide with a defined LLOQ) encompasses 52% of the detected proteins or 39% of the expressed proteins (green, 1511 proteins). (C) Peptides detected in the yeast lysate narrow-window library are ranked by intensity, and the wide-window detected and quantitative peptides are shown for each decile. (D) Cerebrospinal fluid peptides detected in the narrow-window library (8698 total peptides, 2994 protein groups) are ranked by intensity, and the wide-window detected and quantitative (3183 peptides; 1303 protein groups) peptides are shown for each decile.

The matrix-matched calibration curve approach is generalizable beyond cell culture. To illustrate its flexibility, we adapted the framework to two human samples: cerebrospinal fluid (CSF) and formalin fixed paraffin-embedded (FFPE) tissue. For the CSF reference material, we chose a commercially available pool of healthy donor CSF (Golden West Biologicals, Inc.), which we prepared following conventional protocols. For the CSF matrix-matched material, we performed a second enzymatic digest in the presence of 18O-enriched water. This reaction preferentially exchanges one or both oxygens at the C-terminus of the peptide with 18O, shifting the peptides by 2 or 4 mass units via incorporation of one or two 18O atoms. Following the matrix-matched calibration curve framework, we found that 36% of peptides detected in the CSF reference material library (8698 peptides; 2994 protein groups) has a defined LLOQ (3183 peptides; 1303 protein groups) (Figure 2d). In both the yeast (Figure 2c) and CSF (Figure 2d) references, the most intense peptides in the reference are more likely to be detected and quantified. We also applied the matrix-matched calibration curve approach to an FFPE sample (Figure S4) and acquired the data by another form of mass spectrometry (selected reaction monitoring). To construct the FFPE matrix-matched calibration curve, we spiked human plasma into homogenized chicken liver as a reference and used the unspiked homogenized chicken liver for the background proteome. In this example, because the matrix-matched material needed to undergo formalin-fixation and paraffin-embedding protocols, it was not feasible to perform an 18O digest of the human plasma as the back-exchange rate throughout the FFPE tissue blocking process would have caused an undesirable increase in the endogenous peptide signal. In the FFPE human plasma samples, we targeted 84 peptides (18 proteins) and found that 33 of the targeted peptides were quantitative (13 proteins). This demonstrates that the matched-matrix calibration curve approach is generalizable broadly across not only sample types but also mass spectrometry acquisition approaches.

DISCUSSION

Using the matrix-matched approach, we find that highly abundant proteins often contain peptides that are detected at 1% FDR but are not quantifiable because the observed abundance in the reference material is below the LLOQ. A peptide with no LLOQ is a less accurate quantitative proxy for its parent protein of origin, while a peptide with a low LLOQ is a more accurate quantitative proxy and is more accurate over a wider linear range. If the purpose of an assay is simply to discover differences between samples, then the strategies reported here may not be needed. However, if the magnitude of the changes is important for an assay, then the measurements must be demonstrated to be above the LLOQ. The extreme range of peptide responsiveness illustrated in these matrix-matched calibration curves regardless of the sample matrix emphasizes the necessity to carefully select which peptides should act as quantitative proxies for their protein of origin.

A strength of this approach is that every peptide (or other type of analyte) in the reference material is diluted through the curve, meaning that calibration curves are constructed for all peptides detected in the reference material. However, this also introduces a limitation in that the maximum possible peptide quantification is restricted to the endogenous abundance of the peptide in the reference, which for low-abundance peptides results in stunted linear range. As a practical consideration, the matrix-matched calibration curve approach requires enough reference material and instrument time for the particular experimental curve design, necessitating careful prior consideration of reference material and instrumental limitations. Another consequence of the endogenous abundance limitation is that matrix-matched calibration curve data is extremely sparse compared to conventional calibration curves because low-abundance reference peptides produce a low signal, which reduces to zero signal as the reference is diluted.

While the quantitative peptides reported here may serve as a starting point for future assay development, we emphasize that these LLOQs are specific to these exact conditions. Matrix-matched calibration curves, like all calibration curves, are only reflective of the peptide measured on a given platform. While most quantitative methods report precision, this does not assess whether a change in signal reflects the change in quantity. Our findings here suggest that quantitative proteomics experiments should, at minimum, confirm analyte signal linearity within the dynamic range where most samples are likely to fall20 and, at the most rigorous, should describe the entirety of each reported analyte’s dynamic range under the specific experimental and signal processing conditions. Therefore, the use of matrix-matched calibration curves should be performed for all proteomics experiments that require an assessment of which peptides reflect the change in the quantity rather than those that are just differential.

Supplementary Material

Supporting Information: Matrix-matched calibration curves for assessing analytical figures of merit in quantitative proteomics

Algorithem 1.

model for determining LOD and LOQ from matrix-matched calibration curves

Input: x curve points, y measured signals
1: Fit piecewise regression (parameters bn, bs, ms)
2: Find intersection of piecewise components
3: Calculate standard deviation of noise segment (σyn)
4: Calculate LOD
5: Uniformly discretize 100 bins of xi from the range LOD < xmax
6: for i to N do
 7: Resample n=x data points from x,y with replacement
 8: Fit piecewise regression to the resampled points
 9: For each xi predict yi using the regression parameters
 10: For each xi calculate CVyi
11: end for
12: Calculate LOQ (LOQ = min(xi) for which CVyi ≤0.2

ACKNOWLEDGMENTS

This work is supported in part by National Institutes of Health Grants F31 AG055257 (to L.K.P.); F31 GM119273 (to B.C.S.); U01 DK121289 (to A.N.H.); R01GM121818 (to W.S.N.); P41 GM103533, RF1 AG053959, and R01 GM103551 (to M.J.M).

Footnotes

Complete contact information is available at: https://pubs.acs.org/10.1021/acs.jproteome.9b00666

Supporting Information

The Supporting Information is available free of charge at https://pubs.acs.org/doi/10.1021/acs.jproteome.9b00666.

Reference materials must be diluted with a similarly complex material to preserve matrix properties (Figure S1); Curation of DIA data using common targeted proteomics criteria can filter for the highest-quality peptides (Figure S2); matrix-matched calibration curves can assess more candidate targets than conventional approaches without predetermining targets (Figure S3); matrix-matched calibration curves can be used to rapidly develop targeted methods (Figure S4); dilution series for the yeast matrix-matched calibration curves (Table S1); dilution series for the FFPE tissue block matrix-matched calibration curves (Table S2); frequently asked questions (FAQ) (PDF)

The authors declare the following competing financial interest(s): The MacCoss Lab at the University of Washington has a sponsored research agreement with Thermo Fisher Scientific, the manufacturer of the instrumentation used in this research. Additionally, M.J.M. is a paid consultant for Thermo Fisher Scientific. The Hoofnagle laboratory receives instrument and grant support from Waters.

Contributor Information

Lindsay K. Pino, Department of Genome Sciences, University of Washington, Seattle, Washington 98195, United States;.

Brian C. Searle, Department of Genome Sciences, University of Washington, Seattle, Washington 98195, United States;.

Han-Yin Yang, Department of Genome Sciences, University of Washington, Seattle, Washington 98195, United States.

Andrew N. Hoofnagle, Department of Laboratory Medicine, University of Washington, Seattle, Washington 98195, United States

William S. Noble, Department of Genome Sciences and Department of Computer Science and Engineering, University of Washington, Seattle, Washington 98195, United States;.

Michael J. MacCoss, Department of Genome Sciences, University of Washington, Seattle, Washington 98195, United States;.

REFERENCES

  • (1).Ting YS; Egertson JD; Bollinger JG; Searle BC; Payne SH; Noble WS; MacCoss MJ PECAN: Library-Free Peptide Detection for Data-Independent Acquisition Tandem Mass Spectrometry Data. Nat. Methods 2017, 14, 903–908. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (2).Collins BC; Hunter CL; Liu Y; Schilling B; Rosenberger G; Bader SL; Chan DW; Gibson BW; Gingras A-C; Held JM; et al. Multi-Laboratory Assessment of Reproducibility, Qualitative and Quantitative Performance of SWATH-Mass Spectrometry. Nat. Commun 2017, 8, No. e1002165. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (3).Method of the Year 2012. Nat. Methods 2013, 10, 1 DOI: 10.1038/nmeth.2329. [DOI] [PubMed] [Google Scholar]
  • (4).Nic M; Jirat J; Kosata B; Jenkins A; McNaught A. IUPAC Compendium of Chemical Terminology: Gold Book, version 2.1.0; IUPAC: Research Triagle Park, NC, 10.1351/goldbook.2009. [DOI] [Google Scholar]
  • (5).Abbatiello SE; Schilling B; Mani DR; Zimmerman LJ; Hall SC; MacLean B; Albertolle M; Allen S; Burgess M; Cusack MP; et al. Large-Scale Interlaboratory Study to Develop, Analytically Validate and Apply Highly Multiplexed, Quantitative Peptide Assays to Measure Cancer-Relevant Proteins in Plasma. Mol. Cell. Proteomics 2015, 14, 2357–2374. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (6).Lynch KL CLSI C62-A: A New Standard for Clinical Mass Spectrometry. Clin. Chem 2016, 62, 24–29. [DOI] [PubMed] [Google Scholar]
  • (7).Savitski MM; Mathieson T; Zinn N; Sweetman G; Doce C; Becher I; Pachl F; Kuster B; Bantscheff M Measuring and Managing Ratio Compression for Accurate ITRAQ/TMT Quantification. J. Proteome Res 2013, 12, 3586–3598. [DOI] [PubMed] [Google Scholar]
  • (8).Venable JD; Dong M-Q; Wohlschlegel J; Dillin A; Yates JR Automated Approach for Quantitative Analysis of Complex Peptide Mixtures from Tandem Mass Spectra. Nat. Methods 2004, 1, 39–45. [DOI] [PubMed] [Google Scholar]
  • (9).Galitzine C; Egertson JD; Abbatiello S; Henderson CM; Pino LK; MacCoss M; Hoofnagle AN; Vitek O Nonlinear Regression Improves Accuracy of Characterization of Multiplexed Mass Spectrometric Assays. Mol. Cell. Proteomics 2018, 17, 913–924. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (10).Petritis BO; Qian W-J; Camp DG; Smith RD A Simple Procedure for Effective Quenching of Trypsin Activity and Prevention of 18O-Labeling Back-Exchange. J. Proteome Res 2009, 8, 2157–2163. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (11).Vrana JA; Gamez JD; Madden BJ; Theis JD; Bergen HR; Dogan A Classification of Amyloidosis by Laser Microdissection and Mass Spectrometry–Based Proteomic Analysis in Clinical Biopsy Specimens. Blood 2009, 114, 4957–4959. [DOI] [PubMed] [Google Scholar]
  • (12).Theis JD; Dasari S; Vrana JA; Kurtin PJ; Dogan A Shotgun-Proteomics-Based Clinical Testing for Diagnosis and Classification of Amyloidosis: Shotgun-Proteomics-Based Clinical Testing. J. Mass Spectrom 2013, 48, 1067–1077. [DOI] [PubMed] [Google Scholar]
  • (13).Searle BC; Pino LK; Egertson JD; Ting YS; Lawrence RT; MacLean BX; Villen J; MacCoss MJ Chromatogram Libraries Improve Peptide Detection and Quantification by Data Independent Acquisition Mass Spectrometry. Nat. Commun 2018, 9, No. 5128. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (14).Amodei D; Egertson J; MacLean BX; Johnson R; Merrihew GE; Keller A; Marsh D; Vitek O; Mallick P; MacCoss MJ Improving Precursor Selectivity in Data-Independent Acquisition Using Overlapping Windows. J. Am. Soc. Mass Spectrom 2019, 30, 669–684. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (15).Vizcaíno JA; Deutsch EW; Wang R; Csordas A; Reisinger F; Ríos D; Dianes JA; Sun Z; Farrah T; Bandeira N; et al. ProteomeXchange Provides Globally Coordinated Proteomics Data Submission and Dissemination. Nat. Biotechnol 2014, 32, 223–226. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (16).Sharma V; Eckels J; Taylor GK; Shulman NJ; Stergachis AB; Joyner SA; Yan P; Whiteaker JR; Halusa GN; Schilling B; et al. Panorama: A Targeted Proteomics Knowledge Base. J. Proteome Res 2014, 13, 4205–4210. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (17).Pino LK; Searle BC; Huang EL; Noble WS; Hoofnagle AN; MacCoss MJ Calibration Using a Single-Point External Reference Material Harmonizes Quantitative Mass Spectrometry Proteomics Data between Platforms and Laboratories. Anal. Chem 2018, 90, 13112–13117. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (18).Ghaemmaghami S; Huh W-K; Bower K; Howson RW; Belle A; Dephoure N; O’Shea EK; Weissman JS Global Analysis of Protein Expression in Yeast. Nature 2003, 425, 737–741. [DOI] [PubMed] [Google Scholar]
  • (19).Lawless C; Holman SW; Brownridge P; Lanthaler K; Harman VM; Watkins R; Hammond DE; Miller RL; Sims PFG; Grant CM; et al. Direct and Absolute Quantification of over 1800 Yeast Proteins via Selected Reaction Monitoring. Mol. Cell. Proteomics 2016, 15, 1309–1322. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (20).Grant RP; Hoofnagle AN From Lost in Translation to Paradise Found: Enabling Protein Biomarker Method Transfer by Mass Spectrometry. Clin. Chem 2014, 60, 941–944. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supporting Information: Matrix-matched calibration curves for assessing analytical figures of merit in quantitative proteomics

Data Availability Statement

The RAW files, converted MZML files, Encyclopedia elib files, and Skyline documents have been deposited in ProteomeXChange Consortium15 via the Panorama16 partner repository with the identifiers PXD014815 (ProteomeXchange) and https://panoramaweb.org/matrix-matched_calcurves.url (Panorama). While chromatograms provided in the Panorama Project are visualized using Skyline, all of the statistical analyses in this work were performed using the quantitative matrix output from EncyclopeDIA.

RESOURCES