Skip to main content
American Journal of Physiology - Lung Cellular and Molecular Physiology logoLink to American Journal of Physiology - Lung Cellular and Molecular Physiology
. 2008 May 2;295(1):L16–L22. doi: 10.1152/ajplung.00044.2008

Challenges in translating plasma proteomics from bench to bedside: update from the NHLBI Clinical Proteomics Programs

Robert E Gerszten 1, Frank Accurso 2, Gordon R Bernard 3, Richard M Caprioli 4, Eric W Klee 5, George G Klee 5, Iftikhar Kullo 6, Theresa A Laguna 2, Frederick P Roth 7, Marc Sabatine 8, Pothur Srinivas 9, Thomas J Wang 1, Lorraine B Ware 3; for the NHLBI Clinical Proteomics Programs
PMCID: PMC2494793  PMID: 18456800

Abstract

The emerging scientific field of proteomics encompasses the identification, characterization, and quantification of the protein content or proteome of whole cells, tissues, or body fluids. The potential for proteomic technologies to identify and quantify novel proteins in the plasma that can function as biomarkers of the presence or severity of clinical disease states holds great promise for clinical use. However, there are many challenges in translating plasma proteomics from bench to bedside, and relatively few plasma biomarkers have successfully transitioned from proteomic discovery to routine clinical use. Key barriers to this translation include the need for “orthogonal” biomarkers (i.e., uncorrelated with existing markers), the complexity of the proteome in biological samples, the presence of high abundance proteins such as albumin in biological samples that hinder detection of low abundance proteins, false positive associations that occur with analysis of high dimensional datasets, and the limited understanding of the effects of growth, development, and age on the normal plasma proteome. Strategies to overcome these challenges are discussed.

Keywords: protein content, proteome


the emerging scientific field of proteomics encompasses the identification, characterization, and quantification of the protein content or proteome of whole cells, tissues, or body fluids. The potential for proteomic technologies to identify and quantify novel proteins in the plasma that can function as biomarkers of the presence or severity of clinical disease states holds great promise for clinical use. However, there are challenges in translating plasma proteomics from bench to bedside (37), and to date, relatively few biomarkers have successfully transitioned from proteomic discovery to routine clinical use.

To directly address some of these challenges, the National Heart, Lung, and Blood Institute's (NHLBI) Clinical Proteomics Programs were established in 2005. The overall goal of the NHLBI Clinical Proteomics Programs is to promote systematic, comprehensive, large-scale validation of existing and new candidate protein markers that are appropriate for routine use in the diagnosis and management of heart, lung, blood, and sleep diseases. Specific goals include 1) to design panels of candidate proteins for unmet clinical disease areas; 2) to develop high-throughput analytical methods to simultaneously assay multiple putative markers; 3) to assess the predictive value of these proteomic measurements using biological specimens and clinical data from existing study populations; and 4) to establish procedures and standards for quality control.

In October 2007, the steering committee of the NHLBI Clinical Proteomics Programs met in Rochester, Minnesota. The meeting focused on current perceptions of the barriers to achieving rapid and effective translation of plasma protein biomarkers from discovery to clinical use, and importantly, research directions aimed at overcoming existing limitations. Here, we summarize highlights from this meeting. Table 1 presents an overview of the key challenges that were identified and strategies for overcoming these challenges.

Table 1.

Strategies for overcoming key challenges in plasma proteomics

Challenges Strategies
Correlated biomarkers often do not improve disease prediction Search for uncorrelated (orthogonal) biomarkers either through: unbiased discovery experiments or targeted examination of novel pathways (including those identified by recent genetic association studies)
Complexity of the proteome in clinical samples Assemble data from multiple large patient populations for comparison; incorporate emerging bioinformatics approaches
High abundance proteins in clinical samples Evaluate emerging depletion techniques for high abundance plasma constituents; make use of improved mass spectrometry techniques with greater dynamic range
Validation of multiplex assays Compare assays across each source of reagents in each specimen matrix; collaborate with laboratory standardization groups and agencies (e.g., National Institute of Standards, Clinical Laboratory Standards Institute); apply mass spectrometry approaches to multiplex protein quantification
False positive associations with high-throughput proteomic data analysis Apply pathway/functional trend analysis
Heterogeneity in approach to clinical specimen acquisition Use standardized specimen collection and storage protocols; expand studies of factors affecting analyses of clinical samples (e.g., freeze-thaw cycles, processing details)
Insufficient knowledge of the effect of growth and development on the normal proteome Population-based studies of the normal proteome in children

The need for novel biomarkers.

The majority of plasma protein biomarkers that are currently in clinical use in lung and heart disease were developed as an extension of targeted physiological studies, investigating previously identified pathways such as inflammation, endothelial injury, or coagulation and fibrinolysis. Although individual biomarker concentrations have often been associated with a disease diagnosis or an increased risk of adverse events or clinical outcomes, combining individual biomarkers has added only moderately to the prediction of risk in a given individual. This phenomenon is illustrated by recent findings from the Framingham Heart Study. Wang and colleagues (46) evaluated 10 contemporary biomarkers of cardiovascular disease in over 3,000 study participants who were followed for development of cardiovascular disease for almost 10 years. Among the 10 biomarkers measured, many were significant predictors of cardiovascular events (B-type natriuretic peptide, urinary albumin excretion) and mortality (C-reactive protein, B-type natriuretic peptide, urinary albumin excretion, renin, homocysteine). Risk prediction was improved by combining biomarkers into a multimarker score; however, combining the biomarkers yielded only modest changes in the area under the receiver-operating curve (AUC) (Fig. 1).

Fig. 1.

Fig. 1.

Receiver-operating-characteristic curves for death during 5-yr follow-up. For each end point, curves are based on models of the prediction of risk with the use of conventional risk factors with or without biomarkers (multimarker score). Biomarkers for death were B-type natriuretic peptide, C-reactive protein, the urinary albumin-to-creatinine ratio, homocysteine, and renin. [From Wang et al. (46).]

Evaluation of diagnostic or predictive tests by receiver-operating curve analysis uses the AUC as a measure of the test's ability to discriminate individuals with disease from those without disease. Distributions of biomarkers in individuals with and without disease typically overlap a great deal, which provides one explanation for the modest increase in AUC seen in most clinical biomarker studies (47). Furthermore, existing candidate biomarkers usually derive from pathways, such as lipoprotein metabolism or inflammation, that have already been implicated in atherosclerotic vascular disease. As a consequence, these biomarkers provide predictive information that is correlated with characteristics that are already being measured, limiting their incremental predictive value. The failure of correlated biomarkers to provide incremental predictive value is illustrated in models of hypothetical cancer biomarkers, as shown by Pepe and Thompson (32). Indeed, a large number of correlated biomarkers is likely to be less informative than a smaller number of uncorrelated biomarkers. Thus, to maximize clinical utility, there is a need to identify uncorrelated (“orthogonal”) biomarkers along novel pathways.

Because it is difficult to demonstrate improvements in the AUC, some investigators have advocated use of other metrics to evaluate new biomarkers, such as model calibration (10) and reclassification percentage (31). The former refers to the correspondence between predicted event rates and observed event rates and is assessed quantitatively using goodness-of-fit measures such as the Hosmer-Lemeshow statistic. Reclassification refers to the ability of new biomarkers to move people between discrete risk categories, so that some low-risk individuals may be reclassified as high risk and vice versa. The utility of these alternative metrics remains under evaluation. For instance, reclassification relies on the use of cutpoints to separate patients into risk categories. Movement between risk categories may not be meaningful if the cutpoints are not well accepted, if there is no management strategy explicitly linked to the categorization, or if most of the movements involve small shifts in absolute risk from just below to just above the cutpoint (45).

Challenges inherent to plasma proteomics.

Of the many sources available for identifying new biomarkers of clinical disease diagnosis and severity, the proteome offers the most promise for identifying previously unknown biomarkers that have the potential to be from novel pathways and to be complementary to previously identified biomarkers. The promise of proteomics may exceed that of genomics approaches because proteins and their biological and enzymatic activities are the major determinants of the diversity of phenotypes that can manifest from a common set genes. The complement of expressed proteins changes rapidly in response to environmental cues. Thus, the proteome is highly suited to represent the state of a cell, tissue, or organism at a given time, in the context of a specific stimulus.

One of the barriers to successful plasma biomarker identification and translation from discovery platforms is the high level of complexity of the proteome. This complexity presents unique analytical challenges that are further magnified with the use of clinical plasma samples to search for novel biomarkers of clinical disease. The plasma proteome is composed of tens of thousands of unique proteins. The plasma proteome does not result from expression of a particular cellular genome; rather, it reflects contributions from the collective expression of many cellular genomes. In fact, it has been hypothesized that the estimated complement of over 300,000 human polypeptide species arising from variable splicing and posttranslational modifications could be present in the plasma proteome (3). Proteins from all functional classes and cellular localizations are found in the plasma, and a majority of the lower abundance proteins are intracellular or membrane proteins, presumably found in the plasma as a result of cellular turnover.

One of the challenges inherent to plasma or serum studies is the issue of high abundance proteins. Greater than 95% of the serum proteome is composed of ∼20 high abundance proteins including albumin and the immunoglobulins (3). These high abundance proteins hinder the ability to detect low abundance proteins. However, it is the low abundance proteins that are most likely to be biologically relevant as markers of a disease state. Concentrations of low abundance proteins may differ from those of high abundance proteins by as much as 10 orders of magnitude. For example, plasma levels of markers of myocardial injury such as the troponins are in the nanomolar range, levels of insulin are in the picomolar range, and levels of the proinflammatory cytokine TNFα may be in the femtomolar range. To address the issue of high abundance proteins, mass spectrometers with wider dynamic ranges have been developed. In addition, there have been advances in technologies that allow depletion of high abundance proteins. New immunodepletion strategies efficiently remove as many as 20 of the high abundance constituents (7). Techniques for immunoextraction and concentration of targeted biomarker fragments may be more reliable (2, 11, 24, 28, 38, 49). However, the degree to which relevant low abundance proteins are lost during processing to remove high abundance proteins is unclear and may be highly variable. A recent study reported that albumin depletion also removed 58% of IL-6, 60% of TNF, and 74% of IL-8 (14).

Challenges in biostatistical analysis of proteomic and biomarker datasets.

Although a high-throughput proteomic approach to plasma biomarker discovery has many advantages, it also brings a danger of generating false positive associations due to multiple testing and overfitting of data. Application of traditional statistical approaches (e.g., Bonferroni correction) in this setting tends to levy an insurmountable statistical penalty that can obscure biologically relevant associations. Even newer statistical techniques, such as advanced resampling methods or control of the false discovery rate (40), do not address adequately the fundamental problem of how to detect subtle but important changes in multiple variables identified with high-throughput proteomic approaches.

In contrast to traditional statistical approaches, a bioinformatics approach using pathways analysis harnesses the vast information gathered in proteomics experiments and turns it into a strength. Specifically, although measurement error in the marker discovery phase often prevents high confidence in any one particular protein's correlation, the observation that multiple proteins in a particular biological pathway are moving in tandem brings confidence that a particular pathway, and hence any biomarkers in that pathway, truly are correlated with the perturbation. By utilizing a more principled selection process for candidate marker triage, this approach increases the likelihood that candidate biomarkers will be validated in subsequent prospective validation studies. This approach also enhances the ability to use the proteomics data collected in the biomarker discovery phase to gain insight into disease biology. Identification of relevant pathways facilitates focus on other biomarkers in a perturbed pathway that may not have been identified in traditional screens as well as exploration of these pathways as possible targets for therapeutic intervention.

Although not yet widely used in proteomics, systematic analysis of functional trends has become widespread and important in the analysis of DNA microarray data from model organisms. An early use of this approach was an analysis by Tavazoie et al. in 1999 (41), in which clusters of genes with mutually similar expression in a synchronized Saccharomyces cerevisiae time-course experiment were examined. In this study, each cluster of genes was examined for overrepresented functional annotation trends (41). This study not only rigorously demonstrated the intuitive notion that coexpressed genes often share a function, but also objectively highlighted specific functional trends, e.g., that budding and cell polarity genes are overrepresented among genes expressed in the M-phase of the cell cycle.

The value of this approach in human studies was illustrated in a recent analysis of high-throughput differential mRNA expression (27). Expression of mRNA was assessed on more than 22,000 genes comparing patients with type 2 diabetes mellitus and unaffected controls (patients with normal glucose tolerance). A group of genes with depressed expression in diabetes vs. controls was identified and tested for association with a collection of other gene characteristics. It was found that this gene set was enriched for genes involved in oxidative phosphorylation. Although individual oxidative phosphorylation genes were not dramatically reduced in expression, as a group the trend was highly significant. Furthermore, the effect was attributable to a subset of oxidative phosphorylation genes regulated by peroxisome proliferator-activated receptor coactivator 1, a cold-inducible regulator of mitochondrial biogenesis. Thus, the analysis of trends among differentially expressed genes led directly to insight into altered metabolism in diabetes patients and hinted at therapeutic hypotheses involving the modulation of oxidative phosphorylation pathways.

Emerging software tools, including FuncAssociate (5), recently described by Berriz et al., may be used in conjunction with essentially any high-throughput experimental approach for identifying or ranking genes or proteins. Furthermore, although this approach has generally been used in conjunction with controlled vocabulary functional annotation, e.g., Gene Ontology (GO) annotation, it can be used in conjunction with many different sources of gene/protein/metabolite annotation, e.g., expression pattern in other studies, phenotype, protein complex membership, disease association, or phylogenetic profile.

Strengths and limitations of current multiplexing platforms for biomarker validation.

Having established which novel plasma biomarkers are of sufficient interest for validation, emerging technologies allow us to assay multiple markers at once. Below we discuss the strengths and limitations of several multiplex platforms. Strengths and limitations are also summarized in Table 2.

Table 2.

Strengths and limitations of currently available multiplexing platforms for biomarker validation

Multiplex Platform Strengths Weaknesses
Multiplex Immunoassays
    Suspension arrays
  1. Smaller sample volume and higher throughput than traditional single analyte immunoassays

  2. Many commercially available multiplex assays available from multiple sources

  1. Nonspecific binding of serum proteins directly to microspheres may result in bead aggregation and nonspecific fluorescent emission

  2. Difficult to optimize assay conditions for multiple analytes

    Planar arrays
  1. Smaller sample volume and higher throughput than traditional single analyte immunoassays

  2. Does not require flow cytometric bead analysis

  1. Damage to the capture antibodies by mechanical forces may occur during spotting

  2. Large dynamic range of serum protein abundance limits potential combinations of analyte proteins within an array

  3. Difficult to optimize assay conditions for multiple analytes

Mass spectrometry with selected reaction monitoring
  1. Exquisite specificity

  2. Ability to multiplex hundreds of analytes

  1. Less sensitivity than immunoassays for low abundance proteins

  2. Generation of peptide targets is labor intensive

  3. High abundance proteins may interfere with analyte detection

Multiplex immunoassay technologies.

There is considerable interest in multiplex arrays that enable simultaneous quantification of multiple proteins (37). Single protein measurement can be laborious, time-consuming, and costly, whereas concurrent measurement of multiple protein biomarkers permits reduced sample consumption, technician time, and reagent volumes and increased sample throughput. The information provided by measurement of a single protein is limited, and multiple markers may be more useful for disease screening and assessing multiple physiological pathways that contribute to disease activity and prognosis.

ELISA, a “workhorse” for protein measurement for decades, has been adapted for multiplex biomarker assessment. Multiplex immunoassay assay formats may include suspension arrays or planar arrays. For planar arrays, analytes are traditionally detected using fluorescent or chemiluminescent sandwich immunoassay principles in which immobilized “capture” antibodies complex with protein in a biological sample and “detection” antibodies linked to reporter molecules bind the captured protein to create a “sandwich.” The signal generated by the reporter molecules is directly proportional to protein concentration in the unknown sample.

In bead-based suspension arrays, capture antibodies are immobilized on polystyrene microspheres (beads) suspended in buffer. Biological sample is added to mixtures of the beads, and a detection antibody-fluorophore conjugate binds the captured protein. Flow cytometric systems allow simultaneous discrimination of bead types and quantification of captured sample proteins. As beads pass through laser beams housed in the flow cytometer, the reporter fluorophore is excited and emits light that is converted to a numeric signal by internal digital processors. Simultaneous excitation of internal bead dyes allows measurement of bead fluorescent intensity that is unique to each assay and used to assign fluorophore values to the correct assay. Several bead-based suspension arrays are commercially available (4, 6, 23, 25, 34). Advantages include in-house assay development, automated nature, and avoidance of the need to spot antibody material (19). A disadvantage is that nonspecific binding of serum proteins directly to the microspheres may result in bead aggregation and nonspecific fluorescent emission, thereby limiting assay sensitivity and accuracy (48).

For planar arrays, capture antibodies are discretely immobilized on a rigid microplate surface using robotic arrayers (13). Antibodies can be spotted directly onto the plate's surface using tiny pins (51) or by noncontact arrayers that use piezoelectric elements to transfer capture material to the microplate surface (51). Planar array protocols are comparable to traditional ELISAs and typically use a camera to detect chemiluminescent signal. Numeric values are generated based on the density of light image spots, and data are assigned to a specific assay based on the intra-well location of the light spot. Planar multiplex platforms that can assay 9–16 analytes concurrently are commercially available (12, 26). Technical limitations include the possibility of damage to the capture antibodies by mechanical forces during spotting, and the large dynamic range of serum protein levels, a factor that limits combination of analyte proteins within an array.

For validation of panels of biomarkers by either suspension or planar antibody array technologies, well-characterized multiplex assay components are needed to ensure that the data derived from multiplex assays are useful in the clinical setting. Capture and detection antibody materials should be well characterized and exhibit minimal interlot variability. A sustainable source of antibodies with adequate specificity is key, and lack of specific antibodies can be a major impediment to both singleplex and multipex assays. Ongoing large-scale efforts to generate antibodies against human epitopes will undoubtedly provide new reagents for multiplex assay development (30). Also, there is a need for validated reference standards that allow accurate and consistent quantification of proteins. Multiplex arrays are classified as in vitro diagnostic multivariate index assays by the Food and Drug Administration (FDA) center. Although formal regulatory guidance for clinical validation of multiplex assays is lacking, regulatory requirements are under review (43).

Multiplex biomarker measurements with mass spectrometry.

Currently, the core technology for identification of novel plasma protein biomarkers is tandem mass spectrometry (MS/MS). Tandem mass spectrometry, coupled with upfront liquid chromatography, is applicable to the readily accessible biological fluids (serum, plasma, urine, etc.) and is highly sensitive for peptides and other small molecules. Recent advances in MS/MS now enable researchers to determine masses of analytes with high precision and accuracy such that many peptides and metabolites can be identified unambiguously even in complex fluids. With a wealth of novel proteins being found in discovery efforts, the emerging field of clinical proteomics focuses on the triage and validation of newly identified protein biomarkers.

Beyond its utility as a platform for biomarker discovery, mass spectrometry with selected reaction monitoring is a powerful tool for identification of small molecules such as drugs and steroids (15, 22) and is increasingly being applied to peptide analyses (20). This method has exquisite specificity due to the unique mass-to-charge ratios of these molecules and the corresponding daughter ion fragments of the selected target peptides. High sensitivity also can be achieved if mass spectrometry is coupled with immunogenic extraction and enrichment methods to increase low concentration target selection (2). Multiplex combinations of these peptide molecules can be quantitated by comparison to isotypically labeled internal standards that are similar to the endogenous substances but are shifted by a small number of mass units (15). The wider application of this technique to measure low concentrations of protein biomarkers requires procedures to cleave the proteins into smaller segments (2, 49) and methods to extract or concentrate the biomarker segments. The basic processes for mass spectrometry measurement of peptide digestion fragments of biomarkers are illustrated in Fig. 2.

Fig. 2.

Fig. 2.

Flow diagram illustrating mass spectrometry measurement of biomarkers utilizing prior enzyme digestion and immuno-affinity extraction of peptide fragments.

Bioinformatics analysis of the candidate biomarkers is a critical first step. Multiple features need to be evaluated when selecting the optimal peptides for biomarker validation. These features include:

  1. ) Enzymatic cleavage sites. The peptide cleavage products can be identified by computational tryptic digestion of target protein sequences using PeptideCutter (http://ca.expasy.org/tools/peptidecutter/). This program also predicts cleavage points for 35 common alternative enzymes and chemicals.

  2. ) Specificity for targeted biomarker. Peptides that are 7–21 residues in length generally are selected to ensure protein-specific targeting. The tryptic peptides can be compared with the RefSeq Human Protein (33) sequence set using the BLAST (blastp) algorithm (1). Peptides displaying strong homology with nonspecific protein targets are excluded.

  3. ) Immunogenicity for antibody production. Peptide sequence immunogenic potential can be assessed using the GCG Peptidestructure program (17). This program identifies sequences scoring well for immunogenetic characteristics such as hydrophilicity, surface probability, and flexibility.

  4. ) Mass spectrometry signal strength. The mass spectrometry signal strength can be predicted based on the number of prolines present minus the number of methionines and cysteines present.

  5. ) Likelihood that peptide would be found in plasma. Transmembrane domains are less likely to be found in circulation and are excluded using predictions or annotations in the SwissProt (9) database.

  6. ) Low probability of posttranslational metabolic changes. Peptides with predicted glycosylation sites and/or phosphorylation sites are avoided. N-linked glycosylation site can be predicted from NetNGlyc 1.0 (http://www.cbs.dtu.dk/services/NetNGlyc/), O-linked glycosylation site predictions from NetOGlyc 3.1 (18), and phosphorylation site predictions from NetPhos 2.0 (8).

Once the key peptide sequences have been selected, both the natural peptide (with a conjugation site such as cysteine added) and an isotopically labeled form of the peptide (often 13C) can be readily synthesized using automated platforms. The natural peptides can be conjugated to a carrier protein and used to make polyclonal and/or monoclonal antibodies. The isotopically labeled peptides are used as internal standards for the mass spectrometry (2).

Clinical specimen considerations.

Validation of potential biomarkers of diagnosis or disease severity requires the use of biological fluids from substantial numbers of patients with the relevant disease and appropriate controls. Banked specimens that have been collected in conjunction with prior clinical trials or observational studies have the advantage of immediate availability in conjunction with well-phenotyped patient populations. However, the use of banked specimens also has significant limitations. Foremost, a clinical trial is usually not designed with the goal of validating a diagnostic or disease severity biomarker. Strict inclusion and exclusion criteria may limit the generalizability of findings. Appropriate controls may not be included in the study population. For studies of disease states such as myocardial ischemia, the inherent unpredictability of the onset complicates the timing of blood sampling. The effects of the clinical intervention may also affect the biomarker that is being studied or may confound any association with disease severity (21, 29). Finally, the sample collection and storage procedures may not be sufficiently uniform for reliable biomarker assays.

There are a number of preanalytic variables that can affect the validity of biomarker assays, and these variables need to be considered when designing validation studies and assessing the potential utility of banked specimens (35). Preanalytic variables that can affect assay validity include the method of sample collection, the type of anticoagulants or preservatives that are used, the procedure used to process the sample, the time between collection and assay, and the storage conditions used during this interval. Freezing and thawing, especially repetitive freeze-thaw cycles, may be particularly harmful to some protein analytes (44). Protein degradation can occur at any time from sample collection to time of assay. Investigators (36) in the Vanderbilt Clinical Proteomics program found that significant protein degradation was ongoing in plasma samples that were collected with EDTA and allowed to remain on ice for 7–8 h (Fig. 3) before reverse phase purification for a MALDI-TOF-based discovery platform. Furthermore, archival plasma samples that had been collected as part of a randomized clinical trial of two ventilator strategies in patients with ARDS (42) also showed significant evidence of protein degradation, suggesting that sample collection procedures in that study may not have been optimal for discovery proteomic studies. These findings, in conjunction with a growing body of published literature (reviewed in Ref. 35) on the importance of standardization of sample collection and processing for discovery proteomic studies, suggest that the potential limitations of archival samples must be carefully assessed before using these samples for discovery or validation of protein biomarkers. For some applications, prospective sample collection may be required, although this approach, by necessity, is more time-consuming and expensive. Biomarkers that are selected from smaller, carefully phenotyped cohorts that are prospectively collected can be subsequently validated in larger, more heterogeneous populations.

Fig. 3.

Fig. 3.

Matrix-assisted laser desorption ionization time-of-flight mass spectroscopy (MALDI-TOF MS) of reverse phase purified plasma samples. Light blue line shows the spectrum from an archival EDTA plasma sample from a patient with acute lung injury (ALI/ARDS) enrolled in a clinical trial. The other spectra (black, red, green, dark blue) are from a single EDTA plasma sample drawn from a healthy volunteer that was spun and frozen immediately, thawed, and either reverse phase purified immediately (black line) or allowed to remain on ice for 7–8 h before reverse phase purification and spotting for MALDI-TOF MS. Note that control samples that were kept on ice for 7–8 h developed new peaks in the low mass-to-charge range (m/z) that were not visualized in the freshly thawed control plasma sample. The archival acute lung injury plasma sample also had numerous peaks in the low m/z range.

Pediatric populations present special challenges for biomarker discovery and validation (16). Often, only small sample volumes of biological fluids are available for analysis. This limits the number of analytes that can be assessed. In addition, the use of blood collection tubes that are specially designed for proteomic analysis may be precluded if they require more blood than can be taken from small infants and young children. Young children also cannot cooperate to obtain induced sputum, a specimen type that is increasingly used for airway biomarker studies (39). Even outpatient collection of urine is more difficult in small children. Families may have to wait hours to obtain the specimen.

Biomarker discovery and validation in pediatrics also occurs on a background of growth and development. Since we do not have a good understanding of the normal proteome in children, it can be difficult to determine whether protein products are related to disease or to normal growth and development. For example, Winfield et al. (50) found that urinary desmosine levels, a breakdown product of elastin, were much higher in normal infants less than 2 yr old compared with older children. A possible interpretation of this is that the lung is undergoing remodeling as part of normal development in infants. In any case, it illustrates that before elastin breakdown can be studied as a biomarker of lung injury in pediatrics, studies of normal children of all ages are necessary.

Conclusions.

With the explosion of genetic and genomic studies of human disease, including the growing number of genome-wide association studies, there is a critical need for complementary proteomic technologies. The potential for plasma proteomic analysis to identify and quantify novel proteins that can function as plasma biomarkers of the presence or severity of clinical disease continues to hold great promise for clinical use. Standardized approaches to sample collection and preparation, new analytical techniques, and novel algorithms for biostatistical and bioinformatics analysis will facilitate the translation of plasma proteomics from the bench to the bedside and allow the great potential of clinical proteomics to be realized.

The costs of publication of this article were defrayed in part by the payment of page charges. The article must therefore be hereby marked “advertisement” in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

REFERENCES

  • 1.Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Basic local alignment search tool. J Mol Biol 215: 403–410, 1990. [DOI] [PubMed] [Google Scholar]
  • 2.Anderson NL, Anderson NG, Haines LR, Hardi DB, Olafson RW, Pearson TW. Mass spectrometric quantitation of peptides and proteins using stable isotope standards and capture by anti-peptide antibodies. J Proteome Res 3: 235–244, 2004. [DOI] [PubMed] [Google Scholar]
  • 3.Anderson NL, Polanski M, Pieper R, Gatlin T, Tirumalai RS, Conrads TP, Veenstra TD, Adkins JN, Pounds JG, Fagan R, Lobley A. The human plasma proteome: a nonredundant list developed by combination of four separate sources. Mol Cell Proteomics 3: 311–326, 2004. [DOI] [PubMed] [Google Scholar]
  • 4.BD Biosciences. BD Bioscences FACSarray, accessed December 2007 at http://www.bdbiosciences.com/pharmingen/products/display_product.php?keyID=9.
  • 5.Berriz GF, King OD, Bryant B, Sander C, Roth FP. Characterizing gene sets with FuncAssociate. Bioinformatics 19: 2502–2504, 2003. [DOI] [PubMed] [Google Scholar]
  • 6.Bio-Rad Laboratories. Bio-Rad Bio-Plex System and Suspension Array Technology, accessed December 2007 at www.biorad.com/bio-plex.
  • 7.Bjorhall K, Miliotis T, Davidsson P. Comparison of different depletion strategies for improved resolution in proteomic analysis of human serum samples. Proteomics 5: 307–317, 2005. [DOI] [PubMed] [Google Scholar]
  • 8.Blom N, Gammeltoft S, Brunak S. Sequence and structure-based prediction of eukaryotic protein phosphorylation sites. J Mol Biol 294: 1351–1362, 1999. [DOI] [PubMed] [Google Scholar]
  • 9.Boeckmann B, Bairoch A, Apweiler R, Blatter MC, Estreicher A, Gasteiger E, Martin MJ, Michoud K, O'Donovan C, Phan I, Pilbout S, Schneider M. The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003. Nucleic Acids Res 31: 365–370, 2003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Cook NR Use and misuse of the receiver operating characteristic curve in risk prediction. Circulation 115: 928–935, 2007. [DOI] [PubMed] [Google Scholar]
  • 11.Elia G, Silacci M, Scheurer S, Scheuermann J, Neri D. Affinity-capture reagents for protein arrays. Trends Biotechnol 20: S19–S22, 2002. [DOI] [PubMed] [Google Scholar]
  • 12.Endogen. Endogen Searchlight, accessed December 2007 at http://www.endogen.com/services/.
  • 13.Glokler J, Angenendt P. Protein and antibody microarray technology. J Chromatogr B Analyt Technol Biomed Life Sci 797: 229–240, 2003. [DOI] [PubMed] [Google Scholar]
  • 14.Granger J, Siddiqui J, Copeland S, Remick D. Albumin depletion of human plasma also removes low abundance proteins including the cytokines. Proteomics 5: 4713–4718, 2005. [DOI] [PubMed] [Google Scholar]
  • 15.Guo T, Chan M, Soldin SJ. Steroid profiles using liquid chromatography-tandem mass spectrometry with atmospheric pressure photoionization source. Arch Pathol Lab Med 128: 469–475, 2004. [DOI] [PubMed] [Google Scholar]
  • 16.Hunsucker SW, Accurso FJ, Duncan MW. Proteomics in pediatric research and practice. Adv Pediatr 54: 9–28, 2007. [DOI] [PubMed] [Google Scholar]
  • 17.Jameson BA, Wolf H. The antigenic index: a novel algorithm for predicting antigenic determinants. Comput Appl Biosci 4: 181–186, 1988. [DOI] [PubMed] [Google Scholar]
  • 18.Julenius K, Molgaard A, Gupta R, Brunak S. Prediction, conservation analysis, and structural characterization of mammalian mucin-type O-glycosylation sites. Glycobiology 15: 153–164, 2005. [DOI] [PubMed] [Google Scholar]
  • 19.Kersten B, Wanker EE, Hoheisel JD, Angenendt P. Multiplex approaches in protein microarray technology. Expert Rev Proteomics 2: 499–510, 2005. [DOI] [PubMed] [Google Scholar]
  • 20.Keshishian H, Addona T, Burgess M, Kuhn E, Carr SA. Quantitative, multiplexed assays for low abundance proteins in plasma by targeted mass spectrometry and stable isotope dilution. Mol Cell Proteomics 6: 2212–2229, 2007. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Kirschenlohr HL, Griffin JL, Clarke SC, Rhydwen R, Grace AA, Schofield PM, Brindle KM, Metcalfe JC. Proton NMR analysis of plasma is a weak predictor of coronary artery disease. Nat Med 12: 705–710, 2006. [DOI] [PubMed] [Google Scholar]
  • 22.Klee GG Laboratory techniques for recognition of endocrine disorders. In: Williams Textbook of Endocrinology (11th ed.), edited by Kronenberg HM, Melmed S, Polonsky KS, and Larsen PR. Philadelphia, PA: Saunders Elsevier, 2008, p. 67–81.
  • 23.Luminex Corporation. Luminex Corporation xMAP Technology, accessed December 2007 at http://www.luminexcorp.com/index.html.
  • 24.Martosella J, Zolotarjova N, Liu H, Nicol G, Boyes BE. Reversed-phase high-performance liquid chromatographic prefractionation of immunodepleted human serum proteins to enhance mass spectrometry identification of lower-abundant proteins. J Proteome Res 4: 1522–1537, 2005. [DOI] [PubMed] [Google Scholar]
  • 25.Melton L Protein arrays: proteomics in multiplex. Nature 429: 101–107, 2004. [DOI] [PubMed] [Google Scholar]
  • 26.Meso Scale Discovery. Meso-scale Discoveries Sector Imager, accessed December 2007 at http://www.mesoscale.com/.
  • 27.Mootha VK, Lindgren CM, Eriksson KF, Subramanian A, Sihag S, Lehar J, Puigserver P, Carlsson E, Ridderstrale M, Laurila E, Houstis N, Daly MJ, Patterson N, Mesirov JP, Golub TR, Tamayo P, Spiegelman B, Lander ES, Hirschhorn JN, Altshuler D, Groop LC. PGC-1alpha-responsive genes involved in oxidative phosphorylation are coordinately downregulated in human diabetes. Nat Genet 34: 267–273, 2003. [DOI] [PubMed] [Google Scholar]
  • 28.Morozov VN, Morozova TY. Active bead-linked immunoassay on protein microarrays. Anal Chim Acta 564: 40–52, 2006. [DOI] [PubMed] [Google Scholar]
  • 29.Nielsen EM, Hansen L, Carstensen B, Echwald SM, Drivsholm T, Glumer C, Thorsteinsson B, Borch-Johnsen K, Hansen T, Pedersen O. The E23K variant of Kir6.2 associates with impaired post-OGTT serum insulin response and increased risk of type 2 diabetes. Diabetes 52: 573–577, 2003. [DOI] [PubMed] [Google Scholar]
  • 30.Nilsson P, Paavilainen L, Larsson K, Odling J, Sundberg M, Andersson AC, Kampf C, Persson A, Al-Khalili Szigyarto C, Ottosson J, Bjorling E, Hober S, Wernerus H, Wester K, Ponten F, Uhlen M. Towards a human proteome atlas: high-throughput generation of mono-specific antibodies for tissue profiling. Proteomics 5: 4327–4337, 2005. [DOI] [PubMed] [Google Scholar]
  • 31.Pencina MJ, D'Agostino RBS, D'Agostino RBJ, Vasan RS. Evaluating the added predictive ability of a new marker: from area under the ROC curve to reclassification and beyond. Stat Med 27: 157–172, 2008. [DOI] [PubMed] [Google Scholar]
  • 32.Pepe MS, Thompson ML. Combining diagnostic test results to increase accuracy. Biostatistics 1: 123–140, 2000. [DOI] [PubMed] [Google Scholar]
  • 33.Pruitt KD, Tatusova T, Maglott DR. NCBI reference sequences (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins. Nucleic Acids Res 35: D61–D65, 2007. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Qiagen. Qiagen Liquichip, accessed December 2007 at http://www1.qiagen.com/Products/Protein/Assay/LiquiChipSystem/LiquiChipWorkstation.aspx.
  • 35.Rai AJ, Vitzthum F. Effects of preanalytical variables on peptide and protein measurements in human serum and plasma: implications for clinical proteomics. Expert Rev Proteomics 3: 409–426, 2006. [DOI] [PubMed] [Google Scholar]
  • 36.Raj JU, Aliferis C, Caprioli RM, Cowley AW Jr, Davies PF, Duncan MW, Erle DJ, Erzurum SC, Finn PW, Ischiropoulos H, Kaminski N, Kleeberger SR, Leikauf GD, Loyd JE, Martin TR, Matalon S, Moore JH, Quackenbush J, Sabo-Attwood T, Shapiro SD, Schnitzer JE, Schwartz DA, Schwiebert LM, Sheppard D, Ware LB, Weiss ST, Whitsett JA, Wurfel MM, Matthay MA. Genomics and proteomics of lung disease: conference summary. Am J Physiol Lung Cell Mol Physiol 293: L45–L51, 2007. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Rifai N, Gillette MA, Carr SA. Protein biomarker discovery and validation: the long and uncertain path to clinical utility. Nat Biotechnol 24: 971–983, 2006. [DOI] [PubMed] [Google Scholar]
  • 38.Safarik I, Safarikova M. Magnetic techniques for the isolation and purification of proteins and peptides. Biomagn Res Technol 2: 7, 2004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Sagel SD, Chmiel JF, Konstan MW. Sputum biomarkers of inflammation in cystic fibrosis lung disease. Proc Am Thorac Soc 4: 406–417, 2007. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Storey JD, Tibshirani R. Statistical significance for genomewide studies. Proc Natl Acad Sci USA 100: 9440–9445, 2003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Tavazoie S, Hughes JD, Campbell MJ, Cho RJ, Church GM. Systematic determination of genetic network architecture. Nat Genet 22: 281–285, 1999. [DOI] [PubMed] [Google Scholar]
  • 42.The Acute Respiratory Distress Syndrome Network. Ventilation with lower tidal volumes as compared with traditional tidal volumes for acute lung injury and the acute respiratory distress syndrome. N Engl J Med 342: 1301–1308, 2000. [DOI] [PubMed] [Google Scholar]
  • 43.US Department of Health and Human Services and Food and Drug Administration Center for Devices. Draft Guidance for Industry, Clinical Laboratories, and FDA Staff. In Vitro Diagnostic Multivariate Index Assays, accessed December 2007 at http://www.fda.gov/cdrh/oivd/guidance/1610.pdf.
  • 44.Villanueva J, Philip J, Chaparro CA, Li Y, Toledo-Crow R, DeNoyer L, Fleisher M, Robbins RJ, Tempst P. Correcting common errors in identifying cancer-specific serum peptide signatures. J Proteome Res 4: 1060–1072, 2005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Wang TJ New cardiovascular risk factors exist, but are they clinically useful? Eur Heart J 29: 441–444, 2008. [DOI] [PubMed] [Google Scholar]
  • 46.Wang TJ, Gona P, Larson MG, Tofler GH, Levy D, Newton-Cheh C, Jacques PF, Rifai N, Selhub J, Robins SJ, Benjamin EJ, D'Agostino RB, Vasan RS. Multiple biomarkers for the prediction of first major cardiovascular events and death. N Engl J Med 355: 2631–2639, 2006. [DOI] [PubMed] [Google Scholar]
  • 47.Ware JH The limitations of risk factors as prognostic tools. N Engl J Med 355: 2615–2617, 2006. [DOI] [PubMed] [Google Scholar]
  • 48.Waterboer T, Sehr P, Pawlita M. Suppression of non-specific binding in serological Luminex assays. J Immunol Methods 309: 200–204, 2006. [DOI] [PubMed] [Google Scholar]
  • 49.Whiteaker JR, Zhao L, Zhang HY, Feng LC, Piening BD, Anderson L, Paulovich AG. Antibody-based enrichment of peptides on magnetic beads for mass-spectrometry-based quantification of serum biomarkers. Anal Biochem 362: 44–54, 2007. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Winfield KR, Gard S, Kent GN, Sly PD, Brennan S. Assay for urinary desmosines in a healthy pre-pubertal population using an improved extraction technique. Ann Clin Biochem 43: 146–152, 2006. [DOI] [PubMed] [Google Scholar]
  • 51.Zhu H, Snyder M. Protein chip technology. Curr Opin Chem Biol 7: 55–63, 2003. [DOI] [PubMed] [Google Scholar]

Articles from American Journal of Physiology. Lung Cellular and Molecular Physiology are provided here courtesy of American Physiological Society

RESOURCES