Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2011 Feb 1.
Published in final edited form as: Trends Analyt Chem. 2010 Feb 1;29(2):128. doi: 10.1016/j.trac.2009.11.007

Biomarker discovery and clinical proteomics

Jerzy Silberring 1, Pawel Ciborowski 2,*
PMCID: PMC2822390  NIHMSID: NIHMS172045  PMID: 20174458

Abstract

New biomarkers are urgently needed to accelerate efforts in developing new drugs and treatments of known diseases. New clinical and translational proteomics studies emerge almost every day. However, discovery of new diagnostic biomarkers lags behind because of variability at every step in proteomics studies (e.g., assembly of a cohort of patients, sample preparation and the nature of body fluids, selection of a profiling method and uniform protocols for data analysis).

Quite often, the validation step that follows the discovery phase does not reach desired levels of sensitivity and specificity or reproducibility between laboratories. Mass spectrometry and gel-based methods do not provide enough throughput for screening thousands of clinical samples. Further development of protein arrays may address this issue.

Despite many obstacles, proteomics delivers vast amounts of information useful for understanding the molecular mechanisms underlying diseases.

Keywords: Bioinformatics, Biomarker, Clinical, Electrophoresis, Immunodepletion, Liquid chromatography, Mass spectrometry, Protein array, Proteomics, Sample preparation

1. Introduction

The proteome is a reflection of all processes of any living organism. Proteomes are indispensible in determining pathophysiology and defining what “normal state” means functionally.

Besides using proteomics to identify protein sequences, modern proteomics aim to search for novel methodologies having direct impact in clinical diagnosis, new drug designs, clinical trials and control of therapy. For example, genomic profiling and identification of the deletion of phenylalanine at position 508 of the cystic fibrosis transmembrane conductance regulator (CFTR) protein impairing intracellular trafficking to the cell membrane is directly linked to the causes of cystic fibrosis [1]. However, in other instances, identification of mutations is only an indication of the risk of developing the disease, which may not be substantiated at the molecular level [2]. Although genomic sequencing and proteomics each address a portion of the problem, it would be naive to assess which part is more important although they depend on each other. Knowing and understanding changes in proteomes are therefore integral parts of broad evaluation of prognosis, diagnosis, disease progression, and evaluation of progressing disease and effectiveness of treatment.

Technological development led us to an era of individualized therapy. New-generation sequencing technology allows massive sequencing of genetic material at a speed that was unimaginable five years ago. This lays the foundation for the broad use of pharmacogenomics [3]. Clinical proteomics, which is expected to enhance translational “bench-to-bed” studies, still faces methodological drawbacks that need to be resolved before it will be truly mature and robust enough to be implemented in clinical diagnosis [4].

2. What is clinical proteomics?

To define clinical proteomics is complicated. Proteomics emerged as a result of two factors: maturation of genomic global sequencing profiling and the unprecedented technological boom in mass spectrometry (MS) in the 1990s. The term “proteomics” was quickly accepted and the field expanded. The growth was associated with the hope that accelerated advancement in understanding how biological systems work would result in identification of new drug targets and diagnostic biomarkers. The discovery of diagnostic biomarkers for a plethora of diseases appeared to be close and spurred numerous publications reporting screening of clinical samples.

This led to the development of clinical proteomics, characterized by two parallel avenues. One is to utilize body fluids as primary clinical material. The other is to use tissue samples representing pathological change. However, each approach faces inherent problems related to global protein profiling. A dominant common problem in both cases is proper assembly of clinical cohorts providing samples from clinically well-defined individuals. More questions arise after the clinical samples are assembled, such as the selection of protocols used in sample preparation, profiling platform, and ways of analyzing data. In many instances, after lengthy and costly effort, many biomarker candidates do not reach the desired sensitivity and specificity.

An early, straightforward strategy to differentiate groups of samples was based on large-scale profiling of proteins to accelerate direct application to clinical diagnosis, but it has not delivered the results expected. Surface-enhanced laser desorption-time-of-flight (SELDI-TOF, Ciphergen, Inc.) technology was to serve such purpose and was disappointing.

New proteomics profiling strategies followed. However, we still do not observe a flood of new diagnostic biomarkers. Immunodepletion of abundant proteins from plasma or cerebrospinal fluid (CSF) was the next step in reducing the dynamic range of proteins in plasma/serum/CSF samples; however, the process of preparing samples introduces variability and the problem of sample normalization. Within the healthy population, a substantial range of serum albumin concentration (3.5–5.0 g/dl) creates uncertainty as to whether validation should be performed based on non-depleted or immunodepleted samples.

Summarizing, clinical proteomics is still in its infancy and requires a combination of basic research and clinical studies. Between those two disciplines, there is a constant exchange of information, a driving force of translational medicine, thus collecting all data necessary to support diagnosis (genome, proteome, histopathology, and imaging). This is part of the systems biology (i.e. understanding how the human organism functions in health and disease). It will be very difficult, if not impossible, to advance clinical proteomics without a strong foundation and constant feedback from basic research. For now, clinical proteomics, such as monitoring of clinical trials, personalized diagnosis and therapy, and rapid search for any toxic effects of drugs are still waiting for new biomarkers. Moreover, it seems that we are still far away from implementing rapid assays that can be uniformly applied in clinical laboratories.

3. Disease and diagnostic biomarkers

Broadly speaking, a biomarker can be any biomolecule or a specific characteristic, feature, and indicator of a change in any biological structure and function that can objectively measure the state of a living organism.

In clinical biomarkers, the major objective is to evaluate an individual's state of health for diagnosis of any type of disease, preventive treatment, and/or treatment efficacy if a disease has been diagnosed. These diagnostic biomarkers need to be sensitive and specific enough to be considered useful in objective laboratory tests. To find such biomarkers is difficult, and many potential candidates, including protein markers, do not meet such criteria. This has led to the decline of US Food and Drug Administration (FDA)-approved diagnostic biomarkers, despite the development of technologies, including proteomics [5].

Disease biomarkers constitute another group that can be informative about underlying molecular mechanisms of disease or treatment, although they do not necessarily meet the criteria of diagnostic specificity and sensitivity. An example is disease biomarkers for neurodegenerative disorders. They can be indicative of neuroinflammation, although not a specific for any particular disease of the CNS. An example is the decreasing levels of complement C3 in CSF and plasma of HIV-infected patients, which correlates with development of HIV-1 Associated Dementia (HAD) reflecting inflammation and immunosuppresion [68], although it does not constitute a diagnostic biomarker because of the lack of specificity to HIV-1 infection. Other proteins may correlate with development of disease, but their function(s) is (are) not clear enough to use as biomarker(s) because they may represent a confounding effect.

Regardless of study objectives, the proteomics approach has to contain sequential steps of workflow. Fig. 1 summarizes such a workflow, comprising sample preparation, quantitative analysis, data acquisition, database searches and final bioinformatics analysis. We next discuss the strengths and the weaknesses of each phase, then validation of biomarkers and verification that they have diagnostic value or remain disease markers.

Figure 1.

Figure 1

Typical proteomic workflow used in clinical proteomics leading to biomarker discovery. Sample preparation and the first steps of fractionation (e.g., immunodepletion) can be considered as one process, introducing substantial variability. The chemical-labeling step, whether intact proteins or peptides, is least susceptible to introduction of variability because of wide application of well-established procedures. Most researchers use standard nano-LC-ESI-MS2 or MALDI-TOF2 methods, which can be easily compared and cross-validated. Bioinformatics tools may translate only part of data into biologically-relevant information and currently no solid standards are available. Building knowledge – data interpretation – is entirely at the discretion of individual investigators.

4. Sample preparation as a primary step in creating artifacts

Table 1 summarizes samples used for clinical proteomics and their brief characteristics. Please note that, in all of these samples, one can find highly abundant proteins that may easily mask proteins of interest – potential biomarkers. Certainly, serum albumin is the most abundant and ubiquitous.

Table 1.

Summary of material used for clinical proteomics

Sample/material Characteristics Comments
Serum/plasma Relatively easy to get patient's consent to draw blood. Rich in highly-abundant proteins so requires depletion of at least six most abundant proteins. May contain significant levels of protein-degradation products. Mixture of over 3000 proteins. Depletion of highly-abundant proteins may lead to removal of other proteins of interest. If red blood cells lyse during sample collection and processing, hemoglobin will be abundantly present and will interfere with profiling.
Cerebrospinal fluid (CSF) 10–100 times lower amount of total protein content than serum/plasma. Invasive procedure of lumbar puncture limits justification for collecting CSF. Like serum/plasma, requires depletion of highly-abundant proteins. The only clinical material having direct contact with the central nervous system that can be obtained from a living person. Very useful to study neuropeptides and neurotransmitters).
Urine Easy to obtain from pediatric patients. Also useful in search of biomarkers for extrarenal diseases (e.g., coronary artery disease). High salt content. Results require normalization (e.g., creatine). Very useful for diagnosis of renal pathologies. May contain bacterial proteins.
Saliva Easy to obtain in relatively high quantities. High content of glycoproteins (e.g., mucins) Very large number of phenotypes (polymorphism). Suitable and recommended source for detecting illicit drugs.
Solid-tissue biopsies Usually small pieces of tissue providing limited material for proteomics studies. Clinical material from the central nervous system is available only post mortem. One biopsy sample may contain cells at various stages of pathology, (e.g., cancer). Lack of homogeneity may lead to difficulty in biomarker identification.
Blood cells More difficult to obtain than fluids. Larger amount of blood necessary or plasmapheresis for white cells. Important information (e.g., for leukemia studies). Isolation of sub-fractions requires density-gradient centrifugation (e.g., Lymphoprep). Need for extensive washing. T helper cells are characterized by a variety of proteins belonging to the MHC system.
Tears Methods of material collection are still under debate. Limited amount of protein requires sample pooling. Some highly-abundant proteins (e.g., albumin, lactoferrin, lysozyme) may mask other proteins.

Early experiments in proteomics profiling of serum and plasma made evident that there is no technology platform that can analyze proteins quantitatively with a dynamic range of concentration as high as 1012 [5] and that pre-fractionation of these samples is necessary [9,10]. One reason for the failure of SELDI-TOF was the assumption that low-specificity binding of proteins to chips with ion-exchange or hydrophobic surfaces would be enough to substitute for pre-fractionation and/or immunodepletion of serum/plasma samples. Although SELDI-TOF technology is still used by some [1113], this platform is far from the mainstream in biomarker-discovery efforts.

Currently, the major objective of clinical proteomics utilizing body fluids is to reduce the dynamic range of proteins in analyzed samples [5,10]. Initially, columns and cartridges for albumin and IgG were available [14,15] and were soon followed by columns for multiple-protein removal, based on immunodepletion [16]. In a relatively short period, removal of most abundant proteins from serum/plasma and CSF samples became a standard first step in clinical proteomics analyses aiming at biomarker discovery [17]. This widely-used approach is now commonly accepted as the first step in sample preparation and it is quite obvious that immunodepletion of the 12 most abundant proteins is necessary (i.e. serum albumin, IgG, fibrinogen, transferrin, IgA, IgM, haptoglobin, apo A-I, apo A-II, α1-antitrypsin, α1-acid glycoprotein, α2-macroglobulin). These proteins comprise over 96% of total protein content in plasma/serum [5]. It is apparent that, without their removal from the plasma sample, the number of protein spots identified in standard pH 3–10 by a 17-cm 2-dimensional electrophoresis (2DE) gel is significantly lower than for immunodepleted samples [6,18].

However, immunodepletion of multiple proteins can also risk losing proteins of interest or low-abundant-candidate biomarkers that are removed along with those specifically depleted. Albumin is the most abundant protein in plasma and is a carrier of many proteins and other compounds (e.g., lipoproteins and amino acids) so the removal of albumin may have a profound impact on the final quantitative effect of proteomics profiling [19]. Subsequently, other approaches were used (e.g., lectin-affinity columns [20]).

Regardless of how we attempt to reduce complexity of the plasma/serum sample, there is no consensus on how many and which proteins or group of proteins should be removed from these samples prior to proteomics profiling. As stated above, this is one of significant problems awaiting a solution to make the discovery of clinical biomarkers more effective. Moreover, removal of highly-abundant proteins also applies to cell and tissue lysates. We have found that culture supernatants from in vitro cultures of human monocyte-derived macrophages (MDM) washed thoroughly three times with serum-free medium and subsequently those cultured in such a medium for 24 hours still contained serum albumin at a level that interfered with proteomics profiling [21].

Numerous methods of protein fractionation prior to proteomics profiling have been established [22]. However, these methods do not meet the criteria needed to reduce the complexity of the sample to measure, e.g., the degree of phosphorylation of one potential site that may occur at level below 10% of a medium- or low-abundant protein. In addition, if such a protein is in the complex mixture of proteins of the whole cell lysate or serum/plasma, accurate quantitation remains a big challenge.

Can multiple monitoring reaction be a solution? Such studies, although promising and able to measure at the femtomolar levels [23], do not give us a clear answer about their utility for global profiling of hundreds of proteins in one analytical run [2426]. Also, such experiments often require specialized tools (e.g., isotopically-labeled and phosphorylated protein standards, which need to be synthesized and used as internal controls [27]). This makes them expensive and time consuming, and often they do not target proteins and/or their forms, which are vital for analysis.

Summarizing, in the years ahead, sample preparation will require much more standardized effort at the bench level.

5. Multistep profiling is another source of variability

Many investigators share the view that 2DE analysis has too many steps requiring manual manipulations, so it introduces significantly higher experimental error, so they sees a remedy in applying experimental protocols based on automated processes. We disagree with this view and provide a comparative analysis of multi-step protocols of two widely-used methods of quantitative proteomics profiling (i.e. iTRAQ and 2DE DIGE).

Regardless of the method used, immunodepletion, if chosen as the first step of sample preparation, is identical in all experimental designs. Post-immunodepletion sample concentration and preparation for labeling either with Cy-dyes or iTRAQ involve very similar, if not identical, steps of manual manipulations. However, the latter technique requires in-solution enzymatic digestion (usually trypsin), which has to be complete to be fully reproducible. Subsequently, samples are separated into two dimensions, either utilizing strong cation-exchange chromatography in combination with reversed-phase high-performance liquid chromatography (RP-HPLC) when iTRAQ is used or by IEF followed by gel electrophoresis in the second dimension of 2DE DIGE. This step raises most controversies. Proponents of the HPLC step argue that MudPIT is an automated process using autosamplers for loading samples, so the error is lower than in hand manipulation of IPG strips inserted manually onto PAGE gels.

Agilent, Inc. recently developed an OFFGEL Fractionator for separations based on isoelectric focusing (IEF) in solution. Such ideas are not new. Many years ago, BioRad developed Rotofor, which they recently miniaturized, and now offers a family of apparatus. Also, Invitrogen offers the ZOOM IEF Fractionator. The ProteomLab PF2D system from BeckmanCoulter, Inc. uses LC columns for first-dimension fractionation based on IEF in solution. The disadvantage of the ProteomLab PF2D system is requiring a relatively high initial volume of plasma/serum sample to perform quantitative profiling in full scale. However, all samples are in a liquid form and can be used for all downstream analyses [28].

The most recent device in this category is the dPC Fractionator offered by Protein Forest, Inc. Whether IEF in a liquid phase provides better reproducibility than IPG-strip technology remains to be seen, although more studies and thorough comparisons are needed.

Further steps of protein identification using LC-MS2 are identical and use the same software and algorithms [29]. We therefore see 2DE DIGE remaining a vital proteomics platform [30,31], complementary to those based on LC. Each profiling approach has inherent problems with variability that will inevitably have an impact on results.

Based on current status, 2DE DIGE and MudPIT, including iTRAQ approaches, are used equally frequently in proteomics studies of plasma/serum. Other techniques (e.g., SELDI-TOF and ProteomeLab PF2D) are still used but to a lesser extent. We expect that such trends will remain in clinical proteomics for the foreseeable future.

6. Quantitation in proteomics experiments

One of goals of global profiling pursued by many investigators is to maximize the number of high-confidence identifications of peptides, thus proteins [32,33], but this does not address quantitative differences, without which it is impossible to discover good, reliable biomarkers. MS-based protein identification combined with quantitative measurements is therefore at the center of development of new technologies and methods.

For many years, chemometrics has provided tools widely used in analytical chemistry. Those same tools can be utilized in proteomics. However, we have to be aware of a number of caveats that may have a profound effect on whether an observed difference between two samples is real. First, it is necessary to quantify the relative change between samples. This is widely used in 2DE DIGE experiments, in which the pooled internal standard is used for data normalization rather than absolute quantitation. Because the gel image in 2DE DIGE experiments is three-dimensional, the most efficient way of quantitation is comparison of spot volumes. In LC, peaks are two-dimensional and the peak area is used for quantitative purposes. Overall, there is not much difference in methods of analysis and the same mathematical models can be used. The advantage of this approach is that, in one experiment, we are able to generate ratios of relative abundance for thousands of proteins.

Several methods of chemical labeling have been developed to aid quantitation of complex mixtures of proteins, including SILAC, iTRAQ, ICAT, O18, and label-free methods. Results of any of these techniques provide only partial information and show trends in protein expression, making validation necessary downstream. Moreover, all these techniques appear to be complementary and not exclusive [34] and the application of some of them in clinical proteomics is limited. For example, SILAC can be used to study differential expression of secreted proteins in in vitro cultures. Results from such experiments can be used indirectly in a biomarker discovery in such a way that we may select proteins and use MRM to analyze plasma samples to investigate whether these proteins show different levels in circulation. Other factors impacting on full reproducibility across these techniques include sample-preparation methods, as discussed above, but also possibly steric hindrance, labeling method and nature of chemistry used [e.g., cleavable ICAT (cICAT) does bind to Cys residues].

Another problem arises from the application of various detergents recommended for each particular labeling. This again adds further uncertainty to the final results [35]. Spiking with internal standards definitively helps cross comparisons between individuals and groups within the study. Minimizing errors with each and every step still requires joint efforts in protocol development.

The isotope-dilution method of quantification has been used for many years in measuring small molecules and metabolites. Recently, this principle has been applied to proteomics. This method requires internal standards to be spiked at the exact concentration. It also requires isotope-labeled analogues. It is simple to obtain isotopically-labeled analogues of peptides, but the extra cost associated with this step may discourage some investigators. To obtain similar standards in the form of intact proteins is more challenging. Nevertheless, this seems to be the right direction towards unambiguous quantitation of detected proteins, thus eliminating a potential source of errors and improving overall reproducibility [36,37].

7. Peptidomics

Hormones and neuropeptides have been at the center of interest for many years, even decades. Their presence and effects were observed long before they were formally discovered (e.g., the effect of leptin, a 164-amino-acid-long polypetide/protein, was reported by Ingalls and co-workers from Jackson Laboratories [38] in 1950, but leptin was discovered only in 1994 by Friedman and colleagues [39]).

Further advances led to discovery of short neuropeptides [e.g., neuropeptide Y (NPY)], gastrointestinal peptides [40] and opioid peptides, in particular dynorphin [41], endorphins and enkephalins [42]. Subsequently, combined sequential activation/conversion of biologically-active peptides were studied extensively before the suffix “-omics” started to be used to define specialized fields of study. As such peptidomics is not a novel concept. It is rather emerging as a field re-defined by technological advances. This, in turn, does not minimize the importance of peptidomics, but gives it new meaning and a deserved place within proteomics [43].

Peptidomics can be distinguished from proteomics based on the same concept, as peptides can be distinguished from proteins, though there is no common consensus in this respect. Peptides are short chains of amino acids linked by peptide bonds while proteins are long polypeptides. How short or how long? There is no clear distinction and the criteria used are artificial. Some scientists define peptides as no longer than chains comprising 50 amino-acid chains, while others state that a peptide is as long as it can be chemically synthesized. Both descriptions are artificial because technology allows us to synthesize polypeptide chains longer than 100 amino acids, although such synthesis is laborious and costly, and there is a high risk of amino-acid deletions or duplications occurring during chemical synthesis.

Technologies used in peptidomics and proteomics overlap to a certain extent, and lead to comprehensive analysis of patterns of peptides in a biological sample at any given point in time. Endogenous peptides and their synthetic analogs play a critical role in regulating many physiological processes.

Biologically-active peptides offer a premier example of how the proteinaceous product of one gene may produce multiple, diverse functionalities that are tightly regulated in time and space. Peptides are proteolytically derived from precursor proteins by a complex array of more than 500 proteases. The half-lives of peptides will vary over a wide range. Technically, MS analysis of peptides is not easier or more difficult than MS analysis of proteins. The challenges are similar and using ionization (e.g., electrospray) will have similar strengths and obstacles to the case of proteins.

Peptidomics can also be seen as part of broadly-viewed post-translational modifications (PTMs). However, PTM is usually associated with addition and/or removal in nature of chemical groups other than polypeptides. Of course, one can argue that sumoylation is a modification, by which proteins are labeled by addition of polypeptide chain (small ubiquitin-like modifier). We agree with this view, but sumoylation is not currently part of peptidomics. There are other similarities between peptidomics and other areas of PTM (e.g., the “map” of proteases generating active peptides, or kinases/phosphatases in the case of phosphorylation/dephosphorylation, respectively). For example, pre-propeptide precursors (e.g., POMC) are cleaved by a variety of proteolytic enzymes leading to a release of β-endorphin, α-MSH, ACTH, β-lipotropin, to name a few. The presence of such fragments in body fluids or tissues may indicate an appropriate cleavage pathway involving peptidases of particular specificity. Regardless of semantics, further development of peptidomics is a profoundly important area of biomarker research that has great potential for adding significant clinical value [44].

8. Protein arrays, a forgotten element in ‘-omics’?

The technological advances of the 1980s made it possible to create libraries based on an array of technology and comprising random peptide sequences. Such libraries were used for broad screenings and thus profiling of protein-protein interactions in search of interactive regions/domains [45], drug discovery [46], and other applications. Early on, this approach was a particularly attractive way to map epitopes for the purpose of screening antibodies, which further can be utilized for vaccine development [4749] or for the evaluation of passive immunization (e.g., the HIV gp120 protein [50,51]). Further work of Scott and Smith on phage display of short amino-acid sequences [52,53] advanced this technology platform and made it possible to probe complex mixtures of proteins to find interactive protein domains. One weakness of using short sequences is the relatively high ratio of non-specific or low-affinity-binding pairs that generate false positive results that cannot be validated in the following step (Ciborowski, unpublished data).

Subsequent arrays were developed based on antigen-antibody interactions. Although this approach addresses the problem of specificity, three issues remained:

  • whether monoclonal or polyclonal antibodies should be used;

  • lack of quality antibodies to many proteins and their forms limit broader application of the strategy; and,

  • the necessity of purifying multiple proteins to obtain pure antibody fractions adds to the weakness of this approach, this being the most limiting factor.

Nevertheless, protein-array platforms became an attractive profiling approach among many proteomics technologies [5457] because of the promise of large-scale analysis that can be performed with relatively low amount of sample, technical ease and high throughput [58,59]. Protein arrays have also become a very convenient screening tool for testing signaling pathways [60] or identifying enzymatic substrates.

In recent years, reversed-phase protein arrays (RPPAs) have been developed [61]. The principle of this method is to “print” the specimen from the cohort of patients on nitrocellulose-covered glass slides and probe the slides with specific antibody. The specimen is printed in serial dilutions that allow quantitation [61]. An advantage of this method is that there is no need to purify proteins – which was one of the major limiting factors in previous methods. This approach was successfully applied to a variety of screening experiments (e.g., substrates for kinases, which control many check points of signaling, and thus the cell function [62], proteins involved in known signaling pathways and their phosphorylated forms [63], and identification of polyubiquitin binding proteins [64]).

RPPA is also an attractive option for screening large numbers of serum/plasma samples, including controls, all in one experiment. Proof of principle was shown by Janzi and co-authors, who quantified IgA in samples from patients with immunodeficiency diseases [65]. Reproducibility and sensitivity of RPPA and their strong correlation with immunohistochemistry, which is considered “gold standard”, warrants further development of this platform to increase its sensitivity, while maintaining flexibility and multiplexing as strengths [66].

The next advancement in protein microarrays was development of high-density, self-assembling protein microarrays, based on the concept of the nucleic-acid programmable protein array (NAPPA) [55,67,68]. The concept is to synthesize proteins on the high-density chip using spotted cDNA and a T7-coupled rabbit reticulocyte lysate in vitro transcription-translation (IVTT) system [55]. Translated proteins contain a C-terminal glutathione S-transferase (GST) tag, which is used to capture co-printed anti-GST antibody. NAPPA represents a crucial step in addressing many of the concerns related to manufacturing limitations (e.g., density of printing, reproducibility, and quality of immobilized proteins). Nevertheless, other factors (e.g., proper folding and PTMs) remain to be addressed in some way; otherwise, protein arrays will not be useful in meeting the goals and the objectives of many experiments.

Please note that accuracy and reproducibility of protein arrays also depend on proper experimental design, normalization procedures, elimination of systematic bias, and appropriate statistical analysis. Several different clustering tools used in verification, instead of single-cluster analysis, result in discovering differences in a class of molecules (e.g., cytokines proposed by Eckel-Passow and colleagues may decrease false results [69]).

Ciphergen, Inc. developed a PS10 Protein Chip for its SELDI-TOF Protein Reader, which is used for covalent immobilization of antibodies. Subsequently, masses of immunocapture-specific antigens are measured by MALDI-TOF. We have successfully used this approach to investigate proteolytic processing of cytokine SDF-1α by MMP2 [70]. The weaknesses of this technique are low throughput and the relatively low sensitivity and low mass accuracy of the SELDI-TOF Protein Reader.

Despite the technological advances, protein microarrays still suffer from skepticism and criticism. Nevertheless, we can foresee this technology gaining ground in biomedical research with an increasing potential for application in clinical diagnosis. However, to attract users, it will require more examples demonstrating a proof of concept for new applications, technical feasibility and economy, free samples, and a “no obligation” trial period from manufacturers. Development of whole-proteome arrays may also increase interest and broader use of this technology platform. At present, protein arrays remain an emerging technology [71,72]; they require further technological developments and refinements but have great potential to be complementary to other profiling platforms [66].

9. Bioinformatics is the Achilles' heel of proteomics

Proteomics experiments produce a vast amount of raw data that needs to be analyzed in a reasonable time. Simultaneous analysis of the raw data well exceeds the capabilities of the standard or specialized software delivered with equipment. Unless software is highly specialized, commercial software packages comprise modules that meet the criteria of an average user. In some instances, such software may offer more than investigators need. However, software code is protected and does not allow users to adapt it for very specific requirements of an experimental design. This generated a widespread effort to create an “in-house” software ranging from small scripts to quite sophisticated packages. In many instances, such software is available at no cost to smaller groups that do not have computing capabilities for their own bioinformatics development. Although this is very promising trend, the important element of standardization is missing. If we agree that the selection criteria for evaluation and interpretation of multiple data sets markedly influence the final results of analysis, the lack of standardization and uniform criteria may lead to accumulation of data that are not comparable or low quality. How will we benefit from hundreds or thousands of peptides and proteins listed, if we cannot draw a conclusion as to what they mean biologically?

Even the best algorithm for database searches of MS2 spectra will produce false positive results if the database itself is corrupt. The identification of many proteins is based on matching sequences of two peptides, 10 amino acids each. This may cover only a few percent of protein sequences and, in some instances, indicate proteins that were deposited in the database as theoretical translations of cDNA-nucleotide sequences. If we all agree that protein databases should be curated, then somebody will make decisions about whether the protein should be included or excluded and such a process can introduce bias. This is inevitable, so detailed criteria and the process of database curation must be disclosed to make users aware of the potential bias and to allow them to eliminate artifacts.

Identification of a vast number of “hypothetical proteins” that are identified by an algorithm are not based on a complete amino-acid sequence submitted to the protein database, but rather translated nucleotide sequence in a gene bank adds another level of ambiguity for data interpretation and defining the function of a biological system. Is there therefore any hesitation as to whether such a protein exists in a given organism and can possess any biological function that should be added to build our knowledge?

The introduction of 2DE with DIGE opened the door for chemometricians, statisticians and software engineers to create user-friendly, reliable software for the average user. Although very helpful, the initial versions of 2DE had many weaknesses, mostly in such modules as background subtraction and spot alignment. Statistical analyses were based on well-developed models and use normal Gaussian distribution, Student's t-test, and analysis of variance (one-way ANOVA).

Another weakness was the substantial amount of work that had to be done manually, including the removal of artificial spots on the edges of gel images as a prerequisite to further analysis. Many of these problems were addressed with some success in newer versions [73]. However, there is no widely-accepted consensus as to how many biological replicates one analysis should have. Good statistics can be performed with 3–7 repeats of each sample (i.e. an independent sample and the entire proteomics procedure!) to apply to Student's t-tests [74], but, in clinical proteomics, we deal with inherent variations in genetic background within the general population of patients that may lead to false-positive discoveries. If we include larger cohorts of samples in the experiments, we will suffer increased costs and effort without guarantee of success. For this reason, many researchers can effectively be discouraged from contributing to biomarker-discovery efforts and thus to the field of clinical proteomics.

Recent introduction of proteomics-data analysis using cloud computing and open-source algorithms [e.g., Virtual Proteomics Data Analysis Cluster (ViPDAC), which works as an on demand service, and is accessible to anyone with a credit card] may increase the possibilities of further production of ambiguous information.

10. From discovery to validation of biomarkers

In everyday practice, bringing a biomarker candidate to the level of validated diagnostic marker is an ambitious goal and provides several methodological challenges for proteomics. Besides overcoming the challenges of sensitivity and specificity, a biomarker must be capable of being reproducibly quantified in independent laboratories by independent personnel. The cost of personnel can be resolved by utilizing laboratory robots. Another way to meet the challenges is by requiring the use of particular reagents from a single manufacturer or specified manufacturers along with standardized procedural protocols, as these issues are usually resolved during preparation of an assay for manufacture.

The situation differs during the discovery phase because many biomarker candidates are proposed and reported in the literature by one group of researchers, but are not identified by other groups, even those using similar approaches to proteomics. One reason for the lack of standardization of multi-step procedures is the background and the baseline subtraction for quantitation purposes result in poor overall reproducibility. Selection of specific criteria during MS-based protein identification (i.e. number of peptides chosen for fragmentation; sequence coverage, mass accuracy, and number of PTMs) adds layers of complexity and potential error, and eventually contribute to the lack of desired reproducibility between laboratories. Without comparing global consensus of how data is generated by various groups, results will therefore be “contaminated” by too many technical factors, will not truly reflect differences between cohorts of patients, and might be recognized as originating from different genetic backgrounds, but not achieve clinical status.

Other factors can add to variability (e.g., age, gender, prior disease background, information of current medical treatment, dietary preferences, general health status and even the day and the time of sample collection). Eventually, many studies will create background noise in identifying new biomarkers instead of helping us to refine and to solidify or to reject candidates already proposed. Fig. 2 recapitulates the approach in which both biomarker discovery and biomarker validation should be performed in a “blinded fashion.” By “blinded fashion”, the authors mean that all confounding factors described above should be removed, leaving only objective facts resulting from analytical measurements.

Figure 2.

Figure 2

Biomarker discovery and validation should be free from any bias and performed in a similar fashion to weighing justice – based on facts.

Proteomics studies, when published, should include data validation generated by different assays (e.g., ELISA or quantitative Western-blot analysis). Even if limited, such data can indicate whether observed differences in protein levels might be associated with differences in the genetic background of patients who have provided the clinical samples. Based on our experience, having such information helps tremendously in designing future experiments.

Chronic inflammation or infection (e.g., HIV and hepatitis), certain bone marrow diseases (e.g., multiple myeloma, amyloidosis and monoclonal gammopathy) manifest themselves in abnormally high levels of proteins circulating in the blood. Malnutrition may cause the opposite effect. Major proteins also vary in a broad range [e.g., albumin (3.5–5.0 g/dl) or fibrinogens (0.2–0.45 g/dl)]. There is therefore no plasma/serum protein that could be used as a reference point for data normalization, as actin is for cell lysates, thus making validation more difficult.

If plasma/serum samples are immunodepleted from most abundant proteins prior to profiling, which is almost standard procedure, which samples should be used for validation? Immunodepleted or not? This question remains open so far.

We have learned from our experience that the relative change in time for any given patient would be more informative than an absolute level of a protein measured per mL or mg in a total pool of serum/plasma proteins.

11. Will the “-omics” approach support efficient biomarker discovery?

We believe that the answer to this question is positive – under several conditions. In an interview [75], Amos Bairoch, one of the most competent people in this field (Swiss Prot, ExPASy server to name his most important achievements) stated: “…much of the data generated by proteomics groups over the past decade is junk”. This is a categorical, controversial statement. Nevertheless, we are aware of this problem.

A constantly growing number of research groups entering the “- omics” arena is a positive trend, although it may increase the risk of lacking in-depth understanding of the issues associated with the biomarker-discovery process, including all those involved from the first step of sample selection to the final conclusions. Without a firm understanding of the issues, research may result in low-quality data that will increase “background noise” rather than contribute to real progress in clinical proteomics.

One remedy is to demand validation of candidate biomarkers using independent assays (e.g., ELISA, immunohistochemistry, quantitative Western blot and/or flow cytometry) and cohorts of clinical samples unrelated to the discovery phase. Quite often, studies are simply based on the statistical analysis of screening shotgun-type analyses.

However, this issue complicates the situation when a candidate biomarker represents a change in PTMs or mutations and alternative splicing rather than an absolute or relative level (e.g., protein circulating in the bloodstream). The lack of antibodies or the lack of specificity in existing and commercially-available antibodies may discourage many from adding such a step to their study. In addition, all commonly-applied proteomics techniques measure protein expression and not its activity. What is then a relationship between overexpression of an enzyme and its catalytic activity to the state of the biological system in question?

Another issue is the magnitude of the difference between “control” and tested samples. One can argue that, if a candidate biomarker shows satisfactory (beneficial for supporting diagnosis) sensitivity and specificity, the absolute or relative difference is of a lesser concern. One must also keep in mind that the lower difference will require a larger number of samples (measurements) to show statistical significance, so more money and effort must be committed when the outcome is uncertain. This is a risk that few investigators are willing to take and their decisions are usually based on intuition.

It is also obvious that we look at past experience and try to rationalize decisions made for the direction(s) of future studies. For example, past lessons from clinical chemistry tell us that normal range (controls) and standard deviation should be established based on a large cohort of samples obtained from healthy volunteers. Two examples can illustrate such a dilemma. A normal range of creatine kinase is 40–140 U/L or even higher, while TSH has a much narrower range (0.4–4 U/mL). At the level of proteomics investigation, the difference is unknown. Should this be around 5%, 50%, or higher? Our experience with multiple CSF and serum/plasma-profiling experiments is that substantial intuition is involved in determining whether any given protein should be pursued further. Many examples from literature show that anything below 20% of the difference is heavily contaminated with experimental error.

12. Putting things together

In spite of “-omics”-based studies undergoing phases of enthusiasm and hope for new discoveries as well as phases of pessimism and doubts, these strategies can assist in biomarker discovery for diagnostic, therapy efficacy and drug design. It seems that, after the euphoric era, we observe a growing number of reports tackling real problems and challenges in “-omics” [76].

We first need to envision all “-omics” approaches as one integrated field comprising many sub-disciplines. Subsequently, we should integrate “-omics” disciplines to build our knowledge and overall picture of the biological process. The classical approach starts from the genome to the transcriptome through the translation and to the proteome. This is a good starting point and there is a lot of work that needs to be done in integrating these four elements. However, a more challenging aspect is to integrate proteomics with other disciplines (e.g., bioimaging or behavioral studies). There is no doubt that, as of today, proteomics is maturing enough to assist diagnosis as a secondary tool to clinical chemistry, and the experience of the physician and the best of proteomics is yet to come. Other and younger disciplines – chemogenomics and chemoproteomics, which are based on different assumptions [77] – are joining the “-omics” field and adding complexity of an unprecedented proportion.

A critical review by Aebersold [78] comments on the methodological problems that arose from a large comparative study on the identification of 20 standard proteins [79]. Though he clearly states that such a number of standard proteins is not a proteomics sample, this experiment provides some important clues for further analysis of true biological mixtures. First of all, it proves that even the application of various modern MS instrumentation provides consistent data that can be compared between laboratories. A further conclusion, intuitively understandable, is maximal sample reduction and analysis of limited set of proteins/peptides, or metabolites. However, this concept shifts the importance of sample preparation to be one of the most important aspects of the entire analysis. Based on our experience, this is still the key step, where the majority of methodological errors begin.

This approach demands development of reliable guidelines for various procedures, preparation of internationally-available standards and controls, broad blind studies, and meta analyses of published data to reveal trends and clinical relevance of various results. We have already performed such an approach on a small scale using published data on proteomics in drug dependence [80]. This, in turn, will require more international initiatives to arrange and to evaluate inter-laboratory cross-validations, thus analyzing identical samples to improve the quality of the outputs from each particular research laboratory.

Such studies have already been conducted by the Association of Biomedical Research Facilities (ABRF) and implemented on a larger scale by HUPO [79]. The common problems identified by this multi-laboratory study (27 laboratories, 20 highly-purified recombinant proteins) concern missed identifications (false negatives), environmental contamination, database matching, and curation of protein identifications. The authors also postulate the need to improve search engines and databases for MS-based proteomics. Looking back into history, we faced similar challenges many years ago – in 1970s – when radioimmunoassay tests were also producing data that could not be correlated between laboratories, unless several measures, including the above initiatives were undertaken. It is important to note that radioimmunoassay was a far simpler task to standardize at the international level.

It is necessary to implement a set of simple, user-friendly rules and criteria to deal with data overload, quality control, the impatience of scientists “to know” and the pressure of sponsors to use their money more efficiently [81]. They are urgently needed for clinical proteomics if we want it to play bigger role in clinical diagnosis [82].

Although it is difficult, or even impossible, to predict where proteomics will be in five years, we are confident that it is very unlikely that proteomics will be an easy-to-go black box providing reliable, reproducible results. Large research groups with extensive instrumentation regularly renew or up-grade their technological capabilities by purchasing new instrumentation and software. They also comprise many researchers representing various disciplines, including bioinformatics, which will play leading role in directing future developments. However, we cannot forget small research groups and facilities with much more limited technological and human power but with great experimental experience, knowledge of topics and innovative ideas. Such laboratories will be able to perform most of the experimental proteomics work “in house,” but they too must be given access to cutting-edge technology through collaborative networks and initiatives to finish work at the competitive level. Clinical proteomics will not deliver so much-needed new biomarkers if effort is limited to a few groups.

Acknowledgments

This work was partially supported by the Polish Ministry for Science and Higher Education, Grant No. 3048/B/H03/2009/37 to JS, and National Institutes of Health 1 P20DA026146-01 (PC co-investigator), 2 P01 NS043985-05 (PC co-investigator). The authors are grateful to Samantha Mosley and Robin Taylor for their help with graphics, language and writing in preparing this manuscript.

Abbreviations

2DE

2-dimensional electrophoresis

ABRF

Association of Biomedical Research Facilities

CSF

Cerebrospinal fluid

CNS

Central nervous system

dl

deciliter

DIGE

Difference gel electrophoresis

ESI

Electrospray ionization

GST

Glutathione S-transferase

HAD

HIV-1-associated dementia

HUPO

Human Proteome Organization

IVTT

in vitro transcription-translation

ICAT

Isotope-coded affinity tags

LC-MS2

Liquid chromatography-tandem mass spectrometry

MALDI-TOF

Matrix-assisted laser-desorption-time-of-flight

MMP2

Matrix metalloproteinase 2

MDM

Monocyte-derived macrophages

PAGE

Polyacrylamide gel electrophoresis

MRM

Multiple-reaction monitoring

NAPPA

Nucleic-acid programmable protein array

RP-HPLC

Reversed-phase high-performance liquid chromatography

RPPA

Reversed-phase protein arrays

SDF-1α

Stromal cell-derived Factor 1α

SELDI-TOF

Surface-enhanced laser desorption-time-of-flight

SILAC

Stable-isotope labeling with amino acids in culture

TSH

Thyroid-stimulating hormone

Footnotes

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

Contributor Information

Jerzy Silberring, Department of Biochemistry and Neurobiology, Faculty of Materials Science and Ceramics, AGH University of Science and Technology, Kraków, Poland.

Pawel Ciborowski, Department of Pharmacology and Experimental Neuroscience, University of Nebraska Medical Center, 985800 University of Nebraska Medical Center, Omaha, NE 68198-5800, USA.

References

  • 1.O'Sullivan BP, Freedman SD. Lancet. 2009;373:1891. doi: 10.1016/S0140-6736(09)60327-5. [DOI] [PubMed] [Google Scholar]
  • 2.Metcalfe KA, Finch A, Poll A, Horsman D, Kim-Sing C, Scott J, Royer R, Sun P, Narod SA. Br J Cancer. 2009;100:421. doi: 10.1038/sj.bjc.6604830. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Dawood S, Leyland-Jones B. Cancer Invest. 2009;27:482. doi: 10.1080/07357900802574660. [DOI] [PubMed] [Google Scholar]
  • 4.Anderson NL. Clin Chem. 2009 DOI: clinchem.2009.126706v1. [Google Scholar]
  • 5.Anderson NL, Anderson NG. Mol Cell Proteomics. 2002;1:845. doi: 10.1074/mcp.r200007-mcp200. [DOI] [PubMed] [Google Scholar]
  • 6.Rozek W, Horning J, Anderson J, Ciborowski P. Proteomics Clin Appl. 2008;2:1484. doi: 10.1002/prca.200780114. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Rozek W, Ricardo-Dukelow M, Holloway S, Gendelman HE, Wojna V, Melendez LM, Ciborowski P. J Proteome Res. 2007;6:4189. doi: 10.1021/pr070220c. [DOI] [PubMed] [Google Scholar]
  • 8.Wiederin J, Rozek W, Duan F, Ciborowski P. Proteome Sci. 2009;7:8. doi: 10.1186/1477-5956-7-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Bodzon-Kulakowska A, Bierczynska-Krzysik A, Dylag T, Drabik A, Suder P, Noga M, Jarzebinska J, Silberring J. J Chromatogr. 2007;B 849:1. doi: 10.1016/j.jchromb.2006.10.040. [DOI] [PubMed] [Google Scholar]
  • 10.Jebrail MJ, Luk VN, Shih SC, Fobel R, Ng AH, Yang H, Freire SL, Wheeler AR. J Vis Exp. 2009;33:1603. doi: 10.3791/1603. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Gast MC, Van Gils CH, Wessels LF, Harris N, Bonfrer JM, Rutgers EJ, Schellens JH, Beijnen JH. Oncol Rep. 2009;22:205. doi: 10.3892/or_00000426. [DOI] [PubMed] [Google Scholar]
  • 12.Zeidan BA, Cutress RI, Murray N, Coulton GR, Hastie C, Packham G, Townsend PA. Cancer Genomics Proteomics. 2009;6:141. [PubMed] [Google Scholar]
  • 13.Ueda M, Misumi Y, Mizuguchi M, Nakamura M, Yamashita T, Sekijima Y, Ota K, Shinriki S, Jono H, Ikeda S, Suhr OB, Ando Y. Clin Chem. 2009;55:1223. doi: 10.1373/clinchem.2008.118505. [DOI] [PubMed] [Google Scholar]
  • 14.Huang HL, Stasyk T, Morandell S, Mogg M, Schreiber M, Feuerstein I, Huck CW, Stecher G, Bonn GK, Huber LA. Electrophoresis. 2005;26:2843. doi: 10.1002/elps.200500167. [DOI] [PubMed] [Google Scholar]
  • 15.Steel LF, Trotter MG, Nakajima PB, Mattu TS, Gonye G, Block T. Mol Cell Proteomics. 2003;2:262. doi: 10.1074/mcp.M300026-MCP200. [DOI] [PubMed] [Google Scholar]
  • 16.Pieper R, Su Q, Gatlin CL, Huang ST, Anderson NL, Steiner S. Proteomics. 2003;3:422. doi: 10.1002/pmic.200390057. [DOI] [PubMed] [Google Scholar]
  • 17.Gong Y, Li X, Yang B, Ying W, Li D, Zhang Y, Dai S, Cai Y, Wang J, He F, Qian X. J Proteome Res. 2006;5:1379. doi: 10.1021/pr0600024. [DOI] [PubMed] [Google Scholar]
  • 18.Kumar SG, Rahman MA, Lee SH, Hwang HS, Kim HA, Yun JW. Proteomics. 2009;9:2149. doi: 10.1002/pmic.200800571. [DOI] [PubMed] [Google Scholar]
  • 19.Gundry RL, White MY, Nogee J, Tchernyshyov I, Van Eyk JE. Proteomics. 2009;9:2021. doi: 10.1002/pmic.200800686. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Dayarathna MK, Hancock WS, Hincapie M. J Sep Sci. 2008;31:1156. doi: 10.1002/jssc.200700271. [DOI] [PubMed] [Google Scholar]
  • 21.Ciborowski P, Gendelman HE. Curr HIV Res. 2006;4:279. doi: 10.2174/157016206777709474. [DOI] [PubMed] [Google Scholar]
  • 22.Wei J, Sun J, Yu W, Jones A, Oeller P, Keller M, Woodnutt G, Short JM. J Proteome Res. 2005;4:801. doi: 10.1021/pr0497632. [DOI] [PubMed] [Google Scholar]
  • 23.Pruvost A, Becher F, Bardouille P, Guerrero C, Creminon C, Delfraissy JF, Goujard C, Grassi J, Benech H. Rapid Commun Mass Spectrom. 2001;15:1401. doi: 10.1002/rcm.384. [DOI] [PubMed] [Google Scholar]
  • 24.Cox DM, Zhong F, Du M, Duchoslav E, Sakuma T, McDermott JC. J Biomol Tech. 2005;16:83. [PMC free article] [PubMed] [Google Scholar]
  • 25.Mollah S, Wertz IE, Phung Q, Arnott D, Dixit VM, Lill JR. Rapid Commun Mass Spectrom. 2007;21:3357. doi: 10.1002/rcm.3227. [DOI] [PubMed] [Google Scholar]
  • 26.Ciccimaro E, Hevko J, Blair IA. Rapid Commun Mass Spectrom. 2006;20:3681. doi: 10.1002/rcm.2783. [DOI] [PubMed] [Google Scholar]
  • 27.Ciccimaro E, Hanks SK, Yu KH, Blair IA. Anal Chem. 2009;81:3304. doi: 10.1021/ac900204f. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Schlautman JD, Rozek W, Stetler R, Mosley RL, Gendelman HE, Ciborowski P. Proteome Sci. 2008;6:26. doi: 10.1186/1477-5956-6-26. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Findeisen P, Neumaier M. Expert Rev Proteomics. 2009;6:457. doi: 10.1586/epr.09.67. [DOI] [PubMed] [Google Scholar]
  • 30.Friedman DB, Lilley KS. Methods Mol Biol. 2008;428:93. doi: 10.1007/978-1-59745-117-8_6. [DOI] [PubMed] [Google Scholar]
  • 31.Hariharan D, Weeks ME, Crnogorac-Jurcevic T. Methods Mol Biol. 2010;576:197. doi: 10.1007/978-1-59745-545-9_11. [DOI] [PubMed] [Google Scholar]
  • 32.Swaney DL, McAlister GC, Coon JJ. Nat Methods. 2008;5:959. doi: 10.1038/nmeth.1260. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Wong CC, Cociorva D, Venable JD, Xu T, Yates JR., 3rd J Am Soc Mass Spectrom. 2009 doi: 10.1016/j.jasms.2009.04.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Wu WW, Wang G, Baek SJ, Shen RF. J Proteome Res. 2006;5:651. doi: 10.1021/pr050405o. [DOI] [PubMed] [Google Scholar]
  • 35.DeSouza L, Diehl G, Rodrigues MJ, Guo J, Romaschin AD, Colgan TJ, Siu KW. J Proteome Res. 2005;4:377. doi: 10.1021/pr049821j. [DOI] [PubMed] [Google Scholar]
  • 36.Yu KH, Barry CG, Austin D, Busch CM, Sangar V, Rustgi AK, Blair IA. J Proteome Res. 2009;8:1565. doi: 10.1021/pr800904z. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Keshishian H, Addona T, Burgess M, Kuhn E, Carr SA. Mol Cell Proteomics. 2007;6:2212. doi: 10.1074/mcp.M700354-MCP200. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Ingalls AM, Dickie MM, Snell GD. J Hered. 1950;41:317. doi: 10.1093/oxfordjournals.jhered.a106073. [DOI] [PubMed] [Google Scholar]
  • 39.Zhang Y, Proenca R, Maffei M, Barone M, Leopold L, Friedman JM. Nature (London) 1994;372:425. doi: 10.1038/372425a0. [DOI] [PubMed] [Google Scholar]
  • 40.Mutt V. Ann N Y Acad Sci. 1988;527:1. doi: 10.1111/j.1749-6632.1988.tb26968.x. [DOI] [PubMed] [Google Scholar]
  • 41.Goldstein A, Tachibana S, Lowney LI, Hunkapiller M, Hood L. Proc Natl Acad Sci USA. 1979;76:6666. doi: 10.1073/pnas.76.12.6666. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Lazarus LH, Ling N, Guillemin R. Proc Natl Acad Sci USA. 1976;73:2156. doi: 10.1073/pnas.73.6.2156. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Villanueva J, Shaffer DR, Philip J, Chaparro CA, Erdjument-Bromage H, Olshen AB, Fleisher M, Lilja H, Brogi E, Boyd J, Sanchez-Carbayo M, Holland EC, Cordon-Cardo C, Scher HI, Tempst P. J Clin Invest. 2006;116:271. doi: 10.1172/JCI26022. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Westman-Brinkmalm A, Ruetschi U, Portelius E, Andreasson U, Brinkmalm G, Karlsson G, Hansson S, Zetterberg H, Blennow K. Front Biosci. 2009;14:1793. doi: 10.2741/3341. [DOI] [PubMed] [Google Scholar]
  • 45.Houghten RA, Appel JR, Blondelle SE, Cuervo JH, Dooley CT, Pinilla C. Biotechniques. 1992;13:412. [PubMed] [Google Scholar]
  • 46.Houghten RA, Pinilla C, Blondelle SE, Appel JR, Dooley CT, Cuervo JH. Nature (London) 1991;354:84. doi: 10.1038/354084a0. [DOI] [PubMed] [Google Scholar]
  • 47.Zebedee SL, Barbas CF, 3rd, Hom YL, Caothien RH, Graff R, DeGraw J, Pyati J, LaPolla R, Burton DR, Lerner RA, et al. Proc Natl Acad Sci USA. 1992;89:3175. doi: 10.1073/pnas.89.8.3175. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Barbas CF, 3rd, Crowe JE, Jr, Cababa D, Jones TM, Zebedee SL, Murphy BR, Chanock RM, Burton DR. Proc Natl Acad Sci USA. 1992;89:10164. doi: 10.1073/pnas.89.21.10164. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Kijanka G, Murphy D. J Proteomics. 2009;72:936. doi: 10.1016/j.jprot.2009.02.006. [DOI] [PubMed] [Google Scholar]
  • 50.Burton DR, Barbas CF, 3rd, Persson MA, Koenig S, Chanock RM, Lerner RA. Proc Natl Acad Sci USA. 1991;88:10134. doi: 10.1073/pnas.88.22.10134. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Barbas CF, 3rd, Bjorling E, Chiodi F, Dunlop N, Cababa D, Jones TM, Zebedee SL, Persson MA, Nara PL, Norrby E, et al. Proc Natl Acad Sci USA. 1992;89:9339. doi: 10.1073/pnas.89.19.9339. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Smith GP, Scott JK. Methods Enzymol. 1993;217:228. doi: 10.1016/0076-6879(93)17065-d. [DOI] [PubMed] [Google Scholar]
  • 53.Scott JK, Smith GP. Science (Washington, DC) 1990;249:386. doi: 10.1126/science.1696028. [DOI] [PubMed] [Google Scholar]
  • 54.Haab BB, Dunham MJ, Brown PO. Genome Biol. 2001;2:RESEARCH0004. doi: 10.1186/gb-2001-2-2-research0004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.Ramachandran N, Raphael JV, Hainsworth E, Demirkan G, Fuentes MG, Rolfs A, Hu Y, LaBaer J. Nat Methods. 2008;5:535. doi: 10.1038/nmeth.1210. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56.Hsu HY, Wittemann S, Joos TO. Methods Mol Biol. 2008;428:247. doi: 10.1007/978-1-59745-117-8_14. [DOI] [PubMed] [Google Scholar]
  • 57.Stemke-Hale K, Gonzalez-Angulo AM, Lluch A, Neve RM, Kuo WL, Davies M, Carey M, Hu Z, Guan Y, Sahin A, Symmans WF, Pusztai L, Nolden LK, Horlings H, Berns K, Hung MC, van de Vijver MJ, Valero V, Gray JW, Bernards R, Mills GB, Hennessy BT. Cancer Res. 2008;68:6084. doi: 10.1158/0008-5472.CAN-07-6854. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58.Hurst R, Hook B, Slater M, Hartnett J, Storts DR, Nath N. Anal Biochem. 2009;392:45. doi: 10.1016/j.ab.2009.05.028. [DOI] [PubMed] [Google Scholar]
  • 59.Iliopoulos D, Malizos KN, Oikonomou P, Tsezou A. PLoS ONE. 2008;3:e3740. doi: 10.1371/journal.pone.0003740. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 60.Lobke C, Laible M, Rappl C, Ruschhaupt M, Sahin O, Arlt D, Wiemann S, Poustka A, Sultmann H, Korf U. Proteomics. 2008;8:1586. doi: 10.1002/pmic.200700733. [DOI] [PubMed] [Google Scholar]
  • 61.Sheehan KM, Calvert VS, Kay EW, Lu Y, Fishman D, Espina V, Aquino J, Speer R, Araujo R, Mills GB, Liotta LA, Petricoin EF, 3rd, Wulfkuhle JD. Mol Cell Proteomics. 2005;4:346. doi: 10.1074/mcp.T500003-MCP200. [DOI] [PubMed] [Google Scholar]
  • 62.Hattori S, Iida N, Kosako H. Expert Rev Proteomics. 2008;5:497. doi: 10.1586/14789450.5.3.497. [DOI] [PubMed] [Google Scholar]
  • 63.Kornblau SM, Tibes R, Qiu YH, Chen W, Kantarjian HM, Andreeff M, Coombes KR, Mills GB. Blood. 2009;113:154. doi: 10.1182/blood-2007-10-119438. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 64.Fenner BJ, Scannell M, Prehn JH. Biochim Biophys Acta. 2009;1794:1010. doi: 10.1016/j.bbapap.2009.02.013. [DOI] [PubMed] [Google Scholar]
  • 65.Janzi M, Odling J, Pan-Hammarstrom Q, Sundberg M, Lundeberg J, Uhlen M, Hammarstrom L, Nilsson P. Mol Cell Proteomics. 2005;4:1942. doi: 10.1074/mcp.M500213-MCP200. [DOI] [PubMed] [Google Scholar]
  • 66.Caiazzo RJ, Maher AJ, Drummond MP, Lander CI, Tassinari OW, Nelson BP, Liu BCS. Proteomics Clin Appl. 2009;3:138. doi: 10.1002/prca.200800149. [DOI] [PubMed] [Google Scholar]
  • 67.Ramachandran N, Hainsworth E, Demirkan G, LaBaer J. Methods Mol Biol. 2006;328:1. doi: 10.1385/1-59745-026-X:1. [DOI] [PubMed] [Google Scholar]
  • 68.He M, Stoevesandt O, Taussig MJ. Curr Opin Biotechnol. 2008;19:4. doi: 10.1016/j.copbio.2007.11.009. [DOI] [PubMed] [Google Scholar]
  • 69.Eckel-Passow JE, Hoering A, Therneau TM, Ghobrial I. Cancer Res. 2005;65:2985. doi: 10.1158/0008-5472.CAN-04-3213. [DOI] [PubMed] [Google Scholar]
  • 70.Peng H, Wu Y, Duan Z, Ciborowski P, Zheng J. Submitted. 2009 Journal ?, Published ? [Google Scholar]
  • 71.Templin MF, Stoll D, Schwenk JM, Potz O, Kramer S, Joos TO. Proteomics. 2003;3:2155. doi: 10.1002/pmic.200300600. [DOI] [PubMed] [Google Scholar]
  • 72.Poetz O, Schwenk JM, Kramer S, Stoll D, Templin MF, Joos TO. Mech Ageing Dev. 2005;126:161. doi: 10.1016/j.mad.2004.09.030. [DOI] [PubMed] [Google Scholar]
  • 73.Daszykowski M, Stanimirova I, Bodzon-Kulakowska A, Silberring J, Lubec G, Walczak B. J Chromatogr, A. 2007;1158:306. doi: 10.1016/j.chroma.2007.02.009. [DOI] [PubMed] [Google Scholar]
  • 74.Biron DG, Brun C, Lefevre T, Lebarbenchon C, Loxdale HD, Chevenet F, Brizard JP, Thomas F. Proteomics. 2006;6:5577. doi: 10.1002/pmic.200600223. [DOI] [PubMed] [Google Scholar]
  • 75.Service RF. Science (Washington, DC) 2008;321:1758. doi: 10.1126/science.321.5897.1758. [DOI] [PubMed] [Google Scholar]
  • 76.Yan L, Tonack S, Smith R, Dodd S, Jenkins RE, Kitteringham N, Greenhalf W, Ghaneh P, Neoptolemos JP, Costello E. J Proteome Res. 2009;8:142. doi: 10.1021/pr800451h. [DOI] [PubMed] [Google Scholar]
  • 77.Jacoby E. Mol Biosyst. 2006;2:218. doi: 10.1039/b603004c. [DOI] [PubMed] [Google Scholar]
  • 78.Aebersold R. Nat Methods. 2009;6:412. doi: 10.1038/nmeth.f.255. [DOI] [PubMed] [Google Scholar]
  • 79.Bell AW, Deutsch EW, Au CE, Kearney RE, Beavis R, Sechi S, Nilsson T, Bergeron JJ, Beardslee TA, Chappell T, Meredith G, Sheffield P, Gray P, Hajivandi M, Pope M, Predki P, Kullolli M, Hincapie M, Hancock WS, Jia W, Song L, Li L, Wei J, Yang B, Wang J, Ying W, Zhang Y, Cai Y, Qian X, He F, Meyer HE, Stephan C, Eisenacher M, Marcus K, Langenfeld E, May C, Carr SA, Ahmad R, Zhu W, Smith JW, Hanash SM, Struthers JJ, Wang H, Zhang Q, An Y, Goldman R, Carlsohn E, van der Post S, Hung KE, Sarracino DA, Parker K, Krastins B, Kucherlapati R, Bourassa S, Poirier GG, Kapp E, Patsiouras H, Moritz R, Simpson R, Houle B, Laboissiere S, Metalnikov P, Nguyen V, Pawson T, Wong CC, Cociorva D, Yates JR, Iii, Ellison MJ, Lopez-Campistrous A, Semchuk P, Wang Y, Ping P, Elia G, Dunn MJ, Wynne K, Walker AK, Strahler JR, Andrews PC, Hood BL, Bigbee WL, Conrads TP, Smith D, Borchers CH, Lajoie GA, Bendall SC, Speicher KD, Speicher DW, Fujimoto M, Nakamura K, Paik YK, Cho SY, Kwon MS, Lee HJ, Jeong SK, Chung AS, Miller CA, Grimm R, Williams K, Dorschel C, Falkner JA, Martens L, Vizcaino JA. Nat Methods. 2009;6:423. [Google Scholar]
  • 80.Suder P, Bodzon-Kulakowska A, Mak P, Bierczynska-Krzysik A, Daszykowski M, Walczak B, Lubec G, Kotlinska JH, Silberring J. J Proteome Res. 2009;8:4633. doi: 10.1021/pr900443r. [DOI] [PubMed] [Google Scholar]
  • 81.McQueen MJ. Clin Chem. 2001;47:1536. [PubMed] [Google Scholar]
  • 82.Tan DS, Thomas GV, Garrett MD, Banerji U, de Bono JS, Kaye SB, Workman P. Cancer J. 2009;15:406. doi: 10.1097/PPO.0b013e3181bd0445. [DOI] [PubMed] [Google Scholar]

RESOURCES