Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2023 Jul 1.
Published in final edited form as: Proteomics. 2022 May 31;22(13-14):e2100170. doi: 10.1002/pmic.202100170

Stability and reproducibility of proteomic profiles in epidemiological studies: comparing the Olink and SOMAscan platforms

Danielle E Haslam 1,2, Jun Li 2,3, Simon T Dillon 4, Xuesong Gu 4, Yin Cao 5,6,7, Oana A Zeleznik 1, Naoko Sasamoto 8, Xuehong Zhang 1,2, A Heather Eliassen 1,2,3, Liming Liang 3,9, Meir J Stampfer 1,2,3, Samia Mora 10, Zsu-Zsu Chen 11, Kathryn L Terry 3,8, Robert E Gerszten 12,13, Frank B Hu 1,2,3, Andrew T Chan 1,14,15, Towia A Libermann 4, Shilpa N Bhupathiraju 1,2
PMCID: PMC9923770  NIHMSID: NIHMS1864030  PMID: 35598103

Abstract

Limited data exist on the performance of high-throughput proteomics profiling in epidemiological settings, including the impact of specimen collection and within-person variability over time. Thus, the Olink (972 proteins) and SOMAscan7Kv4.1 (7322 proteoforms of 6596 proteins) assays were utilized to measure protein concentrations in archived plasma samples from the Nurses’ Health Studies and Health Professionals Follow-Up Study. Spearman’s correlation coefficients (r) and intraclass correlation coefficients (ICCs) were used to assess agreement between (1) 42 triplicate samples processed immediately, 24-h or 48-h after blood collection from 14 participants; and (2) 80 plasma samples from 40 participants collected 1-year apart. When comparing samples processed immediately, 24-h, and 48-h later, 55% of assays had an ICC/r≥ 0.75 and 87% had an ICC/r≥ 0.40 in Olink compared to 44% with an ICC/r≥ 0.75 and 72% with an ICC/r ≥ 0.40 in SOMAscan7K. For both platforms, >90% of the assays were stable (ICC/r≥ 0.40) in samples collected 1-year apart. Among 817 proteins measured with both platforms, Spearman’s correlations were high (r> 0.75) for 14.7% and poor (r < 0.40) for 44.8% of proteins. High-throughput proteomics profiling demonstrated reproducibility in archived plasma samples and stability after delayed processing in epidemiological studies, yet correlations between proteins measured with the Olink and SOMAscan7K platforms were highly variable.

Keywords: Aptamers, biomarkers, epidemiology studies, laboratory methods and tools, multiplexing, systems biology

1 │. INTRODUCTION

Protein biomarkers have been central to chronic disease diagnosis, prevention, and treatment in clinical medicine for over 30 years [1]. Advances in high-throughput proteomics technology now allow for simultaneous quantification of thousands of human proteins [13]. This provides new opportunities to accelerate biomarker discovery for improving accuracy of chronic disease risk prediction [410].

Olink Bioscience (Uppsala, Sweden) and SOMAscan (SomaLogic: Boulder, CO) assays have led the way in high-throughput, high multiplex affinity proteomics profiling. The Olink Bioscience proteomics platform provides multiplexed immune-based assay panels targeted toward various disease processes, while the SOMAscan platform provides modified oligonucleotide aptamer-based assays that cover a broad range of biological processes. Both assays require small sample volumes and capture a wide range of proteins across the whole dynamic range (>10 logs). Although the two assays quantify different sets of proteins, many proteins can be measured across both platforms. Integrating protein measures across platforms would facilitate research efforts to identify protein–disease relationships, yet recent studies have reported a wide range of correlations in protein levels measured with the two assays [11, 12].

Large epidemiological studies collect and store biospecimens from study participants for several decades that can serve as valuable resources for biomarker discovery. However, sample processing and data collection techniques do not always follow standard clinical protocols and can vary across studies and individuals within studies. To identify robust clinical biomarkers, biomarker discovery should ideally leverage biological samples that are easy to collect, display little within-person variation over time, (except for biomarkers specifically linked to particular clinical phenotypes), and come from samples processed using various methods [13, 14]. The Olink and SOMAscan proteomics assays include extensive quality control (QC) measures as well as normalization and calibration approaches to ensure the detection of technical laboratory errors and low/bad quality outlier samples. However, limited data is available that has rigorously assessed the impact of preanalytical blood sample collection factors on protein expression levels measured by the Olink and the SOMAscan platforms. These experiments are essential in epidemiologic studies, where blood samples are often shipped overnight, resulting in delays in processing, and then are archived for many years. Further, little is known about how protein measures compare between the Olink and SOMAscan platforms.

To address these gaps, we first conducted three experiments to examine the interassay reproducibility, the impact of delayed sample processing, and within-person reproducibility over 1- and 10-years of proteins measured with the Olink proteomics platform among archived blood samples from men and women in three prospective cohort studies. Then, we applied similar methods for the latest version of the SOMAscan platform (SOMAscan7K v4.1), updating a previous assessment of the SOMAscan1.3K platform [15], and compared protein concentrations in Olink versus SOMAscan.

2 │. MATERIALS AND METHODS

2.1 │. Study population

This study utilized data collected from participants of the Nurses’ Health Study (NHS), NHSII, Health Professionals Follow-Up Study (HPFS), and local volunteers. The NHS began in 1976 and included 121,701 female registered nurses aged 30–55 years from 11 US states [16]. The NHSII began in 1989 and included 116,429 female registered nurses aged 25–42 years from 14 US states [16]. Samples from an NHSII substudy, the Mind–Body Study [17], were utilized here because these participants have two blood samples collect 1-year apart. The HPFS began in 1986 and included 51,529 male health professionals aged 40–75 years from all 50 US states [18]. Blood samples were provided by 32,826 women in NHS between 1989 and 1990, 29,611 women in NHSII between 1996 and 1999, and 18,225 men in HPFS from 1993 to 1995. In all cohorts, blood samples were collected and processed as reported previously [19, 20]. In NHS (mean age = 57 years), NHSII (mean age = 45 years), and HPFS (mean age = 64 years), participants were mailed a blood collection kit that contained a consent form, supplies, and instructions for the blood draw. Samples were returned via prepaid overnight courier in a Styrofoam container with an icepack. Blood samples from NHS and NHSII were collected in heparin and samples from HPFS were collected in EDTA. Upon arrival, samples were centrifuged, separated into buffy coat, red blood cells, and plasma, and aliquoted, and stored in liquid nitrogen freezers at ≤−130°C. Processing was completed within several hours of specimen receipt and most samples (>90%) were frozen within 30–36 h after venipuncture. The time elapsed between sample processing and proteomic profiling was 28–31 years in NHS, 19–22 years in NHSII, and 23–25 years in HPFS. Participants from the NHS, NHSII, and HPFS were asked about their fasting status at the time of blood draw and provided information about age, height, weight, sex, and race/ethnicity during biennial questionnaires. In NHS and HPFS, consent among the participants was implied by receipt of completed questionnaires and blood samples. Written informed consent was obtained for NHSII participants.

Local volunteers were recruited to test the effects of immediate versus delayed processing in 2009. At the time of blood draw, volunteers provided written consent and data on age, sex, race/ethnicity, and fasting status. Blood samples were deidentified and divided into three aliquots: one was processed immediately, one was processed 24-h later, and another was processed 48-h after blood collection. Samples were stored in the same manner as described above for NHS, NHSII, and HPFS. The time elapsed between sample processing and proteomic profiling was 10 years among the local volunteers. The Brigham and Women’s Hospital Institutional Review Board approved all study protocols.

2.2 │. Study design

We conducted three separate experiments to assess different types of stability and reproducibility in archived cohort samples: (1) blinded replicates; (2) delayed processing; and (3) within-person stability over time.

Table 1 summarizes the number of participants and samples included in each of the three experiments. QC pools derived from discarded plasma from blood donation centers were also included across the three experiments. First, the Olink blinded replicates experiment evaluated interassay reproducibility in archived cohort samples by anticoagulant type using 28 samples from six NHS participants (six duplicates; heparin), six HPFS participants (six duplicates; EDTA), and two QC pools (two duplicates; one of each collected in heparin or EDTA). This study design is similar to previous studies aiming to evaluate the measurement error within a biological assay [21, 22]. Because a previous version of the blinded replicates experiment reported high reproducibility for the SOMAscan platform (1.3K) [15], an abbreviated blinded replicates experiment was repeated among three QC pools (two heparin duplicates; 1 EDTA duplicate) for SOMAscan7K v4.1. Second, we evaluated the impact of delayed processing in blood samples from 14 donors (seven in EDTA and seven in heparin) and two QC pools, with each sample allocated into triplicates and processed (separated into plasma and blood cells and then frozen) immediately, or 24-h or 48-h after blood draw. Third, within-person reproducibility over time was evaluated by comparing protein concentrations in 84 blood samples collected 1-year apart from 40 randomly selected NHSII participants and two QC pool samples. In Olink, an additional experiment was conducted utilizing 80 blood samples collected 10-years apart from 38 NHS participants from an ovarian cancer nested case–control study and two QC pool samples. One participant was excluded from the SOMAscan7K within-person reproducibility pilot due to limited availability of stored plasma for this sample. These sample sizes have proved adequate in previous reproducibility and stability studies to provide estimates of within-person variation and biomarker stability [14, 15, 21, 23, 24]. To avoid bias, all samples were blinded to laboratory personnel.

TABLE 1.

Characteristics of study participants and QC samples included in the proteomics profiling experiments: blinded replicates, delayed processing, and within-person reproducibility over time (1 and 10-years)

Blinded replicates Delayed processing Within-person reproducibility over 1-year Within-person reproducibility over 10-years
Total number samples (QC and participant), n 28 46 84 80
Standard QC samples, n 2 2 2 2
 Number of replicates 2 2 2 2
Study participants, n 12 14 40 38
 Plasma samples per participant 2 3 2 2
Plasma sample anticoagulant
 EDTA 6 7 0 0
 Heparin 6 7 40 38
Participant characteristics
 Age (years), mean (range) 46 (30–69) 41 (24–57) 45 (37–51) 59 (48–67)
 Female (%) 50 79 100 100
 White (%) 83 79 95 100
 BMI (kg/m2), mean (range) 24.6 (19.5–32.0) NA 27.0 (18.3–50.0) 24.0 (17.4–35.9)
 Smoking (%) current 0 7 0 3
 Time since last meal
  <8 h (%) 50 64 29 21
  ≥8 h (%) 50 36 71 79

BMI, body mass index; EDTA, ethylenediaminetetraacetic acid; QC, quality control.

2.3 │. Proteomic profiling

First, we performed proteomic profiling using an immunoaffinity proteomics technology using the Olink Biosciences platform (Uppsala, Sweden), which offers several different targeted assays. We included proteins from the following 11 panels: cardiometabolic, cardiovascular II, cardiovascular III, cell regulation, development, immune response, inflammation, metabolism, neurology, oncology, and organ damage (specific versions included in each experiment are specified within corresponding tables). Each panel includes 92 proteins, with some proteins being measured in multiple panels. Thus, 972 unique proteins were measured. The immunoaffinity technique used by Olink leverages an extensive collection of nucleotide-labeled antibodies in an immune-PCR method referred to as a proximity extension assay (PEA) that is useful in multiplexing to reduce the problem of cross-reactivity [1, 25]. PEA is based on the incubation of the samples with two distinct antibodies targeting nonoverlapping epitopes of the analyte of interest. The antibodies are labeled with complementary DNA oligonucleotide sequences, which come in close proximity upon target binding and subsequently hybridize. Oligonucleotides are then extended over the complementary probe to form a PCR amplicon through the addition of DNA polymerase that is then finally quantified by microfluidic qPCR [25].

Second, we performed proteomic profiling using the oligonucleotide aptamer-based proteomics technology, SOMAscan, provided by SomaLogic (Boulder, CO; SomaScan Assay v4.1) according to the manufacturer’s standardized protocol. SomaScan Assay v4.1 uses 7322 high affinity, distinct aptamer reagents to measure expression of 6596 unique proteins. SomaLogic’s proprietary protein-capture reagents called SOMAmer (Slow Off-rate Modified Aptamer) leverage short single-stranded DNA sequences known as aptamers that form three-dimensional structures similar to antibodies and based on their unique nucleotide sequences are able to bind with high affinity and selectivity to distinct proteins [26].

Data from both proteomics platforms were normalized and transformed using internal controls and interplate controls to adjust for intra- and inter-run variation. In Olink, the final assay read-out is in Normalized Protein eXpression (NPX), which is an arbitrary unit on a log-scale, where a higher value corresponds to a higher protein expression. Each PEA measurement has a lower limit of detection (LOD) based on negative controls included in each run. Values below the LOD were set to missing. For SOMAscan, the final assay read-out is in relative fluorescence units (RFU), an arbitrary unit corresponding to higher or lower protein expression.

2.4 │. Statistical analysis

Proteins were classified into groups based on their molecular function using Gene Ontology (GO) annotations [27] obtained through UniProt (www.uniprot.org) [28] (see Supplemental Methods), and Olink assays were also grouped by panel. Next, the blinded replicates experiment was utilized to identify assays that demonstrated good reproducibility (CVs < 20%) in the archived cohort samples (see Supplemental Methods).

To assess delayed processing, we calculated intraclass correlation coefficients (ICCs) and Spearman’s r in samples processed immediately versus 24-h and 48-h after blood collection by protein class and panel (Olink only). To achieve a normal distribution for all proteins, relative protein concentrations were probit-transformed prior to ICC calculations. We excluded proteins with CVs > 20% in the blinded replicate experiments or with >75% of samples missing (n = 53 and n = 66 proteins in samples collected in EDTA and heparin for Olink, respectively; n = 211 aptamers for SOMAscan7K). The ICCs across processing times were calculated as the between-person variance divided by the sum of the within- and between-person variances to estimate the within-person variation relative to the total variation across processing times. We quantified the percentage of assays that achieved an ICC or Spearman r ≥ 0.75, which indicates excellent stability, and an ICC or Spearman r ≥ 0.40, which indicates fair to good stability according to previous studies [29].

To assess within-person stability over 1-year, we calculated Spearman r and ICCs by protein class and panel (Olink only). Among assays examined in the delayed processing experiment, we further excluded assays with ICC or Spearman r < 0.40 in the delayed processing experiment (n = 222 and 2072, respectively), resulting in a total of 684 and 5039 assays included in the within-person stability over 1-year experiment for Olink and SOMAscan, respectively. We examined the median, 10th percentile, and 90th percentile of ICCs and Spearman r. We also quantified the percentage of assays that achieved an ICC and/or Spearman r ≥ 0.40. This level of agreement has been shown to be an acceptable level of within-person stability for biomarkers in previous studies examining the influence of measurement error on risk estimates [14, 29]. Similar methods were applied to assess within-person stability over 10-years (see Supplemental Methods).

To assess agreement between the Olink and SOMAscan7K assays, we examined Spearman’s (r) correlation coefficients between proteins measured with both platforms (n = 817) among 39 NHSII participants included in the within-person reproducibility experiment at the time of the first and second blood draw. To account for repeated measures, we have reported the mean correlation from the two blood draws and utilized bootstrapping to obtain a 95% confidence interval around our estimate. We examined correlations by protein class and quantified the number of proteins that achieved high (r > 0.75), moderate (r = 0.40–0.75), or poor (r < 0.40) correlations. Because some of the SOMAscan7K assays include multiple aptamers that target the same protein, we reported all correlations between aptamers targeting the same protein in the Olink assay. To select a SOMAscan7K aptamer for the summary tables, we chose the assay with the highest correlation to the Olink assay. All statistical analyses were performed in R (version 4.0.3; https://cran.r-project.org) statistical software.

3 │. RESULTS

The majority of participants were white female nonsmokers (Table 1). The mean age was similar in all experiments, although participants were oldest in the 10-year within-person reproducibility experiment. In the blinded replicates, delayed processing, and within-person reproducibility over 1- and 10-years experiments, 50%, 36%, 71%, and 79% of participants were fasting for ≥8 h before their blood draw, respectively. Results for individual proteins measured with Olink and/or SOMAscan7K platforms from all experiments are provided in Supplemental Tables S1S7. A total of 99.7% and 97% of proteins passed the blinded replicates experiments (CV < 20%) within the archived cohort samples in Olink and SOMAscan7K, respectively (see Supplemental Results).

3.1 │. Delayed processing

The ICCs and Spearman r for samples processed immediately, 24-h after blood draw, and 48-h after blood draw by protein class are shown in Supplemental Tables S8 and S9. Among all proteins passing the blinded replicates experiment, 54% and 39% in EDTA and 55% and 44% in heparin samples had an ICC or Spearman r ≥ 0.75 for the Olink and SOMAscan7K platforms, respectively (Figure 1). An additional 33% or 33% in EDTA and 21% or 27% in heparin samples had an ICC or Spearman r = 0.40–0.75 for the Olink and SOMAscan7K platforms, respectively. For the SOMAscan7K platform, correlations between protein concentrations in samples stored in EDTA, but not heparin, that were processed immediately versus after a 48-h delay (median r= 0.26) were lower compared to similar correlations in samples processed immediately versus after a 24-h delay (median r = 0.54).

FIGURE 1.

FIGURE 1

Olink and SOMAscan7K proteomics profiling delayed processing experiment: histograms representing the percentage of proteins with ICCs or Spearman’s correlation coefficients (r) ≥0.75 (excellent stability), 0.40–0.75 (fair to good stability), or < 0.40 (poor stability) from samples processed immediately versus 24-h versus 48-h stratified by protein molecular function class and anticoagulant type: OE, Olink proteomics profiling in blood samples collected in EDTA; OH, Olink proteomics profiling in blood samples collected in heparin; N, number of proteins or assays; SE, SOMAscan7K proteomics profiling in blood samples collected in EDTA; SH, SOMAscan7K proteomics profiling in blood samples collected in heparin. EDTA, ethylenediaminetetraacetic acid; ICC, intraclass correlation

The proportion of proteins that displayed stability after delays in processing varied substantially by protein class for the Olink platform but were similar by protein class for the SOMAscan7K platform (Figure 1). Within protein classes in the Olink platform, proteins with transferase activity were affected most by the delayed processing, where only 34% and 28% of proteins had an ICC or Spearman r≥0.75 in EDTA or heparin samples, respectively. Hormones measured with the Olink platform displayed the most stability after delayed processing; 70% and 87% of proteins had an ICC or Spearman r ≥ 0.75 in EDTA or heparin samples, respectively. The largest variation by anticoagulant type was observed for transcription factors, where 44% and 29% of proteins, and structural proteins measured with the Olink platform, where 36% and 65% of proteins, had an ICC or Spearman r ≥ 0.75 after delayed processing when collected in EDTA or heparin, respectively. Similar variation in protein stability after delays in processing was also observed by Olink panels (see Supplemental Results).

3.2 │. Within-person stability over time

The ICCs and Spearman r for samples collected in the same individuals approximately 1 year apart are shown in Table 2. Among all assays that passed the blinded replicate and delayed processing experiments in the Olink (n = 684) and SOMAscan7K (n = 5039) platforms, 92% and 91% had an ICC or Spearman r ≥ 0.40 when comparing samples that were collected 1 year apart. Across protein classes, stability was lowest among kinases, transcription factors, and transferases measured with the Olink platform and growth factors measured with SOMAscan7K, where 83%, 83%, 81%, and 80% of proteins had an ICC or Spearman r ≥ 0.40, respectively. Across Olink panels, the percentage of proteins with an ICC or Spearman r ≥ 0.40 was ≥85% for all panels (Supplemental Table S10). Further, 79% of a subset of proteins measured in the same individuals 10 years apart had an ICC or Spearman r ≥ 0.40 (Supplemental Table S11).

TABLE 2.

Olink and SOMAscan7K proteomics profiling within-person reproducibility over time experiment (1-year): Spearman correlations (r) and intraclass correlations (ICC) for protein concentrations of samples collected 1 year apart in 40 Nurses’ Health Study II participants stratified by protein molecular function class

ICC Spearman r ICC or r


Protein molecular function class n a Median (10th to 90th percentile) ≥0.40 (%) Median (10th to 90th percentile) ≥0.40 (%) ≥0.40 (%)
Olink
Cytokine 65 0.63 (0.48, 0.78) 98 0.65 (0.49, 0.79) 97 98
Growth factor 40 0.56 (0.42, 0.74) 93 0.53 (0.35, 0.77) 88 93
Dimer-forming 53 0.58 (0.38, 0.80) 89 0.60 (0.42, 0.80) 91 91
Kinase 36 0.54 (0.26, 0.74) 75 0.55 (0.30, 0.76) 81 83
Transferase 16 0.56 (0.17, 0.76) 81 0.58 (0.20, 0.79) 75 81
Receptor 179 0.67 (0.39, 0.83) 90 0.65 (0.42, 0.85) 91 92
Protease 72 0.65 (0.31, 0.81) 88 0.64 (0.32, 0.80) 88 88
Protease inhibitor 28 0.64 (0.45, 0.81) 93 0.60 (0.42, 0.81) 89 93
Hormone 21 0.71 (0.44, 0.89) 90 0.72 (0.41, 0.90) 95 95
Structural 28 0.61 (0.45, 0.81) 96 0.62 (0.41, 0.81) 89 96
Carbohydrate binding 46 0.69 (0.46, 0.84) 98 0.66 (0.49, 0.84) 96 98
Peptide binding 21 0.73 (0.51, 0.77) 90 0.71 (0.50, 0.80) 90 90
Lipid binding 12 0.63 (0.40, 0.83) 92 0.64 (0.36, 0.84) 83 92
Transcription factor 6 0.72 (0.42, 0.87) 83 0.64 (0.35, 0.89) 83 83
Other/unclassified 211 0.65 (0.38, 0.85) 89 0.65 (0.38, 0.84) 87 90
All proteins 684 0.64 (0.40, 0.83) 91 0.64 (0.39, 0.84) 89 92
SOMAscan7K
Cytokine 203 0.68 (0.18, 0.94) 80 0.63 (0.31, 0.81) 85 87
Growth factor 138 0.65 (0.13, 0.94) 72 0.59 (0.25, 0.80) 78 80
Dimer-forming 358 0.69 (0.31, 0.94) 86 0.63 (0.39, 0.82) 89 92
Kinase 230 0.63 (0.27, 0.93) 81 0.56 (0.35, 0.82) 86 89
Transferase 343 0.67 (0.28, 0.93) 85 0.62 (0.39, 0.82) 88 92
Receptor 559 0.73 (0.36, 0.95) 88 0.67 (0.41, 0.88) 90 93
Protease 278 0.74 (0.25, 0.94) 83 0.68 (0.36, 0.85) 87 90
Protease inhibitor 133 0.72 (0.39, 0.93) 89 0.70 (0.47, 0.86) 93 95
Hormone 101 0.68 (0.36, 0.94) 87 0.66 (0.43, 0.81) 92 93
Structural 122 0.68 (0.34, 0.92) 84 0.65 (0.35, 0.86) 86 89
Carbohydrate binding 135 0.79 (0.38, 0.95) 89 0.72 (0.36, 0.91) 87 89
Peptide binding 106 0.69 (0.34, 0.96) 87 0.65 (0.37, 0.89) 89 92
Lipid binding 150 0.70 (0.35, 0.89) 85 0.64 (0.45, 0.84) 94 95
Transcription factor 39 0.70 (0.37, 0.93) 87 0.68 (0.35, 0.86) 87 90
Ion binding 1143 0.70 (0.30, 0.95) 85 0.65 (0.39, 0.83) 89 92
Dehydrogenase 80 0.64 (0.26, 0.93) 83 0.60 (0.37, 0.79) 86 90
Hydrolase 107 0.68 (0.27, 0.93) 85 0.65 (0.39, 0.85) 88 92
Cytoskeleton binding 181 0.62 (0.34, 0.93) 84 0.59 (0.36, 0.78) 87 92
Other/unclassified 1978 0.69 (0.31, 0.95) 85 0.64 (0.37, 0.83) 88 91
All assays 5039 0.69 (0.30, 0.95) 85 0.64 (0.38, 0.83) 88 91
a

Total number of unique proteins or assays included in analyses after excluding those with mean CVs > 20% from the blinded replicates experiment, those that were not quantifiable in >75% of samples collected in heparin from the blinded replicates or within-person reproducibility over time experiments, or an ICC or Spearman r < 0.40 in the delayed processing experiment.

3.3 │. Comparing proteins measured with both the Olink and SOMAscan7K proteomics platforms

Correlations between proteins measured with both the Olink and SOMAscan7K platforms (n = 817) are presented by protein class in Table 3 and for individual assays in Supplemental Table S12. The median Spearman (r) correlation was 0.45, ranging from a minimum of −0.43 to a maximum of 0.96. High (r > 0.75) and moderate (r = 0.40–0.75) correlations were observed for 14.7% and 40.5% of proteins, respectively. Similar trends were observed when examining correlations between the Olink platform and an older version of the SOMAscan1.3K platform among 455 overlapping proteins (Supplemental Tables S13 and S14).

TABLE 3.

Spearman correlations (r) by protein class for proteins measured with both the Olink and SOMAscan7K proteomics profiling platforms among 39 Nurses’ Health Study II participantsa

Protein molecular function class N Median (Min, Max) r < 0.40 r = 0.40–0.75 r > 0.75
Cytokine 74 0.50 (−0.19, 0.92) 35.1 50.0 14.9
Growth factor 53 0.40 (−0.10, 0.94) 50.9 35.9 13.2
Dimer-forming 72 0.39 (−0.23, 0.85) 51.4 38.9 9.7
Kinase 49 0.43 (−0.15, 0.83) 44.9 49.0 6.1
Transferase 28 0.33 (−0.25, 0.76) 60.7 35.7 3.6
Receptor 193 0.39 (−0.24, 0.90) 50.8 37.3 11.9
Protease 79 0.58 (−0.33, 0.95) 32.9 43.0 24.1
Protease inhibitor 37 0.59 (−0.41, 0.94) 35.1 40.5 24.3
Hormone 21 0.42 (−0.14, 0.95) 42.9 33.3 23.8
Structural 30 0.39 (−0.24, 0.90) 53.3 40.0 6.7
Carbohydrate binding 49 0.53 (−0.25, 0.90) 40.8 40.8 18.4
Peptide binding 24 0.43 (−0.23, 0.89) 45.8 33.3 20.8
Lipid binding 18 0.41 (−0.08, 0.91) 44.4 44.4 11.1
Transcription factor 10 0.31 (0.01, 0.77) 50.0 40.0 10.0
Other/unclassified 261 0.43 (−0.43, 0.96) 44.4 40.6 14.9
All proteins 817 0.45 (−0.43, 0.96) 44.8 40.5 14.7

Min, minimum; Max, maximum.

a

Estimated correlations are derived from overlapping participants in the Olink and SOMAscan7K within-person reproducibility pilots, accounting for repeated measures of protein concentrations approximately 1 year apart.

4 │. DISCUSSION

Proteomic profiling with the Olink and SOMAscan7K platform assays displayed excellent reproducibility and stability for many proteins measured in archived plasma samples collected in an epidemiologic setting. The utility of these high throughput proteomics assays for the discovery of novel biomarkers was examined for up to 972 unique proteins measured with Olink and 7322 proteoforms for 6596 unique proteins measured with SOMAscan7K. We conducted three stability and reproducibility experiments: (1) blinded replicates; (2) delayed processing; and (3) within-person stability over time. Mean CVs were <20% for 99.7% and 97% of the proteins with the Olink and SOMAscan7K platforms, respectively, and were similar by plasma anticoagulant type. A total of 55% and 44% of proteins displayed excellent stability (ICC or Spearman r ≥ 0.75) and an additional 32% and 28% displayed fair to good stability (ICC or Spearman r = 0.40–0.75) with delays in sample processing up to 48-h for Olink and SOMAscan7K platforms, respectively. This suggests that delayed processing in epidemiological studies should be considered when conducting proteomics profiling with either platform. For both platforms, good stability and low variation (ICC or Spearman r ≥ 0.40) was observed for >90% of assays in samples collected 1-year apart. These stability and reproducibility measures are comparable to other types of high-throughput proteomics assays [3, 15, 30]. However, correlations between proteins measured with both the Olink and SOMAscan7K platforms ranged from high (14.7% with r > 0.75) to moderate (40.5% with r = 0.40– 0.75) to low (44.8% with r < 0.40), replicating similar observations from previous studies [11, 12].

The delayed processing experiment demonstrated that up to 499 and 3105 Olink or SOMAscan7K assays, respectively, displayed excellent stability with delays in processing (24-h and 48-h), and 804 and 5152 displayed fair to good stability. In a prior study, we examined reproducibility and validity of the SOMAscan1.3K, an earlier version of the SOMAscan7K platform, that measures 1305 unique proteins [15]; 50% of the SOMAscan1.3K proteins were measured with Olink in the current study. Similar to the SOMAscan1.3K platform, we observed high variability in protein stability after delayed processing by protein class with the Olink platform. Among protein classes examined in both studies, hormones displayed the best stability and kinases displayed the least stability after delayed processing within the SOMAscan1.3K and Olink platforms. The additional categories examined here identified that few transferases displayed excellent stability after delayed processing (34% in EDTA and 28% in heparin) when using the Olink platform. Similar variation by protein class was not evident in the newer SOMAscan7K platform. However, delays in processing appeared to have a larger impact on protein correlations in samples that were processed after a 48-h delay versus processed immediately when stored in EDTA compared to heparin for the SOMAscan7K assays.

The results from our within-person stability experiment suggest that a single plasma measurement of many proteins measured with the Olink and SOMAscan7K platforms are representative of long-term measures, making them useful candidate biomarkers. Similar results were also observed for the SOMAscan1.3K platform, where 91% of proteins displayed good stability over time (ICC or Spearman r ≥ 0.40) [15], compared to the 92% and 91% observed for the Olink and SOMAscan7K platforms, respectively. For comparison, previous studies have shown that established biomarkers like serum glucose and cholesterol measured 1–2 years apart display correlations of r = 0.38–0.41 and r = 0.65–0.77, respectively [23, 31]. For the Olink and SOMAscan1.3K platforms, measurement of cytokines, receptors, hormones, and structural proteins were the most stable over time, with >90% of proteins in these classes displaying good stability over time using. Notably, kinases displayed better stability over time when measured with the SOMAscan1.3K platform (91% with r ≥ 0.40) compared to the Olink platform (83% with r ≥ 0.40). The proportion of proteins with good stability over time in the SOMAscan7K platform tended to be slightly lower compared to the Olink and SOMAscan1.3K, but the absolute number of proteins displaying good stability in each class was higher across all categories. We further demonstrate long-term stability of a large proportion of the proteins measured with the Olink platform (79%) using data from individuals with proteomics profiling in samples collected 10 years apart.

Among up to 40 NHSII participants with overlapping proteomics profiling using Olink, SOMAscan7K, and SOMAscan1.3K, correlations between assays for the same proteins tended to be moderate but ranged widely (see Supplemental Table S15 for example proteins). A recent study in a similar number of samples compared protein concentrations (n = 425) measured with the SOMAscan1.3K versus Olink platform and observed that 13% of proteins were well-correlated (r ≥ 0.70) and 42% were poorly correlated (r < 0.30) [12]. Our study extends this analysis to a larger number of overlapping proteins with the new SOMAscan7 platform (n = 817) where we observed similar variability in correlations between the assays (14.7% with r > 0.75 and 44.8% with r < 0.40). This variability in correlations highlights the need to better characterize the actual proteoforms measured with the SOMAscan and Olink platforms. There are various differences between the recognition molecules (antibodies, aptamers) for the two platforms that could result in low or no correlation. The antibodies and aptamers may recognize different epitopes on a given protein and, consequently, bind to different isoforms or variants of a protein. Alternatively, reagents may bind to the free form of a protein, the protein interacting with itself or with other proteins in a complex. The LOD may vary for each assay, resulting in decreased ability to rank individuals within certain protein concentration ranges. Only detailed characterization of each of the reagents and the specific proteoforms bound by them will help us gain a better understanding of the similarities and differences between the Olink and SOMAscan platforms.

There are several limitations to our study that should be considered in the interpretation of our results. We did not examine stability and reproducibility of some of the proteins (≤10%) because they were undetectable in sufficient numbers of samples or displayed high CVs in the blinded replicates experiment. The small number of blinded replicates samples limited the accuracy of our estimates. The anticoagulant type is parallel to the sex of participants in the blinded replicates experiment, so we cannot eliminate the possibility that differences observed by anticoagulant type may be due to sex differences. The results from our delayed processing experiment may not directly apply to other cohort studies with different processing methods. The within-person reproducibility over time pilot was conducted among only women, which means the results of this experiment must be tested in men as well. Finally, data were not available to assess the validity and/or specificity of the high-throughput proteomics profiling platforms compared to conventional immunoassays. This underexplored area warrants investigation in future studies [12, 25].

Many of the proteins measured using the Olink and SOMAscan7K proteomics platforms appear to have good reproducibility and stability in archived cohort samples that would make them suitable candidate biomarkers. The assays demonstrated similar accuracy when plasma samples were collected in EDTA or heparin and good within-person reproducibility over time. Sample processing delays had variable impacts across proteins but had little effect on many protein measurements. These data provide an important resource for the investigative community in which an increasing number of studies are leveraging high-throughput proteomics profiling. Further validation and characterization of these assays will be necessary for potential integration of multiple proteomics platforms.

Supplementary Material

Supplemental 2
Supplemental 1

ACKNOWLEDGMENTS

This work is supported by DEH is supported by National Institutes of Health (NIH) T32 CA009001 (DEH), Department of Defense OCRP W81XWH2110320 (NS), NIH P30 CA 006516 (TAL), NIH U01 CA 167552, R01 CA49449, R01 CA67262, UM1 CA 186107, and U01 CA176726 (HPFS, NHS, and NHSII cohorts), a Catalyst Award from Cancer Research UK Aspirin for Cancer Prevention (AsCaP), and the J. Willard Alice S. Marriot Foundation.

Funding information

Cancer Research UK; National Institutes of Health, Grant/Award Numbers: P30 CA006516, R01 CA49449, R01 CA67262, T32 CA009001, U01 CA167552, U01 CA176726, UM1 CA186107; U.S. Department of Defense, Grant/Award Number: OCRP

Abbreviations:

EDTA

ethylenediaminetetraacetic

HPFS

Health Professionals Follow-Up Study

ICC

intraclass correlation coefficient

NHS

Nurses’ Health Study

QC

quality control

Footnotes

SUPPORTING INFORMATION

Additional supporting information may be found online https://doi.org/10.1002/pmic.202100170 in the Supporting Information section at the end of the article.

CONFLICTS OF INTEREST

For work unrelated to this study: Dr. Samia Mora has served as a consultant to Quest Diagnostics and Pfizer; Dr. Andrew T. Chan has served as a consultant to Bayer Pharma AG, Pfizer Inc, and Boehringer Ingelhemi; Dr. Yin Cao previously served as a consultant for Geneoscopy; Dr. Shilpa N. Bhupathiraju serves as a scientific consultant to LayerIV.

DATA AVAILABILITY STATEMENT

The summary data that supports the findings of this study are available in the Supplementary Material. Requests for individual level data are available through external collaboration on approval of a letter of intent and a research proposal (https://www.nurseshealthstudy.org/researchers and https://sites.sph.harvard.edu/hpfs/for-collaborators/).

REFERENCES

  • 1.Smith JG, & Gerszten RE (2017). Emerging affinity-based proteomic technologies for large scale plasma profiling in cardiovascular disease. Circulation, 135, 1651–1664. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Candia J, Cheung F, Kotliarov Y, Fantoni G, Sellers B, Griesman T, Huang J, Stuccio S, Zingone A, Ryan BM, Tsang JS, & Biancotto A. (2017). Assessment of variability in the SOMAscan assay. Science Reports, 7, 14248. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Gold L, Ayers D, Bertino J, Bock C, Bock A, Brody EN, Carter J, Dalby AB, Eaton BE, Fitzwater T, Flather D, Forbes A, Foreman T, Fowler C, Gawande B, Goss M, Gunn M, Gupta S, Halladay D, & Zichi D. (2010). Aptamer-based multiplexed proteomic technology for biomarker discovery. PLoS ONE, 5, e15004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Mallick P, & Kuster B. (2010). Proteomics: A pragmatic perspective. Nature Biotechnology, 28, 695–709. [DOI] [PubMed] [Google Scholar]
  • 5.Geyer PE, Holdt LM, Teupser D, & Mann M. (2017). Revisiting biomarker discovery by plasma proteomics. Molecular Systems Biology, 13, 942. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Lindholm D, James SK, Gabrysch K, Storey RF, Himmelmann A, Cannon CP, Mahaffey KW, Steg PG, Held C, Siegbahn A, & Wallentin L. (2018). Association of multiple biomarkers with risk of all-cause and cause-specific mortality after acute coronary syndromes. JAMA Cardiology, 3, 1160–1166. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Wallentin L, Eriksson N, Olszowka M, Grammer TB, Hagström E, Held C, Kleber ME, Koenig W, März W, Stewart RAH, White HD, Åberg M, & Siegbahn A. (2021). Plasma proteins associated with cardiovascular death in patients with chronic coronary heart disease: A retrospective study. PLoS Medicine, 18, e1003513. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Katz DH, Tahir UA, Bick AG, Pampana A, Ngo D, Benson MD, Yu Z, Robbins JM, Chen ZZ, Cruz DE, Deng S, Farrell L, Sinha S, Schmaier AA, Shen D, Gao Y, Hall ME, Correa A, Tracy RP, & Blood Institute TOPMed (Trans-Omics for Precision Medicine) Consortium. (2022). Whole genome sequence analysis of the plasma proteome in black adults provides novel insights into cardiovascular disease. Circulation, 145, 357–370. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Walker ME, Song RJ, Xu X, Gerszten RE, Ngo D, Clish CB, Corlin L, Ma J, Xanthakis V, Jacques PF, & Vasan RS (2020). Proteomic and metabolomic correlates of healthy dietary patterns: The Framingham Heart Study. Nutrients, 12, 1476. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Vasunilashorn SM, Dillon ST, Chan NY, & Fong TG (2021). Proteome-wide analysis using SOMAscan identifies and validates chitinase-3-like protein 1 as a risk and disease marker of delirium among older adults undergoing major elective surgery. The Journals of Gerontology: Series A, glaa326. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Chaturvedi AK, Kemp TJ, Pfeiffer RM, Biancotto A, Williams M, Munuo S, Purdue MP, Hsing AW, Pinto L, Mccoy JP, & Hildesheim A. (2011). Evaluation of multiplexed cytokine and inflammation marker measurements: A Methodologic Study. Cancer Epidemiology and Prevention Biomarkers, 20, 1902–1911. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Raffield LM, Dang H, Pratte KA, Jacobson S, Gillenwater LA, Ampleford E, Barjaktarevic I, Basta P, Clish CB, Comellas AP, Cornell E, Curtis JL, Doerschuk C, Durda P, Emson C, Freeman CM, Guo X, Hastie AT, Hawkins GA, & NHLBI Trans-Omics for Precision Medicine (TOPMed) Consortium. (2020). Comparison of proteomic assessment methods in multiple cohort studies. Proteomics, 20, e1900278. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Mayeux R. (2004). Biomarkers: Potential uses and limitations. NeuroRx, 1, 182–188. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.White E. (2011). Measurement error in biomarkers: Sources, assessment, and impact on studies. IARC Scientific Publications, 143–161. [PubMed] [Google Scholar]
  • 15.Kim CH, Tworoger SS, Stampfer MJ, Dillon ST, Gu X, Sawyer SJ, Chan AT, Libermann TA, & Eliassen HA (2018). Stability and reproducibility of proteomic profiles measured with an aptamer-based platform. Scientific Reports, 8, 8382. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Bao Y, Bertoia ML, Lenart EB, Stampfer MJ, Willett WC, Speizer FE, & Chavarro JE (2016). Origin, methods, and evolution of the three nurses health studies. American Journal of Public Health, 106, 1573–1581. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Huang T, Trudel-Fitzgerald C, Poole EM, Sawyer S, Kubzansky LD, Hankinson SE, Okereke OI, & Tworoger SS (2019). The Mind–Body Study: Study design and reproducibility and interrelationships of psychosocial factors in the Nurses’ Health Study II. Cancer Causes & Control, 30, 779–790. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Rimm EB, Giovannucci EL, Willett WC, Colditz GA, Ascherio A, Rosner B, & Stampfer MJ (1991). Prospective study of alcohol consumption and risk of coronary disease in men. Lancet, 338, 464–468. [DOI] [PubMed] [Google Scholar]
  • 19.Hankinson SE, Willett WC, Manson JE, Hunter DJ, Colditz GA, Stampfer MJ, Longcope C, & Speizer FE (1995). Alcohol, height, and adiposity in relation to estrogen and prolactin levels in postmenopausal women. JNCI: Journal of the National Cancer Institute, 87, 1297–1302. [DOI] [PubMed] [Google Scholar]
  • 20.Hankinson SE, Willett WC, Manson JE, Colditz GA, Hunter DJ, Spiegelman D, Barbieri RL, & Speizer FE (1998). Plasma sex steroid hormone levels and risk of breast cancer in postmenopausal women. JNCI: Journal of the National Cancer Institute, 90, 1292–1299. [DOI] [PubMed] [Google Scholar]
  • 21.Townsend MK, Clish CB, Kraft P, Wu C, Souza AL, Deik AA, Tworoger SS, & Wolpin BM (2013). Reproducibility of metabolomic profiles among men and women in two large cohort studies. Clinical Chemistry, 59, 1657–1667. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Bland JM, & Altman DG (1996). Statistics notes: Measurement error proportional to the mean. BMJ, 313, 106. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Rosner B, Willett WC, & Spiegelman D. (1989). Correction of logistic regression relative risk estimates and confidence intervals for systematic within-person measurement error. Statistics in Medicine, 8, 1051–1069. discussion 1071–1073. [DOI] [PubMed] [Google Scholar]
  • 24.Al-Delaimy ,WK, Natarajan L, Sun X, Rock CL, & Pierce JJ (2008). Reliability of plasma carotenoid biomarkers and its relation to study power. Epidemiology, 19, 338–344. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Assarsson E, Lundberg M, Holmquist G, Björkesten J, Thorsen SB, Ekman D, Eriksson A, Dickens ER, Ohlsson S, Edfeldt G, Andersson A-C, Lindstedt P, Stenvang J, Gullberg M, & Fredriksson S. (2014). Homogenous 96-Plex PEA immunoassay exhibiting high sensitivity, specificity, and excellent scalability. PLoS ONE, 9, e95192. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.SomaLogic. (2022). SomaScan Assay v4.1. SomaLogic. [Google Scholar]
  • 27.The Gene Ontology Consortium. (2019). The gene ontology resource: 20 years and still going strong. Nucleic Acids Research, 47, D330–D338. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Consortium TU, UniProt: A worldwide hub of protein knowledge. Nucleic Acids Research, 47, D506–D515. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Rosner B. (2005). Fundamentals of biostatistics. Duxbury Press. [Google Scholar]
  • 30.Ngo D, Sinha S, Shen D, Kuhn EW, Keyes MJ, Shi X, Benson MD, O’sullivan JF, Keshishian H, Farrell LA, Fifer MA, Vasan RS, Sabatine MS, Larson MG, Carr SA, Wang TJ, & Gerszten RE (2016). Aptamer-based proteomic profiling reveals novel candidate biomarkers and pathways in cardiovascular disease. Circulation, 134, 270–285. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Shekelle RB, Shryock AM, Paul O, Lepper M, Stamler J, Liu S, & Raynor WJ Jr. (1981). Diet, serum cholesterol, and death from coronary heart disease. New England Journal of Medicine, 304, 65–70. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplemental 2
Supplemental 1

Data Availability Statement

The summary data that supports the findings of this study are available in the Supplementary Material. Requests for individual level data are available through external collaboration on approval of a letter of intent and a research proposal (https://www.nurseshealthstudy.org/researchers and https://sites.sph.harvard.edu/hpfs/for-collaborators/).

RESOURCES