Summary
The COVID-19 pandemic is an unprecedented global challenge, and point-of-care diagnostic classifiers are urgently required. Here, we present a platform for ultra-high-throughput serum and plasma proteomics that builds on ISO13485 standardization to facilitate simple implementation in regulated clinical laboratories. Our low-cost workflow handles up to 180 samples per day, enables high precision quantification, and reduces batch effects for large-scale and longitudinal studies. We use our platform on samples collected from a cohort of early hospitalized cases of the SARS-CoV-2 pandemic and identify 27 potential biomarkers that are differentially expressed depending on the WHO severity grade of COVID-19. They include complement factors, the coagulation system, inflammation modulators, and pro-inflammatory factors upstream and downstream of interleukin 6. All protocols and software for implementing our approach are freely available. In total, this work supports the development of routine proteomic assays to aid clinical decision making and generate hypotheses about potential COVID-19 therapeutic targets.
Keywords: high-throughput proteomics, mass spectrometry, SWATH-MS, COVID-19 infection, clinical classifiers, antiviral immune response
Graphical Abstract
Highlights
-
•
A standardized, ultra-high-throughput clinical platform for serum and plasma proteomics
-
•
Platform enables high precision quantification of 180 human proteomes per day at low cost
-
•
27 biomarkers are differentially expressed between WHO severity grades for COVID-19
-
•
Biomarkers include proteins not previously associated with COVID-19 infection
Messner et al. present a standardized, low-cost, ultra-high-throughput platform for serum and plasma proteomics designed for clinical use and apply it to a cohort of hospitalized COVID-19 patients. They identify 27 potential biomarkers that are differentially expressed depending on the WHO severity grade of COVID-19.
Introduction
The ongoing SARS-CoV-2 pandemic has highlighted the pressing need for technologies that can accelerate our understanding of emerging diseases in order to (1) find markers that define disease severity, have prognostic value, or define a specific phase of the disease; (2) identify preventive strategies; and (3) discover therapeutic targets. PCR-based diagnostics can be implemented and scaled quickly but do not provide information about severity of the disease as well as likely illness trajectories (Chen et al., 2020; Corman et al., 2020). Furthermore, conventional biomarker as well as serological assays depend on affinity reagents, such as antibodies. Developing these takes time and requires prior knowledge of epitopes and the disease mechanisms (Petherick, 2020). Indeed, the host response to each viral infection is significantly different, specifically, as several viruses can evade the host immune system (Bussey and Brinkmann, 2018; Kikkert, 2020). Each novel infective agent requires a new assessment of the host response, as well as requires a unique set of biomarkers for predicting disease trajectories.
Mass spectrometry (MS)-based proteomics does not depend on affinity reagents and can be set up in an untargeted fashion, such that it does not depend on prior knowledge of the disease. It can quickly deliver substantial amounts of clinical and biological information from accessible biological material, such as blood plasma or serum. MS-based proteomics hence has the potential to become an ideal technology to be applied in situations when rapid responses are required. Currently, MS-based proteomic workflows are well established in research laboratories, where they are routinely used for biomarker discovery and profiling (Bruderer et al., 2019; Geyer et al., 2016a, 2016b, 2017, 2019; Liu et al., 2015; Niu et al., 2019; Wewer Albrechtsen et al., 2018).
Increasingly, MS-based proteomics is also entering regulated clinical and diagnostic environments (He, 2019, Mischak et al., 2012, Crutchfield et al., 2016). It has the potential to yield complex and predictive biomarker signatures that support clinical decision making, as well as to enable the prediction of patient trajectories via machine learning methods with datasets of sufficient depth and size (Ahadi et al., 2020). However, in clinical practice, its potential is yet to be completely realized. For routine applications, MS-based proteomic methods must combine precision, reproducibility, and robustness with low cost and high throughput, such that results can be routinely compared within and between clinical studies and laboratories (Geyer et al., 2017, 2019; Nilsson et al., 2010; Wright and Van Eyk, 2017). These requirements might necessitate a compromise with proteomic depth, which has often been a key objective of proteomics in research settings, but which has been, in turn, often achieved through long measurement times and high cost (Bruderer et al., 2019; Geyer et al., 2016a; Liu et al., 2015). A further hurdle is that current MS-based proteomic workflows require, partially due to their dependency on chromatographic flow rates in the range of nanoliters or low microliters per minute, a high level of expert knowledge to achieve the level of necessary robustness for implementation in the clinical laboratory.
Here, we present a redesigned high-throughput MS platform that enables the cost-effective (less than 10€ for consumables per sample) in-depth analysis of disease susceptibility and progression in patients as well as biomarker discovery. Our platform is optimized at all steps from sample preparation, chromatography, and data acquisition to data processing, in comparison to existing research pipelines (Bache et al., 2018; Bekker-Jensen et al., 2020; Bian et al., 2020; Bluemlein and Ralser, 2011; Bruderer et al., 2019; Geyer et al., 2016a; Liu et al., 2015; Müller et al., 2020; Vowinckel et al., 2018). It includes a semi-automated sample preparation workflow that scales to high sample numbers through the use of liquid handling robotics and minimum hands-on-time and includes effective strategies to mitigate longitudinal batch effects. It also makes use of short-gradient high-flow liquid chromatography (LC), a technology that is the basis of several FDA-approved clinical assays (Grebe and Singh, 2011; Nair and Clarke, 2016), applicable to high-throughput proteomic experiments. Using this approach, we were able to reduce measurements to 5-min gradient length, inter-runtime to 3 min or less, as well as to use flow rates of 800 μL/min, thereby substantially increasing sample turnover and reducing costs, while increasing stability and precision.
We first benchmarked the platform on a cohort-based epidemiological study, Generation Scotland (GS) (Smith et al., 2013), and at considerably higher throughput, demonstrated a level of precision and consistency that, to our knowledge, is yet unachieved in comparable large-scale proteomic studies. We then employed this workflow in an immediate response to the SARS-CoV-2 pandemic outbreak in Germany, by applying it to a cohort that includes the first COVID-19 patients hospitalized at the Charité Universitätsmedizin Berlin. We measured samples from a primary exploratory cohort comprising thirty-one COVID-19 patients to identify clinical classifiers, candidate biomarkers as well as potential targets that picture the host immune response specific to a SARS-CoV-2 infection (Figure 4A). We validated these on a smaller cohort of seventeen independent patients and fifteen healthy volunteers (Table S1). We identified protein expression signatures that can classify COVID-19 patients according to WHO grading, introduced as of April 2020 (World Health Organization, 2020).
Figure 4.
Protein Signatures Indicate Clinical Severity in COVID-19
(A) Study design. 199 random individuals from the GS study were measured to assess the performance of the platform and to obtain a population baseline. Protein responses based on COVID-19 severity were obtained from a cohort of 31 hospitalized SARS-CoV-2 infected patients. Severity of COVID-19 was graded using the WHO ordinal outcome scale of clinical improvement (World Health Organization, 2020).
(B) PCA based on proteins found differentially expressed depending on COVID-19 severity. Median quantities across all time points were calculated for each patient and 29 proteins without missing values were used to generate the PCA plot (quantities were standardized). Cases with the severity “3” on the WHO scale (hospitalized, no oxygen therapy) are well separated from cases with the severity “7” along the first principle component, with “4”–“6” cases in between.
(C) Heatmap shows protein signatures that report on COVID-19 severity. Visualization was performed using the ComplexHeatmap R package (Gu et al., 2016). Black “squares” indicate missing values. Patients labeled with an asterisk (∗) had a fatal outcome of the disease.
(D) Proteins upregulated (top panel) and downregulated (lower panel) depending on COVID-19 severity (WHO grade; SS, standard serum; GS, Generation Scotland), as well as the population spread of the protein abundance in 199 randomly selected individuals of an independent cohort (Generation Scotland; GS). As the absolute quantities from the COVID-19 and GS studies cannot be compared directly (samples were obtained in a different manner), we simplified the visual assessment of the population spread, by normalizing by the median of GS quantities to the median of WHO grade 3 (no oxygen support) COVID-19 cases (the normalized values were used for illustration purposes only and not used for testing for statistical significance). The boxes show first and third quartile as well as the median (middle) and the whiskers extend to the most extreme data point, which is no more than 1.5 times the interquartile range from the box. Proteins upregulated with increasing severity of COVID-19: A1BG, ACTB;ACTG1, C1R (complement C1r), C1S (complement C1s), C8A (complement C8 alpha chain), CD14 (monocyte differentiation antigen CD14), CFB (complement factor B), CFH (complement factor H), CFI, CRP, FGA, FGB and FGG, HP, ITIH3, ITIH4, LBP, LGALS3BP, LRG1, SAA1, SAA1;SAA2, and SERPINA10; proteins downregulated with increasing severity of COVID-19: ALB, APOA1, APOC1, GSN, and TF.
Our analysis associates several proteins with COVID-19 severity that have not been previously associated with the infection and the host response. These are alpha-1B-glycoprotein (A1BG), beta and gamma-1 actin (ACTB;ACTG1), monocyte differentiation antigen and lipopolysaccharide co-receptor CD14, lipopolysaccharide-binding protein (LBP), galectin 3-binding protein (LGALS3BP), leucine-rich alpha-2-glycoprotein (LRG1), haptoglobin (HP), protein Z-dependent protease inhibitor (SERPINA10), apolipoprotein C1 (APOC1), gelsolin (GSN), and transferrin (TF). Our results highlight the role of complement factors, the coagulation system, several inflammation modulators as well as pro-inflammatory signaling both upstream and downstream of interleukin (IL)-6. In addition, our study provided evidence that proteomic signatures have the potential to outperform conventional clinical assays. Two individuals with differing proteomic signatures were identified through a clustering approach. In one case, a clinical re-assessment changed the diagnosis (the patient was in fact suffering from an influenza type B infection) and revealed in the other case a strong comorbidity caused by anti-cancer chemotherapy. None of the currently applied clinical tests spotted this situation.
In total, our study demonstrates the value and power of robust high-throughput MS in a global public health crisis. Very fast and reliable proteome technologies can play a vital role both in clinical classification as well as in the rapid identification of therapeutic targets against arising infecting agents.
Results and Discussion
A Platform for Clinical Proteomics Yields High Quantitative Precision at Low Costs and High Throughput
We addressed throughput, precision, costs, and practical hurdles in clinical implementation of MS-based proteomics, by designing a proteomics platform, in which we refined sample preparation, chromatography, mass spectrometric acquisition, and data analysis. Our workflow reaches a high level of standardization and documentation, for which ISO 13485 was used as a reference (Figure S1). After the transfer of the clinical plasma or serum samples, obtained with standard operating procedures, to 96-well plates, all pipetting and mixing steps are conducted with liquid handling robots. The workflow has a total hands-on time of around 3.5 h and is designed so that a single person using a single liquid handling unit can start and complete every day one or two 4-plate batches. Effectively four to eight plates containing up to 768 proteome samples exit the workflow every day and are ready for the mass spectrometric analysis.
Among several improvements in handling (see STAR Methods for details), the preparation workflow includes a simple but effective handling improvement that mitigates batch variation in sample preparation reagents, which has proven so far a major and prohibitive contributor to quantification inconsistencies in (large-scale) proteomic experiments (Fu et al., 2018; Lowenthal et al., 2014; Piehowski et al., 2013). Instead of pipetting reagents on the samples at each step, initial common stock solutions (urea and ammonium bicarbonate buffer, dithiothreitol, iodoacetamide, formic acid, and trypsin) are pre-filled into multi-well plates and are then stored at −80°C for whole projects in perpetuity. These plates enter the workflow at different stages (Figure 1), thereby not only reducing hands-on time but maintaining the exact same reagents for projects of, in theory, unlimited scale. A second key step concerns the cleanup of the digested peptides, which is done with 96-well solid-phase extraction plates (Bennike et al., 2018; Bruderer et al., 2019). In our workflow, four are processed in parallel to reduce technical variability. Finally, the inclusion of sample preparation controls on each plate enables cross-batch normalization, to correct batch effects in case these emerge at the sample preparation or the acquisition step of the workflow (Figure 1).
Figure 1.
A High-Throughput Proteomics Platform for Large-Scale and Longitudinal Clinical Proteomic Studies
(A) Experimental part of the workflow. Receipt and storage (green boxes): clinical or epidemiological samples are collected using a standard operating procedure, received, and stored at −80°C, then aliquoted to 96-well plates alongside control samples. For plasma and serum, 5 μL are processed and yield sufficient tryptic digest for five measurements on the high-flow rate LC-MS platform. Sample preparation (yellow boxes): the sample preparation workflow is designed for handling 384 samples per batch (four 96-well plates). Batch effects are mitigated by using pre-aliquoted stock solution plates—prepared for whole projects and stored at −80°C—that enter the workflow at different steps, as well as by using a liquid handling robot for pipetting and mixing. Sample cleanup is done with 384 samples/batch by using 96-well solid-phase extraction plates (BioPureSPE, the Nest Group) and a liquid handler for pipetting. The hands-on time for cleanup is <2 h and although the digestion is done overnight, the total hands-on time for the sample preparation is <3.5 h. Data acquisition (blue boxes): ultra-fast measurements of the digested samples are facilitated in 300-s chromatographic gradients using high-flow chromatography (800 μL/min) with a short reversed phase C18 column (50 × 2.1 mm, 1.8 μm particle size) to accelerate equilibration and washing steps. A 700 ms duty cycle, required to record sufficient data points per chromatographic peaks that elute at FWHM of about 3 s is achieved with an optimized SWATH data acquisition method. The theoretical throughput of data acquisition for one mass spectrometer is 180 samples/day.
(B) Data processing (red boxes): the analysis of the highly complex short-gradient DIA data is achieved with an optimized version (1.7.10) of DIA-NN (Demichev et al., 2020). DIA-NN is based on neural networks to enable confident peptide identification with fast gradients and achieves a throughput of >2,000 samples/day on a conventional PC. First, a spectral library is automatically “refined” using the dataset in question: only detectable peptide precursors are retained, and their reference spectra and retention times are replaced with empirically observed. Reanalysis with this refined library is then followed by batch correction and, finally, protein quantification using MaxLFQ (Cox et al., 2014). Abbreviations: ABC, ammonium bicarbonate; DTT, dithiothreitol; IAA, iodoacetamide; FA, formic acid.
Next, we developed a data acquisition scheme that could be implemented in regulated environments without major hurdles and maintains very low variability across large sample series. Here, we focused on the implementation of a high-flow chromatographic regime in a proteomic workflow. The current standard technique for bottom-up proteomics, nano-flow LC, used owing to its high sensitivity, is a main contributor to batch variability in LC-MS experiments (Gama et al., 2013; Shishkova et al., 2016). Several recent studies have shown that run-to-run variability improves by switching from nano-flow to micro-flow regimes or to specialized chromatographic devices that operate with pre-formed gradients. These allow faster runtimes and sample turnover, show better retention time stability, and improve column lifetime (Bache et al., 2018; Bian et al., 2020; Bruderer et al., 2019; Vowinckel et al., 2018).
Alternatively, high-flow LC (also known as analytical LC) in conjunction with very fast chromatographic gradients, a technology that reaches the requirements of regulated clinical laboratories, could further and substantially improve throughput and chromatographic properties. However, typically it has not been applied to short-gradient proteomics for two important reasons. First, on this type of fast chromatography, conventional mass spectrometric acquisition schemes do not reach sufficient sampling velocity in data-dependent mode (as peaks elute too fast). Second, when using data-independent acquisition (DIA) schemes, which do not sample each peak individually, conventional software cannot deconvolute the interference-rich short-gradient data produced (Demichev et al., 2020; Messner et al., 2019).
We overcame these issues and developed an acquisition scheme based on 5-min water to acetonitrile chromatographic gradients at a flow rate of 800 μL/min. Separating tryptic digests of non-depleted human plasma using an 1290 Infinity II UPLC (Agilent) or H Class UPLC systems (Waters) coupled to a TripleTOF 6600 (Sciex) mass spectrometer illustrates that total ion chromatograms (TICs) are virtually unchanged over repeated injections (Figure 2A). The nearly complete overlap of the TICs indicates not only the stability of the applied chromatographic system but also the stability of the electrospray, which is facilitated by the gases that assist the desolvation process in high-flow ion sources. By using a short 5-cm column and by increasing the flow rate post-gradient to 1,200 μL/min and 1,000 μL/min during washing and equilibration, respectively, we were able to reduce the total runtime, including overhead, to less than 8 min. In this particular test, the setup allows a theoretical throughput of 180 samples/day on a single mass spectrometer, an at least 5-fold improvement compared with microLC or nanoLC platforms optimized for throughput (Bruderer et al., 2019; Geyer et al., 2016a; Vowinckel et al., 2018). Moreover, the columns used in high-flow chromatography have higher capacities and thus are less prone to carryover. Indeed, blank injection after 10 acquisitions of non-depleted plasma tryptic digests shows no significant carryover even with an applied wash time of less than 1 min (Figure 2A).
Figure 2.
High-Flow LC and Its Application to Short-Gradient MS-Based Proteomics
(A) A tryptic digest of human blood plasma was injected 10 times. The peptides were separated with a 300 s linear water to acetonitrile chromatographic gradient using an Agilent 1290 Infinity II LC system coupled to a TripleTOF 6600 mass analyzer. The TICs of the first and last injection were overlaid and colored with blue and red, respectively. The time from the start of one run to the next was reduced to 8 min (including instrument overheads), which enables a throughput of ~180 samples/day. After the 10 plasma injections, water was injected and the TIC (black line) shows no significant carryover despite the short washing time.
(B) Extracted ion chromatograms of 5 synthetic peptides (AETSELHTSLK [m/z 408.55, black line], LDSTSIPVAK [m/z 519.80, orange line], ALENDIGVPSDATVK [m/z 768.90, blue line], AVYFYAPQIPLYANK [m/z = 883.47, green line], and TVESLFPEEAETPGSAVR [m/z 964.97741, red line]) from a synthetic peptide mixture (Pepcal, Sciex) as separated on the 300 s linear gradient. Chromatograms were extracted from TOF MS data, width = 0.1 Da.
(C) A tryptic digest of K562 human cell lines was separated with a 20-min linear gradient ramping from 3% ACN 0.1% FA to 36% ACN, 0.1% FA on high-flow (800 μL/min; C18 column 50 mm × 2.1, column length). Peak widths at FWHM of the eluting peptides were compared to a 20-min micro-flow run (5 μL/min; 15-cm column; Demichev et al., 2020), analyzed on the same mass spectrometer (Sciex TripleTOF 6600).
(D) Peak capacities (gradient length divided by FWHM) for 3, 5, 10, and 20-min linear gradients (3% ACN/0.1% FA to 36% ACN/0.1% FA) on high-flow compared with 20-min micro-flow chromatographic gradients (red dashed line).
The separation of a K562 cell line tryptic digest (Promega) at different gradient lengths illustrates the chromatographic properties achieved. The high-flow setup achieved a median peak full width at half maximum (FWHM) of 3 s with a 20-min gradient length. For comparison, an extensively optimized micro-flow LC (Demichev et al., 2020; Messner et al., 2019), achieved a FWHM of 5 s, at the same gradient length (Figure 2C). Furthermore, high-flow gradients as fast as 5 min resulted in peak capacities comparable to the highly optimized 20-min micro-flow setup (Demichev et al., 2020; Messner et al., 2019) (Figure 2D). In order to achieve a sufficiently fast mass spectrometric duty cycle, we used a q-TOF instrument with a very fast sampling rate (Schilling et al., 2017) and applied SWATH-MS, a DIA method specifically developed to minimize stochastic elements in data acquisition (Gillet et al., 2012; Ludwig et al., 2018). To record sufficient data points per chromatographic peak, we optimized the method for duty cycles of 700 ms and scan a precursor mass range of m/z 450–850 using 25 windows with variable window size (Table S6) and with 25-ms accumulation time.
In order to deconvolute the complex data recorded, we built on our recent developments of DIA-NN software, that includes several algorithms that boost the number of true positive precursor identifications in the short-gradient DIA-MS runs. DIA-NN can handle complex short-gradient data as it contains algorithms that correct for signal interferences and uses deep neural networks to assign confidence scores to peptide-spectrum matches and identify true positive signals (Demichev et al., 2020). Applied to analyzing 5 μg of human cell line (K562) tryptic digest, the short-gradient high-flow method yielded 2,829 unique proteins (at 1% FDR) in triplicate injections, while the numbers of unique proteins quantified with a coefficient of variation (CV) less than 20% and less than 10% were 1,873 and 1,349, respectively (Figure S2E). Hence, despite the ultra-high throughput and the use of high-flow chromatography, the analytical method is able to achieve proteomic depth even on complex samples. Finally, we improved the DIA-NN (version 1.7.10) workflow for the high-throughput processing of plasma and serum proteomes, through integration of the MaxLFQ protein quantification algorithm (Cox et al., 2014) in a DIA-NN R-package. Originally, MaxLFQ was designed for shotgun-MS studies but was recently introduced to DIA proteomics (Pham et al., 2020) and increased the quantification precision for serum and plasma proteomes.
Benchmarking Acquisition Depth, Quality, and Data Consistency in an Epidemiological Study
To assess the suitability of our high-throughput platform for human blood plasma and serum proteomics, we generated proteomes for undepleted serum samples derived from 199 random individuals that participated in the GS epidemiological study (Smith et al., 2013). GS is a family-based cohort of approximately 24,000 individuals in 7,000 family groups from across Scotland, aged between 18 and 98 (Smith et al., 2013). We also included a large number of commercial plasma (tebu-bio, 91 total) and serum (tebu-bio, 79 total) samples as quality controls for the sample preparation workflow, as well as repeated injections of a single sample every 11 samples (pooled from 32 prepared commercial serum samples, 39 total) as a control for the LC-MS performance. The sample preparation was done in four 96-well plates and the experiment involved 409 non-blank injections.
For interrogating the raw data, we made use of a high-quality spectral library (Bruderer et al., 2019) that was refined with DIA-NN based on the actual data (Figure 1B). Upon protein extraction and batch effect correction (STAR Methods), we assessed the robustness and consistency of protein identification achieved and herein illustrate the key quality parameters. The high-flow LC setup yielded exceptional retention time stability across the whole experiment (Figure 3A). As expected, due to the short gradients, the total proteomic depth (total number of peptides quantified) is lower than achieved with MS workflows that use pre-fractionation and longer gradients with lower flow rates and have a slower duty cycle to scan over a larger mass range. In undepleted plasma, these typically detect 250 to 450 proteins per injection (Bian et al., 2020; Bruderer et al., 2019; Geyer et al., 2016a; Liu et al., 2015). However, with an average of ~270 protein groups detected per injection and 311 in total, our much faster platform still covers most of the typical plasma proteins (Anderson and Anderson, 2002) also seen by the other methods.
Figure 3.
Robustness and Quantitative Precision of the Proteomic Platform Applied to a Population-Based Epidemiological Cohort
409 serum proteomes were analyzed for characterizing 199 participants of the GS study. The sample series are composed of 39 repeat injections (“QC”), 79 serum and 91 plasma commercial sample preparation controls, and 200 serum samples derived from the 199 participants of the GS study (“GS”).
(A) Overlaid aligned retention times (Biognosys iRT scale) of all peptide identifications in the whole experiment. Median iRT standard deviation (SD) was 0.22 (relative SD = 0.0009) and correlation between the observed iRT and library iRT was 0.99995, indicating very high retention time stability.
(B and C) (B) Numbers of peptide precursors and (C) unique proteins identified in control samples.
(D) Data completeness in the whole experiment plotted against the number of proteins identified. The data completeness for all 245 unique proteins was 87%, whereas 182 proteins were identified with data completeness 99%.
(E) PCA using consistently identified proteins (log-transformed quantities).
(F) The “serum” cluster on the PCA plot, with samples prepared on different 96-well plates colored differently. No bias between the plates can be detected.
(G) CV. After accounting for instrument drift, median CV values are 5.4% for replicate injections (“QC”), 7.6% for serum controls, 7.3% for plasma controls, and 25.6% for the participants' samples.
Indeed, this compendium of proteins quantified includes at least 44 FDA-approved protein biomarkers (Table S5). Moreover, for large-scale experiments, the numbers of consistently quantified peptides and proteins are more relevant than the maximum number of protein groups identified, as only consistent detection allows for quantitative comparison between individuals and is suitable for the development of clinical assays. We consistently identified around 3,000 peptide precursors (i.e., peptides ionized to a specific charge; Figure 3B) and 200 unique proteins (i.e., gene products identified with specific proteotypic peptides; Figure 3C) across all 409 proteome acquisitions. In total, we detected 311 protein groups, out of which 245 uniquely identified proteins were measured with 87% data completeness and with at least five peptides. Among these, 182 unique proteins were quantified with 99% data completeness (Figure 3D).
To assess the quantitative precision, we first illustrated the proteomic data using principal component analysis (PCA) (Figure 3E). The PCA data fully separate in PC1 all serum from plasma samples, and in PC2, the control serum samples from the GS serum samples. The difference between the GS samples and the commercial samples might be explained by different serum collection and/or storage procedures. Moreover, the biological variability across the randomly chosen individuals is much higher than the technical variability (spread of GS samples versus serum or plasma samples) and hence is detected with high confidence by our platform. We further examined in detail the “serum” cluster of points and did not detect any bias between different sample preparation plates (Figure 3F).
Finally, we evaluated the quantification precision by calculating the CV of protein quantities across the studies. Median values obtained are 5.4% for repeat reference sample injections (“QC”) after instrument drift correction, with high abundant proteins being measured with a less than 2% CV of protein quantities (Figure S2B). Serum controls and plasma controls were measured with 7.6% and 7.3% CV, reflecting the precision of the entire workflow including sample preparation, acquisition, and data analysis. These values are much lower than the biological variation detected across the randomly chosen GS participants; when this biological variation is expressed as a CV value, it corresponds to a variation of 25.6% (Figure 3G). The platform hence confidently identifies biological variability in large-scale serum proteomic experiments of randomly chosen and presumed healthy individuals. Indeed, to our knowledge, such high precision values (<2% for high abundant proteins, 5.4% CV for all proteins [in the LC-MS part of the workflow], 7.3% for the entire workflow including sample preparation over processing 409 proteomes) have not been achieved to date in comparable large-scale proteomic studies.
Rapid and Precise High-Flow Rate Proteomics Identifies Biomarkers for COVID-19
We applied the developed workflow for the analysis of serum and citrate plasma samples for two independent COVID-19 cohorts that included patients who were among the first that were hospitalized at Charité Universitätsmedizin Berlin, between March 1st and March 26th 2020. Thirty-one SARS-CoV-2 infected patients were included in the exploratory cohort to identify biomarkers (Figure 4A; Table S1). 11/31 (35%) patients were female and 20/31 (65%) were male, median age was 54 years (range 21–81). Severity of COVID-19 was graded using the WHO ordinal outcome scale of clinical improvement (score 3 = hospitalized, no oxygen therapy; score 4 = oxygen by mask or nasal prongs; score 5 = non-invasive ventilation or high-flow oxygen; score 6 = intubation and mechanical ventilation; score 7 = ventilation and additional organ support) (World Health Organization, 2020) (Table S2). Four patients (13%) died from COVID-19, and 4 patients remain hospitalized at the time of writing. All other patients have been discharged in good health from hospital. Successively, a control group, consisting of 15 healthy volunteers and 17 further patients suffering from COVID-19, was recruited at the same hospital and used for validation of the biomarkers discovered (Table S1).
Because of the rapid action required in an early phase of a pandemic, we sampled depending on the first patients hospitalized (i.e., there was no other inclusion criteria than a hospitalization due to a SARS-CoV-2 infection). Such a cohort certainly differs from a long-term planned epidemiological cohort such as the GS study. With samples collected as part of the hospital routine by different medical professionals, the level of sample variability is expected to be higher. Moreover, it is difficult to assemble a control cohort that is matched for the key confounding factors, like age. Nonetheless, our proteomic platform yielded only slightly inferior values compared to the ideal-case scenario of the GS study. In the exploratory cohort, which included 104 serum samples obtained from the 31 of the earliest COVID-19 patients, we quantified 297 protein groups among which 229 unique proteins were detected with 75% data completeness. We surmise that this somewhat lower data completeness was caused by the massive changes in levels of a number of proteins upon severe SARS-CoV-2 infection as well as by the decrease in total serum protein content, also observed previously in patients hospitalized in ICU units (Nie et al., 2020). To account for this biological limitation, we applied very strict filtering to the dataset, namely, we only tested for differential abundance of proteins which had at least five different peptide precursors identified at least in one of the acquisitions.
We identified 37 protein groups with either increasing or decreasing levels, depending on the severity of the disease (0.05 significance, multiple testing corrected, Theil-Sen test against the WHO severity score; see STAR Methods for testing methodology) (Figure S7). Next, to validate the biomarkers, we processed the validation cohort (Table S1) and recorded 96 proteomes in triplicates for 15 healthy volunteers and 17 COVID-19 patients (Table S1). The experiment quantified 319 protein groups among which 248 unique proteins were detected with 85% data completeness. Despite being conducted on a different matrix (citrate plasma), this independent study confirmed 27 of the protein groups with either increasing or decreasing levels (A1BG, ACTB;ACTG1, ALB, APOA1, APOC1, C1R, C1S, C8A, CD14, CFB, CFH, CFI [complement factor I], CRP, FGA, FGB, FGG [fibrinogen alpha, beta, and gamma chains], GSN, HP, ITIH3 [inter-alpha-trypsin inhibitor heavy chain 3], ITIH4 [inter-alpha-trypsin inhibitor heavy chain 4], LBP, LGALS3BP, LRG1, SAA1 [serum amyloid A1], SAA1;SAA2 [serum amyloid A1 and A2 protein group], SERPINA10, TF; 0.05 significance, multiple testing corrected, Figure S4). This set of proteins thus represents potential biomarkers of disease severity. Out of the remaining 10 proteins, 9 (AGT [Angiotensinogen], AZGP1 [Zinc-alpha-2-glycoprotein], C2 [Complement C2], C7 [Complement component C7], C8B [Complement component C8 beta chain], CLU [Clusterin], CPN1 [Carboxypeptidase N catalytic chain], PLG [Plasminogen], and VTN [Vitronectin]) did not reach statistical significance in the smaller validation group, whereas the IGHG2;IGHG3 (Immunoglobulin Heavy Constant Gamma 2 and 3) protein group showed the opposite trend. We illustrate the quantitative variability of the validated biomarkers for COVID-19 severity on a heatmap (Figure 4C) and as boxplots (Figures 4D) as well as summarize their potential connection to COVID-19 in Table S3.
To exclude the possibility that concentration changes in these markers are due to frequent confounders, like age, we have plotted the variability of the same proteins in the GS cohort, with age spanning from 37 to 79, on samples that have been collected before the COVID-19 outbreak. We note that for identified proteins the change between mild and severe COVID-19 substantially exceeds the variation seen in the general population (Figure 4D). Moreover, plotting the protein abundance values against the age did not reveal significant correlations across the GS population baseline (Figure S8).
A PCA categorizes the individuals according to the severity of COVID-19 (Figure 4B). This shows that plasma proteomes, as measured with our platform, suit as clinical classifiers. Moreover, we illustrate the concentration changes of the biomarkers validated in the control cohort, upon grouping of the patients according the WHO severity criteria, ranging from scale 3 (hospitalized, no oxygen therapy) to the most critical (scale 7) in a heatmap (Table S1 for the grading of each patient), which graphically illustrates how level changes in these proteins reflect a progression from mild to severe COVID-19 (Figure 4C). Of note, WHO criteria consider patients up and including category 4 as “mild,” a classification that orients on the situation, that until this point, clinical care does not include invasive treatments that are difficult to provide, like intubation. However, our proteome data indicate the most substantial changes between categories 3 and 4, upon which a patient is put on oxygen supply. Our unbiased analysis hence indicates that at a molecular level, the requirement of oxygen supply coincides with a progression to severe disease.
Case studies indicate the clinical utility of the proteome signatures. First, two patients (#13 and #32) in the group of critical patients, which later died from COVID-19, clearly clustered in the heatmap and indeed had one of the most pronounced proteomic signatures (Figure 4C), implying that proteomics bears the potential to support predictions of clinical trajectories. Moreover, two case studies indicate that systematically recorded proteomes provide information that is beyond that of currently applied clinical assays. When we first obtained the dataset, two individuals (patients 4 and 6) initially clinically assessed as severe (Kurth et al., 2020) clustered with patients that suffered from the mild form of COVID-19 (Figure S3). These results triggered a retrospective assessment, which revealed that patient 4 turned out to be suffering from type B influenza rather than a SARS-CoV-2 infection, whereas patient 6 was classified as severe due to recent chemo-immunotherapy due to a hematological malignancy, applied just ten days before his COVID-19-related hospitalization. We have, as a consequence, excluded patient 4 from the COVID-19 dataset (all results shown in Figure 4). Both clinical case findings indicate however a high prognostic precision of the proteomic biomarker signatures and demonstrate that proteomes may very well outperform conventional clinical assays: no current clinical assays would have identified this type of outliers in the clinical assessment of the two individuals.
Proteome Signatures of Inflammation and Acute-Phase Response in Severe COVID-19
Each virus triggers its own host response, with some effectively evading the innate immune system (Kikkert, 2020). Some viral infections are hence causing only minor inflammatory responses, for example, HIV or many herpesviruses (Beachboard and Horner, 2016; Ongrádi, 2016; Sauter and Kirchhoff, 2016). Other viruses elicit dramatic inflammation and dysregulated coagulation, for example hemorrhagic fevers, e.g., Ebola virus (Baseler et al., 2017). The host immune response to SARS-CoV-2 is so far largely unknown. We detected consistent activation of both the classical complement pathway (C1R, C1S, and C8A) as well as the alternative pathway factor B (CFB) and the complement modulators: factors I (CFI) and H (CFH). Other differentially expressed proteins included the common acute-phase reactants, such as C-reactive protein (CRP) (upregulated), albumin (ALB) (downregulated), or serum amyloid proteins SAA1 and SAA2 (upregulated).
We observed upregulation of a number of proteins implicated in IL-6 signaling (Figure 4; Table S3). In addition to SAA1 and SAA2, these include inter-α-trypsin inhibitor heavy chain 4 (ITIH4), which plays an important role in extracellular matrix organization and is implicated in inflammation (Bost et al., 1998; Yang et al., 2012), HP, an acute-phase response protein (Jain et al., 2011), LRG1, a promoter of cell proliferation and angiogenesis, implicated in local inflammation and fibrosis (Honda et al., 2017), monocyte differentiation antigen CD14, primarily involved in bacterial LPS recognition (Kielian and Blecha, 1995), and the LBP, as well as LGALS3BP, a pro-inflammatory factor, which is known to induce IL-6 expression (Silverman et al., 2012). Thus, the proteomic approach surprisingly revealed a very IL-6-centered response. With the caveat that the depth of plasma proteomes is limiting, we found little evidence of involvement of other common inflammatory mediators (e.g., tumor necrosis factor (TNF), interferon (IFN) gamma, and their targets). Our study thus puts further emphasis on the importance of studying IL-6 function in relationship to new proteins and as therapeutic and diagnostic candidates.
We also observe upregulation of fibrinogen, a coagulation factor, and SERPINA10, an inhibitor of the F10a coagulation factor, further highlighting the importance of coagulation in SARS-CoV-2 infection, established by previous observations of elevated coagulation in severe COVID-19 cases (Zhou et al., 2020a).
A parallel study by (Shen et al., 2020) used a more conventional and time consuming (120× pre-fractionation consolidated in 40 fractions with TMT-16 plex) proteomics method to characterize plasma samples from 99 study participants, including 46 samples from patients with COVID-19 diagnosed in China. Despite the different cohorts and different technologies used, the proteomes implicate similar biological mechanisms in the differentiation of mild, severe, or critical disease progression. Many of the proteins that differentiate the groups in both studies belong to the complement system, acute-phase, and inflammatory response. For example, both studies agree with independently conducted clinical investigations on a number of differentially expressed proteins, in particular, ALB, the complement factors, serum amyloid proteins, ITIH3 and ITIH4 (Nie et al., 2020; Shen et al., 2020). Despite these similarities, we note important differences. For instance, we cannot confirm the downregulation of pro-platelet basic protein (PPBP) and platelet factor 4 (PF4) in severe COVID-19 as highlighted by Shen et al. We can offer several potential explanations. We note that different SARS-CoV-2 clades might exhibit different degrees of pathogenicity (Yao et al., 2020) and thus elicit different physiological responses, especially in different populations. However, when examining the response of PF4 and PPBP at the peptide level, we discovered that although several peptides maintain a relatively stable level, other PF4- and PPBP-specific peptides increase or decrease in concentration (Figure S5). This situation might indicate that instead of being differentially expressed, PF4 and PPBP might be differentially post-translationally modified.
Plasma Proteomes Provide Insights into the Virulence Mechanisms and Potential Therapeutic Targets for COVID-19
Extensive worldwide efforts have been directed recently into finding drug targets for COVID-19 treatment. As most of the damage associated with severe SARS-CoV-2 infection appears to be indirect and caused by excessive inflammation in the lungs, it is of crucial importance to seek opportunities not only to target the pathway of entry and the replication mechanism of the virus but to also identify and examine the possibilities for targeting host factors responsible for harmful inflammatory responses to both alleviate the severity of the infection and to reduce the chance of long-lasting complications (Huang et al., 2020). Some preliminary results in that direction appear promising. For example, pro-inflammatory signaling via IL-6 has been determined to be a marker of severe COVID-19 (Chen et al., 2020; Ruan et al., 2020) and preliminary results on the inhibition of IL-6 receptor (IL-6R) with tocilizumab seem to indicate clinical improvement (Coomes and Haghbayan, 2020). Here, we discovered several proteins that are differentially expressed with the severity of COVID-19 that are linked to IL-6-mediated proinflammatory cytokine signaling: (1) the CD14-LBP LPS recognition system, (2) upregulation of LRG1, an angiogenesis and anti-apoptotic factor associated with inflammation, and (3) upregulation of the LGALS3BP, an inducer of IL-6. Below, we discuss the potential significance of these findings.
Coronaviruses are known to actively disrupt the host immune response (Enjuanes et al., 2016). For example, their papain-like proteases (PLPs) act as interferon antagonists (Niemeyer et al., 2018), causing delayed type-I interferon response, macrophage-mediated inflammation, and lung damage (Channappanavar et al., 2016). Here, we observed upregulation of both monocyte differentiation antigen CD14 (~2.7×) and LBP (~3.9×) in severe COVID-19. As the response to bacterial LPS is one of the primary functions of both CD14 and LBP, which can act in complex to sensitize toll-like receptor-mediated LPS recognition (Ranoa et al., 2013), this observation reflects the dysregulation of the innate immune response by SARS-CoV-2, leading to the activation of the anti-bacterial defense and sensitization to LPS, thus contributing to excessive inflammation, with effects likely more pronounced in case of a concomitant secondary bacterial infection. Interestingly, CD14 and LBP upregulation has been observed in viral pneumonia before (Van Gucht et al., 2005), while CD14 is one of the primary mediators of lung inflammation (Anas et al., 2010). Of note, the glycosylphosphatidylinositol (GPI)-anchored form of CD14 is primarily displayed by monocytes and macrophages (Marcos et al., 2010), while the proportion of CD14+CD16+ inflammatory monocytes in the peripheral blood increases along with COVID-19 severity (Zhou et al., 2020b). At the same time, IL-6 induces soluble CD14 production in the liver (Bas et al., 2004) as well as, along with other cytokines, release of CD14 from monocytes upon their activation (Shive et al., 2015). Given that CD14 is a potent activator of pro-inflammatory cytokine signaling (Zanoni and Granucci, 2013), it might present a potential therapeutic target for COVID-19.
LRG1 is another pro-inflammatory factor induced by IL-6 (Shirai et al., 2009), which is known to promote angiogenesis and cell proliferation, while inhibiting apoptosis (Meng et al., 2016; Naka and Fujimoto, 2018; Wang et al., 2013). Some recent findings indicate its role in promoting skin fibrosis and lung fibrosis in transforming growth factor beta (TGF-β)-mediated fashion (Gao et al., 2019; Honda et al., 2017). Given the previously reported association of the Middle East respiratory syndrome (MERS) infection with lung fibrosis (Zhao et al., 2008) and emerging reports (British Thoracic Society, 2020) of lung fibrosis in a substantial proportion of COVID-19 survivors, we hypothesize that the ~2.1× elevation of serum LRG1 levels we observe in critical COVID-19 cases in comparison with the mild cases might indicate the increased risk of fibrosis, highlighting LRG1 as yet another potential therapeutic candidate for COVID-19 treatment. Furthermore, we detected about ~3.4× upregulation of LGALS3BP, which is known to induce the expression of IL-6 by stromal cells in galectin-3-dependent manner (Silverman et al., 2012). Of note, Galectin-3 has long been considered an attractive drug target in combating various forms of TGF-β-mediated fibrosis and pathological inflammatory conditions (Brinchmann et al., 2018; Mackinnon et al., 2012; Shen et al., 2018; Yu et al., 2013). Inhibition of galectin-3-mediated signaling pathways hence represents another potential therapeutic target against COVID-19.
Tissue Injury and Dysregulation of Modulators of Inflammation
We report substantially decreased (~2.6×) levels of GSN (Figures 4C and 4D). Plasma GSN is a part of the extracellular actin scavenger system (EASS), which removes toxic F-actin filaments that have been released from necrotic cells to the bloodstream (Piktel et al., 2018). Low levels of plasma GSN are associated with inflammation: it is believed that GSN is recruited to the sites of tissue injury to handle the released actin, depleting its plasma levels. Interestingly, we do observe the increase in serum actin concentration (beta and gamma-1 actin, ~2×), indicative of tissue injury (DiNubile, 2008), which could explain the GSN depletion from the blood. Importantly, plasma GSN is a powerful modulator of inflammation, which carries a protective function (DiNubile, 2008; Li et al., 2012). Low plasma GSN is a marker of poor prognosis in various pathological conditions, including diabetes (Khatri et al., 2014), cancers (Asare-Werehene et al., 2019; Stock et al., 2015), and sepsis (Lee et al., 2007), leading to suggestions and animal tests for its therapeutic use. Going forward, it will be important to assess GSN levels in at-risk populations for severe COVID-19, e.g., patients with diabetes. Of note, treatment with GSN has been observed to decrease IL-6 levels in mice (Cheng et al., 2017) and has been suggested to promote epithelial repair (Wittmann et al., 2018). The development of therapies to stabilize the GSN levels could hence be of direct therapeutic value for treating COVID-19.
Notably, we observed a decrease in the expression levels of APOA1 (APOA1; ~3×). APOA1 is a major component of the high-density lipoprotein (HDL) complex, which is a modulator of innate immune response and inflammation (Fotakis et al., 2019; Gordon et al., 2011; Macpherson et al., 2019; White et al., 2017). We also observe decreased levels of APOC1 (~3.2×), a component of several lipoprotein complexes (Fuior and Gafencu, 2019). Based on the GS study, we note that the serum levels of APOA1 are correlated with those of HDL cholesterol (Figure S6). Although decreased APOA1 levels have been observed in systemic inflammatory response (Kumaraswamy et al., 2012; Sirniö et al., 2017), including in COVID-19 (Nie et al., 2020), a potential explanation of the downregulation we observe here would also be provided, if naturally lower APOA1 and hence a different metabolic condition of the individual were associated with a higher risk of severe SARS-CoV-2 disease progression.
In conclusion, SARS-CoV-2, SARS, and MERS constitute a class of emerging coronaviruses of high public health concern. It is likely that other viruses will emerge in the future for which at time of outbreak insufficient biochemical knowledge will be available to identify biomarkers and to define point-of-care clinical classifiers. Serum and plasma proteomics can present valuable and unbiased information about disease progression and therapeutic candidates, without prior knowledge about the etiologies and biomolecules involved. We present a workflow for rapid and large-scale clinical proteomics that is re-designed in comparison to previous platforms. The sample preparation workflow scales to high sample numbers, enables high quantification precision, and reduces batch effects for large-scale and longitudinal studies, while the data acquisition and processing workflow is able to exploit the advantages of high-flow chromatography in short-gradient proteomics. Our platform improves throughput, data quality, and greatly simplifies implementation in regulated laboratories, as it builds on ISO13485 standardization as a reference. We demonstrate a quantification precision and acquisition robustness that, to our knowledge, has not previously been shown in large-scale proteomic experiments. We then applied the technology to a cohort of early hospitalized cases of the SARS-CoV-2 pandemic. We identify a series of proteins that are differentially expressed depending on the severity of COVID-19 and demonstrate the potential of proteome signatures to act as clinical classifiers. The proteome signatures capture the host response to COVID-19 infection, highlighting the role of complement factors, the coagulation system, and indicate a high specificity of several inflammation modulators as well as pro-inflammatory signaling both upstream and downstream of IL-6. The proteomic signatures and biomarkers identified pave the way for the development of routine assays to support clinical decision making, as well as provide hypotheses about potential COVID-19 therapeutic targets.
STAR★Methods
Key Resources Table
REAGENT or RESOURCE | SOURCE | IDENTIFIER |
---|---|---|
Biological Samples | ||
Human Serum | Sigma-Aldrich | Cat# S7023-50MB |
Human Plasma (EDTA, Pooled Donor) | Genetex | Cat# GTX73265 |
Human Serum, Normal off-the-Clot, Frozen | tebu-bio | Cat# 088SER |
Human Recovered Plasma, Pooled- frozen K2EDTA |
tebu-bio | Cat# 088SER-PLP200-EDTA |
Chemicals, Peptides, and Recombinant Proteins | ||
Water (Optima LC-MS Grade, Fisher Chemical) | Fisher Scientific | Cat# W6500 |
Acetonitrile (Optima LC-MS Grade, Fisher Chemical) | Fisher Scientific | Cat# A955-500 |
Methanol (Optima LC-MS Grade, Fisher Chemical) | Fisher Scientific | Cat# A456-212 |
DL-Dithiothreitol (BioUltra) | Sigma-Aldrich | Cat# 43815 |
Iodoacetamide (BioUltra) | Sigma-Aldrich | Cat# I1149 |
Ammonium Bicarbonate (Eluent additive for LC-MS) | Sigma-Aldrich | Cat# 40867 |
Urea (puriss. P.a., reag. Ph. Eur.) | Honeywell Research Chemicals | Cat# 33247H |
Acetic Acid (Eluent additive for LC-MS) | Honeywell Research Chemicals | Cat# 49199 |
Trypsin (Sequence grade) | Promega | Cat# V511X |
Mass Spec-Compatible Human Extract | Promega | Cat# V6951 |
Retention time peptides Biognosys iRT kit | Biognosys | Cat# Ki-30002-b |
MS synthetic peptide calibration kit | SCIEX | Cat# 5045759 |
Deposited Data | ||
Raw data (commercial plasma and serum control samples within the GS study) | This study | Pride:PXD018874 |
Software and Algorithms | ||
Proteomics data analysis via Deep Neural Networks, DIA-NN |
Demichev et al., 2020 | https://github.com/vdemichev/DiaNN |
Spectronaut 13 (Version 13.12.200217.43644) | Biognosys | Product number: Sw-3001 |
PeakView (Version 2.2) | SCIEX | N/A |
DIA-NN R package | Demichev et al., 2020 | https://github.com/vdemichev/diann-rpackage |
ComplexHeatmap R package | Gu et al., 2016 | https://github.com/jokergoo/ComplexHeatmap |
EnvStats R package | Millard, 2014 | https://CRAN.R-project.org/package=EnvStats |
Zoo R package | https://CRAN.R-project.org/package=zoo | |
Other | ||
Zorbax RRHD Eclipse Plus 95A C18, 2.1 x 50mm, 1.8 um, 1200 bar | Agilent | Cat# 959757-902 |
BioPureSPE Macro 96-Well, 100mg PROTO 300 C18 |
The Nest Group, Inc. | HNS S18V-L |
Resource Availability
Lead Contact
Further information and requests for resources and reagents should be directed to and will be fulfilled by the Lead Contact, Markus Ralser (markus.ralser@charite.de).
Materials Availability
This study did not generate new materials.
Data and Code Availability
The raw data of the acquired commercial plasma and serum control samples within the GS study have been deposited to the ProteomeXchange Consortium via PRIDE (Perez-Riverol et al., 2019) partner repository with the dataset identifier Pride:PXD018874. According to the terms of consent for GenerationScotland participants, access to individual-level data (omics and phenotypes) must be reviewed by the GS Access Committee. Applications should be made to access@generationscotland.org. The DIA-NN software suite and DiaNN R package are open source and are freely available for download at https://github.com/vdemichev/DiaNN and https://github.com/vdemichev/diann-rpackage respectively.
Experimental Model and Subject Details
Clinical Samples of COVID-19 Patients
Sampling was performed as part of the Pa-COVID-19 study, a prospective observational cohort study assessing pathophysiology and clinical characteristics of patients with COVID-19 at Charité Universitätsmedizin Berlin (Kurth et al., 2020). All patients with SARS-CoV-2 infection proven by positive PCR from respiratory specimens and willing to provide written informed consent are eligible for inclusion. Exclusion criteria are refusal to participate in the clinical study by patient or legal representative or clinical conditions that do not allow for blood sampling. The study assesses epidemiological and demographic parameters, medical history, clinical course, morbidity and quality of life during hospital stay of COVID-19 patients. Moreover, serial high-quality bio-sampling consisting of various sample types with deep molecular, immunological and virological phenotyping is performed. Treatment and medical interventions follow standard of care as recommended by current international and German guidelines for COVID-19. Severity of illness in the present study follows the WHO ordinal outcome scale (Tables S1 and S2). The Pa-COVID-19 study is carried out according to the Declaration of Helsinki and the principles of Good Clinical Practice (ICH 1996) where applicable and was approved by the ethics committee of Charité- Universitätsmedizin Berlin (EA2/066/20).
Generation Scotland Study
199 serum samples from random individuals that participated in the Generation Scotland (GS) epidemiological study (Smith et al., 2013) were used. GS is a family-based cohort of approximately 24,000 individuals in 7,000 family groups from across Scotland, aged between 18 and 98 (Smith et al., 2013). All components of Generation Scotland received ethical approval from the NHS Tayside Committee on Medical Research Ethics (REC Reference Number: 05/S1401/89). All participants provided broad and enduring written informed consent for biomedical research. Generation Scotland has also been granted Research Tissue Bank status by the East of Scotland Research Ethics Service (REC Reference Number: 15/0040/ES), providing generic ethical approval for a wide range of uses within medical research. This study was performed in accordance with the Helsinki declaration.
Method Details
Plasma and Serum Sample Preparation
The protocol was designed for preparing four 96-well plates in parallel and that a single person using a single liquid handling unit can start and complete every day up to two 4-plate batches. The total hands-on time per batch is 3.5 hrs only and the workflow fits within the 8hr time-window. All liquid transfer and mixing except the addition of serum/plasma to the denaturing buffer was carried out by the liquid handling robot, either a Beckman Coulter Biomek NXp (Crick laboratory) or Biomek i7 liquid handling robot (Charité Universitätsmedizin Berlin). There are slight differences between the protocols due to the two different laboratories. To our knowledge, these have no detectable influence on the results. Where applicable these differences are indicated as “Biomek NXp” or “Biomek i7 protocol”, respectively.
Before starting the sample preparation, 96-well plates were prefilled with Trypsin (12.5μl, 0.1μg/μl solution; four plates/batch), denaturation/reduction buffer (55μl 8M Urea, 100mM ammonium bicarbonate (ABC) and 4.5mM dithiothreitol (DTT) (Biomek NXp protocol) or 50mM DTT (Biomek i7 protocol); four plates/batch) and iodoacetamide (IAA) (100mM, > 20 μl, one plate/batch) and stored sealed at -80°C until the day of the experiment. These stock solutions are thawed/brought to room temperature just before adding them to the sample, which prevents evaporation.
5μl of thawed serum/plasma samples were transferred to the pre-made denaturation/reduction stock solution plates. Subsequently the plates were centrifuged for 15s at pulse setting (Eppendorf Centrifuge 5810R), mixed and incubated at 30°C for 60 minutes. The mixing in this step was done either 30s at 1000rpm on a Thermomixer (Eppendorf Thermomixer C) (Biomek NXp protocol) or by resuspension (Biomek i7 protocol). 5μl IAA was then transferred from the respective stock solution plate to the sample plate and incubated in the dark at 23°C for 30 minutes before dilution with 100mM ABC buffer (340μl). 220μl of this solution was transferred to the pre-made trypsin stock solution plate and incubated at 37°C for 17 h (Memmert IPP55 incubator). The trypsin/total protein ratio was ~1/40 as this provided high and reproducible identification numbers (Figure S2D). The digestion was quenched by addition of formic acid (10% v/v, 25μl). The digestion mixture was cleaned-up using C18 96-well plates (BioPureSPE Macro 96-Well, 100mg PROTO C18, The Nest Group). For the solid phase extraction, 1 minute of centrifugation at the described speeds (Eppendorf Centrifuge 5810R) was used to push the liquids through the stationary phase and the liquid handler was used to pipette the liquids onto the material in order to make four 96-well plates/batch feasible. The plates were conditioned with methanol (200μl, centrifuged at 50g), washed twice with 50% ACN (200μl, centrifuged at 150g and flow through discarded), equilibrated twice with 0.1% FA (200μl, centrifuged at 150g and flow through discarded). Then 200μl of digested and quenched samples were loaded (centrifuged at 150g), washed twice with 0.1% FA (200μl, centrifuged at 150g). After the last washing step, the plates were centrifuged another time at 200g before the peptides were eluted in 3 steps with 110μl 50% ACN (200g) into a collection plate (1.1ml, Square well, V-bottom). Collected material was completely dried on a vacuum concentrator (Eppendorf Concentrator Plus (Biomek NXp protocol) or Fisher Scientific, SPD300P1 (Biomek i7 protocol) and redissolved in 50μl 1% ACN, 0.1% formic acid (Biomek NXp protocol) or 50μl 0.1% formic acid (Biomek i7 protocol), then stored at -80°C until data acquisition. The samples for the SARS-CoV-2 studies were analysed without freezing. QC samples for repeat injections were prepared by pooling commercial serum samples and were spiked with iRT peptides (Biognosys).
Liquid Chromatography-Mass Spectrometry Setup
Liquid chromatography was established on two complementary and exchangeable ultra-high-pressure high-flow LC-MS systems, an Agilent 1290 Infinity II (Crick laboratory) and Waters H-Class (Charité Universitätsmedizin Berlin) system, both coupled to a TripleTOF 6600 mass spectrometer (SCIEX) equipped with IonDrive Tubo V Source (Sciex). In both cases, the peptides were separated in reversed phase mode using a C18 ZORBAX Rapid Resolution High Definition (RRHD) column 2.1mm x 50mm, 1.8μm particles at a column temperature of 30°C. A linear gradient was applied which ramps from 3% B to 36% B in 5 minutes (Buffer A: 0.1% FA; Buffer B: ACN/0.1% FA) with a flow rate of 800μl/min. For washing the column, the organic solvent was increased to 80% B in 0.5 minutes and was kept for 0.2 minutes at this composition before going back to 1% B in 0.3 min. The equilibration times were 2.8 minutes (Water H Class protocol) or 4.2 minutes (Agilent Infinity II protocol). Data was acquired in high sensitivity mode and the amount of total proteins injected was 5μg (GS study) and 10μg (SARS-CoV-2). The sample load was optimised (Figure S2A) and is a balance between how frequently one needs to clean the instrument and the identification numbers. The DIA/SWATH method consisted of an MS1 scan from m/z 100 to m/z 1500 (20ms accumulation time) and 25 MS2 scans (25ms accumulation time) with variable precursor isolation width covering the mass range from m/z 450 to m/z 850 (Table S6). Ion source gas 1 (nebulizer gas), ion source gas 2 (heater gas) and curtain gas were set to 50, 40 and 25 respectively. The source temperature was set to 450°C and the ion spray voltage to 5500V.
Quantification and Statistical Analysis
Mass Spectrometry Data Processing, Batch Correction and Quality Control
The raw data were processed using DIA-NN 1.7.10 in the “robust LC (high precision)” mode with RT-dependent median-based cross-run normalisation enabled. MS2 and MS1 mass accuracies were set to 20 and 12 ppm, respectively, and scan window size set to 6. Although DIA-NN can optimise such parameters automatically, we fixed them to these values to ensure comparability. For all the experiments, we used a project-independent public spectral library (Bruderer et al., 2019). Human UniProt (UniProt Consortium, 2019) isoform sequence database (3AUP000005640) was used to annotate the library. The library was first automatically refined based on the dataset in question at 0.01 global q-value (using the “Generate spectral library” option in DIA-NN). DIA-NN performs such refinement by finding the highest-scoring identification for each library precursor, across all runs in the experiment, and then replacing the library data with the empirically observed spectrum and retention time. The purpose of the refinement step is twofold: (i) retain only those peptide precursors in the library that are detectable in the experiment of interest; (ii) make sure that the library spectra and retention times are optimised specifically for the experimental setup in question, thus improving identification performance. The refined library was then used to reanalyse the data. The resulting report was stringently filtered at 0.01 precursor-level q-value, 0.005 precursor-level library q-value and 0.05 protein group-level q-value. Intra-batch correction was performed for each peptide precursor separately: based on repeat injection controls in “GS” and sample preparation controls in the coronavirus cohorts. Log-transformed peptide precursor quantities were adjusted using linear regression (“GS”) or running median smoothing (coronavirus cohorts). Linear regression was applied only for at least 10 data points. If the p-value for non-zero slope was below 0.01, the slope multiplied by the centered injection number (i.e. the injection number in the given run minus its mean value in all runs) was subtracted from the precursor quantities for all runs. Running median smoothing was performed in two steps using the ‘runmed’ function with the algorithm set to ‘Stuetzle’. First, a 5-point running median smoothing was performed on the quantities in the control samples, to remove outliers. Second, interpolation of the resulting values to all runs was performed using the ‘na.approx’ function (with ‘rule’ set to 2) from the ‘zoo’ R package. Finally 41-point running median smoothing was applied. Protein quantification was performed using the MaxLFQ algorithm (Cox et al., 2014) as implemented in the diann R package (https://github.com/vdemichev/diann-rpackage, version 1.0, commit “eb4607a”). Data completeness was defined as the proportion of non-missing values in the proteins x samples quantities matrix. The coefficient of variation (CV) was calculated for each protein as its empirical standard deviation divided by its empirical mean. PCA analysis was always performed only on ubiquitously identified proteins: imputation was not used.
Differential Expression Analysis
In the exploratory cohort, differential expression was tested only for proteins quantified using at least five peptide precursors in one of the acquisitions. Further, protein groups were quantified using only precursors detected in at least 10% of patient samples (both cohorts). As we were specifically interested in proteins which could serve as biomarkers of COVID-19 severity, the test was performed using the Kendall’s Tau test for the Theil-Sen trend estimator (as implemented in the EnvStats R package (Millard, 2014)) against the disease severity as classified according to the WHO ordinal scale (Table S1). The input for the test was obtained by calculating, for each patient, median protein levels across the timepoints or replicates measured (Table S4 for timepoint information). Multiple testing correction was performed using the Holm-Bonferroni method for FWER control, as implemented in the p.adjust R function, and the significance threshold was set to 0.05. The choice of a nonparametric test (Theil-Sen) was dictated by the fact that such widespread methods as ANOVA or linear regression are only valid under the assumption of Gaussian errors with the same variance across all conditions. In the case of this dataset, however, we observed very significant differences in the variance, e.g. many proteins seem a lot more variable between patients with severe and critical COVID-19 than mild COVID-19. For such a situation a nonparametric test is ideal: although it would typically have less power (less proteins detected as differentially expressed), the p-values produced are reliable.
Chromatographic Peaks Full Width at Half Maximum (FWHM) Estimations
Median peak FWHM was estimated using Spectronaut 13 (version 13.12.200217.43644; Biognosys). Only precursors ubiquitously identified in all runs (3 minutes, 5 minutes, 10 minutes and 20-minute high-flow as well as with the 20-minute micro-flow run) and with a q-value of < 0.001 were considered (804 precursors total).
Total Ion and Extracted Ion Chromatograms
Total ion chromatograms and extracted ion chromatograms were generated with the PeakView software (Version 2.2, SCIEX), exported and plotted in R (R core team, www.R-project.org).
Acknowledgments
We thank Trefor Mitchell for his help with the quality management system; Stephan Kamrad and Simran Aulakh for proofreading the manuscript; Jan-David Manntz (Beckman, Germany) for help with the Biomek i7; and Robert Lane, Jean-Baptiste Vincedent, and Nick Morrice (SCIEX) for help with the TripleTOF 6600. This work was supported by the Ministry of Education and Research (BMBF), as part of the National Research Node "Mass spectrometry in Systems Medicine" (MSCoresys), under grant agreement 031L0220A. The study was further supported by the Francis Crick Institute, which receives its core funding from Cancer Research UK (FC001134), the UK Medical Research Council (FC001134), and the Wellcome Trust (FC001134) and received specific funding from the BBSRC (BB/ N015215/1 and BB/N015282/1) and the Wellcome Trust (200829/Z/16/Z) as well as a Crick Idea to Innovation (i2i) initiative (grant number: 10658) as well as the Crick lifeArc Project 1290305 (to M.R.). The Generation Scotland study received core support from the Chief Scientist Office of the Scottish Government Health Directorates (CZD/16/6) and the Scottish Funding Council (HR03006) and is now supported by the Wellcome Trust (216767/Z/19/Z). A. C. is funded by HDR UK and the Wellcome Trust (216767/Z/19/Z). C. H. is supported by an MRC University Unit Programme grant (MC_UU_00007/10) (QTL in Health and Disease). R.E.M. is supported by an Alzheimer’s Research UK project grant (ARUK-PG2017B-10). L.E.S. is supported by the German Research Foundation (DFG, SFB-TR84 114933180) and by the Berlin Institute of Health (BIH), which receives funding from the Ministry of Education and Research (BMBF). M.Witzenrath is supported by grants from the German Research Foundation, SFB-TR84 C6 and C9 and by the German Ministry of Education and Research (BMBF) in the framework of the CAPSyS (01ZX1304B) SYMPATH (FKZ 01ZX1906A) and PROVID project (FKZ 01KI20160A)
Author Contributions
M.R., L.E.S., F.K., M. Witzenrath, N.S., C.D., M.M., W.M.K., K.S.L., C.L., R.E.M., D.J.P., C.H., A.C., and C.v.K. designed and supervised the study; L.E.S., F.K., M. Witzenrath, C.v.K., A.H., N.S., S.H., M.P., L.M., and D.W. conducted clinical assessments and supervised clinical sample handling; C.B.M., V.D., A.Z., and M.R. designed and developed the proteomics high-throughput platform as well as the proteomic quality control procedures; A.F., M.M., M. White, C.B.M., K.T.-T., S.I.V., A.-S.E., M.K., D.L.,C.K., and F.A. carried out sample preparation and measurements; V.D. and A.Z. designed and developed data processing and analysis workflow and analyzed the proteomic data; M.R., L.E.S, F.K., V.D., C.B.M., and S.V. wrote the paper with contributions from all authors.
Declaration of Interests
The authors declare no competing interests.
Published: June 2, 2020
Footnotes
Supplemental Information can be found online at https://doi.org/10.1016/j.cels.2020.05.012.
Supplemental Information
References
- Ahadi S., Zhou W., Schüssler-Fiorenza Rose S.M., Sailani M.R., Contrepois K., Avina M., Ashland M., Brunet A., Snyder M. Personal aging markers and ageotypes revealed by deep longitudinal profiling. Nat. Med. 2020;26:83–90. doi: 10.1038/s41591-019-0719-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Anas A., van der Poll T., de Vos A.F. Role of CD14 in lung inflammation and infection. Crit. Care. 2010;14:209. doi: 10.1186/cc8850. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Anderson N.L., Anderson N.G. The human plasma proteome: history, character, and diagnostic prospects. Mol. Cell. Proteomics. 2002;1:845–867. doi: 10.1074/mcp.r200007-mcp200. [DOI] [PubMed] [Google Scholar]
- Asare-Werehene M., Communal L., Carmona E., Le T., Provencher D., Mes-Masson A.M., Tsang B.K. Pre-operative circulating plasma gelsolin predicts residual disease and detects early stage ovarian cancer. Sci. Rep. 2019;9:13924. doi: 10.1038/s41598-019-50436-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bache N., Geyer P.E., Bekker-Jensen D.B., Hoerning O., Falkenby L., Treit P.V., Doll S., Paron I., Müller J.B., Meier F. A novel LC system embeds analytes in pre-formed gradients for rapid, ultra-robust proteomics. Mol. Cell. Proteomics. 2018;17:2284–2296. doi: 10.1074/mcp.TIR118.000853. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bas S., Gauthier B.R., Spenato U., Stingelin S., Gabay C. CD14 is an acute-phase protein. J. Immunol. 2004;172:4470–4479. doi: 10.4049/jimmunol.172.7.4470. [DOI] [PubMed] [Google Scholar]
- Baseler L., Chertow D.S., Johnson K.M., Feldmann H., Morens D.M. The pathogenesis of Ebola virus disease. Annu. Rev. Pathol. 2017;12:387–418. doi: 10.1146/annurev-pathol-052016-100506. [DOI] [PubMed] [Google Scholar]
- Beachboard D.C., Horner S.M. Innate immune evasion strategies of DNA and RNA viruses. Curr. Opin. Microbiol. 2016;32:113–119. doi: 10.1016/j.mib.2016.05.015. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bekker-Jensen D.B., Martínez-Val A., Steigerwald S., Rüther P., Fort K.L., Arrey T.N., Harder A., Makarov A., Olsen J.V. A compact quadrupole-Orbitrap mass spectrometer with FAIMS interface improves proteome coverage in short LC gradients. Mol. Cell. Proteomics. 2020;19:716–729. doi: 10.1074/mcp.TIR119.001906. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bennike T.B., Bellin M.D., Xuan Y., Stensballe A. Journal of proteome research. 2018;17:1983–1992. doi: 10.1021/acs.jproteome.8b00111. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bian Y., Zheng R., Bayer F.P., Wong C., Chang Y.C., Meng C., Zolg D.P., Reinecke M., Zecha J., Wiechmann S. Robust, reproducible and quantitative analysis of thousands of proteomes by micro-flow LC–MS/MS. Nat. Commun. 2020;11:157. doi: 10.1038/s41467-019-13973-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bluemlein K., Ralser M. Monitoring protein expression in whole-cell extracts by targeted label- and standard-free LC-MS/MS. Nat. Protoc. 2011;6:859–869. doi: 10.1038/nprot.2011.333. [DOI] [PubMed] [Google Scholar]
- Bost F., Diarra-Mehrpour M., Martin J.P. Inter-alpha-trypsin inhibitor proteoglycan family– a group of proteins binding and stabilizing the extracellular matrix. Eur. J. Biochem. 1998;252:339–346. doi: 10.1046/j.1432-1327.1998.2520339.x. [DOI] [PubMed] [Google Scholar]
- Brinchmann M.F., Patel D.M., Iversen M.H. The role of galectins as modulators of metabolism and inflammation. Mediators Inflamm. 2018;2018:9186940. doi: 10.1155/2018/9186940. [DOI] [PMC free article] [PubMed] [Google Scholar]
- British Thoracic Society British thoracic society advice for managing interstitial lung disease patients during COVID-19 pandemic. 2020. https://www.brit-thoracic.org.uk/media/455101/bts-management-advice-for-ild-patients-v10-23-march-2020.pdf
- Bruderer R., Muntel J., Müller S., Bernhardt O.M., Gandhi T., Cominetti O., Macron C., Carayol J., Rinner O., Astrup A. Analysis of 1508 plasma samples by capillary-flow data-independent acquisition profiles proteomics of weight loss and maintenance. Mol. Cell. Proteomics. 2019;18:1242–1254. doi: 10.1074/mcp.RA118.001288. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bussey K.A., Brinkmann M.M. Strategies for immune evasion by human tumor viruses. Curr. Opin. Virol. 2018;32:30–39. doi: 10.1016/j.coviro.2018.08.015. [DOI] [PubMed] [Google Scholar]
- Channappanavar R., Fehr A.R., Vijay R., Mack M., Zhao J., Meyerholz D.K., Perlman S. Dysregulated type I interferon and inflammatory monocyte-macrophage responses cause lethal pneumonia in SARS-CoV-infected mice. Cell Host Microbe. 2016;19:181–193. doi: 10.1016/j.chom.2016.01.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Chen N., Zhou M., Dong X., Qu J., Gong F., Han Y., Qiu Y., Wang J., Liu Y., Wei Y. Epidemiological and clinical characteristics of 99 cases of 2019 novel coronavirus pneumonia in Wuhan, China: a descriptive study. Lancet. 2020;395:507–513. doi: 10.1016/S0140-6736(20)30211-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cheng Y., Hu X., Liu C., Chen M., Wang J., Wang M., Gao F., Han J., Zhang C., Sun D., Min R. Gelsolin inhibits the inflammatory process induced by LPS. Cell. Physiol. Biochem. 2017;41:205–212. doi: 10.1159/000456043. [DOI] [PubMed] [Google Scholar]
- Coomes E.A., Haghbayan H. Interleukin-6 in COVID-19: a systematic review and meta-analysis. medRxiv. 2020 doi: 10.1101/2020.03.30.20048058. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Corman V.M., Landt O., Kaiser M., Molenkamp R., Meijer A., Chu D.K., Bleicker T., Brünink S., Schneider J., Schmidt M.L. Detection of 2019 novel coronavirus (2019-nCoV) by real-time RT-PCR. EURO Surveill. 2020;25:2000045. doi: 10.2807/1560-7917.ES.2020.25.3.2000045. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cox J., Hein M.Y., Luber C.A., Paron I., Nagaraj N., Mann M. Accurate proteome-wide label-free quantification by delayed normalization and maximal peptide ratio extraction, termed MaxLFQ. Mol. Cell. Proteomics. 2014;13:2513–2526. doi: 10.1074/mcp.M113.031591. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Crutchfield C.A., Thomas S.N., Sokoll L.J., Chan D.W. Advances in mass spectrometry-based clinical biomarker discovery. Clin. Proteomics. 2016;13:1. doi: 10.1186/s12014-015-9102-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Demichev V., Messner C.B., Vernardis S.I., Lilley K.S., Ralser M. DIA-NN: neural networks and interference correction enable deep proteome coverage in high throughput. Nat. Methods. 2020;17:41–44. doi: 10.1038/s41592-019-0638-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- DiNubile M.J. Plasma gelsolin as a biomarker of inflammation. Arthritis Res. Ther. 2008;10:124. doi: 10.1186/ar2547. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Enjuanes L., Zuñiga S., Castaño-Rodriguez C., Gutierrez-Alvarez J., Canton J., Sola I. Molecular basis of coronavirus virulence and vaccine development. Coronaviruses. Adv Virus Res. 2016;96:245–286. doi: 10.1016/bs.aivir.2016.08.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fotakis P., Kothari V., Thomas D.G., Westerterp M., Molusky M.M., Altin E., Abramowicz S., Wang N., He Y., Heinecke J.W. Anti-inflammatory effects of HDL (high-density lipoprotein) in macrophages predominate over proinflammatory effects in atherosclerotic plaques. Arterioscler. Thromb. Vasc. Biol. 2019;39:e253–e272. doi: 10.1161/ATVBAHA.119.313253. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fu Q., Kowalski M.P., Mastali M., Parker S.J., Sobhani K., van den Broek I., Hunter C.L., Van Eyk J.E. Highly reproducible automated proteomics sample preparation workflow for quantitative mass spectrometry. J. Proteome Res. 2018;17:420–428. doi: 10.1021/acs.jproteome.7b00623. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fuior E.V., Gafencu A.V. Apolipoprotein C1: its pleiotropic effects in lipid metabolism and beyond. Int. J. Mol. Sci. 2019;20:5939. doi: 10.3390/ijms20235939. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gama M.R., Collins C.H., Bottoli C.B.G. Nano-liquid chromatography in pharmaceutical and biomedical research. J. Chromatogr. Sci. 2013;51:694–703. doi: 10.1093/chromsci/bmt023. [DOI] [PubMed] [Google Scholar]
- Gao Y., Zhou J., Xie Z., Wang J., Ho C.K., Zhang Y., Li Q. Mechanical strain promotes skin fibrosis through LRG-1 induction mediated by ELK1 and ERK signalling. Commun. Biol. 2019;2:359. doi: 10.1038/s42003-019-0600-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Geyer P.E., Holdt L.M., Teupser D., Mann M. Revisiting biomarker discovery by plasma proteomics. Mol. Syst. Biol. 2017;13:942. doi: 10.15252/msb.20156297. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Geyer P.E., Kulak N.A., Pichler G., Holdt L.M., Teupser D., Mann M. Plasma proteome profiling to assess human health and disease. Cell Syst. 2016;2:185–195. doi: 10.1016/j.cels.2016.02.015. [DOI] [PubMed] [Google Scholar]
- Geyer P.E., Voytik E., Treit P.V., Doll S., Kleinhempel A., Niu L., Müller J.B., Buchholtz M.L., Bader J.M., Teupser D. Plasma Proteome Profiling to detect and avoid sample-related biases in biomarker studies. EMBO Mol. Med. 2019;11:e10427. doi: 10.15252/emmm.201910427. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Geyer P.E., Wewer Albrechtsen N.J.W., Tyanova S., Grassl N., Iepsen E.W., Lundgren J., Madsbad S., Holst J.J., Torekov S.S., Mann M. Proteomics reveals the effects of sustained weight loss on the human plasma proteome. Mol. Syst. Biol. 2016;12:901. doi: 10.15252/msb.20167357. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gillet L.C., Navarro P., Tate S., Röst H., Selevsek N., Reiter L., Bonner R., Aebersold R. Targeted data extraction of the MS/MS spectra generated by data-independent acquisition: a new concept for consistent and accurate proteome analysis. Mol. Cell. Proteomics. 2012;11 doi: 10.1074/mcp.O111.016717. O111.016717. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gordon S.M., Hofmann S., Askew D.S., Davidson W.S. High density lipoprotein: it’s not just about lipid transport anymore. Trends Endocrinol. Metab. 2011;22:9–15. doi: 10.1016/j.tem.2010.10.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Grebe S.K.G., Singh R.J. LC-MS/MS in the clinical laboratory--where to from here? Clin. Biochem. Rev. 2011;32:5–31. [PMC free article] [PubMed] [Google Scholar]
- Gu Z., Eils R., Schlesner M. Complex heatmaps reveal patterns and correlations in multidimensional genomic data. Bioinformatics. 2016;32:2847–2849. doi: 10.1093/bioinformatics/btw313. [DOI] [PubMed] [Google Scholar]
- He T. PROTEOMICS- Clinical Applications. 2019;13:1800198. doi: 10.1002/prca.201800198. [DOI] [PubMed] [Google Scholar]
- Honda H., Fujimoto M., Serada S., Urushima H., Mishima T., Lee H., Ohkawara T., Kohno N., Hattori N., Yokoyama A., Naka T. Leucine-rich α-2 glycoprotein promotes lung fibrosis by modulating TGF-β signaling in fibroblasts. Physiol. Rep. 2017;5:e13556. doi: 10.14814/phy2.13556. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Huang C., Wang Y., Li X., Ren L., Zhao J., Hu Y., Zhang L., Fan G., Xu J., Gu X. Clinical features of patients infected with 2019 novel coronavirus in Wuhan, China. Lancet. 2020;395:497–506. doi: 10.1016/S0140-6736(20)30183-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Jain S., Gautam V., Naseem S. Acute-phase proteins: as diagnostic tool. J. Pharm. Bioallied Sci. 2011;3:118–127. doi: 10.4103/0975-7406.76489. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Khatri N., Sagar A., Peddada N., Choudhary V., Chopra B.S., Garg V., Garg R., Ashish Plasma gelsolin levels decrease in diabetic state and increase upon treatment with F-actin depolymerizing versions of gelsolin. J. Diabetes Res. 2014;2014:152075. doi: 10.1155/2014/152075. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kielian T.L., Blecha F. CD14 and other recognition molecules for lipopolysaccharide: a review. Immunopharmacology. 1995;29:187–205. doi: 10.1016/0162-3109(95)00003-c. [DOI] [PubMed] [Google Scholar]
- Kikkert M. Innate immune evasion by human respiratory RNA viruses. J. Innate Immun. 2020;12:4–20. doi: 10.1159/000503030. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kumaraswamy S.B., Linder A., Åkesson P., Dahlbäck B. Decreased plasma concentrations of apolipoprotein M in sepsis and systemic inflammatory response syndromes. Crit. Care. 2012;16:R60. doi: 10.1186/cc11305. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kurth F., Roennefarth M., Thibeault C., Corman V.M., Mueller-Redezky H., Mittermaier M., Ruwwe-Glosenkamp C., Heim K.M., Krannich A., Zvorc S. Infection. 2020 doi: 10.1007/s15010-020-01464-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lee P.S., Waxman A.B., Cotich K.L., Chung S.W., Perrella M.A., Stossel T.P. Plasma gelsolin is a marker and therapeutic agent in animal sepsis. Crit. Care Med. 2007;35:849–855. doi: 10.1097/01.CCM.0000253815.26311.24. [DOI] [PubMed] [Google Scholar]
- Li G.H., Arora P.D., Chen Y., McCulloch C.A., Liu P. Multifunctional roles of gelsolin in health and diseases. Med. Res. Rev. 2012;32:999–1025. doi: 10.1002/med.20231. [DOI] [PubMed] [Google Scholar]
- Liu Y., Buil A., Collins B.C., Gillet L.C.J., Blum L.C., Cheng L.Y., Vitek O., Mouritsen J., Lachance G., Spector T.D. Quantitative variability of 342 plasma proteins in a human twin population. Mol. Syst. Biol. 2015;11:786. doi: 10.15252/msb.20145728. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lowenthal M.S., Liang Y., Phinney K.W., Stein S.E. Quantitative bottom-up proteomics depends on digestion conditions. Anal. Chem. 2014;86:551–558. doi: 10.1021/ac4027274. [DOI] [PubMed] [Google Scholar]
- Ludwig C., Gillet L., Rosenberger G., Amon S., Collins B.C., Aebersold R. Data-independent acquisition-based SWATH-MS for quantitative proteomics: a tutorial. Mol. Syst. Biol. 2018;14:e8126. doi: 10.15252/msb.20178126. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mackinnon A.C., Gibbons M.A., Farnworth S.L., Leffler H., Nilsson U.J., Delaine T., Simpson A.J., Forbes S.J., Hirani N., Gauldie J., Sethi T. Regulation of transforming growth factor-β1-driven lung fibrosis by galectin-3. Am. J. Respir. Crit. Care Med. 2012;185:537–546. doi: 10.1164/rccm.201106-0965OC. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Macpherson M.E., Halvorsen B., Yndestad A., Ueland T., Mollnes T.E., Berge R.K., Rashidi A., Otterdal K., Gregersen I., Kong X.Y. Impaired HDL function amplifies systemic inflammation in common variable immunodeficiency. Sci. Rep. 2019;9:9427. doi: 10.1038/s41598-019-45861-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Marcos V., Latzin P., Hector A., Sonanini S., Hoffmann F., Lacher M., Koller B., Bufler P., Nicolai T., Hartl D., Griese M. Expression, regulation and clinical significance of soluble and membrane CD14 receptors in pediatric inflammatory lung diseases. Respir. Res. 2010;11:32. doi: 10.1186/1465-9921-11-32. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Meng H., Song Y., Zhu J., Liu Q., Lu P., Ye N., Zhang Z., Pang Y., Qi J., Wu H. LRG1 promotes angiogenesis through upregulating the TGF-β1 pathway in ischemic rat brain. Mol. Med. Rep. 2016;14:5535–5543. doi: 10.3892/mmr.2016.5925. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Messner C., Demichev V., Bloomfield N., Ivosev G., Wasim F., Zelezniak A., Lilley K., Tate S., Ralser M. ScanningSWATH enables ultra-fast proteomics using high-flow chromatography and minute-scale gradients. bioRxiv. 2019 doi: 10.1101/656793. [DOI] [Google Scholar]
- Millard S.P. EnvStats, an RPackage for environmental statistics. In: Balakrishnan N., Colton T., Everitt B., Piegorsch W., Ruggeri F., Teugels J.L., editors. Wiley StatsRef: Statistics Reference Online. Wiley; 2014. [DOI] [Google Scholar]
- Mischak H., Ioannidis J.P.A., Argilis A., Attwood T.K. European Journal of Clinical Investigation. 2012;42:1027–1036. doi: 10.1111/j.1365-2362.2012.02674.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Müller T., Kalxdorf M., Longuespée R., Kazdal D.N., Stenzinger A., Krijgsveld J. Automated sample preparation with SP3 for low-input clinical proteomics. Mol. Syst. Biol. 2020;16:e9111. doi: 10.15252/msb.20199111. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Nair H., Clarke W. Academic Press; 2016. Mass spectrometry for the Clinical Laboratory. [Google Scholar]
- Naka T., Fujimoto M. LRG is a novel inflammatory marker clinically useful for the evaluation of disease activity in rheumatoid arthritis and inflammatory bowel disease. Immunol. Med. 2018;41:62–67. doi: 10.1080/13497413.2018.1481582. [DOI] [PubMed] [Google Scholar]
- Nie S., Zhao X., Zhao K., Zhang Z., Zhang Z., Zhang Z. Metabolic disturbances and inflammatory dysfunction predict severity of coronavirus disease 2019 (COVID-19): a retrospective study. medRxiv. 2020 doi: 10.1101/2020.03.24.20042283. [DOI] [Google Scholar]
- Niemeyer D., Mösbauer K., Klein E.M., Sieberg A., Mettelman R.C., Mielech A.M., Dijkman R., Baker S.C., Drosten C., Müller M.A. The papain-like protease determines a virulence trait that varies among members of the SARS-coronavirus species. PLoS Pathog. 2018;14:e1007296. doi: 10.1371/journal.ppat.1007296. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Nilsson T., Mann M., Aebersold R., Yates J.R., 3rd, Bairoch A., Bergeron J.J.M. Mass spectrometry in high-throughput proteomics: ready for the big time. Nat. Methods. 2010;7:681–685. doi: 10.1038/nmeth0910-681. [DOI] [PubMed] [Google Scholar]
- Niu L., Geyer P.E., Wewer Albrechtsen N.J.W., Gluud L.L., Santos A., Doll S., Treit P.V., Holst J.J., Knop F.K., Vilsbøll T. Plasma proteome profiling discovers novel proteins associated with non-alcoholic fatty liver disease. Mol. Syst. Biol. 2019;15:e8793. doi: 10.15252/msb.20188793. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ongrádi J. InTechOpen; 2016. Herpesviridae. [Google Scholar]
- Perez-Riverol Y., Csordas A., Bai J., Bernal-Llinares M., Hewapathirana S., Kundu D.J., Inuganti A., Griss J., Mayer G., Eisenacher M. The PRIDE database and related tools and resources in 2019: improving support for quantification data. Nucleic Acids Res. 2019;47:D442–D450. doi: 10.1093/nar/gky1106. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Petherick A. Developing antibody tests for SARS-CoV-2. Lancet. 2020;395:1101–1102. doi: 10.1016/S0140-6736(20)30788-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Pham T.V., Henneman A.A., Jimenez C.R. iq: an R package to estimate relative protein abundances from ion quantification in DIA-MS-based proteomics. Bioinformatics. 2020;36:2611–2613. doi: 10.1093/bioinformatics/btz961. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Piehowski P.D., Petyuk V.A., Orton D.J., Xie F., Moore R.J., Ramirez-Restrepo M., Engel A., Lieberman A.P., Albin R.L., Camp D.G. Sources of technical variability in quantitative LC–MS proteomics: human brain tissue sample analysis. J. Proteome Res. 2013;12:2128–2137. doi: 10.1021/pr301146m. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Piktel E., Levental I., Durnaś B., Janmey P.A., Bucki R. Plasma gelsolin: indicator of inflammation and its potential as a diagnostic tool and therapeutic target. Int. J. Mol. Sci. 2018;19:2516. doi: 10.3390/ijms19092516. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ranoa D.R.E., Kelley S.L., Tapping R.I. Human lipopolysaccharide-binding protein (LBP) and CD14 independently deliver triacylated lipoproteins to toll-like receptor 1 (TLR1) and TLR2 and enhance formation of the ternary signaling complex. J. Biol. Chem. 2013;288:9729–9741. doi: 10.1074/jbc.M113.453266. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ruan Q., Yang K., Wang W., Jiang L., Song J. Correction to: clinical predictors of mortality due to COVID-19 based on an analysis of data of 150 patients from Wuhan, China. Intensive Care Med. 2020:1–3. doi: 10.1007/s00134-020-06028-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sauter D., Kirchhoff F. HIV replication: a game of hide and sense. Curr. Opin. HIV AIDS. 2016;11:173–181. doi: 10.1097/COH.0000000000000233. [DOI] [PubMed] [Google Scholar]
- Schilling B., Gibson B.W., Hunter C.L. Generation of high-quality SWATH® acquisition data for label-free quantitative proteomics studies using TripleTOF® mass spectrometers. Methods Mol. Biol. 2017;1550:223–233. doi: 10.1007/978-1-4939-6747-6_16. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Shen H., Wang J., Min J., Xi W., Gao Y., Yin L., Yu Y., Liu K., Xiao J., Zhang Y.F., Wang Z.-N. Activation of TGF-β1/α-SMA/Col I Profibrotic pathway in fibroblasts by galectin-3 contributes to atrial fibrosis in experimental models and patients. Cell. Physiol. Biochem. 2018;47:851–863. doi: 10.1159/000490077. [DOI] [PubMed] [Google Scholar]
- Shen B., Yi X., Sun Y., Bi X., Du J., Zhang C., Quan S., Zhang F., Sun R., Qian L. Proteomic and metabolomic characterization of COVID-19 patient sera. Cell. 2020;182 doi: 10.1016/j.cell.2020.05.032. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Shirai R., Hirano F., Ohkura N., Ikeda K., Inoue S. Up-regulation of the expression of leucine-rich alpha(2)-glycoprotein in hepatocytes by the mediators of acute-phase response. Biochem. Biophys. Res. Commun. 2009;382:776–779. doi: 10.1016/j.bbrc.2009.03.104. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Shishkova E., Hebert A.S., Coon J.J. Now, more than ever, proteomics needs better chromatography. Cell Syst. 2016;3:321–324. doi: 10.1016/j.cels.2016.10.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Shive C.L., Jiang W., Anthony D.D., Lederman M.M. Soluble CD14 is a nonspecific marker of monocyte activation. AIDS. 2015;29:1263–1265. doi: 10.1097/QAD.0000000000000735. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Silverman A.M., Nakata R., Shimada H., Sposto R., DeClerck Y.A. A galectin-3–dependent pathway upregulates interleukin-6 in the microenvironment of human neuroblastoma. Cancer Res. 2012;72:2228–2238. doi: 10.1158/0008-5472.CAN-11-2165. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sirniö P., Väyrynen J.P., Klintrup K., Mäkelä J., Mäkinen M.J., Karttunen T.J., Tuomisto A. Decreased serum apolipoprotein A1 levels are associated with poor survival and systemic inflammatory response in colorectal cancer. Sci. Rep. 2017;7:5374. doi: 10.1038/s41598-017-05415-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Smith B.H., Campbell A., Linksted P., Fitzpatrick B., Jackson C., Kerr S.M., Deary I.J., MacIntyre D.J., Campbell H., McGilchrist M. Cohort profile: generation Scotland: Scottish family health study (GS:SFHS). The study, its participants and their potential for genetic research on health and illness. Int. J. Epidemiol. 2013;42:689–700. doi: 10.1093/ije/dys084. [DOI] [PubMed] [Google Scholar]
- Stock A.M., Klee F., Edlund K., Grinberg M., Hammad S., Marchan R., Cadenas C., Niggemann B., Zänker K.S., Rahnenführer J. Gelsolin is associated with longer metastasis-free survival and reduced cell migration in estrogen receptor-positive breast cancer. Anticancer Res. 2015;35:5277–5285. [PubMed] [Google Scholar]
- UniProt Consortium UniProt: a worldwide hub of protein knowledge. Nucleic Acids Res. 2019;47:D506–D515. doi: 10.1093/nar/gky1049. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Van Gucht S., Van Reeth K., Nauwynck H., Pensaert M. Porcine reproductive and respiratory syndrome virus infection increases CD14 expression and lipopolysaccharide-binding protein in the lungs of pigs. Viral Immunol. 2005;18:116–126. doi: 10.1089/vim.2005.18.116. [DOI] [PubMed] [Google Scholar]
- Vowinckel J., Zelezniak A., Bruderer R., Mülleder M., Reiter L., Ralser M. Cost-effective generation of precise label-free quantitative proteomes in high-throughput by microLC and data-independent acquisition. Sci. Rep. 2018;8:4346. doi: 10.1038/s41598-018-22610-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wang X., Abraham S., McKenzie J.A.G., Jeffs N., Swire M., Tripathi V.B., Luhmann U.F.O., Lange C.A.K., Zhai Z., Arthur H.M. LRG1 promotes angiogenesis by modulating endothelial TGF-β signalling. Nature. 2013;499:306–311. doi: 10.1038/nature12345. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wewer Albrechtsen N.J., Geyer P.E., Doll S., Treit P.V., Bojsen-Møller K.N., Martinussen C., Jørgensen N.B., Torekov S.S., Meier F., Niu L. Plasma proteome profiling reveals dynamics of inflammatory and lipid homeostasis markers after Roux-en-Y gastric bypass surgery. Cell Syst. 2018;7:601–612.e3. doi: 10.1016/j.cels.2018.10.012. [DOI] [PubMed] [Google Scholar]
- White R., Giordano S., Datta G. Role of HDL-associated proteins and lipids in the regulation of inflammation. In: Isbir T., editor. Advances in lipoprotein research. 2017. p. 53. [Google Scholar]
- Wittmann J., Dieckow J., Schröder H., Hampel U., Garreis F., Jacobi C., Milczarek A., Hsieh K.L., Pulli B., Chen J.W. Plasma gelsolin promotes re-epithelialization. Sci. Rep. 2018;8:13140. doi: 10.1038/s41598-018-31441-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- World Health Organization . WHO; 2020. R&D blueprint novel coronavirus COVID-19 therapeutic trial synopsis.https://www.who.int/teams/blueprint/covid-19 [Google Scholar]
- Wright I., Van Eyk J.E. A roadmap to successful clinical proteomics. Clin. Chem. 2017;63:245–247. doi: 10.1373/clinchem.2016.254664. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yang M.H., Yang Y.H., Lu C.Y., Jong S.B., Chen L.J., Lin Y.F., Wu S.J., Chu P.Y., Chung T.W., Tyan Y.C. Activity-dependent neuroprotector homeobox protein: a candidate protein identified in serum as diagnostic biomarker for Alzheimer’s disease. J. Proteomics. 2012;75:3617–3629. doi: 10.1016/j.jprot.2012.04.017. [DOI] [PubMed] [Google Scholar]
- Yao H., Lu X., Chen Q., Xu K., Chen Y., Cheng L., Liu F., Wu Z., Wu H., Jin C. Patient-derived mutations impact pathogenicity of SARS-CoV-2. medRxiv. 2020 doi: 10.1101/2020.04.14.20060160. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yu L., Ruifrok W.P., Meissner M., Bos E.M., van Goor H., Sanjabi B., van der Harst P., Pitt B., Goldstein I.J., Koerts J.A. Genetic and pharmacological inhibition of galectin-3 prevents cardiac remodeling by interfering with myocardial fibrogenesis. Circ. Heart Fail. 2013;6:107–117. doi: 10.1161/CIRCHEARTFAILURE.112.971168. [DOI] [PubMed] [Google Scholar]
- Zanoni I., Granucci F. Role of CD14 in host protection against infections and in metabolism regulation. Front. Cell. Infect. Microbiol. 2013;3:32. doi: 10.3389/fcimb.2013.00032. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zhao X., Nicholls J.M., Chen Y.G. Severe acute respiratory syndrome-associated coronavirus nucleocapsid protein interacts with Smad3 and modulates transforming growth factor-β signaling. J. Biol. Chem. 2008;283:3272–3280. doi: 10.1074/jbc.M708033200. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zhou F., Yu T., Du R., Fan G., Liu Y., Liu Z., Xiang J., Wang Y., Song B., Gu X. Clinical course and risk factors for mortality of adult inpatients with COVID-19 in Wuhan, China: a retrospective cohort study. Lancet. 2020;395:1054–1062. doi: 10.1016/S0140-6736(20)30566-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zhou Y., Fu B., Zheng X., Wang D., Zhao C., Qi Y., Sun R. Aberrant pathogenic GM-CSF+ T cells and inflammatory CD14+ CD16+ monocytes in severe pulmonary syndrome patients of a new coronavirus. bioRxiv. 2020 doi: 10.1101/2020.02.12.945576. [DOI] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
The raw data of the acquired commercial plasma and serum control samples within the GS study have been deposited to the ProteomeXchange Consortium via PRIDE (Perez-Riverol et al., 2019) partner repository with the dataset identifier Pride:PXD018874. According to the terms of consent for GenerationScotland participants, access to individual-level data (omics and phenotypes) must be reviewed by the GS Access Committee. Applications should be made to access@generationscotland.org. The DIA-NN software suite and DiaNN R package are open source and are freely available for download at https://github.com/vdemichev/DiaNN and https://github.com/vdemichev/diann-rpackage respectively.