HeapMS: An Automatic Peak-Picking Pipeline for Targeted Proteomic Data Powered by 2D Heatmap Transformation and Convolutional Neural Networks

Chi-Ching Lee; Yu-Chieh Lin; Teng Yu Pan; Cheng Hann Yang; Pei-Hsuan Li; Sin You Chen; Jhih Jie Gao; Chi Yang; Lichieh Julie Chu; Po-Jung Huang; Yuan-Ming Yeh; Petrus Tang; Yu-Sun Chang; Jau-Song Yu; Yung-Chin Hsiao

doi:10.1021/acs.analchem.3c01011

. 2023 Oct 11;95(42):15486–15496. doi: 10.1021/acs.analchem.3c01011

HeapMS: An Automatic Peak-Picking Pipeline for Targeted Proteomic Data Powered by 2D Heatmap Transformation and Convolutional Neural Networks

Chi-Ching Lee ^†,^‡,^§,^*, Yu-Chieh Lin ^∥, Teng Yu Pan ^†, Cheng Hann Yang ^†, Pei-Hsuan Li ^†, Sin You Chen ^†,^§, Jhih Jie Gao ^†, Chi Yang ^⊥, Lichieh Julie Chu ^⊥,^#,^∇, Po-Jung Huang ^‡,^○, Yuan-Ming Yeh ^‡, Petrus Tang ^◆,^¶, Yu-Sun Chang ^⊥, Jau-Song Yu ^⊥,^#,^∇,^⋈, Yung-Chin Hsiao ^⊥,^#,^∇,^*

PMCID: PMC10603604 PMID: 37820297

Abstract

graphic file with name ac3c01011_0007.jpg

The process of peak picking and quality assessment for multiple reaction monitoring (MRM) data demands significant human effort, especially for signals with low abundance and high interference. Although multiple peak-picking software packages are available, they often fail to detect peaks with low quality and do not report cases with low confidence. Furthermore, visual examination of all chromatograms is still necessary to identify uncertain or erroneous cases. This study introduces HeapMS, a web service that uses artificial intelligence to assist with peak picking and the quality assessment of MRM chromatograms. HeapMS applies a rule-based filter to remove chromatograms with low interference and high-confidence peak boundaries detected by Skyline. Additionally, it transforms two histograms (representing light and heavy peptides) into a single encoded heatmap and performs a two-step evaluation (quality detection and peak picking) using image convolutional neural networks. HeapMS offers three categories of peak picking: uncertain peak picking that requires manual inspection, deletion peak picking that requires removal or manual re-examination, and automatic peak picking. HeapMS acquires the chromatogram and peak-picking boundaries directly from Skyline output. The output results are imported back into Skyline for further manual inspection, facilitating integration with Skyline. HeapMS offers the benefit of detecting chromatograms that should be deleted or require human inspection. Based on defined categories, it can significantly reduce human workload and provide consistent results. Furthermore, by using heatmaps instead of histograms, HeapMS can adapt to future updates in image recognition models. The HeapMS is available at: https://github.com/ccllabe/HeapMS.

Introduction

Targeted proteomic approaches have been widely applied in biomolecular studies of proteins and metabolites. The liquid chromatography-multiple reaction monitoring-mass spectrometry (LC-MRM-MS) assay is a multiplexed and targeted quantification approach and has been demonstrated as a robust technology platform for disease biomarker discovery and verification.¹⁻⁴ The MRM assay is capable of precisely quantifying multiple candidates of interest in a single analysis by employing the corresponding stable isotope-labeled standard (SIS) peptides with known amounts as the internal standards. Recently, the MRM assay has enabled sensitive, reproducible, and specific quantification of 100 or more peptide features in complex biological matrices.⁵ Due to the multiple quantification results obtained from the MRM assay, data analysis of hundreds of samples is a time- and labor-consuming process.

Analysis of multiple reaction monitoring (MRM) data typically involves two major steps: peak picking and quality assessment. Because chromatographic peaks exhibit a variety of patterns, data processing and examination heavily rely on human experience. Generally, MRM analysis requires weeks or even months of data processing and manual evaluation, especially for peak quality inspection, selected peak correction, and sample comparison. Therefore, the identification of methods to improve the fluency, effectiveness, and accuracy of this process is critical. Several mass spectrometry (MS) data algorithms and tools focus on the quality assessment of ion transitions and peak boundary picking. MRM analysis software such as Skyline⁶ provides peak-picking capabilities through software prediction or manual operation functionalities. Skyline is a powerful software for the visualization of chromatograms obtained from the MRM assay that facilitates human inspection. Human inspection remains the primary step in the workflow for peak quality assessment and boundary picking.

Several mass spectrometry (MS) data algorithms and tools focus on the quality assessment of peak boundary picking. Two types of strategies are typically adopted: scores and prediction models. Featuring scores is used to build the criteria for manual examinations, whereas prediction models are developed from predefined rules, machine learning methods, and deep learning techniques. Table 1 summarizes the types of supporting MS instruments, targets, user interfaces, and prediction abilities. TargetedMSQC⁷ uses different peak features against data from targeted MS to calculate the metrics and filter out the peak qualities. These features include full width at half-maximum, peak jaggedness, and symmetry. PeakOnly⁸ is a convolutional neural network (CNN) based on machine learning for the detection of regions of interest (ROIs) and peak area integration to classify ROIs into three categories. NeatMS⁹ is an application of CNNs that describes model and train selection peaks by using tools such as XCMS¹⁰ and MZmine 2.¹¹ It also filters peak qualities to reduce the number of false peaks. AutomRm¹² uses machine learning to train two models, namely, a peak-picking model and a peak-reporting model, to execute MS data peak preprocessing. Currently, software provides users with prediction results directly, but for certain parts where predictions are not accurate, the software does not report them to the user. This leads to users having to verify the prediction results one by one. Additionally, the differences in MRM technology, such as quantifier and qualifier purposes, lead to diverse output contents. Although it is the same technology, the data structure is different due to differences in experimental design and parameter settings. This complicates the process of comparing different tools for different purposes.

Table 1. Comparison of Recently Published MS Peak-Picking Solutions.

tool	MS type	target^a	peak quality	peak picking	uncertain& deletion	input format
HeapMS	MRM	P	V	V	V	.csv
NeatMS	LC/MS	M	V			.mzML
peakonly	LC/MS	M		V		.mzML
DNN	LC/MS	M	V			.mzML
TargetedMSQC	MRM	P	V			.csv
AutomRm	MRM	M	V	V		.mzML

Open in a new tab

P, proteome; M, metabolome.

Technicians tend to examine ion transition graphs one by one by using visualization tools, resulting in time-consuming MRM data processing. In this study, we developed HeapMS, a user-friendly web service that serves as an MRM chromatogram quality assessment and automatic peak-picking pipeline. The primary goal of HeapMS is to reduce the duration of manual postprocessing and to work alongside current MRM data processing software. HeapMS reports uncertain chromatograms that require a manual examination. It also provides accurate peak boundary prediction capabilities for technicians to focus on reexamining chromatograms that are marked as uncertain and those marked as deleted. Three types of HeapMS outputs, namely, peak boundaries, uncertain chromatograms, and marks for deletion, are imported into Skyline for manual examination and verification. This not only shortens the duration of manual comparison and inspection for peak selection but also increases the efficiency of the entire process. We also developed an accuracy of quantification (AQ) score as a standard peak-picking benchmark based on a comparison of manual picking results for different tools and technicians. By exporting the software automatic picking results and importing the prediction results of HeapMS, the HeapMS can be integrated into the current MRM workflow.

Methods

Sample Collection and Preparation

The biofluid samples (first morning-urine and unstimulated whole saliva) were collected from our unpublished chronic kidney disease (CKD) data and oral squamous cell carcinoma (OSCC) patients and prepared as described in our previous studies.^13,14 Briefly, the collected samples were centrifuged to remove cells and debris, and the clarified supernatants were stored at −80 °C. For urine sample, urinary proteins were further concentrated using a 10 kDa centrifugal filter. Based on the protein concentration measured by the BCA Protein Assay kit, triplicate aliquots of each sample containing 30 μg proteins were diluted with an appropriate amount of 25 mM ammonium bicarbonate, denatured with 10% sodium deoxycholate (DOC), reduced with 50 mM Tris (2-carboxyethyl) phosphine (TCEP) at 60 °C for 30 min, alkylated with 100 mM iodoacetamide at 37 °C for 30 min, and acidified with 10% formic acid (FA) and 10% trifluoroacetic acid (TFA) to precipitate DOC. The resulting samples were mixed with stable isotope-labeled standard (SIS) peptides and subsequently subjected to trypsin-mediated digestion.

LC-MRM-MS Analysis and Data Acquisition

A nano-ACQUITY UPLC system equipped with a C18 column (100 μm × 100 mm, 1.7 μm particle size; Waters) was coupled with a triple quadrupole mass spectrometer (QTRAP 5500; AB Sciex, Redwood, CA). Four-microliter samples containing 1 μg of peptides were injected into the LC-MRM-MS system, and a 42 min linear gradient of buffer A (0.1% FA in H₂O) combined with buffer B (0.1% FA in acetonitrile) was applied as described in our previous studies.^13,14 The acquisition parameters (CE, DP, EP, CEP, CXP) for each target peptide and the instrument settings were experimentally determined in order to obtain sensitive detection of peptide targets in biological samples. Raw files (.wiff) of MRM analysis were processed using Skyline software based on the defined precursor (Q1) and fragment (Q3) mass list of the target peptides. All spectra were manually checked to ensure confident quantitative results with adjustments or deletions as needed.

Data Sets for Deep Learning

Four data sets were used to establish the data processing and prediction pipeline: one from the urine samples of CKD patients and three from the saliva samples of OSCC patients. All samples underwent triplicate measurements of LC-MRM-MS for technical replicates. The four data sets differ in the number of monitored peptides: data set CKD comprises 20 peptides, data set OSCC-1 comprises 53 peptides, data set OSCC-2 comprises 28 peptides, and data set OSCC-3 comprises 45 peptides. Each chromatogram consists of the light ion (target peptide) and the heavy ion (SIS peptide), and the identities of the peptides are represented by three unique transitions (paired Q1/Q3). Output files of Skyline software in text format, including chromatograms and peak boundaries, were utilized for subsequent processes.

AQ Score: An Objective Score for Evaluating Peak-Picking Results

Generally, time boundaries for peak picking cannot be fully correlated to quantification results. Moreover, chromatograms labeled with the same time boundaries may have varying intensities, resulting in different quantification results. In this study, we developed a relative score, called the AQ score, to evaluate the prediction results accurately by comparing the area ratio of the predicted results with that of human-evaluated results for the same chromatogram. The AQ score primarily evaluates the area ratio of selected peak boundaries, which can be utilized to calculate the quantification results by cross-referencing with an internal standard. The AQ score provides a more precise evaluation than simply measuring the time boundaries for peak picking. Figure S1 illustrates how the AQ score is calculated and how it can serve as an evaluation standard for any peak-picking method.

Identifying Low-Interference Chromatograms

The MRM analytical software provides precise peak boundaries for low-interference chromatograms observed in the data set. Low-interference chromatograms have high signal-to-noise ratios and exhibit comparable patterns and peak time boundaries for three ion transitions. In this study, we utilized a Savitzky–Golay filter¹⁵ to obtain smoothed chromatograms. We identified low-interference chromatograms on the basis of the following rules: (1) the root-mean-square error of ion transitions should be less than 0.25, (2) each chromatogram must have only one high peak defined by intensities higher than 20% of the highest intensity of the three ion transitions, and (3) the retention bias of the six highest intensities of both heavy and light ions should be less than 0.3 times the retention time period.

graphic file with name ac3c01011_m004.jpg

We used strict rules to identify low-interference chromatograms to ensure highly accurate software prediction results. After filtering low-interference chromatograms, we used two artificial intelligence (AI) models to perform a quality assessment and automatic peak picking using the following transformations.

Converting MRM Spectra into AI Predictable Chromatographic Heatmaps

We used a spectrum-to-image generator to convert the spectra of both heavy- and light-ion intensities, coupled with retention time, into a two-dimensional (2D) heatmap designed for AI recognition and object detection. Figure 1 illustrates the concept of this conversion. The intensities of three ions were converted into relative intensities, summed into a single value, and mapped to a color bar called “nipy_spectral” using the matplotlib package in Python. We also tested grayscale and two other color settings, named Gist_gray, Gnuplot, and Spring, respectively. The results are shown in Figure S2. This heatmap conversion strategy maintains information such as the input spectra, including the relative intensities of both heavy and light ions and the pattern of the quantified ions. In addition, it conserves the patterns of correct peak boundaries and high-quality chromatograms, making the chromatographic heatmaps suitable for AI prediction.

Chromatographic heatmap transformation. The heatmap is created by superimposing the relative intensities of both light and heavy ions. Manually marked peak boundaries are highlighted by a red dotted line, and the red square highlights the 2D heatmap of the optimal peak boundaries. Examples of chromatographic heatmaps are shown in Figure S3.

Implementation of HeapMS

On the web interface of HeapMS, we use PHP, JavaScript, and HTML for development. Users simply need to convert and select files using Skyline before uploading them to HeapMS for processing. We used the docker environment and Python language to implement a backend pipeline with a Queuing system to handle all user requests. The HeapMS platform is composed of multiple independent computing nodes connected, and any computer capable of running the HeapMS container can serve as a computing node. Once the user’s files are uploaded, the system executes the tasks sequentially based on the upload time. Each time, only one task is executed individually. When the computation starts, a log file is generated to record the execution time for each step. If another task triggered by another user is submitted, then it will wait until the previous task is completed before proceeding. The source code and a step-by-step installation guide are available on GitHub: https://github.com/ccllabe/HeapMS.

Deep Learning Architectures

To build the prediction modules for HeapMS, we utilized two convolutional neural network architectures: CoAtNet-4¹⁶ as a quality control (QC) checker and Faster R-CNN¹⁷ as a boundary picker. CoAtNets combines convolution and self-attention to achieve outstanding performance on various data sets, while Faster R-CNN uses region proposals to detect objects in images by identifying bounding boxes. Both CoAtNet-416 and Faster R-CNN are compatible with GPU acceleration. We trained the deep learning models from chromatographic heatmaps and implemented the training and prediction components of HeapMS using Python.

Performance Evaluation of the QC Checker

We used precision, recall, and F1 score to evaluate the performance of the QC checker by comparing the predicted results to the manually evaluated results. A true positive (TP) is defined as a chromatogram that is predicted to be positive and is also labeled as positive in manual evaluation. A true negative (TN) is defined as a chromatogram that is predicted to be negative and is also labeled as negative in the manual evaluation. A false positive (FP) is defined as a chromatogram that is predicted to be positive but is actually negative, according to the manual evaluation. A false negative (FN) is defined as a chromatogram that is predicted to be negative but is actually positive, according to the manual evaluation.

Results and Discussion

Workflow of HeapMS

HeapMS is a front-end web interface and backend data processing pipeline. Figure 2a depicts the data processing pipeline. Users are advised to use Skyline to perform automatic software picking and export the data into time boundary comma-separated value (CSV) and ion transition tab-separated value (TSV) files. An interference detector is then used to identify low-interference chromatograms, which are subsequently applied to the picking results of Skyline. The QC checker uses a pretrained CoAtNet model to divide the remaining data into QC-failed and QC-passed groups. The QC-failed chromatograms are either deleted or manually reexamined. The boundary picker then uses a pretrained Faster R-CNN model to perform automatic time boundary picking. Users are required to upload only their automatic picking and ion transition files exported from Skyline. The queuing system then automatically performs heatmap conversion, QC checks, and boundary picking. Users then download the picked time boundary files in CSV format and intensities in TSV format and import them into Skyline to perform manual inspection. Figure 2b depicts the user interface of HeapMS. Users can select computing nodes with fewer queuing jobs and can upload their manually picked data. HeapMS compares the results of AI and Skyline picking by using manual picking as the standard. The Web site is available at http://heapms.cgu.edu.tw. The source code of HeapMS and the preinstall docker image with instructions is also available at https://github.com/ccllabe/HeapMS.

(a) Data processing pipeline of HeapMS. MRM raw data are first picked and exported in text format by Skyline. The interference detector then selects chromatograms whose optimal boundaries can be picked by Skyline. Subsequently, the QC checker identifies the QC-failed chromatograms, and the boundary picker performs time boundary picking based on the object detection capability of transformed 2D heatmaps. (b) Web interface of HeapMS. Users can acquire (A) the number of computing nodes and (B) the job status (queued/running/finished) of each computing node. (C) Users can retrieve the submitted job reports by entering their job ID. (D) The job titles, Skyline boundaries, and transitions of each ion can be specified and uploaded. (E) Users can upload manually picked time boundaries as a reference, and then, a comparison is made, as shown in (F, G), after our automatic picking pipeline is run. (H) AI-predicted, QC-failed, and low-interference chromatograms can be downloaded and imported into Skyline.

Interplay between Traditional Workflows

In traditional workflows, analytical software and human technicians collaborate. The first step involves identifying and visualizing peak boundaries using software such as Skyline. Technicians manually examined all chromatograms according to the spectrum of transitional ions. Chromatograms with low intensities, low signal-to-noise ratios, and no coelution properties are eliminated (marked as “Deletion” in Figure 3a). Technicians use their experience to accept the time boundary detected by analytic software (marked as “Computation” in Figure 2a) and to manually adjust the time boundaries (marked as “Manually picked” in Figure 3a). Figure 3a summarizes the standard workflow results for four data sets. In practical cases, all software-predicted boundaries must be manually validated, which is the most time-consuming step. The decision and adjustment to delete or repick the boundaries might generally take 30 s per chromatogram, while review of the acceptable boundaries would take 5 s. The labor working-hours were 53.3, 23.9, 125.8, and 414.0 h for CDK, OSCC-1, OSCC-2, and OSCC-3 data sets, respectively (Figure 3a).

(a) “Deletion” category (black) represents the intensities of transitions that are too low to be used. The “Computation” category (pink) represents the time regions selected by automatic software, whose results are manually verified by technicians. The “manually picked” category (purple) contains manually picked chromatograms. (b) AQ score distribution of manually repicked chromatograms summarized from the purple area in (a). AQ scores between the two dotted lines represent the acceptable deviation of regular peak picking. From the four data sets, the chromatograms that required repicking were CKD (12.6%), OSCC-1 (23.3%), OSCC-2 (7.7%), and OSCC-3 (10.5%). Table S1 lists the percentages of all of the AQ ranges.

Advantage of the AQ Score

The AQ score can help identify which chromatograms truly need to be repicked by evaluating the similarity between manually and software-predicted results. Figure 3b summarizes the AQ scores of chromatograms picked automatically from four data sets (purple areas in Figure 3a). The acceptable time boundary is represented by two dashed lines in Figure 3b, which are 0.2 units above and below the baseline. As shown in the figure, compared to manually adjusted results, after evaluating the AQ scores before adjustment, we found that most chromatograms considered in need of adjustment by humans did not actually require adjustment, as the predicted results from Skyline were acceptable. Around 7–23% of the chromatograms did require manual adjustment. In the proteomic MRM analysis, software-based boundary predictions are still judged manually due to the lack of assistant tools. Nonetheless, as depicted in the figure, manual judgments tend to be too strict, resulting in the inclusion of many acceptable predicted results in the category requiring manual adjustment, which undoubtedly increases labor costs. Manual re-examination makes the analytical workflow for MRM data time-consuming.

Need for AI-Assisted Peak Picking

As a significant portion of the data required manual examination and repicking, we conducted an experiment to determine the variation in examination outcomes among the individual technicians. We randomly selected 20 peptides, each containing 130 chromatograms, to investigate the deviation in manual operations. Three well-trained technicians independently examined the data, either accepting the software predictions or performing manual repicking. We then combined the coefficient of variation of peak-picking time (time CV) and the area ratio of light- and heavy-ion transitions (area CV) on a 2D plot. The results showed similar area CVs, indicating that the quantification of human-validated data was consistent. However, the time CVs were more diverse than the area CVs, indicating that transitions near peak boundaries had low intensities and were challenging to manually define. Based on Figure 4, relying solely on peak-picking time boundaries to evaluate peak-picking results may be insufficient. However, the quantification result, represented by the ratio of peak areas between the picked light- and heavy-ion boundaries, may provide a reasonable measure of adjustment. Figure 4b,4c demonstrates two data points from Figure 4a with highly diverse area and time coefficients of variation (CVs). Despite other technicians using different judgment criteria resulting in varied peak-picking times, the area ratios of the selected regions were similar.

Summary of chromatogram picking results and division of manual picking. (a) Deviation of quantification results in a traditional workflow with three technicians. Highly diverse chromatograms, labeled as (1, 2), are depicted in (b, c).

Accuracy of Low-Interference Detectors

We used four data sets to test our low-interference detector by comparing the manual inspection results listed in Table 2. The results indicated that the accuracies of the low-interference chromatograms were >98%, indicating that our low-interference detector was functional. Filtering out low-interference chromatograms allowed for reducing the computational time for chromatographic heatmap generation and AI prediction.

Table 2. Accuracy of Low-Interference Detectors.

	AQ failed	AQ passed^a	accuracy (%)
CKD	8	4435	99.82
OSCC-1	14	942	98.53
OSCC-2	104	7799	98.68
OSCC-3	82	5942	98.64

Open in a new tab

AQ-passed chromatograms are defined as chromatograms that are manually marked as nondeletion with 0.8 ≤ AQ score ≤ 1.2.

QC Checker Performance

We used CNN to assess the quality of the chromatograms. Figure 5a,5b depicts the most expected QC-passed and QC-failed chromatograms, respectively. We compared the accuracies of four commonly used CNN architectures.^16,18,19 The results are depicted in Figure 5c. Of the four data sets, CoAtNet-4 exhibited the highest accuracy. After removing the QC-failed chromatograms, we performed peak time boundary identification. Generally, HeapMS uses Faster R-CNN to detect peak boundaries on heatmaps generated from light- and heavy-ion transitions. Therefore, we cropped all manually picked time boundaries, which are represented as a square area in the chromatographic heatmap, as a training data set to build an object detection model.

The QC checker of HeapMS distinguishes between QC-passed and QC-failed chromatograms by identifying heatmap patterns. Discernible variations can be observed in the color-pixel permutations, with QC-passed cases exhibiting greater symmetry than QC-failed cases. Table 3 lists the prediction results of the four data sets. To eliminate the artifacts across samples, we performed experiments employing a 5-fold cross-validation methodology. In this approach, we randomly divided our data sets of 72,220 chromatograms into a test set comprising 20% (14,444 chromatograms) and a training set consisting of 80% (57,776 chromatograms). This was repeated five times, and the mean and standard deviation were obtained. The F1 score of the four data sets was approximately 0.8, with OSCC-3 exhibiting a high false deletion rate. Compared with the data shown in Figure 6, the results of HeapMS and Skyline contained a higher proportion of deleted chromatograms. The workflow allowed users to manually examine the predicted QC-failed chromatograms, including true and false deletion. Compared to traditional workflows, integrating HeapMS can reduce the work time by approximately 5–10 times. For experiments that allow for the possibility of chromatogram deletion, manual inspection of those marked as “Uncertain” is sufficient, resulting in a 10× increase in efficiency compared to traditional workflows. Additionally, the team can manually review chromatograms marked as “deletion” based on their workload and available resources.

Table 3. Performance Comparison of the QC Checker of HeapMS with the Four Test Data Sets^a ^b.

	precision	recall	F1
CKD	0.94	0.85	0.89
OSCC-1	0.90	0.84	0.87
OSCC-2	0.91	0.86	0.88
OSCC-3	0.93	0.66	0.78
cross-sampling	0.91 ± 0.01	0.87 ± 0.03	0.89 ± 0.01

Open in a new tab

The number of prediction results is listed in Table S2.

The value after the symbol ± is the standard deviation.

Comparison of peak-picking results between HeapMS and Skyline with those of manual inspection. (a) The top row shows our AI prediction results for the four data sets, (b) the middle row shows the Skyline results, (c) and the bottom row lists the predicted deletion ratios of all target peptides for the four data sets. Each cell contains a different height of purple bar representing the deletion ratio marked by the QC checker. When the purple bar fills the entire cell, it indicates that the QC checker marked all chromatograms for deletion by HeapMS. Conversely, when there is no purple bar in the cell, it means that the QC checker marked no chromatograms for deletion. (d) The final row illustrates the comparison of working hours between the traditional workflow (using Skyline for autopicking and manual verification) and our proposed workflow (manual verification only of chromatograms marked as deletion and uncertain after prediction with HeapMS). The computer specifications used for the performance test were as follows: CPU - Intel(R) Core(TM) i7–7740X CPU@4.30 GHz, 64GB of memory, and an Nvidia GeForce GTX 108 graphics card. GPU acceleration was also utilized during the testing.

Performance and Time Cost of HeapMS

To objectively evaluate the overall performance of HeapMS, we developed two models: Model 1 and Model 2. We developed Model 1 from OSCC-3 and used it to predict CKD, OSCC-1, and OSCC-2. Because the number of chromatograms used for training in Model 1 was lower than that in OSCC-3, we developed Model 2 by combining CKD, OSCC-1, and OSCC-2 and used this model to predict OSCC-3. Figure 6 summarizes all of the results, including the calculation time, number of chromatograms, and predicted categories, and compares them with the results of manual peak picking. Table 4 lists the definitions of six categories compared to manual inspection.

Table 4. Categories of Prediction Results.

Open in a new tab

Because Skyline can pick peaks from any input chromatogram, only three categories are labeled in its output results. Although the overall accuracy of human technicians is lower than that of Skyline, human technicians can focus only on the “Uncertain” category, which represents only approximately 2% of all input chromatograms. For laboratories that require automatically acquired MRM data, a false deletion rate of approximately 5–15% is observed. However, for laboratories that prefer the collection of a large amount of data, technicians can manually inspect all of the chromatograms labeled in the “Deletion” category. HeapMS guarantees an error rate of less than 2% and substantially reduces the human inspection workload. The comparisons of AQ scores from HeapMS and Skyline are shown in Figure S5. We present the experimental result of choosing an appropriate threshold of the object detection model in Figure S6. HeapMS shows a higher number of chromatograms with an acceptable AQ score range. Moreover, Skyline predicts pick boundaries in all chromatograms without quality filtering; technicians must manually examine all chromatograms.

We plotted the proportion of deleted chromatograms that were marked by the QC checker for each target peptide because the proportion of true deletions is much higher in OSCC samples than in CKD samples. As shown in Figure 6, each purple cell in the third row represents a single target peptide, sorted in a descending order by the number of marked deletions. Almost all chromatograms of the first 10 peptides of OSCC-1 were marked as deleted. In the CKD samples, the cells marked as deleted are relatively lower compared to other data sets. In comparison to human results, the proportions of true and false deletions are lower than the other three data sets. The proportions of true and false deletions are high, as well. Therefore, technicians can use the HeapMS QC checker as a quality estimator for target peptides and delete those whose chromatograms are mostly marked as deleted.

Universal AI Model for Predicting Various Sample Types

Generally, the low-interference detector and AI models of HeapMS can be used on different sample types. In our test, we used the model developed by OSCC-3, a saliva sample, to perform peak picking on saliva samples from patients with oral cancer. This is because manual inspection uses the same criteria, and technicians use relative intensities to evaluate the quality of transitions from three heavy and light pairs. For example, the signal-to-noise ratio, the pattern of each ion, and a coalition of targeting ions are generally used to adjust the peak-picking results in manual inspections. These features are conserved in different tissues and samples, and they can be preserved by our chromatographic heatmap conservation. Based on the previous cross-tissue exchange experiments, HeapMS demonstrates robust transferability of pretrained AI models across various types of samples. The use of HeapMS in MRM workflows can dramatically reduce the time cost from months to only a few days. In addition, the output data can be reimported into Skyline or another MRM analytic tool for further examination. Thus, HeapMS can seamlessly integrate into existing MRM workflows without requiring any deliberate adjustments or modifications to incorporate the AI prediction tool.

Limitation of HeapMS and Future Work

The present study focuses on developing HeapMS, a tool specifically designed for analyzing the precise quantification of multiple reaction monitoring (MRM) data. By integration of heavy and light peptides, HeapMS enables more precise data analysis from an experimental perspective. The current version of HeapMS allows for only three transitions per chromatogram. If more than three transitions are uploaded, the pipeline will select the first three to generate the heatmap. Additionally, because our neural network can be used for the two-dimensional images, only target peptides paired as light and heavy transitions can be processed by HeapMS. HeapMS has the potential to handle more than three ion transitions. It accomplishes this by converting intensities into relative percentages, summing them, and representing the total as color blocks on a heatmap. In the future, we will fine-tune the 2D heatmap transformation specifically for chromatograms with multiple transitions, aiming to enhance the applicability of HeapMS. It demonstrates its potential in processing parallel reaction monitoring (PRM) data, which includes multiple transitions within a single chromatogram.

Conclusions

MRM peak picking and quality assessment heavily rely on the experience of technicians. In some cases, technicians obtained varying picking results. In other cases, manual adjustments are highly similar to the original software predictions. Although automatic peak-picking tools, such as Skyline, provide reliable peak results, the output data often contain hidden errors or uncertainties. Therefore, to ensure high data quality, manual inspection is required to examine the data, which is a highly time-consuming process. HeapMS combines rule-based and machine learning approaches to provide flexible and accurate MRM peak picking. The input and output file formats are the same as those provided by the current software. Thus, HeapMS can be easily integrated into existing MRM workflows.

Acknowledgments

This study was supported by grants from the National Science and Technology Council, Taiwan (110-2221-E-182-048, 111-2221-E-182-056, 111-2320-B-182-030, and 112-2221-E-182–049) and Chang Gung Memorial Hospital (CORPD2J0051, CMRPD1J0343). Publication of this article was sponsored by the Chang Gung University Funding (BMRPF59 and UERPD2M0101).

Supporting Information Available

The Supporting Information is available free of charge at https://pubs.acs.org/doi/10.1021/acs.analchem.3c01011.

Concept of the AQ score (Figure S1); color setting experiment (Figure S2); chromatographic heatmaps (Figure S3) and examples of QC-passed and QC-failed chromatograms and their representative chromatographic heatmaps (Figure S4); AQ score comparisons of HeapMS and Skyline boundary picking results (Figure S5); AQ scores of Skyline and manually repicking results (Table S1); and QC checker results of HeapMS (Table S2) (PDF)

Author Contributions

The manuscript was written through contributions of all authors. Web site and Software, Y.C.L., Y.T.P., P.H.L, J.J.G, and C.C.L.; data curation, L.J.C. and Y.C.H.; data validation, Y.C.L., L.J.C., and Y.C.H.; program investigation, Y.C.L., Y.T. P., C.H.Y., S.Y.C, and C.C.L.; resources, C.C.L., J.S. Y., and Y.S.C.; writing—original draft preparation, C.C.L. and Y.C.H.; writing—review & editing, Y.M.Y., L.J.C., P.J.H, C.Y., Y.C.H., P.T., J.S. Y., and Y.S.C.; supervision, C.C.L. and Y.C.H. All authors have given approval to the final version of the manuscript. The authors wrote all of the text and contents, and we hired Wallace Academic Editing (https://www.editing.tw) to proofread the English grammar. Additionally, online tools such as Grammarly.com, BingAI, and ChatGPT were used for proofreading the English grammar of some text.

The authors declare no competing financial interest.

Supplementary Material

ac3c01011_si_001.pdf^{(1.8MB, pdf)}

References

Paulovich A. G.; Whiteaker J. R. Quantifying the human proteome. Nat. Biotechnol. 2016, 34 (10), 1033–1034. 10.1038/nbt.3695. [DOI] [PMC free article] [PubMed] [Google Scholar]
Percy A. J.; Yang J.; Hardie D. B.; Chambers A. G.; Tamura-Wells J.; Borchers C. H. Precise quantitation of 136 urinary proteins by LC/MRM-MS using stable isotope labeled peptides as internal standards for biomarker discovery and/or verification studies. Methods 2015, 81, 24–33. 10.1016/j.ymeth.2015.04.001. [DOI] [PubMed] [Google Scholar]
Percy A. J.; Hardie D. B.; Jardim A.; Yang J.; Elliott M. H.; Zhang S.; Mohammed Y.; Borchers C. H. Multiplexed panel of precisely quantified salivary proteins for biomarker assessment. Proteomics 2017, 17 (6), 1600230 10.1002/pmic.201600230. [DOI] [PubMed] [Google Scholar]
Percy A. J.; Chambers A. G.; Yang J.; Hardie D. B.; Borchers C. H. Advances in multiplexed MRM-based protein biomarker quantitation toward clinical utility. Biochim. Biophys. Acta 2014, 1844 (5), 917–926. 10.1016/j.bbapap.2013.06.008. [DOI] [PubMed] [Google Scholar]
Addona T. A.; Abbatiello S. E.; Schilling B.; Skates S. J.; Mani D. R.; Bunk D. M.; Spiegelman C. H.; Zimmerman L. J.; Ham A. J.; Keshishian H.; et al. Multi-site assessment of the precision and reproducibility of multiple reaction monitoring-based measurements of proteins in plasma. Nat. Biotechnol. 2009, 27 (7), 633–641. 10.1038/nbt.1546. [DOI] [PMC free article] [PubMed] [Google Scholar]
MacLean B.; Tomazela D. M.; Shulman N.; Chambers M.; Finney G. L.; Frewen B.; Kern R.; Tabb D. L.; Liebler D. C.; MacCoss M. J. Skyline: an open source document editor for creating and analyzing targeted proteomics experiments. Bioinformatics 2010, 26 (7), 966–968. 10.1093/bioinformatics/btq054. [DOI] [PMC free article] [PubMed] [Google Scholar]
Eshghi S. T.; Auger P.; Mathews W. R. Quality assessment and interference detection in targeted mass spectrometry data using machine learning. Clin. Proteomics 2018, 15, 33 10.1186/s12014-018-9209-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
Melnikov A. D.; Tsentalovich Y. P.; Yanshole V. V. Deep Learning for the Precise Peak Detection in High-Resolution LC-MS Data. Anal. Chem. 2020, 92 (1), 588–592. 10.1021/acs.analchem.9b04811. [DOI] [PubMed] [Google Scholar]
Gloaguen Y.; Kirwan J. A.; Beule D. Deep Learning-Assisted Peak Curation for Large-Scale LC-MS Metabolomics. Anal. Chem. 2022, 94 (12), 4930–4937. 10.1021/acs.analchem.1c02220. [DOI] [PMC free article] [PubMed] [Google Scholar]
Domingo-Almenara X.; Siuzdak G.. Metabolomics Data Processing Using XCMS. In Computational Methods and Data Analysis for Metabolomics; Springer, 2020; Vol. 2104, pp 11–24. [DOI] [PubMed] [Google Scholar]
Du X.; Smirnov A.; Pluskal T.; Jia W.; Sumner S.. Metabolomics Data Preprocessing Using ADAP and MZmine 2. In Computational Methods and Data Analysis for Metabolomics; Springer, 2020; Vol. 2104, pp 25–48. [DOI] [PMC free article] [PubMed] [Google Scholar]
Eilertz D.; Mitterer M.; Buescher J. M. automRm: An R Package for Fully Automatic LC-QQQ-MS Data Preprocessing Powered by Machine Learning. Anal. Chem. 2022, 94 (16), 6163–6171. 10.1021/acs.analchem.1c05224. [DOI] [PMC free article] [PubMed] [Google Scholar]
Chi L. M.; Hsiao Y. C.; Chien K. Y.; Chen S. F.; Chuang Y. N.; Lin S. Y.; Wang W. S.; Chang I. Y.; Yang C.; Chu L. J.; et al. Assessment of candidate biomarkers in paired saliva and plasma samples from oral cancer patients by targeted mass spectrometry. J. Proteomics 2020, 211, 103571 10.1016/j.jprot.2019.103571. [DOI] [PubMed] [Google Scholar]
Yu J. S.; Chen Y. T.; Chiang W. F.; Hsiao Y. C.; Chu L. J.; See L. C.; Wu C. S.; Tu H. T.; Chen H. W.; Chen C. C.; et al. Saliva protein biomarkers to detect oral squamous cell carcinoma in a high-risk population in Taiwan. Proc. Natl. Acad. Sci. U.S.A. 2016, 113 (41), 11549–11554. 10.1073/pnas.1612368113. [DOI] [PMC free article] [PubMed] [Google Scholar]
Savitzky A.; Golay M. J. E. Smoothing and Differentiation of Data by Simplified Least Squares Procedures. Anal. Chem. 1964, 36 (8), 1627–1639. 10.1021/ac60214a047. [DOI] [Google Scholar]
Dai Z.; Liu H.; Le Q. V.; Tan M. Coatnet: Marrying convolution and attention for all data sizes. Adv. Neural Inf. Process. Syst. 2021, 34, 3965–3977. [Google Scholar]
Ren S. Q.; He K. M.; Girshick R.; Sun J. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 39 (6), 1137–1149. 10.1109/TPAMI.2016.2577031. [DOI] [PubMed] [Google Scholar]
Simonyan K.; Zisserman A.. Very deep convolutional networks for large-scsulphale image recognition. 2014, arXiv:1409.1556. arXiv.org e-Print archive. https://arxiv.org/abs/1409.1556.
Tan M.; Le Q. In Efficientnet: Rethinking Model Scaling for Convolutional Neural Networks, International Conference on Machine Learning; PMLR, 2019; pp 6105–6114.

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

ac3c01011_si_001.pdf^{(1.8MB, pdf)}

[ref1] Paulovich A. G.; Whiteaker J. R. Quantifying the human proteome. Nat. Biotechnol. 2016, 34 (10), 1033–1034. 10.1038/nbt.3695. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref2] Percy A. J.; Yang J.; Hardie D. B.; Chambers A. G.; Tamura-Wells J.; Borchers C. H. Precise quantitation of 136 urinary proteins by LC/MRM-MS using stable isotope labeled peptides as internal standards for biomarker discovery and/or verification studies. Methods 2015, 81, 24–33. 10.1016/j.ymeth.2015.04.001. [DOI] [PubMed] [Google Scholar]

[ref3] Percy A. J.; Hardie D. B.; Jardim A.; Yang J.; Elliott M. H.; Zhang S.; Mohammed Y.; Borchers C. H. Multiplexed panel of precisely quantified salivary proteins for biomarker assessment. Proteomics 2017, 17 (6), 1600230 10.1002/pmic.201600230. [DOI] [PubMed] [Google Scholar]

[ref4] Percy A. J.; Chambers A. G.; Yang J.; Hardie D. B.; Borchers C. H. Advances in multiplexed MRM-based protein biomarker quantitation toward clinical utility. Biochim. Biophys. Acta 2014, 1844 (5), 917–926. 10.1016/j.bbapap.2013.06.008. [DOI] [PubMed] [Google Scholar]

[ref5] Addona T. A.; Abbatiello S. E.; Schilling B.; Skates S. J.; Mani D. R.; Bunk D. M.; Spiegelman C. H.; Zimmerman L. J.; Ham A. J.; Keshishian H.; et al. Multi-site assessment of the precision and reproducibility of multiple reaction monitoring-based measurements of proteins in plasma. Nat. Biotechnol. 2009, 27 (7), 633–641. 10.1038/nbt.1546. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref6] MacLean B.; Tomazela D. M.; Shulman N.; Chambers M.; Finney G. L.; Frewen B.; Kern R.; Tabb D. L.; Liebler D. C.; MacCoss M. J. Skyline: an open source document editor for creating and analyzing targeted proteomics experiments. Bioinformatics 2010, 26 (7), 966–968. 10.1093/bioinformatics/btq054. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref7] Eshghi S. T.; Auger P.; Mathews W. R. Quality assessment and interference detection in targeted mass spectrometry data using machine learning. Clin. Proteomics 2018, 15, 33 10.1186/s12014-018-9209-x. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref8] Melnikov A. D.; Tsentalovich Y. P.; Yanshole V. V. Deep Learning for the Precise Peak Detection in High-Resolution LC-MS Data. Anal. Chem. 2020, 92 (1), 588–592. 10.1021/acs.analchem.9b04811. [DOI] [PubMed] [Google Scholar]

[ref9] Gloaguen Y.; Kirwan J. A.; Beule D. Deep Learning-Assisted Peak Curation for Large-Scale LC-MS Metabolomics. Anal. Chem. 2022, 94 (12), 4930–4937. 10.1021/acs.analchem.1c02220. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref10] Domingo-Almenara X.; Siuzdak G.. Metabolomics Data Processing Using XCMS. In Computational Methods and Data Analysis for Metabolomics; Springer, 2020; Vol. 2104, pp 11–24. [DOI] [PubMed] [Google Scholar]

[ref11] Du X.; Smirnov A.; Pluskal T.; Jia W.; Sumner S.. Metabolomics Data Preprocessing Using ADAP and MZmine 2. In Computational Methods and Data Analysis for Metabolomics; Springer, 2020; Vol. 2104, pp 25–48. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref12] Eilertz D.; Mitterer M.; Buescher J. M. automRm: An R Package for Fully Automatic LC-QQQ-MS Data Preprocessing Powered by Machine Learning. Anal. Chem. 2022, 94 (16), 6163–6171. 10.1021/acs.analchem.1c05224. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref13] Chi L. M.; Hsiao Y. C.; Chien K. Y.; Chen S. F.; Chuang Y. N.; Lin S. Y.; Wang W. S.; Chang I. Y.; Yang C.; Chu L. J.; et al. Assessment of candidate biomarkers in paired saliva and plasma samples from oral cancer patients by targeted mass spectrometry. J. Proteomics 2020, 211, 103571 10.1016/j.jprot.2019.103571. [DOI] [PubMed] [Google Scholar]

[ref14] Yu J. S.; Chen Y. T.; Chiang W. F.; Hsiao Y. C.; Chu L. J.; See L. C.; Wu C. S.; Tu H. T.; Chen H. W.; Chen C. C.; et al. Saliva protein biomarkers to detect oral squamous cell carcinoma in a high-risk population in Taiwan. Proc. Natl. Acad. Sci. U.S.A. 2016, 113 (41), 11549–11554. 10.1073/pnas.1612368113. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref15] Savitzky A.; Golay M. J. E. Smoothing and Differentiation of Data by Simplified Least Squares Procedures. Anal. Chem. 1964, 36 (8), 1627–1639. 10.1021/ac60214a047. [DOI] [Google Scholar]

[ref16] Dai Z.; Liu H.; Le Q. V.; Tan M. Coatnet: Marrying convolution and attention for all data sizes. Adv. Neural Inf. Process. Syst. 2021, 34, 3965–3977. [Google Scholar]

[ref17] Ren S. Q.; He K. M.; Girshick R.; Sun J. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 39 (6), 1137–1149. 10.1109/TPAMI.2016.2577031. [DOI] [PubMed] [Google Scholar]

[ref18] Simonyan K.; Zisserman A.. Very deep convolutional networks for large-scsulphale image recognition. 2014, arXiv:1409.1556. arXiv.org e-Print archive. https://arxiv.org/abs/1409.1556.

[ref19] Tan M.; Le Q. In Efficientnet: Rethinking Model Scaling for Convolutional Neural Networks, International Conference on Machine Learning; PMLR, 2019; pp 6105–6114.

PERMALINK

HeapMS: An Automatic Peak-Picking Pipeline for Targeted Proteomic Data Powered by 2D Heatmap Transformation and Convolutional Neural Networks

Chi-Ching Lee

Yu-Chieh Lin

Teng Yu Pan

Cheng Hann Yang

Pei-Hsuan Li

Sin You Chen

Jhih Jie Gao

Chi Yang

Lichieh Julie Chu

Po-Jung Huang

Yuan-Ming Yeh

Petrus Tang

Yu-Sun Chang

Jau-Song Yu

Yung-Chin Hsiao

Abstract

Introduction

Table 1. Comparison of Recently Published MS Peak-Picking Solutions.

Methods

Sample Collection and Preparation

LC-MRM-MS Analysis and Data Acquisition

Data Sets for Deep Learning

AQ Score: An Objective Score for Evaluating Peak-Picking Results

Identifying Low-Interference Chromatograms

Converting MRM Spectra into AI Predictable Chromatographic Heatmaps

Figure 1.

Implementation of HeapMS

Deep Learning Architectures

Performance Evaluation of the QC Checker

Results and Discussion

Workflow of HeapMS

Figure 2.

Interplay between Traditional Workflows

Figure 3.

Advantage of the AQ Score

Need for AI-Assisted Peak Picking

Figure 4.

Accuracy of Low-Interference Detectors

Table 2. Accuracy of Low-Interference Detectors.

QC Checker Performance

Figure 5.

Table 3. Performance Comparison of the QC Checker of HeapMS with the Four Test Data Setsab.

Figure 6.

Performance and Time Cost of HeapMS

Table 4. Categories of Prediction Results.

Universal AI Model for Predicting Various Sample Types

Limitation of HeapMS and Future Work

Conclusions

Acknowledgments

Supporting Information Available

Author Contributions

Supplementary Material

References

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases

Table 3. Performance Comparison of the QC Checker of HeapMS with the Four Test Data Sets^a ^b.