ProteomicsBrowser: MS/proteomics data visualization and investigation

Gang Peng; Rashaun Wilson; Yishuo Tang; TuKiet T Lam; Angus C Nairn; Kenneth Williams; Hongyu Zhao

doi:10.1093/bioinformatics/bty958

. 2018 Nov 21;35(13):2313–2314. doi: 10.1093/bioinformatics/bty958

ProteomicsBrowser: MS/proteomics data visualization and investigation

Gang Peng ^1,², Rashaun Wilson ³, Yishuo Tang ², TuKiet T Lam ^4,⁵, Angus C Nairn ³, Kenneth Williams ⁴, Hongyu Zhao ^1,^2,^✉

Editor: John Hancock

PMCID: PMC6596887 PMID: 30462190

Abstract

Summary

Large-scale, quantitative proteomics data are being generated at ever increasing rates by high-throughput, mass spectrometry technologies. However, due to the complexity of these large datasets as well as the increasing numbers of post-translational modifications (PTMs) that are being identified, developing effective methods for proteomic visualization has been challenging. ProteomicsBrowser was designed to meet this need for comprehensive data visualization. Using peptide information files exported from mass spectrometry search engines or quantitative tools as input, the peptide sequences are aligned to an internal protein database such as UniProtKB. Each identified peptide ion including those with PTMs is then visualized along the parent protein in the Browser. A unique property of ProteomicsBrowser is the ability to combine overlapping peptides in different ways to focus analysis of sequence coverage, charge state or PTMs. ProteomicsBrowser includes other useful functions, such as a data filtering tool and basic statistical analyses to qualify quantitative data.

Availability and implementation

ProteomicsBrowser is implemented in Java8 and is available at https://medicine.yale.edu/keck/nida/proteomicsbrowser.aspx and https://github.com/peng-gang/ProteomicsBrowser.

Supplementary information

Supplementary data are available at Bioinformatics online.

1 Introduction

Rapid advances in high-throughput MS instrumentation have resulted in an exponential growth of large-scale, quantitative proteomics studies during the past decade (Efstathiou et al., 2017). The demands for high-performance data processing and proteomics analysis software are greater today than ever before, as many more research groups have the capacity and capability to generate complex proteomics datasets on a daily basis. As noted by Gatto et al. (2015), visualization plays an essential role in high-throughput biology. It is very important that the software used to interpret MS/MS proteomics data has the ability to display key findings both quantitatively and visually. While there have been enormous advances in LC-MS/MS instrumentation and in the automation of data acquisition, it has been observed (Avtonomov et al., 2016; Efstathiou et al., 2017) that software solutions for ‘bottom-up’ proteomics data processing, analysis and visualization have lagged behind. In this regard, overlapping peptides resulting from partial proteolytic cleavage, multiple charge states and sub-stoichiometric post-translational modifications (PTMs) can easily result in >20 peptides covering a PTM site of interest. It may be helpful to combine overlapping peptides based on their sequence, charge state or PTMs. Since a review of ∼30 programs (Supplementary Table S1) for the analysis and visualization of MS/proteomics datasets did not identify any with these capabilities, we designed ProteomicsBrowser to meet these needs.

2 Materials and methods

ProteomicsBrowser is coded in Java8 (https://www.java.com/) that can run on multiple platforms including Windows, macOS and Linux. ProteomicsBrowser takes comma-separated values files with peptide information exported from a mass spectrometry search engine or peptide quantitative output from third party vendors as input. ProteomicsBrowser searches the internal database to locate the position of each peptide ion in a protein. The internal database uses fasta format to store protein sequences for different organisms, which can be updated by downloading protein information from UniProt (http://www.uniprot.org).

3 Results

As described in the User Guide in Supplementary Material and Supplementary Figure S1, after importing LC-MS/MS data from a study that typically contains multiple ‘control’ versus ‘experimental’ samples, ProteomicsBrowser organizes the project into two sections: Data and Browser. The Data section allows users to view the peptide and protein data in a table and to perform some statistical analyses. The key function of ProteomicsBrowser is its Browser that was designed to help users visualize peptides in a protein of interest in any one of the samples in the project being analyzed. As shown in Figure 1, the Browser depicts each peptide as a horizontal gray bar (box) above the corresponding sequence. The peptide ions are aligned along the parent protein with overlapping peptides being stacked vertically as shown in Figure 1. The intensity of the gray color of each bar indicates the relative peptide abundance with color-coded vertical lines in the bars indicating the positions and type of PTMs that were identified in the selected protein. A particularly novel feature of the Browser is its ability to simplify the depiction of related peptides by combining peptide ions (Fig. 1) based on their sequence, charge and/or specific PTMs. In addition, the unique ‘Quantify PTM’ option can be used to rapidly determine the extent of PTM at a selected residue in a protein of interest by combining all of the overlapping peptide ions that contain the PTM of interest into one group and all of the overlapping peptide ions that do not contain that PTM into a second group. The integrated areas of the peptides with the selected PTM are summed and depicted as a single peptide box. Similarly, the overlapping peptides without the PTM of interest are combined into a different peptide box. The extent of modification can be easily calculated from the ratio of the sum of the integrated peak areas of the combined PTM peptides to the sum of the combined PTM + non-PTM peptides. Another unique feature of ProteomicsBrowser is its ability to export a text file or figure that shows the amino acid frequencies around a PTM of interest similar to a BlockLogo tool (Olsen et al., 2013). All of the results and Browser views can be exported to text files or figures to facilitate publication.

Fig. 1. — GUI of ProteomicsBrowser. Peptide ions identified in the ACTG protein from the Disease-1 sample. The left sidebar allows the user to customize and control the analysis procedure, including using the Selection Options to choose a particular sample, a particular protein in that sample and changing the scale of visualization with the Zoom control. The center panel presents the overall visualization of the alignment of the identified and quantified peptides. As shown, with the Zoom control near the mid position, it is possible to visualize residues 150–313 as designated by the font ‘Start 150’ on the left and the ‘End 313’ on the right

4 Discussion

ProteomicsBrowser has unique abilities to clearly and easily visualize the positions and relative abundances of all of the identified and quantified peptides and their PTMs from a selected protein. This simplified depiction based on sequence, charge, and/or PTM should greatly aid in the analysis and interpretation of complex LC-MS/MS datasets. In particular, ProteomicsBrowser will leverage recent advances in LC-MS/MS instrumentation by facilitating the optimal selection of proteotypic peptide ions for designing targeted proteomics analyses such as Parallel Reaction Monitoring (PRM) (Rauniyar, 2015), and in the determination of the overall extent of modification of individual PTMs. These capabilities are augmented by the inclusion in the ProteomicsBrowser of selected statistical analyses tools and a wide range of options for filtering the data. We will continue to improve the ProteomicsBrowser and to add new features based on user feedback and will post the latest version at https://medicine.yale.edu/keck/nida/proteomicsbrowser.aspx.

Supplementary Material

bty958_Supplementary_Data

Click here for additional data file.^{(4.1MB, zip)}

Acknowledgements

We thank Drs Melissa Monsey and Jane Taylor for use of their data. The ProteomicsBroswer is copyrighted by Yale University.

Funding

This work was supported by National Institutes of Health grant [P30 DA018343]. The LC-MS/MS data utilized for testing were collected on a National Institutes of Health SIG-supported mass spectrometer [1S10ODOD018034].

Conflict of Interest: none declared.

References

Avtonomov D.M. et al. (2016) BatMass: a Java Software Platform for LC-MS Data visualization in proteomics and metabolomics. J. Proteome Res., 15, 2500–2509. [DOI] [PMC free article] [PubMed] [Google Scholar]
Efstathiou G. et al. (2017) ProteoSign: an end-user online differential proteomics statistical analysis platform. Nucleic Acids Res., 45, W300–W306. [DOI] [PMC free article] [PubMed] [Google Scholar]
Gatto L. et al. (2015) Visualization of proteomics data using R and bioconductor. Proteomics, 15, 1375–1389. [DOI] [PMC free article] [PubMed] [Google Scholar]
Olsen L.R. et al. (2013) BlockLogo: visualization of peptide and sequence motif conservation. J. Immunol. Methods, 400–401, 37–44. [DOI] [PMC free article] [PubMed] [Google Scholar]
Rauniyar N. (2015) Parallel reaction monitoring: a targeted experiment performed using high resolution and high mass accuracy mass spectrometry. Int. J. Mol. Sci., 16, 28566–28581. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

bty958_Supplementary_Data

Click here for additional data file.^{(4.1MB, zip)}

[bty958-B1] Avtonomov D.M. et al. (2016) BatMass: a Java Software Platform for LC-MS Data visualization in proteomics and metabolomics. J. Proteome Res., 15, 2500–2509. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bty958-B2] Efstathiou G. et al. (2017) ProteoSign: an end-user online differential proteomics statistical analysis platform. Nucleic Acids Res., 45, W300–W306. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bty958-B3] Gatto L. et al. (2015) Visualization of proteomics data using R and bioconductor. Proteomics, 15, 1375–1389. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bty958-B4] Olsen L.R. et al. (2013) BlockLogo: visualization of peptide and sequence motif conservation. J. Immunol. Methods, 400–401, 37–44. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bty958-B5] Rauniyar N. (2015) Parallel reaction monitoring: a targeted experiment performed using high resolution and high mass accuracy mass spectrometry. Int. J. Mol. Sci., 16, 28566–28581. [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

ProteomicsBrowser: MS/proteomics data visualization and investigation

Gang Peng

Rashaun Wilson

Yishuo Tang

TuKiet T Lam

Angus C Nairn

Kenneth Williams

Hongyu Zhao

Roles

Abstract

Summary

Availability and implementation

Supplementary information

1 Introduction

2 Materials and methods

3 Results

Fig. 1.

4 Discussion

Supplementary Material

Acknowledgements

Funding

References

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

ProteomicsBrowser: MS/proteomics data visualization and investigation

Gang Peng

Rashaun Wilson

Yishuo Tang

TuKiet T Lam

Angus C Nairn

Kenneth Williams

Hongyu Zhao

Roles

Abstract

Summary

Availability and implementation

Supplementary information

1 Introduction

2 Materials and methods

3 Results

Fig. 1.

4 Discussion

Supplementary Material

Acknowledgements

Funding

References

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases