Abstract
As advancements in the field of proteomics continue to make unparalleled progress, the universal acceptance of the value proteomic data adds to both the ever-developing field of cancer research and clinical studies. This combined with the generation of faster and higher resolution state of the art mass spectrometers from multiple vendors has resulted in the proliferation of large-scale proteomics projects. These projects, whether research or clinical, often require quantitation of thousands of proteins, either by label free (LFQ), by using isobaric tags such as TMT or by running the data in Data-Independent Mode (DIA). PEAKS Online is a new multi-user, cluster based, high-throughput protein sequencing software solution that runs on a shared resource, is flexible to scale, and is fully parallelized with the ability to run on any cluster or multi-cluster CPU machine. Herein we describe the use of PEAKS online in analyzing several datasets. LFQ data with match between runs is notorious for taking a significant amount of time to search. With the new build of PEAKS Online X, we are able to search >28 Million MS2 spectra in less than a day. Similar results were obtained with TMT data sets. Furthermore, our de novo-based approach of generating de novo sequencing tags allows us to further investigate clinical datasets by identifying variants. Integrated seamlessly with PEAKS online is the SPIDER algorithm which can identify these variants by matching de novo sequencing tags to database proteins by making allowances for homology peptide mutations. We demonstrate that we can detect amino acid substitutions in patient samples that may ultimately be disease causing or disease progressing mutations. Finally, data-independent acquisition (DIA) Mass Spectrometry has become increasing popular due to its parallel nature of acquiring all fragment ions for all precursors within a selected m/z range. We have also incorporated spectral library generation and searching.
