Abstract
Data-independent mass spectrometry activates all ion species isolated within a given mass-to-charge window (m/z) regardless of their abundance. This acquisition strategy overcomes the traditional data-dependent ion selection boosting data reproducibility and sensitivity. However, several tandem mass (MS/MS) spectra of the same precursor ion are acquired during chromatographic elution resulting in large data redundancy. Also, the significant number of chimeric spectra and the absence of accurate precursor ion masses hamper peptide identification. Here, we describe an algorithm to preprocess data-independent MS/MS spectra by filtering out noise peaks and clustering the spectra according to both the chromatographic elution profiles and the spectral similarity. In addition, we developed an approach to estimate the m/z value of precursor ions from clustered MS/MS spectra in order to improve database search performance. Data acquired using a small 3 m/z units precursor mass window and multiple injections to cover a m/z range of 400–1400 was processed with our algorithm. It showed an improvement in the number of both peptide and protein identifications by 8 % while reducing the number of submitted spectra by 18 % and the number of peaks by 55 %. We conclude that our clustering method is a valid approach for data analysis of these data-independent fragmentation spectra. The software including the source code is available for the scientific community.
Similar content being viewed by others
References
Gatlin, C.L., Eng, J.K., Cross, S.T., Detter, J.C., Yates, J.R.: Automated identification of amino acid sequence variations in proteins by HPLC/microspray tandem mass spectrometry. Anal. Chem. 72, 757–763 (2000)
Washburn, M.P., Wolters, D., Yates III, J.R.: Large-scale analysis of the yeast proteome by multidimensional protein identification technology. Nat. Biotechnol. 19, 242–247 (2001)
Chang, E.J., Archambault, V., McLachlin, D.T., Krutchinsky, A.N., Chait, B.T.: Analysis of protein phosphorylation by hypothesis-driven multiple-stage mass spectrometry. Anal. Chem. 76, 4472–4483 (2004)
Liu, H., Sadygov, R.G., Yates III, J.R.: A model for random sampling and estimation of relative protein abundance in shotgun proteomics. Anal. Chem. 76, 4193–4201 (2004)
Purvine, S., Eppel, J.-T., Yi, E.C., Goodlett, D.R.: Shotgun collision-induced dissociation of peptides using a time of flight mass analyzer. Proteomics 3, 847–850 (2003)
Silva, J.C., Gorenstein, M.V., Li, G.-Z., Vissers, J.P.C., Geromanos, S.J.: Absolute quantification of proteins by LCMSE: a virtue of parallel MS acquisition. Mol. Cell. Proteom. 5, 144–156 (2006)
Venable, J.D., Dong, M.-Q., Wohlschlegel, J., Dillin, A., Yates, J.R.: Automated approach for quantitative analysis of complex peptide mixtures from tandem mass spectra. Nat. Methods 1, 39–45 (2004)
Panchaud, A., Scherl, A., Shaffer, S.A., von Haller, P.D., Kulasekara, H.D., Miller, S.I., Goodlett, D.R.: PAcIFIC: how to dive deeper into the proteomics ocean. Anal. Chem. 81, 6481–6488 (2009)
Yi, E.C., Marelli, M., Lee, H., Purvine, S.O., Aebersold, R., Aitchison, J.D., Goodlett, D.R.: Approaching complete peroxisome characterization by gas-phase fractionation. Electrophoresis 23, 3205–3216 (2002)
Spahr, C.S., Davis, M.T., McGinley, M.D., Robinson, J.H., Bures, E.J., Beierle, J., Mort, J., Courchesne, P.L., Chen, K., Wahl, R.C., Yu, W., Luethy, R., Patterson, S.D.: Towards defining the urinary proteome using liquid chromatography-tandem mass spectrometry. I. Profiling an unfractionated tryptic digest. Proteomics 1, 93–107 (2001)
Panchaud, A., Jung, S., Shaffer, S.A., Aitchison, J.D., Goodlett, D.R.: Faster, quantitative, and accurate precursor acquisition independent from ion count. Anal. Chem. 83, 2250–2257 (2011)
Chen, S., Panchaud, A., Goodlett, D., Shaffer, S.: Making a case for data-independent tandem mass spectrometry workflows. J. Biomol. Tech. 21, S52–S53 (2010)
Hengel, S.M., Murray, E., Langdon, S., Hayward, L., O’Donoghue, J., Panchaud, A., Hupp, T., Goodlett, D.R.: Data-independent proteomic screen identifies novel tamoxifen agonist that mediates drug resistance. J. Proteome Res. 10, 4567–4578 (2011)
Gillet, L.C., Navarro, P., Tate, S., Röst, H., Selevsek, N., Reiter, L., Bonner, R., Aebersold, R.: Targeted data extraction of the MS/MS spectra generated by data-independent acquisition: a new concept for consistent and accurate proteome analysis. Mol. Cell. Proteom. 11, (2012)
Scherl, A., Tsai, Y.S., Shaffer, S.A., Goodlett, D.R.: Increasing information from shotgun proteomic data by accounting for misassigned precursor ion masses. Proteomics 8, 2791–2797 (2008)
Ahrné, E., Ohta, Y., Nikitin, F., Scherl, A., Lisacek, F., Müller, M.: An improved method for the construction of decoy peptide MS/MS spectra suitable for the accurate estimation of false discovery rates. Proteomics 11, 4085–4095 (2011)
Bern, M., Finney, G., Hoopmann, M.R., Merrihew, G., Toth, M.J., MacCoss, M.J.: Deconvolution of mixture spectra from ion-trap data-independent-acquisition tandem mass spectrometry. Anal. Chem. 82, 833 (2010)
Venable, J.D., Xu, T., Cociorva, D., Yates III, J.R.: Cross-correlation algorithm for calculation of peptide molecular weight from tandem mass spectra. Anal. Chem. 78, 1921–1929 (2006)
Carvalho, P.C., Han, X., Xu, T., Cociorva, D.: da G. Carvalho M., Barbosa, V.C., Yates, J.R., 3rd: XDIA: improving on the label-free data-independent analysis. Bioinformatics 26, 847–848 (2010)
Prim, R.: Shortest connection networks and some generalizations. Bell Syst. Technical J. 36, 1389–1401 (1957)
Gluck, F., Hoogland, C., Antinori, P., Robin, X., Nikitin, F., Zufferey, A., et al.: EasyProt—an easy-to-use graphical platform for proteomics data analysis. J. Proteom. 79, 146–160 (2013)
Colinge, J., Masselot, A., Giron, M., Dessingy, T., Magnin, J.: OLAV: towards high-throughput tandem mass spectrometry data identification. Proteomics 3, 1454–1463 (2003)
Elias, J.E., Gygi, S.P.: Target-decoy search strategy for increased confidence in large-scale protein identifications by mass spectrometry. Nat. Methods 4, 207–214 (2007)
Frank, A.M., Bandeira, N., Shen, Z., Tanner, S., Brigg, S.P., Smith, R.D., Pevzner, P.A.: Clustering millions of tandem mass spectra. J. Proteome Res. 7, 113–122 (2008)
Acknowledgment
The authors thank the Swiss National Science Foundation (SNSF), grant 315230_130830, for support of this work. The authors declare no conflict of interest.
Author information
Authors and Affiliations
Corresponding author
Electronic supplementary material
Below is the link to the electronic supplementary material.
ESM 1
(DOCX 905 kb)
Rights and permissions
About this article
Cite this article
Pak, H., Nikitin, F., Gluck, F. et al. Clustering and Filtering Tandem Mass Spectra Acquired in Data-Independent Mode. J. Am. Soc. Mass Spectrom. 24, 1862–1871 (2013). https://doi.org/10.1007/s13361-013-0720-z
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13361-013-0720-z