Integrating shotgun proteomics and mRNA expression data to improve protein identification

  • Smriti R. Ramakrishnan
  • , Christine Vogel
  • , John T. Prince
  • , Zhihua Li
  • , Luiz O. Penalva
  • , Margaret Myers
  • , Edward M. Marcotte
  • , Daniel P. Miranker
  • , Rong Wang

Research output: Contribution to journalArticlepeer-review

59 Scopus citations

Abstract

Motivation: Tandem mass spectrometry (MS/MS) offers fast and reliable characterization of complex protein mixtures, but suffers from low sensitivity in protein identification. In a typical shotgun proteomics experiment, it is assumed that all proteins are equally likely to be present. However, there is often other information available, e.g. the probability of a protein's presence is likely to correlate with its mRNA concentration. Results: We develop a Bayesian score that estimates the posterior probability of a protein's presence in the sample given its identification in an MS/MS experiment and its mRNA concentration measured under similar experimental conditions. Our method, MSpresso, substantially increases the number of proteins identified in an MS/MS experiment at the same error rate, e.g. in yeast, MSpresso increases the number of proteins identified by ∼40%. We apply MSpresso to data from different MS/MS instruments, experimental conditions and organisms (Escherichia coli, human), and predict 19-63% more proteins across the different datasets. MSpresso demonstrates that incorporating prior knowledge of protein presence into shotgun proteomics experiments can substantially improve protein identification scores.

Original languageEnglish (US)
Pages (from-to)1397-1403
Number of pages7
JournalBioinformatics
Volume25
Issue number11
DOIs
StatePublished - Jun 2009

ASJC Scopus subject areas

  • Statistics and Probability
  • Biochemistry
  • Molecular Biology
  • Computer Science Applications
  • Computational Theory and Mathematics
  • Computational Mathematics

Fingerprint

Dive into the research topics of 'Integrating shotgun proteomics and mRNA expression data to improve protein identification'. Together they form a unique fingerprint.

Cite this