ABRF Proteome Informatics Research Group (iPRG) 2016 Study: Inferring Proteoforms from Bottom-up Proteomics Data

Joon Yong Lee, Hyungwon Choi, Christopher M. Colangelo, Darryl Davis, Michael R. Hoopmann, Lukas Käll, Henry Lam, Samuel H. Payne, Yasset Perez-Riverol, Matthew The, Ryan Wilson, Susan E Weintraub, Magnus Palmblad

Research output: Contribution to journalArticle

1 Citation (Scopus)

Abstract

This report presents the results from the 2016 Association of Biomolecular Resource Facilities Proteome Informatics Research Group (iPRG) study on proteoform inference and false discovery rate (FDR) estimation from bottom-up proteomics data. For this study, 3 replicate Q Exactive Orbitrap liquid chromatography-tandom mass spectrometry datasets were generated from each of 4 Escherichia coli samples spiked with different equimolar mixtures of small recombinant proteins selected to mimic pairs of homologous proteins. Participants were given raw data and a sequence file and asked to identify the proteins and provide estimates on the FDR at the proteoform level. As part of this study, we tested a new submission system with a format validator running on a virtual private server (VPS) and allowed methods to be provided as executable R Markdown or IPython Notebooks. The task was perceived as difficult, and only eight unique submissions were received, although those who participated did well with no one method performing best on all samples. However, none of the submissions included a complete Markdown or Notebook, even though examples were provided. Future iPRG studies need to be more successful in promoting and encouraging participation. The VPS and submission validator easily scale to much larger numbers of participants in these types of studies. The unique "ground-truth" dataset for proteoform identification generated for this study is now available to the research community, as are the server-side scripts for validating and managing submissions.

Original languageEnglish (US)
Pages (from-to)39-45
Number of pages7
JournalJournal of biomolecular techniques : JBT
Volume29
Issue number2
DOIs
StatePublished - Jul 1 2018

Fingerprint

Informatics
Proteome
Proteomics
Research
Recombinant Proteins
Liquid Chromatography
Mass Spectrometry
Proteins
Escherichia coli
Datasets

Keywords

  • best practice
  • community study
  • false discovery rate
  • inference

ASJC Scopus subject areas

  • Molecular Biology

Cite this

ABRF Proteome Informatics Research Group (iPRG) 2016 Study : Inferring Proteoforms from Bottom-up Proteomics Data. / Lee, Joon Yong; Choi, Hyungwon; Colangelo, Christopher M.; Davis, Darryl; Hoopmann, Michael R.; Käll, Lukas; Lam, Henry; Payne, Samuel H.; Perez-Riverol, Yasset; The, Matthew; Wilson, Ryan; Weintraub, Susan E; Palmblad, Magnus.

In: Journal of biomolecular techniques : JBT, Vol. 29, No. 2, 01.07.2018, p. 39-45.

Research output: Contribution to journalArticle

Lee, JY, Choi, H, Colangelo, CM, Davis, D, Hoopmann, MR, Käll, L, Lam, H, Payne, SH, Perez-Riverol, Y, The, M, Wilson, R, Weintraub, SE & Palmblad, M 2018, 'ABRF Proteome Informatics Research Group (iPRG) 2016 Study: Inferring Proteoforms from Bottom-up Proteomics Data', Journal of biomolecular techniques : JBT, vol. 29, no. 2, pp. 39-45. https://doi.org/10.7171/jbt.18-2902-003
Lee, Joon Yong ; Choi, Hyungwon ; Colangelo, Christopher M. ; Davis, Darryl ; Hoopmann, Michael R. ; Käll, Lukas ; Lam, Henry ; Payne, Samuel H. ; Perez-Riverol, Yasset ; The, Matthew ; Wilson, Ryan ; Weintraub, Susan E ; Palmblad, Magnus. / ABRF Proteome Informatics Research Group (iPRG) 2016 Study : Inferring Proteoforms from Bottom-up Proteomics Data. In: Journal of biomolecular techniques : JBT. 2018 ; Vol. 29, No. 2. pp. 39-45.
@article{325150b2a4e4491d92726121c2e3b48b,
title = "ABRF Proteome Informatics Research Group (iPRG) 2016 Study: Inferring Proteoforms from Bottom-up Proteomics Data",
abstract = "This report presents the results from the 2016 Association of Biomolecular Resource Facilities Proteome Informatics Research Group (iPRG) study on proteoform inference and false discovery rate (FDR) estimation from bottom-up proteomics data. For this study, 3 replicate Q Exactive Orbitrap liquid chromatography-tandom mass spectrometry datasets were generated from each of 4 Escherichia coli samples spiked with different equimolar mixtures of small recombinant proteins selected to mimic pairs of homologous proteins. Participants were given raw data and a sequence file and asked to identify the proteins and provide estimates on the FDR at the proteoform level. As part of this study, we tested a new submission system with a format validator running on a virtual private server (VPS) and allowed methods to be provided as executable R Markdown or IPython Notebooks. The task was perceived as difficult, and only eight unique submissions were received, although those who participated did well with no one method performing best on all samples. However, none of the submissions included a complete Markdown or Notebook, even though examples were provided. Future iPRG studies need to be more successful in promoting and encouraging participation. The VPS and submission validator easily scale to much larger numbers of participants in these types of studies. The unique {"}ground-truth{"} dataset for proteoform identification generated for this study is now available to the research community, as are the server-side scripts for validating and managing submissions.",
keywords = "best practice, community study, false discovery rate, inference",
author = "Lee, {Joon Yong} and Hyungwon Choi and Colangelo, {Christopher M.} and Darryl Davis and Hoopmann, {Michael R.} and Lukas K{\"a}ll and Henry Lam and Payne, {Samuel H.} and Yasset Perez-Riverol and Matthew The and Ryan Wilson and Weintraub, {Susan E} and Magnus Palmblad",
year = "2018",
month = "7",
day = "1",
doi = "10.7171/jbt.18-2902-003",
language = "English (US)",
volume = "29",
pages = "39--45",
journal = "Journal of Biomolecular Techniques",
issn = "1524-0215",
publisher = "Association of Biomolecular Resource Facilities",
number = "2",

}

TY - JOUR

T1 - ABRF Proteome Informatics Research Group (iPRG) 2016 Study

T2 - Inferring Proteoforms from Bottom-up Proteomics Data

AU - Lee, Joon Yong

AU - Choi, Hyungwon

AU - Colangelo, Christopher M.

AU - Davis, Darryl

AU - Hoopmann, Michael R.

AU - Käll, Lukas

AU - Lam, Henry

AU - Payne, Samuel H.

AU - Perez-Riverol, Yasset

AU - The, Matthew

AU - Wilson, Ryan

AU - Weintraub, Susan E

AU - Palmblad, Magnus

PY - 2018/7/1

Y1 - 2018/7/1

N2 - This report presents the results from the 2016 Association of Biomolecular Resource Facilities Proteome Informatics Research Group (iPRG) study on proteoform inference and false discovery rate (FDR) estimation from bottom-up proteomics data. For this study, 3 replicate Q Exactive Orbitrap liquid chromatography-tandom mass spectrometry datasets were generated from each of 4 Escherichia coli samples spiked with different equimolar mixtures of small recombinant proteins selected to mimic pairs of homologous proteins. Participants were given raw data and a sequence file and asked to identify the proteins and provide estimates on the FDR at the proteoform level. As part of this study, we tested a new submission system with a format validator running on a virtual private server (VPS) and allowed methods to be provided as executable R Markdown or IPython Notebooks. The task was perceived as difficult, and only eight unique submissions were received, although those who participated did well with no one method performing best on all samples. However, none of the submissions included a complete Markdown or Notebook, even though examples were provided. Future iPRG studies need to be more successful in promoting and encouraging participation. The VPS and submission validator easily scale to much larger numbers of participants in these types of studies. The unique "ground-truth" dataset for proteoform identification generated for this study is now available to the research community, as are the server-side scripts for validating and managing submissions.

AB - This report presents the results from the 2016 Association of Biomolecular Resource Facilities Proteome Informatics Research Group (iPRG) study on proteoform inference and false discovery rate (FDR) estimation from bottom-up proteomics data. For this study, 3 replicate Q Exactive Orbitrap liquid chromatography-tandom mass spectrometry datasets were generated from each of 4 Escherichia coli samples spiked with different equimolar mixtures of small recombinant proteins selected to mimic pairs of homologous proteins. Participants were given raw data and a sequence file and asked to identify the proteins and provide estimates on the FDR at the proteoform level. As part of this study, we tested a new submission system with a format validator running on a virtual private server (VPS) and allowed methods to be provided as executable R Markdown or IPython Notebooks. The task was perceived as difficult, and only eight unique submissions were received, although those who participated did well with no one method performing best on all samples. However, none of the submissions included a complete Markdown or Notebook, even though examples were provided. Future iPRG studies need to be more successful in promoting and encouraging participation. The VPS and submission validator easily scale to much larger numbers of participants in these types of studies. The unique "ground-truth" dataset for proteoform identification generated for this study is now available to the research community, as are the server-side scripts for validating and managing submissions.

KW - best practice

KW - community study

KW - false discovery rate

KW - inference

UR - http://www.scopus.com/inward/record.url?scp=85059915162&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85059915162&partnerID=8YFLogxK

U2 - 10.7171/jbt.18-2902-003

DO - 10.7171/jbt.18-2902-003

M3 - Article

C2 - 29977167

AN - SCOPUS:85059915162

VL - 29

SP - 39

EP - 45

JO - Journal of Biomolecular Techniques

JF - Journal of Biomolecular Techniques

SN - 1524-0215

IS - 2

ER -