TY - JOUR
T1 - The proBAM and proBed standard formats
T2 - Enabling a seamless integration of genomics and proteomics data
AU - Menschaert, Gerben
AU - Wang, Xiaojing
AU - Jones, Andrew R.
AU - Ghali, Fawaz
AU - Fenyö, David
AU - Olexiouk, Volodimir
AU - Zhang, Bing
AU - Deutsch, Eric W.
AU - Ternent, Tobias
AU - Vizcaíno, Juan Antonio
N1 - Publisher Copyright:
© 2018 The Author(s).
PY - 2018/1/31
Y1 - 2018/1/31
N2 - On behalf of The Human Proteome Organization (HUPO) Proteomics Standards Initiative, we introduce here two novel standard data formats, proBAM and proBed, that have been developed to address the current challenges of integrating mass spectrometry-based proteomics data with genomics and transcriptomics information in proteogenomics studies. proBAM and proBed are adaptations of the well-defined, widely used file formats SAM/BAM and BED, respectively, and both have been extended to meet the specific requirements entailed by proteomics data. Therefore, existing popular genomics tools such as SAMtools and Bedtools, and several widely used genome browsers, can already be used to manipulate and visualize these formats "out-of-the-box." We also highlight that a number of specific additional software tools, properly supporting the proteomics information available in these formats, are now available providing functionalities such as file generation, file conversion, and data analysis. All the related documentation, including the detailed file format specifications and example files, are accessible at http://www.psidev.info/probamand at http://www.psidev.info/probed.
AB - On behalf of The Human Proteome Organization (HUPO) Proteomics Standards Initiative, we introduce here two novel standard data formats, proBAM and proBed, that have been developed to address the current challenges of integrating mass spectrometry-based proteomics data with genomics and transcriptomics information in proteogenomics studies. proBAM and proBed are adaptations of the well-defined, widely used file formats SAM/BAM and BED, respectively, and both have been extended to meet the specific requirements entailed by proteomics data. Therefore, existing popular genomics tools such as SAMtools and Bedtools, and several widely used genome browsers, can already be used to manipulate and visualize these formats "out-of-the-box." We also highlight that a number of specific additional software tools, properly supporting the proteomics information available in these formats, are now available providing functionalities such as file generation, file conversion, and data analysis. All the related documentation, including the detailed file format specifications and example files, are accessible at http://www.psidev.info/probamand at http://www.psidev.info/probed.
UR - http://www.scopus.com/inward/record.url?scp=85041482719&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85041482719&partnerID=8YFLogxK
U2 - 10.1186/s13059-017-1377-x
DO - 10.1186/s13059-017-1377-x
M3 - Letter
C2 - 29386051
AN - SCOPUS:85041482719
SN - 1474-7596
VL - 19
JO - Genome biology
JF - Genome biology
IS - 1
M1 - 12
ER -