Background: Methylated RNA Immunoprecipatation combined with RNA sequencing (MeRIP-seq) is revolutionizing the de novo study of RNA epigenomics at a higher resolution. However, this new technology poses unique bioinformatics problems that call for novel and sophisticated statistical computational solutions, aiming at identifying and characterizing transcriptome-wide methyltranscriptome. Results: We developed HEP, a Hidden Markov Model (HMM)-based Exome Peak-finding algorithm for predicting transcriptome methylation sites using MeRIP-seq data. In contrast to exomePeak, our previously developed MeRIP-seq peak calling algorithm, HEPeak models the correlation between continuous bins in an m6A peak region and it is a model-based approach, which admits rigorous statistical inference. HEPeak was evaluated on a simulated MeRIP-seq dataset and achieved higher sensitivity and specificity than exomePeak. HEPeak was also applied to real MeRIP-seq datasets from human HEK293T cell line and mouse midbrain cells and was shown to be able to recapitulate known m6A distribution in transcripts and identify novel m6A sites in long non-coding RNAs. Conclusions: In this paper, a novel HMM-based peak calling algorithm, HEPeak, was developed for peak calling for MeRIP-seq data. HEPeak is written in R and is publicly available.
ASJC Scopus subject areas