A bayesian hidden markov model for motif discovery through joint modeling of genomic sequence and ChIP-chip data

Jonathan A.L. Gelfond, Mayetri Gupta, Joseph G. Ibrahim

Research output: Contribution to journalArticlepeer-review

6 Scopus citations

Abstract

We propose a unified framework for the analysis of chromatin (Ch) immunoprecipitation (IP) microarray (ChIP-chip) data for detecting transcription factor binding sites (TFBSs) or motifs. ChIP-chip assays are used to focus the genome-wide search for TFBSs by isolating a sample of DNA fragments with TFBSs and applying this sample to a microarray with probes corresponding to tiled segments across the genome. Present analytical methods use a two-step approach: (i) analyze array data to estimate IP-enrichment peaks then (ii) analyze the corresponding sequences independently of intensity information. The proposed model integrates peak finding and motif discovery through a unified Bayesian hidden Markov model (HMM) framework that accommodates the inherent uncertainty in both measurements. A Markov chain Monte Carlo algorithm is formulated for parameter estimation, adapting recursive techniques used for HMMs. In simulations and applications to a yeast RAP1 dataset, the proposed method has favorable TFBS discovery performance compared to currently available two-stage procedures in terms of both sensitivity and specificity.

Original languageEnglish (US)
Pages (from-to)1087-1095
Number of pages9
JournalBiometrics
Volume65
Issue number4
DOIs
StatePublished - Dec 2009

Keywords

  • Data augmentation
  • Gene regulation
  • Tiling array
  • Transcription factor binding site

ASJC Scopus subject areas

  • Statistics and Probability
  • Biochemistry, Genetics and Molecular Biology(all)
  • Immunology and Microbiology(all)
  • Agricultural and Biological Sciences(all)
  • Applied Mathematics

Fingerprint Dive into the research topics of 'A bayesian hidden markov model for motif discovery through joint modeling of genomic sequence and ChIP-chip data'. Together they form a unique fingerprint.

Cite this