Comparing multiple ChIP-sequencing experiments.

Hatice Gulcin Ozer, Yi Wen Huang, Jiejun Wu, Jeffrey D. Parvin, Hui-ming Huang, Kun Huang

Research output: Contribution to journalArticle

1 Citation (Scopus)

Abstract

New high-throughput sequencing technologies can generate millions of short sequences in a single experiment. As the size of the data increases, comparison of multiple experiments on different cell lines under different experimental conditions becomes a big challenge. In this paper, we investigate ways to compare multiple ChIP-sequencing experiments. We specifically studied epigenetic regulation of breast cancer and the effect of estrogen using 50 ChIP-sequencing data from Illumina Genome Analyzer II. First, we evaluate the correlation among different experiments focusing on the total number of reads in transcribed and promoter regions of the genome. Then, we adopt the method that is used to identify the most stable genes in RT-PCR experiments to understand background signal across all of the experiments and to identify the most variable transcribed and promoter regions of the genome. We observed that the most variable genes for transcribed regions and promoter regions are very distinct. Gene ontology and function enrichment analysis on these most variable genes demonstrate the biological relevance of the results. In this study, we present a method that can effectively select differential regions of the genome based on protein-binding profiles over multiple experiments using real data points without any normalization among the samples.

Original languageEnglish (US)
Pages (from-to)269-282
Number of pages14
JournalJournal of Bioinformatics and Computational Biology
Volume9
Issue number2
DOIs
StatePublished - Apr 2011
Externally publishedYes

Fingerprint

Genes
Genetic Promoter Regions
Genome
Experiments
Gene Ontology
Protein Binding
Epigenomics
Estrogens
Breast Neoplasms
Technology
Cell Line
Polymerase Chain Reaction
Ontology
Cells
Throughput

ASJC Scopus subject areas

  • Biochemistry
  • Molecular Biology
  • Computer Science Applications

Cite this

Comparing multiple ChIP-sequencing experiments. / Ozer, Hatice Gulcin; Huang, Yi Wen; Wu, Jiejun; Parvin, Jeffrey D.; Huang, Hui-ming; Huang, Kun.

In: Journal of Bioinformatics and Computational Biology, Vol. 9, No. 2, 04.2011, p. 269-282.

Research output: Contribution to journalArticle

Ozer, Hatice Gulcin ; Huang, Yi Wen ; Wu, Jiejun ; Parvin, Jeffrey D. ; Huang, Hui-ming ; Huang, Kun. / Comparing multiple ChIP-sequencing experiments. In: Journal of Bioinformatics and Computational Biology. 2011 ; Vol. 9, No. 2. pp. 269-282.
@article{970665f441e049e3b007264596f67824,
title = "Comparing multiple ChIP-sequencing experiments.",
abstract = "New high-throughput sequencing technologies can generate millions of short sequences in a single experiment. As the size of the data increases, comparison of multiple experiments on different cell lines under different experimental conditions becomes a big challenge. In this paper, we investigate ways to compare multiple ChIP-sequencing experiments. We specifically studied epigenetic regulation of breast cancer and the effect of estrogen using 50 ChIP-sequencing data from Illumina Genome Analyzer II. First, we evaluate the correlation among different experiments focusing on the total number of reads in transcribed and promoter regions of the genome. Then, we adopt the method that is used to identify the most stable genes in RT-PCR experiments to understand background signal across all of the experiments and to identify the most variable transcribed and promoter regions of the genome. We observed that the most variable genes for transcribed regions and promoter regions are very distinct. Gene ontology and function enrichment analysis on these most variable genes demonstrate the biological relevance of the results. In this study, we present a method that can effectively select differential regions of the genome based on protein-binding profiles over multiple experiments using real data points without any normalization among the samples.",
author = "Ozer, {Hatice Gulcin} and Huang, {Yi Wen} and Jiejun Wu and Parvin, {Jeffrey D.} and Hui-ming Huang and Kun Huang",
year = "2011",
month = "4",
doi = "10.1142/S0219720011005483",
language = "English (US)",
volume = "9",
pages = "269--282",
journal = "Journal of Bioinformatics and Computational Biology",
issn = "0219-7200",
publisher = "World Scientific Publishing Co. Pte Ltd",
number = "2",

}

TY - JOUR

T1 - Comparing multiple ChIP-sequencing experiments.

AU - Ozer, Hatice Gulcin

AU - Huang, Yi Wen

AU - Wu, Jiejun

AU - Parvin, Jeffrey D.

AU - Huang, Hui-ming

AU - Huang, Kun

PY - 2011/4

Y1 - 2011/4

N2 - New high-throughput sequencing technologies can generate millions of short sequences in a single experiment. As the size of the data increases, comparison of multiple experiments on different cell lines under different experimental conditions becomes a big challenge. In this paper, we investigate ways to compare multiple ChIP-sequencing experiments. We specifically studied epigenetic regulation of breast cancer and the effect of estrogen using 50 ChIP-sequencing data from Illumina Genome Analyzer II. First, we evaluate the correlation among different experiments focusing on the total number of reads in transcribed and promoter regions of the genome. Then, we adopt the method that is used to identify the most stable genes in RT-PCR experiments to understand background signal across all of the experiments and to identify the most variable transcribed and promoter regions of the genome. We observed that the most variable genes for transcribed regions and promoter regions are very distinct. Gene ontology and function enrichment analysis on these most variable genes demonstrate the biological relevance of the results. In this study, we present a method that can effectively select differential regions of the genome based on protein-binding profiles over multiple experiments using real data points without any normalization among the samples.

AB - New high-throughput sequencing technologies can generate millions of short sequences in a single experiment. As the size of the data increases, comparison of multiple experiments on different cell lines under different experimental conditions becomes a big challenge. In this paper, we investigate ways to compare multiple ChIP-sequencing experiments. We specifically studied epigenetic regulation of breast cancer and the effect of estrogen using 50 ChIP-sequencing data from Illumina Genome Analyzer II. First, we evaluate the correlation among different experiments focusing on the total number of reads in transcribed and promoter regions of the genome. Then, we adopt the method that is used to identify the most stable genes in RT-PCR experiments to understand background signal across all of the experiments and to identify the most variable transcribed and promoter regions of the genome. We observed that the most variable genes for transcribed regions and promoter regions are very distinct. Gene ontology and function enrichment analysis on these most variable genes demonstrate the biological relevance of the results. In this study, we present a method that can effectively select differential regions of the genome based on protein-binding profiles over multiple experiments using real data points without any normalization among the samples.

UR - http://www.scopus.com/inward/record.url?scp=80052045458&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=80052045458&partnerID=8YFLogxK

U2 - 10.1142/S0219720011005483

DO - 10.1142/S0219720011005483

M3 - Article

C2 - 21523932

AN - SCOPUS:80052045458

VL - 9

SP - 269

EP - 282

JO - Journal of Bioinformatics and Computational Biology

JF - Journal of Bioinformatics and Computational Biology

SN - 0219-7200

IS - 2

ER -