Identifying differentially methylated genes using mixed effect and generalized least square models

Shuying Sun, Pearlly S. Yan, Hui-ming Huang, Shili Lin

Research output: Contribution to journalArticle

7 Citations (Scopus)

Abstract

Background: DNA methylation plays an important role in the process of tumorigenesis. Identifying differentially methylated genes or CpG islands (CGIs) associated with genes between two tumor subtypes is thus an important biological question. The methylation status of all CGIs in the whole genome can be assayed with differential methylation hybridization (DMH) microarrays. However, patient samples or cell lines are heterogeneous, so their methylation pattern may be very different. In addition, neighboring probes at each CGI are correlated. How these factors affect the analysis of DMH data is unknown. Results: We propose a new method for identifying differentially methylated (DM) genes by identifying the associated DM CGI(s). At each CGI, we implement four different mixed effect and generalized least square models to identify DM genes between two groups. We compare four models with a simple least square regression model to study the impact of incorporating random effects and correlations. Conclusions: We demonstrate that the inclusion (or exclusion) of random effects and the choice of correlation structures can significantly affect the results of the data analysis. We also assess the false discovery rate of different models using CGIs associated with housekeeping genes.

Original languageEnglish (US)
Article number404
JournalBMC Bioinformatics
Volume10
DOIs
StatePublished - Dec 9 2009
Externally publishedYes

Fingerprint

Mixed Effects
Generalized Least Squares
CpG Islands
Least-Squares Analysis
Methylation
Genes
Gene
Random Effects
Model
Least Squares Regression
Correlation Structure
Microarray
Essential Genes
DNA Methylation
Microarrays
Tumor
Regression Model
Data analysis
Genome
Probe

ASJC Scopus subject areas

  • Biochemistry
  • Molecular Biology
  • Computer Science Applications
  • Structural Biology
  • Applied Mathematics

Cite this

Identifying differentially methylated genes using mixed effect and generalized least square models. / Sun, Shuying; Yan, Pearlly S.; Huang, Hui-ming; Lin, Shili.

In: BMC Bioinformatics, Vol. 10, 404, 09.12.2009.

Research output: Contribution to journalArticle

@article{29b1000a33204133a8326d4c372aeb47,
title = "Identifying differentially methylated genes using mixed effect and generalized least square models",
abstract = "Background: DNA methylation plays an important role in the process of tumorigenesis. Identifying differentially methylated genes or CpG islands (CGIs) associated with genes between two tumor subtypes is thus an important biological question. The methylation status of all CGIs in the whole genome can be assayed with differential methylation hybridization (DMH) microarrays. However, patient samples or cell lines are heterogeneous, so their methylation pattern may be very different. In addition, neighboring probes at each CGI are correlated. How these factors affect the analysis of DMH data is unknown. Results: We propose a new method for identifying differentially methylated (DM) genes by identifying the associated DM CGI(s). At each CGI, we implement four different mixed effect and generalized least square models to identify DM genes between two groups. We compare four models with a simple least square regression model to study the impact of incorporating random effects and correlations. Conclusions: We demonstrate that the inclusion (or exclusion) of random effects and the choice of correlation structures can significantly affect the results of the data analysis. We also assess the false discovery rate of different models using CGIs associated with housekeeping genes.",
author = "Shuying Sun and Yan, {Pearlly S.} and Hui-ming Huang and Shili Lin",
year = "2009",
month = "12",
day = "9",
doi = "10.1186/1471-2105-10-404",
language = "English (US)",
volume = "10",
journal = "BMC Bioinformatics",
issn = "1471-2105",
publisher = "BioMed Central",

}

TY - JOUR

T1 - Identifying differentially methylated genes using mixed effect and generalized least square models

AU - Sun, Shuying

AU - Yan, Pearlly S.

AU - Huang, Hui-ming

AU - Lin, Shili

PY - 2009/12/9

Y1 - 2009/12/9

N2 - Background: DNA methylation plays an important role in the process of tumorigenesis. Identifying differentially methylated genes or CpG islands (CGIs) associated with genes between two tumor subtypes is thus an important biological question. The methylation status of all CGIs in the whole genome can be assayed with differential methylation hybridization (DMH) microarrays. However, patient samples or cell lines are heterogeneous, so their methylation pattern may be very different. In addition, neighboring probes at each CGI are correlated. How these factors affect the analysis of DMH data is unknown. Results: We propose a new method for identifying differentially methylated (DM) genes by identifying the associated DM CGI(s). At each CGI, we implement four different mixed effect and generalized least square models to identify DM genes between two groups. We compare four models with a simple least square regression model to study the impact of incorporating random effects and correlations. Conclusions: We demonstrate that the inclusion (or exclusion) of random effects and the choice of correlation structures can significantly affect the results of the data analysis. We also assess the false discovery rate of different models using CGIs associated with housekeeping genes.

AB - Background: DNA methylation plays an important role in the process of tumorigenesis. Identifying differentially methylated genes or CpG islands (CGIs) associated with genes between two tumor subtypes is thus an important biological question. The methylation status of all CGIs in the whole genome can be assayed with differential methylation hybridization (DMH) microarrays. However, patient samples or cell lines are heterogeneous, so their methylation pattern may be very different. In addition, neighboring probes at each CGI are correlated. How these factors affect the analysis of DMH data is unknown. Results: We propose a new method for identifying differentially methylated (DM) genes by identifying the associated DM CGI(s). At each CGI, we implement four different mixed effect and generalized least square models to identify DM genes between two groups. We compare four models with a simple least square regression model to study the impact of incorporating random effects and correlations. Conclusions: We demonstrate that the inclusion (or exclusion) of random effects and the choice of correlation structures can significantly affect the results of the data analysis. We also assess the false discovery rate of different models using CGIs associated with housekeeping genes.

UR - http://www.scopus.com/inward/record.url?scp=74049159064&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=74049159064&partnerID=8YFLogxK

U2 - 10.1186/1471-2105-10-404

DO - 10.1186/1471-2105-10-404

M3 - Article

C2 - 20003206

AN - SCOPUS:74049159064

VL - 10

JO - BMC Bioinformatics

JF - BMC Bioinformatics

SN - 1471-2105

M1 - 404

ER -