TY - JOUR
T1 - Preprocessing differential methylation hybridization microarray data
AU - Sun, Shuying
AU - Huang, Yi Wen
AU - Yan, Pearlly S.
AU - Huang, Tim Hm
AU - Lin, Shili
N1 - Funding Information:
This work was supported by the National Science Foundation [0112050] while SS was a postdoctoral researcher in the Mathematical Biosciences Institute, The Ohio State University. The authors thank Drs. Terry Speed, Greg Singers and Dustin Potter for valuable suggestions and discussions. In particular, we appreciate that Dr. Potter shared the methylation-sensitive restriction enzyme cutting sites data (from his previous DMH publications) with us.
PY - 2011
Y1 - 2011
N2 - Background: DNA methylation plays a very important role in the silencing of tumor suppressor genes in various tumor types. In order to gain a genome-wide understanding of how changes in methylation affect tumor growth, the differential methylation hybridization (DMH) protocol has been developed and large amounts of DMH microarray data have been generated. However, it is still unclear how to preprocess this type of microarray data and how different background correction and normalization methods used for two-color gene expression arrays perform for the methylation microarray data. In this paper, we demonstrate our discovery of a set of internal control probes that have log ratios (M) theoretically equal to zero according to this DMH protocol. With the aid of this set of control probes, we propose two LOESS (or LOWESS, locally weighted scatter-plot smoothing) normalization methods that are novel and unique for DMH microarray data. Combining with other normalization methods (global LOESS and no normalization), we compare four normalization methods. In addition, we compare five different background correction methods. Results: We study 20 different preprocessing methods, which are the combination of five background correction methods and four normalization methods. In order to compare these 20 methods, we evaluate their performance of identifying known methylated and un-methylated housekeeping genes based on two statistics. Comparison details are illustrated using breast cancer cell line and ovarian cancer patient methylation microarray data. Our comparison results show that different background correction methods perform similarly; however, four normalization methods perform very differently. In particular, all three different LOESS normalization methods perform better than the one without any normalization. Conclusions: It is necessary to do within-array normalization, and the two LOESS normalization methods based on specific DMH internal control probes produce more stable and relatively better results than the global LOESS normalization method.
AB - Background: DNA methylation plays a very important role in the silencing of tumor suppressor genes in various tumor types. In order to gain a genome-wide understanding of how changes in methylation affect tumor growth, the differential methylation hybridization (DMH) protocol has been developed and large amounts of DMH microarray data have been generated. However, it is still unclear how to preprocess this type of microarray data and how different background correction and normalization methods used for two-color gene expression arrays perform for the methylation microarray data. In this paper, we demonstrate our discovery of a set of internal control probes that have log ratios (M) theoretically equal to zero according to this DMH protocol. With the aid of this set of control probes, we propose two LOESS (or LOWESS, locally weighted scatter-plot smoothing) normalization methods that are novel and unique for DMH microarray data. Combining with other normalization methods (global LOESS and no normalization), we compare four normalization methods. In addition, we compare five different background correction methods. Results: We study 20 different preprocessing methods, which are the combination of five background correction methods and four normalization methods. In order to compare these 20 methods, we evaluate their performance of identifying known methylated and un-methylated housekeeping genes based on two statistics. Comparison details are illustrated using breast cancer cell line and ovarian cancer patient methylation microarray data. Our comparison results show that different background correction methods perform similarly; however, four normalization methods perform very differently. In particular, all three different LOESS normalization methods perform better than the one without any normalization. Conclusions: It is necessary to do within-array normalization, and the two LOESS normalization methods based on specific DMH internal control probes produce more stable and relatively better results than the global LOESS normalization method.
UR - http://www.scopus.com/inward/record.url?scp=79955921176&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=79955921176&partnerID=8YFLogxK
U2 - 10.1186/1756-0381-4-13
DO - 10.1186/1756-0381-4-13
M3 - Article
C2 - 21575229
AN - SCOPUS:79955921176
SN - 1756-0381
VL - 4
JO - BioData Mining
JF - BioData Mining
IS - 1
M1 - 13
ER -