Motivation: There are two general methods for making gene-expression microarrays: one is to hybridize a single test set of labeled targets to the probe, and measure the background-subtracted intensity at each probe site; the other is to hybridize both a test and a reference set of differentially labeled targets to a single detector array, and measure the ratio of the background-subtracted intensities at each probe site. Which method is better depends on the variability in the cell system and the random factors resulting from the microarray technology. It also depends on the purpose for which the microarray is being used. Classification is a fundamental application and it is the one considered here. Results: This paper describes a model-based simulation paradigm that compares the classification accuracy provided by these methods over a variety of noise types and presents the results of a study modeled on noise typical of cDNA microarray data. The model consists of four parts: (1) the measurement equation for genes in the reference state; (2) the measurement equation for genes in the test state; (3) the ratio and normalization procedure for a dual-channel system; and (4) the intensity and normalization procedure for a single-channel system. In the reference state, the mean intensities are modeled as a shifted exponential distribution, and the intensity for a particular gene is modeled via a normal distribution, Normal(l, αl), about its mean intensity l, with α being the coefficient of variation of the cell system. In the test state, some genes have their intensities up-regulated by a random factor. The model includes a number of random factors affecting intensity measurement: deposition gain d, labeling gain, and post-image-processing residual noise. The key conclusion resulting from the study is that the coefficient of variation governing the randomness of the intensities and the deposition gain are the most important factors for determining whether a single-channel or dual-channel system provides superior classification, and the decision region in the α -d plane is approximately linear.
ASJC Scopus subject areas
- Statistics and Probability
- Molecular Biology
- Computer Science Applications
- Computational Theory and Mathematics
- Computational Mathematics