Abstract
An unsupervised data clustering method, called the local maximum clustering (LMC) method, is proposed for identifying clusters in experiment data sets based on research interest. A magnitude properly is defined according to research purposes, and data sets are clustered around each local maximum of the magnitude property. By properly defining a magnitude property, this method can overcome many difficulties in microarray data clustering such as reduced projection in similarities, noises, and arbitrary gene distribution. To critically evaluate the performance of this clustering method in comparison with other methods, we designed three model data sets with known cluster distributions and applied the LMC method as well as the hierarchic clustering method, the K-mean clustering method, and the self-organized map method to these model data sets. The results show that the LMC method produces the most accurate clustering results. As an example of application, we applied the method to cluster the leukemia samples reported in the microarray study of Golub et al. (1999).
Original language | English (US) |
---|---|
Pages (from-to) | 53-63 |
Number of pages | 11 |
Journal | Eurasip Journal on Applied Signal Processing |
Volume | 2004 |
Issue number | 1 |
DOIs | |
State | Published - Jan 1 2004 |
Externally published | Yes |
Keywords
- Classification
- Clustering method
- Data cluster
- Gene expression
- Microarray
- Model data sets
ASJC Scopus subject areas
- Signal Processing
- Hardware and Architecture
- Electrical and Electronic Engineering