Assessing the significance of consistently mis-regulated genes in cancer associated gene expression matrices

Mattias Wahde, Gregory T. Klus, Michae L. Bittner, Yidong Chen, Zoltan Szallasi

Research output: Contribution to journalArticlepeer-review

6 Scopus citations


Motivation: The simplest level of statistical analysis of cancer associated gene expression matrices is aimed at finding consistently up- or down-regulated genes within a given set of tumor samples. Considering the high level of gene expression diversity detected in cancer, one needs to assess the probability that the consistent mis-regulation of a given gene is due to chance. Furthermore, it is important to determine the required sample number that will ensure the meaningful statistical analysis of massively parallel gene expression measurements. Results: The probability of consistent mis-regulation is calculated in this paper for binarized gene expression data, using combinatorial considerations. For practical purposes, we also provide a set of accurate approximate formulas for determining the same probability in a computationally less intensive way. When the pool of mis-regulatable genes is restricted, the probability of consistent mis-regulation can be overestimated. We show, however, that this effect has little practical consequences for cancer associated gene expression measurements published in the literature. Finally, in order to aid experimental design, we have provided estimates on the required sample number that will ensure that the detected consistent mis-regulation is not due to chance. Our results suggest that less than 20 sufficiently diverse tumor samples may be enough to identify consistently mis-regulated genes in a statistically significant manner.

Original languageEnglish (US)
Pages (from-to)389-394
Number of pages6
Issue number3
StatePublished - 2002
Externally publishedYes

ASJC Scopus subject areas

  • Computational Mathematics
  • Molecular Biology
  • Biochemistry
  • Statistics and Probability
  • Computer Science Applications
  • Computational Theory and Mathematics


Dive into the research topics of 'Assessing the significance of consistently mis-regulated genes in cancer associated gene expression matrices'. Together they form a unique fingerprint.

Cite this