TY - JOUR
T1 - Discovering disease-specific biomarker genes for cancer diagnosis and prognosis
AU - Huang, Hung Chung
AU - Zheng, Siyuan
AU - VanBuren, Vincent
AU - Zhao, Zhongming
PY - 2010/6
Y1 - 2010/6
N2 - The large amounts of microarray data provide us a great opportunity to identify gene expression profiles (GEPs) in different tissues or disease states. Disease-specific biomarker genes likely share GEPs that are distinct in disease samples as compared with normal samples. The similarity of the GEPs may be evaluated by Pearson Correlation Coefficient (PCC) and the distinctness of GEPs may be assessed by Kolmogorov-Smirnov distance (KSD). In this study, we used the PCC and KSD metrics for GEPs to identify disease-specific (cancerspecific) biomarkers. We first analyzed and compared GEPs using microarray datasets for smoking and lung cancer. We found that the number of genes with highly different GEPs between comparing groups in smoking dataset was much larger than that in lung cancer dataset; this observation was further verified when we compared GEPs in smoking dataset with prostate cancer datasets. Moreover, our Gene Ontology analysis revealed that the top ranked biomarker candidate genes for prostate cancer were highly enriched in molecular function categories such as 'cytoskeletal protein binding' and biological process categories such as 'muscle contraction'. Finally, we used two genes, ACTC1 (encoding an actin subunit) and HPN (encoding hepsin), to demonstrate the feasibility of diagnosing and monitoring prostate cancer using the expression intensity histograms of marker genes. In summary, our results suggested that this approach might prove promising and powerful for diagnosing and monitoring the patients who come to the clinic for screening or evaluation of a disease state including cancer.
AB - The large amounts of microarray data provide us a great opportunity to identify gene expression profiles (GEPs) in different tissues or disease states. Disease-specific biomarker genes likely share GEPs that are distinct in disease samples as compared with normal samples. The similarity of the GEPs may be evaluated by Pearson Correlation Coefficient (PCC) and the distinctness of GEPs may be assessed by Kolmogorov-Smirnov distance (KSD). In this study, we used the PCC and KSD metrics for GEPs to identify disease-specific (cancerspecific) biomarkers. We first analyzed and compared GEPs using microarray datasets for smoking and lung cancer. We found that the number of genes with highly different GEPs between comparing groups in smoking dataset was much larger than that in lung cancer dataset; this observation was further verified when we compared GEPs in smoking dataset with prostate cancer datasets. Moreover, our Gene Ontology analysis revealed that the top ranked biomarker candidate genes for prostate cancer were highly enriched in molecular function categories such as 'cytoskeletal protein binding' and biological process categories such as 'muscle contraction'. Finally, we used two genes, ACTC1 (encoding an actin subunit) and HPN (encoding hepsin), to demonstrate the feasibility of diagnosing and monitoring prostate cancer using the expression intensity histograms of marker genes. In summary, our results suggested that this approach might prove promising and powerful for diagnosing and monitoring the patients who come to the clinic for screening or evaluation of a disease state including cancer.
KW - Cancer biomarker
KW - Cancer diagnosis and prognosis
KW - Gene expression profile
KW - Kolmogorov-Smirnov distance
KW - Pearson correlation coefficient
UR - http://www.scopus.com/inward/record.url?scp=77953248367&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=77953248367&partnerID=8YFLogxK
U2 - 10.1177/153303461000900301
DO - 10.1177/153303461000900301
M3 - Article
C2 - 20441232
AN - SCOPUS:77953248367
SN - 1533-0346
VL - 9
SP - 219
EP - 229
JO - Technology in Cancer Research and Treatment
JF - Technology in Cancer Research and Treatment
IS - 3
ER -