A personalized committee classification approach to improving prediction of breast cancer metastasis

Md Jamiul Jahid, Hui-ming Huang, Jianhua Ruan

Research output: Contribution to journalArticle

10 Citations (Scopus)

Abstract

Motivation: Metastasis prediction is a well-known problem in breast cancer research. As breast cancer is a complex and heterogeneous disease with many molecular subtypes, predictive models trained for one cohort often perform poorly on other cohorts, and a combined model may be suboptimal for individual patients. Furthermore, attempting to develop subtype-specific models is hindered by the ambiguity and stereotypical definitions of subtypes. Results: Here, we propose a personalized approach by relaxing the definition of breast cancer subtypes. We assume that each patient belongs to a distinct subtype, defined implicitly by a set of patients with similar molecular characteristics, and construct a different predictive model for each patient, using as training data, only the patients defining the subtype. To increase robustness, we also develop a committee-based prediction method by pooling together multiple personalized models. Using both intra- and inter-dataset validations, we show that our approach can significantly improve the prediction accuracy of breast cancer metastasis compared with several popular approaches, especially on those hard-to-learn cases. Furthermore, we find that breast cancer patients belonging to different canonical subtypes tend to have different predictive models and gene signatures, suggesting that metastasis in different canonical subtypes are likely governed by different molecular mechanisms. Availability and implementation: Source code implemented in MATLAB and Java available at www.cs.utsa.edu/∼jruan/PCC/. Contact: Supplementary information: Supplementary data are available at Bioinformatics online.

Original languageEnglish (US)
Pages (from-to)1858-1866
Number of pages9
JournalBioinformatics
Volume30
Issue number13
DOIs
StatePublished - Jul 1 2014

Fingerprint

Metastasis
Breast Cancer
Breast Neoplasms
Neoplasm Metastasis
Predictive Model
Prediction
Pooling
Java
MATLAB
Bioinformatics
Computational Biology
Signature
Availability
Likely
Model
Contact
Tend
Robustness
Gene
Distinct

ASJC Scopus subject areas

  • Biochemistry
  • Molecular Biology
  • Computational Theory and Mathematics
  • Computer Science Applications
  • Computational Mathematics
  • Statistics and Probability
  • Medicine(all)

Cite this

A personalized committee classification approach to improving prediction of breast cancer metastasis. / Jahid, Md Jamiul; Huang, Hui-ming; Ruan, Jianhua.

In: Bioinformatics, Vol. 30, No. 13, 01.07.2014, p. 1858-1866.

Research output: Contribution to journalArticle

@article{6f4239ba3824425682c3592904176c00,
title = "A personalized committee classification approach to improving prediction of breast cancer metastasis",
abstract = "Motivation: Metastasis prediction is a well-known problem in breast cancer research. As breast cancer is a complex and heterogeneous disease with many molecular subtypes, predictive models trained for one cohort often perform poorly on other cohorts, and a combined model may be suboptimal for individual patients. Furthermore, attempting to develop subtype-specific models is hindered by the ambiguity and stereotypical definitions of subtypes. Results: Here, we propose a personalized approach by relaxing the definition of breast cancer subtypes. We assume that each patient belongs to a distinct subtype, defined implicitly by a set of patients with similar molecular characteristics, and construct a different predictive model for each patient, using as training data, only the patients defining the subtype. To increase robustness, we also develop a committee-based prediction method by pooling together multiple personalized models. Using both intra- and inter-dataset validations, we show that our approach can significantly improve the prediction accuracy of breast cancer metastasis compared with several popular approaches, especially on those hard-to-learn cases. Furthermore, we find that breast cancer patients belonging to different canonical subtypes tend to have different predictive models and gene signatures, suggesting that metastasis in different canonical subtypes are likely governed by different molecular mechanisms. Availability and implementation: Source code implemented in MATLAB and Java available at www.cs.utsa.edu/∼jruan/PCC/. Contact: Supplementary information: Supplementary data are available at Bioinformatics online.",
author = "Jahid, {Md Jamiul} and Hui-ming Huang and Jianhua Ruan",
year = "2014",
month = "7",
day = "1",
doi = "10.1093/bioinformatics/btu128",
language = "English (US)",
volume = "30",
pages = "1858--1866",
journal = "Bioinformatics",
issn = "1367-4803",
publisher = "Oxford University Press",
number = "13",

}

TY - JOUR

T1 - A personalized committee classification approach to improving prediction of breast cancer metastasis

AU - Jahid, Md Jamiul

AU - Huang, Hui-ming

AU - Ruan, Jianhua

PY - 2014/7/1

Y1 - 2014/7/1

N2 - Motivation: Metastasis prediction is a well-known problem in breast cancer research. As breast cancer is a complex and heterogeneous disease with many molecular subtypes, predictive models trained for one cohort often perform poorly on other cohorts, and a combined model may be suboptimal for individual patients. Furthermore, attempting to develop subtype-specific models is hindered by the ambiguity and stereotypical definitions of subtypes. Results: Here, we propose a personalized approach by relaxing the definition of breast cancer subtypes. We assume that each patient belongs to a distinct subtype, defined implicitly by a set of patients with similar molecular characteristics, and construct a different predictive model for each patient, using as training data, only the patients defining the subtype. To increase robustness, we also develop a committee-based prediction method by pooling together multiple personalized models. Using both intra- and inter-dataset validations, we show that our approach can significantly improve the prediction accuracy of breast cancer metastasis compared with several popular approaches, especially on those hard-to-learn cases. Furthermore, we find that breast cancer patients belonging to different canonical subtypes tend to have different predictive models and gene signatures, suggesting that metastasis in different canonical subtypes are likely governed by different molecular mechanisms. Availability and implementation: Source code implemented in MATLAB and Java available at www.cs.utsa.edu/∼jruan/PCC/. Contact: Supplementary information: Supplementary data are available at Bioinformatics online.

AB - Motivation: Metastasis prediction is a well-known problem in breast cancer research. As breast cancer is a complex and heterogeneous disease with many molecular subtypes, predictive models trained for one cohort often perform poorly on other cohorts, and a combined model may be suboptimal for individual patients. Furthermore, attempting to develop subtype-specific models is hindered by the ambiguity and stereotypical definitions of subtypes. Results: Here, we propose a personalized approach by relaxing the definition of breast cancer subtypes. We assume that each patient belongs to a distinct subtype, defined implicitly by a set of patients with similar molecular characteristics, and construct a different predictive model for each patient, using as training data, only the patients defining the subtype. To increase robustness, we also develop a committee-based prediction method by pooling together multiple personalized models. Using both intra- and inter-dataset validations, we show that our approach can significantly improve the prediction accuracy of breast cancer metastasis compared with several popular approaches, especially on those hard-to-learn cases. Furthermore, we find that breast cancer patients belonging to different canonical subtypes tend to have different predictive models and gene signatures, suggesting that metastasis in different canonical subtypes are likely governed by different molecular mechanisms. Availability and implementation: Source code implemented in MATLAB and Java available at www.cs.utsa.edu/∼jruan/PCC/. Contact: Supplementary information: Supplementary data are available at Bioinformatics online.

UR - http://www.scopus.com/inward/record.url?scp=84903699295&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84903699295&partnerID=8YFLogxK

U2 - 10.1093/bioinformatics/btu128

DO - 10.1093/bioinformatics/btu128

M3 - Article

C2 - 24618465

AN - SCOPUS:84903699295

VL - 30

SP - 1858

EP - 1866

JO - Bioinformatics

JF - Bioinformatics

SN - 1367-4803

IS - 13

ER -