Abstract
There has been increasing interest in predicting patients' survival after therapy by investigating gene expression microarray data. In the regression and classification models with high-dimensional genomic data, boosting has been successfully applied to build accurate predictive models and conduct variable selection simultaneously. We propose the Buckley-James boosting for the semiparametric accelerated failure time models with right censored survival data, which can be used to predict survival of future patients using the high-dimensional genomic data. In the spirit of adaptive LASSO, twin boosting is also incorporated to fit more sparse models. The proposed methods have a unified approach to fit linear models, non-linear effects models with possible interactions. The methods can perform variable selection and parameter estimation simultaneously. The proposed methods are evaluated by simulations and applied to a recent microarray gene expression data set for patients with diffuse large B-cell lymphoma under the current gold standard therapy.
Original language | English (US) |
---|---|
Article number | 24 |
Journal | Statistical Applications in Genetics and Molecular Biology |
Volume | 9 |
Issue number | 1 |
DOIs | |
State | Published - 2010 |
Externally published | Yes |
Keywords
- Buckley-James estimator
- LASSO
- accelerated failure time model
- boosting
- censored survival data
- variable selection
ASJC Scopus subject areas
- Statistics and Probability
- Molecular Biology
- Genetics
- Computational Mathematics