It is unclear to what extent simulated versions of real data can be used to assess potential value of new biomarkers added to prognostic risk models. Using data on 4522 women and 3969 men who contributed information to the Framingham CVD risk prediction tool, we develop a simulation model that allows assessment of the added contribution of new biomarkers. The simulated model matches closely the one obtained using real data: discrimination area under the curve (AUC) on simulated vs actual data is 0.800 vs 0.799 in women and 0.778 vs 0.776 in men. Positive correlation with standard risk factors decreases the impact of new biomarkers ("AUC 0.002-0.024), but negative correlation leads to stronger effects ("AUC 0.026-0.101) than no correlation ("AUC 0.003-0.051). We suggest that researchers construct simulation models similar to the one proposed here before embarking on larger, expensive biomarker studies based on actual data.
|Original language||English (US)|
|Number of pages||4|
|Journal||Journal of the American Medical Informatics Association|
|State||Published - Oct 1 2018|
- synthetic cohorts
ASJC Scopus subject areas
- Health Informatics