Prevention or late onset of a disease progression can be accomplished if a data-mining technique can identify a person who is at a greater risk of developing the disease in a later stage. This study aimed to extract and find the biomarkers responsible for the progression of diabetes mellitus (DM) leading to cardiovascular disease (CVD), followed by applying data-driven techniques for type 2 diabetes (T2D) and CVD prediction in advance. The proposed approach comprises novel feature extraction and selection, applying ensembling and stacking of three different data mining techniques, namely, support vector machine (SVM), random forest (RF), and extreme gradient boosting (XGB) models. The developed framework has been evaluated using oral glucose tolerant test (OGTT) data sourced from the San Antonio Heart Study. The model achieved 92.54% prediction accuracy in differentiating healthy patients from those who developed T2D leading to CVD.