Network Modeling Analysis in Health Informatics and Bioinformatics, cilt.15, sa.1, 2026 (ESCI, Scopus)
Cardiovascular diseases remain the leading cause of global mortality, yet conventional risk models often fail to capture complex, non-linear dependencies in patient data, limiting their clinical applicability. We designed a comprehensive ensemble and meta-learning framework integrating multiple boosting algorithms (AdaBoost, Gradient Boosting, XGBoost, LightGBM, etc.) and bagging approaches (Random Forest, Bagged Logistic Regression) trained independently and as base learners within a Super Learner framework, with Logistic Regression, HistBoost, etc. as meta-learners to capture non-linear feature relationships. Two custom Blending strategies were applied for practical implementation. Models were trained and validated on harmonized data from five heterogeneous cohorts, with hyperparameter optimization and K-fold cross-validation ensuring robust performance. Ensemble approaches achieved strong predictive accuracy, with meta-learning consistently outperforming base learners. The Comprehensive Blending model achieved the highest AUC (0.972) and average precision (96.9%), exceeding LightGBM (AUC: 0.96). Super Learners using Logistic Regression as a meta-learner provided balanced, generalizable predictions (AUC up to 0.97; F1-score up to 96%). Carefully tuned ensemble and meta-learning frameworks achieved state-of-the-art cardiovascular risk prediction, where RF Boosting excelled in classification, Super Learners provided balance, and Blending models offered the highest AUC, supporting early detection and precision cardiovascular care.