A Comparative Study of Preprocessing Techniques for Stroke Prediction using XGBoost Classifier

3rd International Conference On Advanced Engineering, Technology And Applications, Catania, İtalya, 24 - 25 Mayıs 2024

Yayın Türü: Bildiri / Tam Metin Bildiri
Basıldığı Şehir: Catania
Basıldığı Ülke: İtalya
İstanbul Üniversitesi Adresli: Evet

Özet

Stroke is a condition characterized by the cessation of blood flow to a region of the brain or bleeding within the brain. Early diagnosis and treatment not only reduce the risks of permanent damage and mortality but also enhance the likelihood of recovery. Hence, timely diagnostic interventions are essential for formulating effective treatment strategies and preventing potential complications. Machine learning models are frequently used in the literature as powerful tools in stroke diagnosis. In this study, a comparative analysis of the effectiveness of methodologies used successfully in the literature with the Extreme Gradient Boosting (XGBoost) machine learning method was conducted to overcome the challenges caused by missing values and imbalanced datasets in stroke prediction. In the experiments, the Cerebral Stroke Prediction (CSP) dataset was employed to evaluate the performance of these methodologies using model evaluation metrics. The study findings emphasize the effectiveness of SMOTEENN in addressing class imbalance and missing data challenges across various imputation methods. This underlines the importance of employing suitable sampling and imputation strategies to improve the performance of stroke prediction models.