CYBERNETICS AND SYSTEMS, cilt.55, sa.4, ss.940-960, 2024 (SCI-Expanded)
This study intends to predict the secondary school students' academic achievements using machine learning methods, to determine the features that have the greatest influence on achievement, and to develop a system for the prediction of academic achievement. In line with this objective, the prediction of year-end weighted grade point averages using secondary school students' socioeconomic, demographic characteristics, and course grade data is addressed as a classification problem. The data set used in the present study was collected from students studying at a secondary school in Istanbul province of Turkey. With the purpose of classification of the target variable, 7 machine learning algorithms, including K-Nearest Neighbor, Decision Trees, Random Forest, Support Vector Machines, Multilayer Perceptron, Logistic Regression, and Naive Bayes were applied, and their performances were compared. According to the results of the model performance evaluation, it becomes clear that the most successful model was the Random Forest algorithm (accuracy: 80.73%). The sample system developed based on the Random Forest model can be accessed at the following address: https://model-tahmin.herokuapp.com . The features of the students' academic background (grade point average of past years, Turkish course 1st-semester average, and Math 1stsemester average) were found as the features that most affect achievement.