
Explainable AI-Based Student Result Prediction Using CatBoost and Logistic Regression | IJET Volume 12 â Issue 3 | IJET-V12I3P82

Table of Contents
ToggleInternational Journal of Engineering and Techniques (IJET)
Open Access ⢠Peer Reviewed ⢠High Citation & Impact Factor ⢠ISSN: 2395-1303
Volume 12, Issue 3 | Published: June 2026
Author: Heena M. Pathan, Pournima E. Gawade
DOI: https://doi.org/{{doi}} ⢠PDF: Download
Abstract
The percentage of students passing a course depends heavily on how early at-risk learners can be identified and supported. This pa-per presents an Explainable Artificial Intelligence (XAI) based hybrid model for predicting student academic results using CatBoost and Logistic Regression. CatBoost, a gradient-boosting algorithm, is used to capture complex, non-linear relationships among academic attributes such as at-tendance, internal marks, study hours and previous GPA, while Logistic Regression contributes a transparent, probability-based decision boundary. The two classifiers are combined through a Soft-Voting ensemble that averages their predicted class probabilities, and SHAP (SHapley Additive exPlanations) is applied to the trained model to explain individual and global feature contributions. The system was evaluated on a dataset of 10,000 student records using a 80:20 train-test split. The proposed ensemble achieved an accuracy of 99.10%, precision of 99.37%, recall of 99.02%, F1-score of 99.19% and ROC-AUC of 99.98%, outperforming the individual Logistic Regression and CatBoost models on the combined set of metrics while retaining model interpretability. SHAP analysis identified Study Hours, Previous GPA and Internal Marks as the most influential predictors of student outcome. The resulting system offers a practical, transparent and computationally inexpensive decision-support tool that institutions can use for early identification of academically at-risk students.
Keywords
Explainable AI, CatBoost, Logistic Regression, SHAP, Soft Voting Ensemble, Student Result Prediction, Educational Data Mining.
Conclusion
This paper presented an Explainable AI-based hybrid model for student result prediction that combines CatBoost and Lo-gistic Regression through a Soft-Voting ensemble, with SHAP used to explain both global and individual predictions. On a 10,000-record academic dataset, the ensemble achieved 99.10% accuracy, 99.37% precision, 99.02% recall, 99.19% F1-score and 99.98% ROC-AUC, while SHAP analysis identified Study Hours, Previous GPA and Internal Marks as the most influential factors behind student outcomes. The system is computation-ally light, handles categorical and numerical academic attributes without extensive pre-processing, and produces explanations that non-technical academic staff can act on directly.
References
[1]O. Daramola, O. Emebo, I. Afolabi, and C. Ayo, âImplementation of an intelligent course advisory expert system,â Int. J. Adv. Res. Artif. Intell., vol. 3, no. 5, pp. 6â12, May 2014.
[2]
I. Khan, A. R. Ahmad, N. Jabeur, and M. N. Mahdi, âAn artificial intelligence approach to monitor student performance and devise preventive measures,â Smart Learn. Environ., vol. 8, no. 1, p. 27, Sep. 2021.
[3]H. A. Mengash, âUsing data mining techniques to predict student performance to support decision making in university admission systems,â IEEE Access, vol. 8, pp. 55462â55470, 2020.
[4]L. M. A. Zohair, âPrediction of studentâs performance by mod-elling small dataset size,â Int. J. Educ. Technol. Higher Educ., vol. 16, no. 1, p. 18, Aug. 2019.
[5]G. AkçapÄąnar, A. Altun, and P. As¸kar, âUsing learning analytics to develop early-warning system for at-risk students,â Int. J. Educ. Technol. Higher Educ., vol. 16, no. 1, p. 40, Oct. 2019.
[6]S. Limanto, J. L. Buliali, and A. Saikhu, âGLoW SMOTE-D: Oversampling technique to improve prediction model perfor-mance of students failure in courses,â IEEE Access, vol. 12, pp. 8889â8901, 2024.
[7]X. Huang and J. Li, âExplainable AI models for academic perfor-mance prediction in higher education,â IEEE Trans. Learn. Tech-nol., 2025.
[8]N. Omar and R. Hassan, âStudent performance prediction using machine learning and explainable AI,â Artif. Intell. Rev., Springer, 2025.
[9]A. Kumar and R. Sharma, âStudent result prediction using logistic regression,â in Proc. IEEE Conf. Educational Data Mining, 2024.
[10]V. Reddy and S. Patel, âStudent performance prediction using KNN and SVM,â Lecture Notes in Computer Science (LNCS), Springer, 2025.
[11]P. Singh and N. Verma, âDecision tree based academic perfor-mance prediction,â IEEE Access, 2024.
[12]M. Gupta and A. Jain, âRandom forest model for student perfor-mance evaluation,â Computers & Education, Elsevier, 2024.
[13]A. Kord, A. Aboelfetouh, and S. M. Shohieb, âAcademic course planning recommendation and studentsâ performance prediction based on educational data mining,â J. Comput. Higher Educ., Springer, 2025.
[14]Y. Ren and X. Yu, âLong-term student performance prediction using learning ability self-adaptive algorithm,â Complex & Intell. Syst., Springer, 2024.
[15]S. Khan and T. Ali, âHandling imbalanced student data using SMOTE and machine learning,â in Proc. IEEE Int. Conf. AI in Education, 2024.
[16]L. Zhou and Y. Wang, âAutoML based student academic result prediction,â Appl. Artif. Intell., Springer, 2025.
[17]H. Li and Q. Chen, âGraph neural networks for learning behav-ior analysis and performance prediction,â Knowledge-Based Syst., Elsevier, 2024.
[18]J. Martin and D. Lopez, âPredicting student performance in on-line learning using multidimensional time-series data,â Appl. Sci., MDPI, 2024. [19]Y. Zhang and M. Liu, âExplainable student performance predic-tion using ensemble machine learning,â IEEE Access, 2024.
[20]K. Patel and D. Mehta, âEarly prediction of at-risk students using machine learning models,â Computers & Education: Artif. Intell., Elsevier, 2024.
[21]R. Almeida and P. Costa, âStudent academic performance predic-tion using gradient boosting techniques,â Appl. Sci., MDPI, 2024.
[22]S. Sharma and P. Kulkarni, âMachine learning approaches for pre-dicting student academic outcomes,â Educ. Inf. Technol., Springer, 2025.
[23]S. Verma and R. Gupta, âHybrid machine learning framework for student result prediction,â J. Appl. Data Sci., Taylor & Francis, 2024.
[24]J. Fernandez and A. Morales, âPredicting student success using interpretable machine learning models,â Expert Syst. Appl., Else-vier, 2025.
[25]P. Das and S. Nanda, âEducational data mining for early student performance prediction,â J. Educ. Technol. Syst., Springer, 2024.
[26]K. Lee and S. Park, âExplainable gradient boosting models for student achievement prediction,â Sensors, MDPI, 2025.
[27]P. Mehra and S. Kulkarni, âStudent academic result prediction us-ing explainable machine learning models,â IEEE Access, 2024.
[28]R. Choudhary and K. Malhotra, âPredicting student success using hybrid classification techniques,â Educ. Inf. Technol., Springer, 2024.
[29]M. Ahmed and N. Rahim, âMachine learning-based student per-formance prediction in higher education,â Computers & Educa-tion: Artif. Intell., Elsevier, 2024.
[30]J. Park and H. Kim, âEarly detection of at-risk students using in-terpretable ML models,â Appl. Sci., MDPI, 2025.
[31]A. Joshi and M. Patwardhan, âStudent result prediction using en-semble learning and explainable AI,â Artif. Intell. Rev., Springer, 2025.
[32]T. Wang and L. Xu, âAcademic performance prediction using gra-dient boosting and SHAP explanations,â Expert Syst. Appl., Else-vier, 2024.
[33]S. Iyer and P. Rao, âEducational data mining for student perfor-mance evaluation,â in Proc. IEEE Int. Conf. Data Science in Edu-cation, 2024.
[34]L. Moreno and D. Sanchez, âExplainable student performance prediction in blended learning environments,â Knowledge-Based Syst., Elsevier, 2025.
[35]N. Bansal and V. Arora, âPredicting student outcomes using ma-chine learning and learning analytics,â J. Comput. Higher Educ., Springer, 2024.
[36]R. Yadav and S. Mishra, âHybrid ML models for academic perfor-mance prediction with interpretability,â J. Appl. Data Sci., Taylor & Francis, 2025.
[37]
F. Khan and Z. Ullah, âStudent performance prediction using ex-plainable gradient boosting models,â Sensors, MDPI, 2024.
J. Silva and M. Pereira, âA comparative study of machine learning models for student result prediction,â IEEE Trans. Learn. Tech-nol., 2025.
Cite this article
APA
Heena M. Pathan, Pournima E. Gawade (June 2026). Explainable AI-Based Student Result Prediction Using CatBoost and Logistic Regression. International Journal of Engineering and Techniques (IJET), 12(3). https://doi.org/{{doi}}
Heena M. Pathan, Pournima E. Gawade, âExplainable AI-Based Student Result Prediction Using CatBoost and Logistic Regression,â International Journal of Engineering and Techniques (IJET), vol. 12, no. 3, June 2026, doi: {{doi}}.
