Explainable AI-Based Student Result Prediction: A Review of Machine Learning Approaches Using CatBoost and Logistic Regression | IJET Volume 12 – Issue 3 | IJET-V12I3P33

International Journal of Engineering and Techniques (IJET) Logo

International Journal of Engineering and Techniques (IJET)

Open Access • Peer Reviewed • High Citation & Impact Factor • ISSN: 2395-1303

Volume 12, Issue 3  |  Published: May 2026}

Author: Heena M. Pathan, Pournima E. Gawade}

DOI: https://doi.org/{{doi}}  •  PDF: Download

Abstract

Student academic performance prediction has become a critical priority for educational institutions seeking to improve learning outcomes and reduce dropout rates. Traditional evaluation methods are inadequate for handling the large volumes of student data generated by modern digital academic environments. While numerous machine learning models have been proposed for result prediction, a persistent gap exists between predictive accuracy and model interpretability. This paper presents a comprehensive review of machine learning approaches for student result prediction, culminating in the proposal of a hybrid framework combining CatBoost and Logistic Regression. The reviewed models are assessed for accuracy, interpretability, computational cost, and suitability for real educational deployment. Key limitations — including the absence of explainability in high-accuracy models, poor handling of categorical student data, and class imbalance — are analyzed in depth. The paper concludes by identifying open research problems and proposing directions toward scalable, transparent, and computationally efficient student performance prediction systems.

Keywords

Student Result Prediction, Explainable AI, CatBoost, Logistic Regression, Educational Data Mining, Machine Learning, Hybrid Model, Class Imbalance, SHAP.

Conclusion

This paper has presented a structured review of machine learning approaches for student result prediction, tracing the evolution from simple statistical baselines to complex deep learning architectures. The reviewed body of work demonstrates clear technical progress; however, a persistent gap between predictive accuracy and model interpretability continues to limit practical deployment in real educational environments. The proposed hybrid framework combining CatBoost and Logistic Regression addresses this gap by delivering strong classification performance without sacrificing the transparency that educational stakeholders require. By enabling early identification of at-risk students and supporting data-driven academic planning through interpretable predictions, the proposed approach holds the potential to meaningfully improve student outcomes and institutional decision-making. Addressing the remaining challenges of real-world deployment, fairness, and scalability through continued research will be essential to realising these benefits at scale.

References

[1]A. Kumar and R. Sharma, “Student Result Prediction using Logistic Regression,” IEEE Conference on Educational Data Mining, 2024. [2]V. Reddy and S. Patel, “Student Performance Prediction using KNN and SVM,” Springer – Lecture Notes in Computer Science, 2025. [3]P. Singh and N. Verma, “Decision Tree-Based Academic Performance Prediction,” IEEE Access, 2024. [4]M. Gupta and A. Jain, “Random Forest Model for Student Performance Evaluation,” Elsevier – Computers & Education, 2024. [5]L. Zhou and Y. Wang, “Student Performance Prediction Using Bi-LSTM,” Springer – Applied Artificial Intelligence, 2024. [6]Y. Zhou and Y. Wang, “AutoML-Based Student Academic Result Prediction,” Springer – Applied Artificial Intelligence, 2025. [7]Y. Ren and X. Yu, “Long-Term Student Performance Prediction Using Learning Ability Self-Adaptive Algorithm,” Complex & Intelligent Systems, Springer, 2024. [8]H. Li and Q. Chen, “Graph Neural Networks for Learning Behavior Analysis and Performance Prediction,” Elsevier – Knowledge-Based Systems, 2024. [9]A. Kord, A. Aboelfetouh, and S. M. Shohieb, “Academic Course Planning Recommendation and Students’ Performance Prediction Based on Educational Data Mining,” Journal of Computing in Higher Education, Springer, 2025. [10]J. Martin and D. Lopez, “Predicting Student Performance in Online Learning Using Multidimensional Time-Series Data,” MDPI – Applied Sciences, 2024. [11]Y. Zhang and M. Liu, “Explainable Student Performance Prediction Using Ensemble Machine Learning,” IEEE Access, 2024. [12]K. Patel and D. Mehta, “Early Prediction of At-Risk Students Using Machine Learning Models,” Elsevier – Computers & Education: Artificial Intelligence, 2024. [13]X. Huang and J. Li, “Explainable AI Models for Academic Performance Prediction in Higher Education,” IEEE Transactions on Learning Technologies, 2025. [14]S. Khan and T. Ali, “Handling Imbalanced Student Data Using SMOTE and Machine Learning,” IEEE International Conference on AI in Education, 2024. [15]N. Omar and R. Hassan, “Student Performance Prediction Using Machine Learning and Explainable AI,” Springer – Artificial Intelligence Review, 2025.

Cite this article

APA
Heena M. Pathan, Pournima E. Gawade (May 2026). Explainable AI-Based Student Result Prediction: A Review of Machine Learning Approaches Using CatBoost and Logistic Regression. International Journal of Engineering and Techniques (IJET), 12(3). https://doi.org/{{doi}}
Heena M. Pathan, Pournima E. Gawade, “Explainable AI-Based Student Result Prediction: A Review of Machine Learning Approaches Using CatBoost and Logistic Regression,” International Journal of Engineering and Techniques (IJET), vol. 12, no. 3, May 2026, doi: {{doi}}.
Submit Your Paper