Students' performance prediction employing Decision Tree
Main Article Content
Abstract
An optimized educational community is a must in this modern era. The intersection of educational activities and the transformative potentials of Educational Data Mining (EDM) should be traversed, highlighting the reasoning behind the importance of EDM. Prior prediction of how a student stands academically, can facilitate them towards a much safer approach with their life decisions. This study uses the vast power and analytical domain of EDM, combining it with machine learning models, upholding an accurate prediction of students' academic performance. The study consists of a dataset containing academic, demographic and social data of undergraduate students. The paper aims to analyze comprehensively the features that act behind academic performance. Lastly, it compares the impact of non-academic data separately on a student's performance and with academic data as well. Traditional machine learning algorithms perform quite well in general, with SVM giving a best accuracy of around 95% with academic data, while training and testing the model without academic data still gives a good performance of 93%. The hierarchical tree from Decision Tree visualizes the key features, which include past results, family members' qualification levels and their jobs, hobbies of the student, commute time, and more.
Article Details
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
References
Alamri, R., & Alharbi, B. (2021). Explainable student performance prediction models: A systematic review. IEEE Access, 9, 33132-33143. https://doi.org/10.1109/ACCESS.2021.3061368
Alturki, S., & Alturki, N. (2021). Using educational data mining to predict students' academic performance for applying early interventions. Journal of Information Technology Education: JITE. Innovations in Practice: IIP, 20, 121-137.
Amrieh, E. A., & Hamtini, T. (2016). Students' academic performance dataset. https://www.kaggle.com/datasets/aljarah/xAPI-Edu-Data
Cortez, P., & Silva, A. M. G. (2008). Student Performance. https://archive.ics.uci.edu/dataset/320/student+performanc
Bangladesh Government. (2022). Education Statistics of Bangladesh Bureau of Educational Information and Statistics. https://banbeis.portal.gov.bd/40
Feng, G., Fan, M., & Chen, Y. (2022). Analysis and prediction of students’ academic performance based on educational data mining. IEEE Access, 10, 19558-19571. https://doi.org/10.1109/ACCESS.2022.3151652.
Jayaprakash, S., Krishnan, S., & Jaiganesh, V. (2020). Predicting students’ academic performance using an improved random forest classifier. In 2020 international conference on emerging smart computing and informatics (ESCI) (pp. 238-243).
Jordan, M. I., & Mitchell, T. M. (2015). Machine learning: Trends, perspectives, and prospects. Science, 349(6245), 255-260. https://doi.org/10.1126/science.aaa841
Kaggle. (2019). Higher education students performance evaluation. https://www.kaggle.com/datasets/csafrit2/higher-education-students-performance-evaluation
Kotsiantis, S. B., Pierrakeas, C., & Pintelas, P. E. (2003). Preventing student dropout in distance learning using machine learning techniques. In Knowledge-Based Intelligent Information and Engineering Systems: 7th International Conference, KES 2003, Oxford, UK, September 2003. Proceedings, Part II 7 (Vol. 2774, pp. 267-274).
Kumar, M., Singh, A. J., & Handa, D. (2017). Literature survey on educational dropout prediction. International Journal of Education and Management Engineering, 7(2), 8-19. https://doi.org/10.5815/ijeme.2017.02.02
Marbouti, F., Diefes-Dux, H. A., & Madhavan, K. (2016). Models for early prediction of at-risk students in a course using standards-based grading. Computers and Education, 103, 1-15. https://doi.org/10.1016/j.compedu.2016.09.005
Mohamad, S. K., & Tasir, Z. (2013). Educational data mining: A review. Procedia - Social and Behavioral Sciences, 97, 320-324. https://doi.org/10.1016/j.sbspro.2013.10.240
Nahar, K., Shova, B. I., Ria, T., Rashid, H. B., & Islam, A. S. (2021). Mining educational data to predict students performance: A comparative study of data mining techniques. Education and Information Technologies, 26(6), 6051-6067.
Nosseir, A., & Fathy, Y. (2020). A mobile application for early prediction of student performance using fuzzy logic and artificial neural networks. International Journal of Interactive Mobile Technologies, 14(2), 4-18. https://doi.org/10.3991/ijim.v14i02.10940
Ocumpaugh, J., Baker, R., Gowda, S., Heffernan, N., & Heffernan, C. (2014). Population validity for educational data mining models: A case study in affect detection. British Journal of Educational Technology, 45(3), 487-501. https://doi.org/10.1111/bjet.12156
Pathan, A. A., Hasan, M., Ahmed, M. F., & Farid, D. M. (2014). Educational data mining: A mining model for developing students' programming skills. In The 8th International Conference on Software, Knowledge, Information Management and Applications (SKIMA 2014) (pp. 1-5). IEEE. https://doi.org/10.1109/SKIMA.2014.7083552
Romero, C., & Ventura, S. (2010). Educational data mining: a review of the state of the art. IEEE Transactions on Systems, Man, and Cybernetics, Part C (applications and reviews), 40(6), 601-618. https://doi.org/10.1109/TSMCC.2010.2053532
Roslan, M. B., & Chen, C. (2022). Educational data mining for student performance prediction: A systematic literature review (2015-2021). International Journal of Emerging Technologies in Learning (iJET), 17(5), 147-179.
Shafiq, D. A., Marjani, M., Habeeb, R. A. A., & Asirvatham, D. (2022). Student retention using educational data mining and predictive analytics: A systematic literature review. IEEE Access, 10, 72480 - 72503. https://doi.org/10.1109/ACCESS.2022.3188767
Sokkhey, P., Navy, S., Tong, L., & Okazaki, T. (2020). Multi-models of educational data mining for predicting student performance in mathematics: A case study on high schools in Cambodia. IEIE Transactions on Smart Processing and Computing, 9(3), 217-229.
Tomasevic, N., Gvozdenovic, N., & Vranes, S. (2020). An overview and comparison of supervised data mining techniques for student exam performance prediction. Computers and Education, 143, 103676. https://doi.org/10.1016/j.compedu.2019.103676
Western OC2 Lab. (2018). Student-Performance-and-Engagement-Prediction-eLearning-datasets. https://github.com/Western-OC2-Lab/Student-Performance-and-Engagement-Prediction-eLearning-datasets
Yağcı, M. (2022). Educational data mining: Prediction of students' academic performance using machine learning algorithms. Smart Learning Environments, 9(1), 11. https://doi.org/10.1186/s40561-022-00192-z
Yang, F., & Li, F. W.(2018, August). Study on student performance estimation, student progress analysis, and student potential prediction based on data mining. Computers and Education, 123, 97-108.
Ocumpaugh, J., Baker, R., Gowda, S., Heffernan, N., & Heffernan, C. (2014). Population validity for educational data mining models: A case study in affect detection. British Journal of Educational Technology, 45(3), 487-501.