Koushik Roy * , Huu-Hoa Nguyen and Dewan Md. Farid

* Corresponding author: Koushik Roy (email: rkoushikroy2@gmail.com)

Main Article Content

Abstract

This study addresses the crucial issue of predicting student performance in educational data mining (EDM) by proposing an Adaptive Dimensionality Reduction Algorithm (ADRA). ADRA efficiently reduces the dimensionality of student data, encompassing various academic, demographic, behavioral, social, and health-related features. It achieves this by iteratively selecting the most relevant features based on a combined normalized mean rank of five feature ranking methods. This reduction in dimensionality enhances the performance of predictive models and provides valuable insights into the key factors influencing student performance. The study evaluates ADRA using four different student performance datasets and six machine learning algorithms, comparing it to three existing dimensionality reduction methods. The results show that ADRA achieves an average dimensionality reduction factor of 6.2 while maintaing comprable accuracy with other mehtods.

Keywords: Backward elimination, dimensionality reduction, forward selection, recursive feature elimination, student performance prediction

Article Details

Author Biographies

Koushik Roy, United International University, Dhaka, Bangladesh

Koushik Roy is a proficient Data Analyst at NEXT Ventures, based in Dhaka, Bangladesh, known for his adeptness in processing, analyzing, and visualizing data, alongside crafting automated data processing systems. With a background as a Machine Learning Engineer at NybSys Ltd., Koushik has applied his skills in areas such as facial recognition, surveillance solutions, and handwritten digit recognition. Currently pursuing a Master's degree in Computer Science and Engineering at the United International University, his academic research focuses on Student Performance Prediction using Education Data Mining. Koushik's technical repertoire spans Python, R, TensorFlow, and PyTorch, and his diverse skill set includes deep learning, computer vision, and robotics. His research contributions, certifications, and personal projects demonstrate his dedication to advancing data science and artificial intelligence, positioning him as a valuable contributor to the field with a keen interest in real-world problem-solving and innovation.

Huu-Hoa Nguyen, College of Information and Communication Technology, Can Tho University, 3/2 Street, Ninh Kieu District, Can Tho City, Vietnam

Nguyen Huu Hoa received his Engineering Degree in Informatics from Can Tho University (Vietnam) in 1996, MSc Degree in Information Systems from HAN University (The Netherland) in 2004 and PhD Degree in Informatics from Lyon University (France) in 2012. Dr. Hoa held the position of Vice Dean from 2015 to 2018, Dean from 2018 to 2022, and has been serving as Rector of the College of Information and Communication Technology at Can Tho University since October 2022. He has 25 years of experience in teaching and research which covers a lot of ICT topics, including artificial intelligence, computer networks, data mining, big data, information visualization and knowledge management systems. His expertise in such research topics is specially valued, which will provide unique view angles to develop the research insight onto the development of digital asset management. Moreover, Dr. Hoa secured multiple research projects, such as “High dimensional heterogeneous data based animation techniques for southeast asian intangible cultural heritage digital content” funded by Horizon-2020 European Commission as Co-Partner and “Smart City” funded by Newton Fund in Vietnam as Co-Investigator. He has also organized several international/national conferences and served as program chairs/technical program members in many conferences during his career period.

Dewan Md. Farid, United International University, Dhaka, Bangladesh

Prof. Dr. Dewan Md. Farid is a Professor of Computer Science and Engineering at United International University. He is an IEEE Senior Member and Member ACM. Prof. Farid worked as a Postdoctoral Fellow/Staff at the following research labs/groups: (1) Computational Intelligence Group (CIG), Department of Computer Science and Digital Technology, University of Northumbria at Newcastle, UK in 2013, (2) Computational Modelling Lab (CoMo) and Artificial Intelligence Research Group, Department of Computer Science, Vrije Universiteit Brussel, Belgium in 2015-2016, and (3) Decision and Information Systems for Production systems (DISP) Laboratory, IUT Lumi`ere – Universit ́e Lyon 2, France in 2020. Prof. Farid was a Visiting Faculty at the Faculty of Engineering, University of Porto, Portugal in June 2016. He holds a PhD in Computer Science and Engineering from Jahangirnagar University, Bangladesh in 2012. Part of his PhD research has been done at ERIC Laboratory, University Lumi`ere Lyon 2, France by Erasmus-Mundus ECW eLink PhD Exchange Program. His PhD was fully funded by Ministry of Science & Information and Communication Technology, Government of the People’s Republic of Bangladesh and European Union (EU) eLink project. Prof. Farid has published 118 peer-reviewed scientific articles including 31 highly esteemed journals like Expert Systems with Applications, Journal of Theoretical Biology, Journal of Neuroscience Methods, Bioinformatics, Scientific Reports (Nature), Proteins and so on in the field of Machine Learning, Data Mining and Big Data. Prof. Farid received the following awards: (1) Dr. Fatema Rashid Best Paper Award (2nd Position) for the paper titled “KNNTree: A new method to ameliorate k-nearest neighbour classification using decision tree” in 3rd International Conference on Electrical Computer and Communication Engineering (ECCE 2023), CUET, Chittagong, Bangladesh, (2) JuliaCon 2019 Travel Award for attending Julia Conference at the University of Maryland, Baltimore, USA, and (3) United Group Research Award 2016 in the field of Science and Engineering. He received the following research funds as Principal Investigator: (1) a2i Innovation Fund of Innov-A-Thon 2018 (Ideabank ID No.: 12502) from a2i-Access to Information Program – II, Information and Communication Technology (ICT) Division, Government of the People’s Republic of Bangladesh, and (2) Project Code: UIU/IAR/01/2021/SE/23 received from Institute for Advanced Research (IAR), United International University. Prof. Farid received the following Erasmus Mundus scholarships: (1) LEADERS (Leading mobility between Europe and Asia in Developing Engineering Education and Research) to undertake a staff level mobility at the Faculty of Engineering, University of Porto, Portugal in 2015, (2) cLink (Centre of excellence for Learning, Innovation, Networking and Knowledge) for pursuing Postdoc at University of Northumbria at Newcastle, UK in 2013, and (3) eLink (east west Link for Innovation, Networking and Knowledge exchange) for pursuing Ph.D. at University Lumi`ere Lyon 2, France in 2009. Prof. Farid also received Senior Fellowship I and II awards by National Science & Information and Communication Technology (NSICT), Ministry of Science & Information and Communication Technology, Government of the People’s Republic of Bangladesh respectively in 2008 and 2011 for pursuing Ph.D. at Jahangirnagar University. He visited 18 countries for attending international conferences, research and higher education. Prof. Farid delivered several invited/keynote talks including an invited research talk at Data to AI Group (DAI), Laboratory for Information and Decision Systems (LIDS), Massachusetts Institute of Technology (MIT), Cambridge, Massachusetts, USA.

References

Amrieh, E. A., Hamtini, T., & Aljarah, I. (2016). Mining educational data to predict student’s academic performance using ensemble methods. International Journal of Database Theory and Application, 9(8), 119-136.

Alhassan, A., Zafar, B., & Mueen, A. (2020). Predict students’ academic performance based on their assessment grades and online activity data. International Journal of Advanced Computer Science and Applications, 11(4).

Bilal, M., Omar, M., Anwar, W., Bokhari, R. H., & Choi, G. S. (2022). The role of demographic and academic features in a student performance prediction. Scientific Reports, 12(1), 12508.

Cortez, P. (2014). Student Performance. [Dataset]. UCI Machine Learning Repository. https://doi.org/10.24432/C5TG7T.

Estrera, P. J. M., Natan, P. E., Rivera, B. G. T., & Colarte, F. B. (2017). Student Performance Analysis for Academic Ranking Using Decision Tree Approach in University of Science and Technology of Southern Philippines Senior High School Abstract. International Journal of Engineering and Technology, 3(5), 147-153.

Febro, J. D. (2019). Utilizing feature selection in identifying predicting factors of student retention. International Journal of Advanced Computer Science and Applications, 10(9).

Fida, S., Masood, N., Tariq, N., & Qayyum, F. (2022). A Novel Hybrid Ensemble Clustering Technique for Student Performance Prediction. JUCS: Journal of Universal Computer Science, 28(8).

Injadat, M., Moubayed, A., Nassif, A. B., & Shami, A. (2020). Multi-split optimized bagging ensemble model selection for multi-class educational data mining. Applied Intelligence, 50, 4506-4528.

Mythili, M. S., & Shanavas, A. M. (2014). An Analysis of students’ performance using classification algorithms. IOSR Journal of Computer Engineering, 16(1), 63-69.

Ouyang, F., Wu, M., Zheng, L., Zhang, L., & Jiao, P. (2023). Integration of artificial intelligence performance prediction and learning analytics to improve student learning in online engineering course. International Journal of Educational Technology in Higher Education, 20(1), 1-23.

Ramaswami, M., & Bhaskaran, R. (2009). A study on feature selection techniques in educational data mining. arXiv preprint arXiv:0912.3924.

Romero, C., & Ventura, S. (2020). Educational data mining and learning analytics: An updated survey. Wiley interdisciplinary reviews: Data mining and knowledge discovery, 10(3), e1355.

Sabri, M., Zahid, M., Abd Majid, N. A., Hanawi, S. A., Talib, N. I. M., & Yatim, A. I. A. (2023). Prediction model based on continuous data for student performance using principal component analysis and support vector machine. TEM Journal, 12(2).

Shetu, S. F., Saifuzzaman, M., Moon, N. N., Sultana, S., & Yousuf, R. (2021). Student’s performance prediction using data mining technique depending on overall academic status and environmental attributes. In International Conference on Innovative Computing and Communications: Proceedings of ICICC 2020, Volume 2 (pp. 757-769). Springer Singapore.

VeeraManickam, M. R. M., Mohanapriya, M., Pandey, B. K., Akhade, S., Kale, S. A., Patil, R., & Vigneshwar, M. (2019). Map-reduce framework based cluster architecture for academic student’s performance prediction using cumulative dragonfly based neural network. Cluster Computing, 22(Suppl 1), 1259-1275.

Xue, H., & Niu, Y. (2023). Multi-Output Based Hybrid Integrated Models for Student Performance Prediction. Applied Sciences, 13(9), 5384.

Yağcı, M. (2022). Educational data mining: prediction of students' academic performance using machine learning algorithms. Smart Learning Environment, 9(11). https://doi.org/10.1186/s40561-022-00192-z.

Yılmaz, N., & Sekeroglu, B. (2019, August). Student performance classification using artificial intelligence techniques. In International Conference on Theory and Application of Soft Computing, Computing with Words and Perceptions (pp. 596-603). Cham: Springer International Publishing.

Zaffar, M., Hashmani, M. A., Savita, K. S., & Rizvi, S. S. H. (2018). A study of feature selection algorithms for predicting students academic performance. International Journal of Advanced Computer Science and Applications, 9(5).

Zhang, X., Liu, J., Zhang, C., Shao, D., & Cai, Z. (2023). Innovation Performance Prediction of University Student Teams Based on Bayesian Networks. Sustainability, 15(3), 2335.

Most read articles by the same author(s)