Prediction model of teacher candidate student graduation status: Decision Tree C4.5, Naive Bayes, and k-NN

Kartianom Kartianom* -  Institut Agama Islam Negeri Bone, Indonesia
Arpandi Arpandi -  Institut Agama Islam Negeri Bone, Indonesia
Gulzhaina K. Kassymova -  Abai Kazakh National Pedagogical University, Kazakhstan
Oscar Ndayizeye -  Hebei Foreign Studies University, China

DOI : 10.30863/ekspose.v21i2.3407

This study aims to determine the prediction model of the graduation status of prospective teacher students at IAIN Bone in terms of attributes, accuracy levels, and differences in the level of accuracy produced in the attributes of decision tree C4.5, Naïve Bayes, and k-NN data mining algorithms. This research uses a quantitative approach by adopting the Data Mining method. This research was conducted at IAIN Bone. The data collection process in this study used documentation techniques in the form of data on alumni of the Tarbiyah Faculty of IAIN Bone. The data analysis used was a descriptive analysis using decision tree C4.5, Naive Bayes, and k-NN data mining algorithms assisted by the RapidMiner application.  The results of this study show that (1) model prediction of the graduation status of prospective teacher students in IAIN Bone in terms of attributes generated in the Decision Tree C4.5 and Naïve Bayes data mining algorithms  consist of gender, age, Semester 1 IP, Semester 2 IP, Semester 3 IP, Semester 4 IP, and GPA, while the attributes produced in   k-NN data mining algorithm  consists of gender, regional origin, number of siblings, age, IP Semester 1, IP Semester 2, IP Semester 3, IP Semester 4, and GPA; (2) model prediction of graduation status of iain bone teacher candidate students in terms of the accuracy rate generated in the Decision Tree C4.5 data mining algorithm  of 93.90%, Naïve Bayes by 90.24%, and k-NN of 92.07%; and (3) there was no significant difference between the accuracy rate produced by decision tree's data mining algorithm.  C4.5 and Naïve Bayes (p-value = 1.00); Decision Tree C4.5 and k-NN (p-value = 1.00); as well as Naïve Bayes and k-NN (p-value = 1.00) in predicting the graduation status of iain bone teacher candidate students.
Keywords
Data Mining; Decision Tree C4.5; Naïve Bayes; k-NN; Student Graduation Status
  1. Aldowah, H., Al-Samarraie, H., & Fauzy, W. M. (2019). Educational data mining and learning analytics for 21st century higher education: A review and synthesis. Telematics and Informatics, 37, 13–49.
  2. Altayef, E., Anayi, F., & Packianather, M. (2022). A new enhancement of the k-NN algorithm by Using an optimization technique. 2022 2nd International Conference on Advance Computing and Innovative Technologies in Engineering (ICACITE), 24–31. https://doi.org/10.1109/ICACITE53722.2022.9823537
  3. Arifin, D., & Hadiana, A. (2019). Computer-based Techniques for Predicting the Failure of Student Studies Using the Decision Tree method. IOP Conference Series: Materials Science and Engineering, 662(2), 022112. https://doi.org/10.1088/1757-899X/662/2/022112
  4. Bucos, M., Journal, B. D.-T., & 2018, undefined. (2018). Predicting student success using data generated in traditional educational environments. Ceeol.Com, 7(3), 617. https://doi.org/10.18421/TEM73-19
  5. Bulut, O., & Yavuz, H. C. (2019). Educational data mining: A tutorial for the rattle package in R. International Journal of Assessment Tools in Education, 6(5), 20–36. https://doi.org/10.21449/ijate.627361
  6. Effendi, M. M., & Setiawan, A. (2021). Menentukan prediksi kelulusan siswa dengan membandingkan algoritma C4. 5 dan naive bayes studi kasus SMKN. 1 Cikarang Selatan. Jurnal SIGMA, 10(3), 183–190.
  7. Fadrial, Y. E. (2021). Algoritma naive bayes untuk mencari perkiraan waktu studi mahasiswa. Journal of Information Technology and Computer Science (INTECOMS), 4(1).
  8. Fernández-Delgado, M., Cernadas, E., Barro, S., & Amorim, D. (2014). Do we need hundreds of classifiers to solve real world classification problems? The Journal of Machine Learning Research, 15(1), 3133–3181.
  9. Fitrani, A. S. (2019). Prediction of Study Period Students (Bachelor Degree) Muhammadiyah University of Sidoarjo Based on Decision Tree Method using C4.5 Algorithm. Journal of Physics: Conference Series, 1179(1), 012033. https://doi.org/10.1088/1742-6596/1179/1/012033
  10. Gunawan, Hanes, & Catherine. (2019). Information Systems Students’ Study Performance Prediction Using Data Mining Approach. 2019 Fourth International Conference on Informatics and Computing (ICIC), 1–8. https://doi.org/10.1109/ICIC47613.2019.8985718
  11. Hardiani, T. (2021). Comparison of Naive Bayes Method, K-NN (K-Nearest Neighbor) and Decision Tree for Predicting the Graduation of ‘Aisyiyah University Students of Yogyakarta. International Journal of Health Science and Technology, 2(1), 75–85. https://doi.org/10.31101/ijhst.v2i1.1829
  12. Haryanto, K. W., & Saputra, R. A. (2018). Aplikasi prediksi masa studi mahasiswa menggunakan algoritma naïve bayes classifier (NBC) (Studi kasus: di STMIK Yadika Bangil). Jurnal SPIRIT, 10(1), 5–12.
  13. Hidayati, N., & Hermawan, A. (2021). K-Nearest Neighbor (K-NN) algorithm with Euclidean and Manhattan in classification of student graduation. Journal of Engineering and Applied Technology, 2(2). https://doi.org/10.21831/jeatech.v2i2.42777
  14. Koon, S., & Petscher, Y. (2015). Comparing Methodologies for Developing an Early Warning System: Classification and Regression Tree Model versus Logistic Regression. REL 2015-077. Regional Educational Laboratory Southeast.
  15. Mulyadi, C., & Sugiarto, L. (2021). Penggunaan algoritma naïve bayes untuk prediksi ketepatan waktu lulus mahasiswa diploma 3 STMIK Cipta Darma Surakarta. TEKNOMATIKA, 11(01), 21–30.
  16. Nastiti, V. R. S., Azhar, Y., & Pramudita, A. E. (2019). Penerapan algoritma C5. 0 pada analisis faktor-faktor pengaruh kelulusan tepat waktu mahasiswa Teknik Informatika UMM. Jurnal Repositor, 1(2), 131–140.
  17. Nursyafti, Y., & Purwanto, W. (2021). Faktor-faktor penghambat kelulusan tepat waktu mahasiswa D3 jurusan Teknik Otomotif Fakultas Teknik Universitas Negeri Padang tahun masuk 2016 dan 2017. MSI Transaction on Education, 2(3), 2021.
  18. Pattiasina, T., & Rosiyadi, D. (2020). Comparison of data mining classification algorithm for predicting the performance of high school students. Jurnal Techno Nusa Mandiri, 17(1), 22–30. https://doi.org/10.33480/techno.v17i1.1226
  19. Putri, D. I., & Sulistyowati, N. (2020). Comparison of student graduation classification analysis based on study length using naïve-bayes and C4. 5 algorithms. International Research Journal of Advanced Engineering and Science, 5(1), 233–238.
  20. Putri, D. Y., Andreswari, R., & Hasibuan, M. A. (2018). Analysis of Students Graduation Target Based on Academic Data Record Using C4.5 Algorithm Case Study: Information Systems Students of Telkom University. 2018 6th International Conference on Cyber and IT Service Management (CITSM), 1–6. https://doi.org/10.1109/CITSM.2018.8674366
  21. Qisthiano, M. R., Kurniawan, T. B., Negara, E. S., & Akbar, M. (2021). Pengembangan model untuk prediksi tingkat kelulusan mahasiswa tepat waktu dengan metode naïve bayes. JURNAL MEDIA INFORMATIKA BUDIDARMA, 5(3), 987–994. https://doi.org/10.30865/mib.v5i3.3030
  22. Rechkoski, L., Ajanovski, V. v., & Mihova, M. (2018). Evaluation of grade prediction using model-based collaborative filtering methods. 2018 IEEE Global Engineering Education Conference (EDUCON), 1096–1103. https://doi.org/10.1109/EDUCON.2018.8363352
  23. Rohmawan, E. P. (2018). Prediksi kelulusan mahasiswa tepat waktu menggunakan metode desicion tree dan artificial neural network. Jurnal Ilmiah MATRIK, 20(1).
  24. Roy, S., & Garg, A. (2017). Predicting academic performance of student using classification techniques. 2017 4th IEEE Uttar Pradesh Section International Conference on Electrical, Computer and Electronics, UPCON 2017, 2018-Janua, 568–572. https://doi.org/10.1109/UPCON.2017.8251112
  25. Spikol, D., Ruffaldi, E., Dabisias, G., & Cukurova, M. (2018). Supervised machine learning in multimodal learning analytics for estimating success in project‐based learning. Journal of Computer Assisted Learning, 34(4), 366–377.
  26. Testiana, G. (2018). Perancangan model prediksi kelulusan mahasiswa tepat waktu pada UIN Raden Fatah. JUSIFO (Jurnal Sistem Informasi), 4(1), 49–62.
  27. Wirawan, C. (2020). Teknik data mining menggunakan algoritma decision tree C4. 5 untuk memprediksi tingkat kelulusan tepat waktu. Applied Information Systems and Management, 3(1), 47–52.
  28. Zainuddin, Moh. (2019). Perbandingan 4 algoritma berbasis particle swarm optimization (PSO) untuk prediksi kelulusan tepat waktu mahasiswa. Jurnal Ilmiah Teknologi Informasi Asia, 13(1).

Full Text:
Article Info
Submitted: 2022-12-12
Published: 2022-12-16
Section: Articles
Article Statistics: 276 238