Comparative Analysis of Hybrid Model Performance Using Stacking and Blending Techniques for Student Drop Out Prediction In MOOC

  • Muhammad Ricky Perdana Putra Universitas Amikom Yogyakarta
  • Ema Utami Universitas Amikom Yogyakarta
Keywords: machine learning, classification, stacking, blending, mooc

Abstract

Despite being in high demand as a lifelong learner and academic material supplement, the implementation of Massive Open Online Courses (MOOC) has problems, one of which is the dropout rate (DO) of students, which reaches 93%. As one of the solutions to this problem, machine learning can be utilized as a risk management and early warning system for students who have the potential to drop out. The use of ensemble techniques to build models can improve performance, but previous research has not reviewed the most optimal ensemble technique for this case study. As a form of contribution, this study will compare the performance of models built from stacking and blending techniques. The algorithms used in the base model are KNN, Decision Tree, and Naïve Bayes, while the meta-model uses XGBoost. These algorithms are used to build models with stacking and mixing techniques. The experimental results using stacking are 82.53% accuracy, 84.48% precision, 94.12% recall, and 89.04% F1 score. Meanwhile, the blend obtained 83.39% precision, 85.31% precision, 94.21% recall, and 89.54% F1-Score. These results are supported by model testing using k-fold cross-validation and confusion matrix techniques, which show the same results. That is, blending is 0.86% higher than stacking, so it can be concluded that blending performs better than stacking in the MOOC student dropout prediction case study.

Downloads

Download data is not yet available.

References

L. Ma dan C. S. Lee, “Drivers and barriers to MOOC adoption: perspectives from adopters and non-adopters,” Online Inf. Rev., vol. 44, no. 3, hal. 671–684, 2020, doi: 10.1108/OIR-06-2019-0203.

Z. Chi, S. Zhang, dan L. Shi, “Analysis and Prediction of MOOC Learners’ Dropout Behavior,” Appl. Sci., vol. 13, no. 2, hal. 1–17, 2023, doi: 10.3390/app13021068.

M. Şahin, “A Comparative Analysis of Dropout Prediction in Massive Open Online Courses,” Arab. J. Sci. Eng., vol. 46, no. 2, hal. 1845–1861, 2021, doi: 10.1007/s13369-020-05127-9.

F. Agrusti, G. Bonavolontà, dan M. Mezzini, “University dropout prediction through educational data mining techniques: A systematic review,” J. E-Learning Knowl. Soc., vol. 15, no. 3, hal. 161–182, 2019, doi: 10.20368/1971-8829/1135017.

J. Chen, J. Feng, X. Sun, N. Wu, Z. Yang, dan S. Chen, “MOOC Dropout Prediction Using a Hybrid Algorithm Based on Decision Tree and Extreme Learning Machine,” Math. Probl. Eng., vol. 2019, 2019, doi: 10.1155/2019/8404653.

K. Coussement, M. Phan, A. De Caigny, D. F. Benoit, dan A. Raes, “Predicting student dropout in subscription-based online learning environments: The beneficial impact of the logit leaf model,” Decis. Support Syst., vol. 135, no. December 2019, hal. 113325, 2020, doi: 10.1016/j.dss.2020.113325.

A. Alamri et al., “Predicting MOOCs dropout using only two easily obtainable features from the first week’s activities,” Lect. Notes Comput. Sci. (including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics), vol. 11528 LNCS, hal. 163–173, 2019, doi: 10.1007/978-3-030-22244-4_20.

P. M. Moreno-Marcos, P. J. Muñoz-Merino, J. Maldonado-Mahauad, M. Pérez-Sanagustín, C. Alario-Hoyos, dan C. Delgado Kloos, “Temporal analysis for dropout prediction using self-regulated learning strategies in self-paced MOOCs,” Comput. Educ., vol. 145, hal. 103728, 2020, doi: 10.1016/j.compedu.2019.103728.

C. Jin, “MOOC student dropout prediction model based on learning behaviour features and parameter optimization,” Interact. Learn. Environ., vol. 31, no. 2, hal. 714–732, 2020, doi: 10.1080/10494820.2020.1802300.

L. J. Rodríguez-Muñiz, A. B. Bernardo, M. Esteban, dan I. Díaz, “Dropout and transfer paths: What are the risky profiles when analyzing university persistence with machine learning techniques?” PLoS One, vol. 14, no. 6, hal. 1–20, 2019, doi: 10.1371/journal.pone.0218796.

Y. Mourdi, M. Sadgal, H. El Kabtane, dan H. E. A. El Abdallaoui, “A Multi-Layers Perceptron for predicting weekly learner commitment in MOOCs,” J. Phys. Conf. Ser., vol. 1743, no. 1, 2021, doi: 10.1088/1742-6596/1743/1/012027.

J. Swacha dan K. Muszyńska, “Predicting Dropout in Programming MOOCs through Demographic Insights,” Electron., vol. 12, no. 22, 2023, doi: 10.3390/electronics12224674.

J. Niyogisubizo, L. Liao, E. Nziyumva, E. Murwanashyaka, dan P. C. Nshimyumukiza, “Predicting student’s dropout in university classes using two-layer ensemble machine learning approach: A novel stacked generalization,” Comput. Educ. Artif. Intell., vol. 3, no. November 2021, hal. 100066, 2022, doi: 10.1016/j.caeai.2022.100066.

J. Melvin dan A. Soraya, “Analisis Perbandingan Algoritma XGBoost dan Algoritma Random Forest Ensemble Learning pada Klasifikasi Keputusan Kredit,” J. Ris. Rumpun Mat. dan Ilmu Pengetah. Alam, vol. 2, no. 2, hal. 87–103, 2023.

G. Kumar, A. Singh, dan A. Sharma, “Ensemble Deep Learning Network Model for Dropout Prediction in MOOCs,” Int. J. Electr. Comput. Eng. Syst., vol. 14, no. 2, hal. 187–196, 2023, doi: 10.32985/ijeces.14.2.8.

Z. Shou, P. Chen, H. Wen, J. Liu, dan H. Zhang, “MOOC Dropout Prediction Based on Multidimensional Time-Series Data,” Math. Probl. Eng., vol. 2022, 2022, doi: 10.1155/2022/2213292.

Q. Fu, Z. Gao, J. Zhou, dan Y. Zheng, “CLSA: A novel deep learning model for MOOC dropout prediction,” Comput. Electr. Eng., vol. 94, no. July, hal. 107315, 2021, doi: 10.1016/j.compeleceng.2021.107315.

M. Ahsan, M. A. P. Mahmud, P. K. Saha, K. D. Gupta, dan Z. Siddique, “Effect of Data Scaling Methods on Machine Learning Algorithms and Model Performance,” hal. 5–9, 2021.

A. Ambarwari, Q. Jafar Adrian, dan Y. Herdiyeni, “Analysis of the Effect of Data Scaling on the Performance of the Machine Learning Algorithm for Plant Identification,” J. RESTI (Rekayasa Sist. dan Teknol. Informasi), vol. 4, no. 1, hal. 117–122, 2020, doi: 10.29207/resti.v4i1.1517.

S. Nithya dan S. Umarani, “MOOC Dropout Prediction using FIAR-ANN Model based on Learner Behavioral Features,” Int. J. Adv. Comput. Sci. Appl., vol. 13, no. 9, hal. 607–617, 2022, doi: 10.14569/IJACSA.2022.0130972.

A. Putri et al., “Komparasi Algoritma K-NN, Naive Bayes dan SVM untuk Prediksi Kelulusan Mahasiswa Tingkat Akhir,” MALCOM Indones. J. Mach. Learn. Comput. Sci., vol. 3, no. 1, hal. 20–26, 2023, doi: 10.57152/malcom.v3i1.610.

Z. Saputra, D. Sartika, dan M. H. Irfani, “Prediksi Calon Mahasiswa Penerima KIP Pada Universitas Indo Global Mandiri menggunakan Algoritma Decision Tree,” vol. 4, no. 3, hal. 231–240, 2024.

Reza Fauzy, Riki Winanjaya, dan Susiani, “Analisis Tingkat Kepuasan Pelanggan dengan Menerapkan Algoritma C4.5,” Bull. Comput. Sci. Res., vol. 2, no. 2, hal. 41–46, 2022, doi: 10.47065/bulletincsr.v2i2.162.

H. S. Park dan S. J. Yoo, “Early Dropout Prediction in Online Learning of University using Machine Learning,” Int. J. Informatics Vis., vol. 5, no. 4, hal. 347–353, 2021, doi: 10.30630/JOIV.5.4.732.

Y. Zheng, Z. Gao, Y. Wang, dan Q. Fu, “MOOC Dropout Prediction Using FWTS-CNN Model Based on Fused Feature Weighting and Time Series,” IEEE Access, vol. 8, hal. 225324–225335, 2020, doi: 10.1109/ACCESS.2020.3045157.

B. Huang dan C. Wang, “Research on Data Analysis of Efficient Innovation and Entrepreneurship Practice Teaching Based on LightGBM Classification Algorithm,” Int. J. Comput. Intell. Syst., vol. 16, no. 1, 2023, doi: 10.1007/s44196-023-00324-4.

D.- Andriansyah dan E. W. Fridayanthie, “Optimization of Support Vector Machine and XGBoost Methods Using Feature Selection to Improve Classification Performance,” J. Informatics Telecommun. Eng., vol. 6, no. 2, hal. 484–493, 2023, doi: 10.31289/jite.v6i2.8373.

W. Wunnasri, P. Musikawan, dan C. So-In, “A Two-Phase Ensemble-Based Method for Predicting Learners’ Grade in MOOCs,” Appl. Sci., vol. 13, no. 3, 2023, doi: 10.3390/app13031492.

S. Y. J. Prasetyo, Y. B. Christianto, dan K. D. Hartomo, “Analisis Data Citra Landsat 8 OLI Sebagai Indeks Prediksi Kekeringan Menggunakan Machine Learning di Wilayah Kabupaten Boyolali dan Purworejo,” Indones. J. Model. Comput., vol. 2, no. 2, hal. 25–36, 2019, [Daring]. Tersedia pada: https://ejournal.uksw.edu/icm/article/view/2954

I. M. . Karo, “Implementasi Metode XGBoost dan Feature Importance untuk Klasifikasi pada Kebakaran Hutan dan Lahan,” J. Softw. Eng. Inf. Commun. Technol., vol. 1, no. 1, hal. 11–18, 2020.

A. N. G. Ji dan D. Levinson, “Injury Severity Prediction From Two-Vehicle Crash,” vol. 1, no. April, hal. 217–226, 2020.

K. Kristiawan dan A. Widjaja, “Perbandingan Algoritma Machine Learning dalam Menilai Sebuah Lokasi Toko Ritel,” J. Tek. Inform. dan Sist. Inf., vol. 7, no. 1, hal. 35–46, 2021, doi: 10.28932/jutisi.v7i1.3182.

Published
2024-06-01
How to Cite
Muhammad Ricky Perdana Putra, & Ema Utami. (2024). Comparative Analysis of Hybrid Model Performance Using Stacking and Blending Techniques for Student Drop Out Prediction In MOOC. Jurnal RESTI (Rekayasa Sistem Dan Teknologi Informasi), 8(3), 346 - 354. https://doi.org/10.29207/resti.v8i3.5760
Section
Information Technology Articles