Credit Risk Detection in Peer-to-Peer Lending Using CatBoost
Abstract
P2P lending (Peer-to-peer lending) is widely used by private borrowers, small businesses, and MSMEs because P2P lending allows individuals and businesses to be able to lend money directly from lenders without the stringent requirements and criteria of traditional banks and financial institutions. However, P2P lending has a credit risk problem characterized by a high failure rate for borrowers to repay their loans. Therefore, a system was necessary to detect credit risk to minimize the risk of P2P lending. In this study, a system had been built using the CatBoost method; the dataset used was taken from the Bondora loan dataset. To measure the performance of the CatBoost algorithm, an evaluation matrix was performed using ROC (Receiver Operating Characteristics) curves and AUC (Area Under Curve) was performed. The experiment consists of three scenarios, of which the best result regards Scenario 2 with a data splitting of 90:10. It was caused by the result of AUC value 0.80329 compared to scenario 1 with a data split of 80:20 with the AUC value around 0.789583, and scenario 3 with a data split of 70:30 with the AUC value around 0.781066, respectively.
Downloads
References
P. S. Chong, J. Labadin, and F. Meziane, “Credit Risk Prediction for Peer-To-Peer Lending Platforms: An Explainable Machine Learning Approach,” J. Comput. Soc. Informatics, vol. 1, no. 2, pp. 1–16, 2022, doi: 10.33736/jcsi.4761.2022.
D. Li, S. Na, T. Ding, and C. Liu, “Credit risk management of p2p network lending,” Teh. Vjesn., vol. 28, no. 4, pp. 1145–1151, 2021, doi: 10.17559/TV-20200210110508.
L. Zhou, H. Fujita, H. Ding, and R. Ma, “Credit risk modeling on data with two timestamps in peer-to-peer lending by gradient boosting,” Appl. Soft Comput., vol. 110, p. 107672, Oct. 2021, doi: 10.1016/J.ASOC.2021.107672.
M. Bazarbash, “FinTech in Financial Inclusion: Machine Learning Applications in Assessing Credit Risk,” IMF Work. Pap., vol. 2019, no. 109, May 2019, doi: 10.5089/9781498314428.001.A001.
J. Mezei, A. Byanjankar, and M. Heikkilä, “Credit risk evaluation in peer-to-peer lending with linguistic data transformation and supervised learning,” Proc. Annu. Hawaii Int. Conf. Syst. Sci., vol. 2018-Janua, pp. 1366–1375, 2018, doi: 10.24251/hicss.2018.169.
L. Machado and D. Holmer, “Credit risk modelling and prediction: Logistic regression versus machine learning boosting algorithms,” 2022.
N. Nguyen et al., “A Proposed Model for Card Fraud Detection Based on CatBoost and Deep Neural Network,” IEEE Access, vol. 10, pp. 96852–96861, 2022, doi: 10.1109/ACCESS.2022.3205416.
X. Li, D. Ergu, D. Zhang, D. Qiu, Y. Cai, and B. Ma, “Prediction of loan default based on multi-model fusion,” Procedia Comput. Sci., vol. 199, pp. 757–764, Jan. 2022, doi: 10.1016/J.PROCS.2022.01.094.
J. D. Turiel and T. Aste, “Peer-to-peer loan acceptance and default prediction with artificial intelligence,” R. Soc. Open Sci., vol. 7, no. 6, Jun. 2020, doi: 10.1098/RSOS.191649.
I. Rahadiyan and N. Mentari, “Keterbukaan Informasi Sebagai Mitigasi Risiko Peer To Peer Lending (Perbandingan Antara Indonesia Dan Amerika Serikat),” J. Huk. IUS QUIA IUSTUM, vol. 28, no. 2, pp. 325–347, Jun. 2021, doi: 10.20885/IUSTUM.VOL28.ISS2.ART5.
R. Hutapea, “Minimalisasi Risiko Kredit (NPL) Pada Fintach Peer to Peer Lending melalui Kewajiban Pelaporan SLIK OJK,” J. Ilm. Mandala Educ., vol. 6, no. 2, Oct. 2020, doi: 10.58258/JIME.V6I2.1401.
W. Li, S. Ding, H. Wang, Y. Chen, and S. Yang, “Heterogeneous ensemble learning with feature engineering for default prediction in peer-to-peer lending in China,” World Wide Web, vol. 23, no. 1, pp. 23–45, Jan. 2020, doi: 10.1007/S11280-019-00676-Y/FIGURES/9.
Y. Guo, “Credit risk assessment of P2P lending platform towards big data based on BP neural network,” J. Vis. Commun. Image Represent., vol. 71, p. 102730, Aug. 2020, doi: 10.1016/J.JVCIR.2019.102730.
O. Havrylchyk and M. Verdier, “The Financial Intermediation Role of the P2P Lending Platforms,” Comp. Econ. Stud., vol. 60, no. 1, pp. 115–130, Mar. 2018, doi: 10.1057/S41294-017-0045-1/METRICS.
Y. Song, Y. Wang, X. Ye, D. Wang, Y. Yin, and Y. Wang, “Multi-view ensemble learning based on distance-to-model and adaptive clustering for imbalanced credit risk assessment in P2P lending,” Inf. Sci. (Ny)., vol. 525, pp. 182–204, Jul. 2020, doi: 10.1016/J.INS.2020.03.027.
S. Chen, Q. Wang, and S. Liu, “Credit Risk Prediction in Peer-to-Peer Lending with Ensemble Learning Framework,” Proc. 31st Chinese Control Decis. Conf. CCDC 2019, no. 1, pp. 4373–4377, 2019, doi: 10.1109/CCDC.2019.8832412.
Š. Lyócsa, P. Vašaničová, B. Hadji Misheva, and M. D. Vateha, “Default or profit scoring credit systems? Evidence from European and US peer-to-peer lending markets,” Financ. Innov., vol. 8, no. 1, pp. 1–21, Dec. 2022, doi: 10.1186/S40854-022-00338-5/FIGURES/3.
L. Prokhorenkova, G. Gusev, A. Vorobev, A. V. Dorogush, and A. Gulin, “CatBoost: unbiased boosting with categorical features,” Adv. Neural Inf. Process. Syst., vol. 31, 2018, Accessed: Nov. 20, 2022. [Online]. Available: https://github.com/catboost/catboost
Y. Zhang, Z. Zhao, and J. Zheng, “CatBoost: A new approach for estimating daily reference crop evapotranspiration in arid and semi-arid regions of Northern China,” J. Hydrol., vol. 588, no. May, p. 125087, 2020, doi: 10.1016/j.jhydrol.2020.125087.
A. V. Dorogush, V. Ershov, and A. Gulin, “CatBoost: gradient boosting with categorical features support,” pp. 1–7, 2018, [Online]. Available: http://arxiv.org/abs/1810.11363
J. T. Hancock and T. M. Khoshgoftaar, “CatBoost for big data: an interdisciplinary review,” J. Big Data, vol. 7, no. 1, 2020, doi: 10.1186/s40537-020-00369-8.
X. Huang, X. Liu, and Y. Ren, “Enterprise credit risk evaluation based on neural network algorithm,” Cogn. Syst. Res., vol. 52, pp. 317–324, Dec. 2018, doi: 10.1016/J.COGSYS.2018.07.023.
Copyright (c) 2023 Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi)
This work is licensed under a Creative Commons Attribution 4.0 International License.
Copyright in each article belongs to the author
- The author acknowledges that the RESTI Journal (System Engineering and Information Technology) is the first publisher to publish with a license Creative Commons Attribution 4.0 International License.
- Authors can enter writing separately, arrange the non-exclusive distribution of manuscripts that have been published in this journal into other versions (eg sent to the author's institutional repository, publication in a book, etc.), by acknowledging that the manuscript has been published for the first time in the RESTI (Rekayasa Sistem dan Teknologi Informasi) journal ;