Analisis Kinerja Model Random Forest dengan Teknik Manhattan-SMOTE pada Deteksi Fraud Transaksi Kartu Kredit Imbalance
DOI:
https://doi.org/10.30812/corisindo.v1.5257Keywords:
deteksi fraud, data tidak seimbang, random forest, manhattan-SMOTE, transaksi kartu kreditAbstract
Ketidakseimbangan data merupakan salah satu tantangan utama dalam pengembangan sistem deteksi penipuan transaksi kartu kredit. Model pembelajaran mesin cenderung bias terhadap kelas mayoritas, sehingga sulit mendeteksi transaksi fraud yang tergolong sebagai kelas minoritas. Penelitian ini bertujuan untuk meningkatkan kinerja deteksi fraud dengan mengimplementasikan teknik oversampling Manhattan-SMOTE sebagai solusi penyeimbang data sebelum pelatihan model Random Forest. Manhattan-SMOTE merupakan pengembangan dari metode SMOTE konvensional yang menggunakan jarak Manhattan dalam proses interpolasi data sintetis, sehingga lebih stabil dan akurat untuk data berdimensi tinggi. Hasil evaluasi menunjukkan bahwa model Random Forest tanpa oversampling menghasilkan akurasi 81.18% dengan recall yang rendah, yaitu 36.26%. Setelah diterapkan Manhattan-SMOTE, nilai recall meningkat menjadi 67%, F1-score menjadi 0.50, dan ROC AUC melonjak dari 0.75 menjadi 0.96, meskipun akurasi menurun menjadi 70%. Hasil ini menunjukkan bahwa teknik Manhattan-SMOTE secara signifikan meningkatkan kemampuan model dalam mengenali transaksi penipuan tanpa mengorbankan performa klasifikasi secara keseluruhan. Kombinasi Random Forest dan Manhattan-SMOTE terbukti efektif dalam mengatasi ketidakseimbangan kelas dan layak diterapkan dalam sistem deteksi fraud berbasis kecerdasan buatan.
References
[1] R. Sailusha, V. Gnaneswar, R. Ramesh, and G. Ramakoteswara Rao, “Credit Card Fraud Detection Using Machine Learning,” Proc. Int. Conf. Intell. Comput. Control Syst. ICICCS 2020, pp. 1264–1270, 2020, doi: 10.1109/ICICCS48265.2020.9121114.
[2] Z. Faraji, “A Review of Machine Learning Applications for Credit Card Fraud Detection with A Case study,” SEISENSE J. Manag., vol. 5, no. 1, pp. 49–59, 2022, doi: 10.33215/sjom.v5i1.770.
[3] P. Gupta, A. Varshney, M. R. Khan, R. Ahmed, M. Shuaib, and S. Alam, “Unbalanced Credit Card Fraud Detection Data: A Machine Learning-Oriented Comparative Study of Balancing Techniques,” Procedia Comput. Sci., vol. 218, pp. 2575–2584, 2022, doi: 10.1016/j.procs.2023.01.231.
[4] P. Sharma, S. Banerjee, D. Tiwari, and J. C. Patni, “Machine learning model for credit card fraud detection-A comparative analysis,” Int. Arab J. Inf. Technol., vol. 18, no. 6, pp. 789–796, 2021, doi: 10.34028/iajit/18/6/6.
[5] M. S. Credit Card Fraud Detection Using Enhanced Random Forest Classifier for Imbalanced DataUmmah, “Credit Card Fraud Detection Using Enhanced Random Forest Classifier for Imbalanced Data,” Sustain., vol. 11, no. 1, pp. 1–14, 2019, [Online]. Available: http://scioteca.caf.com/bitstream/handle/123456789/1091/RED2017-Eng-8ene.pdf?sequence=12&isAllowed=y%0Ahttp://dx.doi.org/10.1016/j.regsciurbeco.2008.06.005%0Ahttps://www.researchgate.net/publication/305320484_SISTEM_PEMBETUNGAN_TERPUSAT_STRATEGI_MELESTARI
[6] E. Ileberi, Y. Sun, and Z. Wang, “Performance Evaluation of Machine Learning Methods for Credit Card Fraud Detection Using SMOTE and AdaBoost,” IEEE Access, vol. 9, pp. 165286–165294, 2021, doi: 10.1109/ACCESS.2021.3134330.
[7] P. Soltanzadeh and M. Hashemzadeh, “RCSMOTE : Range-Controlled Synthetic Minority,” Inf. Sci. (Ny)., 2020, [Online]. Available: https://doi.org/10.1016/j.ins.2020.07.014
[8] G. Tsaousoglou, I. Sartzetakis, P. Makris, N. Efthymiopoulos, E. Varvarigos, and N. G. Paterakis, “Flexibility Aggregation of Temporally Coupled Resources in Real-Time Balancing Markets Using Machine Learning,” IEEE Trans. Ind. Informatics, vol. 18, no. 7, pp. 4342–4351, 2022, doi: 10.1109/TII.2021.3132036.
[9] W. Weiying et al., “adaptive sv-borderline smote-svm algrithm for imbalance data classification,” The Lancent Pschch, vol. 11, no. August, pp. 133–143, 2022.
[10] S. Feng, J. Keung, P. Zhang, Y. Xiao, and M. Zhang, “The impact of the distance metric and measure on SMOTE-based techniques in software defect prediction,” Inf. Softw. Technol., vol. 142, no. October 2021, p. 106742, 2022, doi: 10.1016/j.infsof.2021.106742.
[11] Y. Ma, Y. Tian, N. Moniz, and N. V. Chawla, “Class-Imbalanced Learning on Graphs: A Survey,” ACM Comput. Surv., vol. 57, no. 8, pp. 1–16, 2025, doi: 10.1145/3718734.
[12] K. Ghosh, C. Bellinger, R. Corizzo, P. Branco, B. Krawczyk, and N. Japkowicz, The class imbalance problem in deep learning, vol. 113, no. 7. Springer US, 2024. doi: 10.1007/s10994-022-06268-8.
[13] F. Carcillo, Y. A. Le Borgne, O. Caelen, Y. Kessaci, F. Oblé, and G. Bontempi, “Combining unsupervised and supervised learning in credit card fraud detection,” Inf. Sci. (Ny)., vol. 557, no. June 2020, pp. 317–331, 2021, doi: 10.1016/j.ins.2019.05.042.
[14] P. Kumar, R. Bhatnagar, K. Gaur, and A. Bhatnagar, “Classification of Imbalanced Data:Review of Methods and Applications,” IOP Conf. Ser. Mater. Sci. Eng., vol. 1099, no. 1, p. 012077, 2021, doi: 10.1088/1757-899x/1099/1/012077.
[15] M. Prameswari, P. E. Kania, I. G. De Ayu, S. Namira, and P. Harnoko, “Penerapan Metode Stacking Ensemble Untuk Klasifikasi Status Pinjaman Nasabah Bank,” vol. 2024, no. Senada, pp. 802–811, 2024.
[16] B. Billy Riantono and R. Andarsyah, “Analisa Performa Algoritma Random Forest & Logistic Regression Dalam Sistem Credit Scoring,” J. Teknol. Dan Sist. Inf. Bisnis, vol. 6, no. 2, pp. 438–444, 2024, doi: 10.47233/jteksis.v6i2.1308.
[17] I. D. Mienye and Y. Sun, “A Machine Learning Method with Hybrid Feature Selection for Improved Credit Card Fraud Detection,” Appl. Sci., vol. 13, no. 12, 2023, doi: 10.3390/app13127254.
[18] V. S. S. Karthik, A. Mishra, and U. S. Reddy, “Credit Card Fraud Detection by Modelling Behaviour Pattern using Hybrid Ensemble Model,” Arab. J. Sci. Eng., vol. 47, no. 2, pp. 1987–1997, 2022, doi: 10.1007/s13369-021-06147-9.
[19] R. Peranginangin, E. J. G. Harianja, I. K. Jaya, and B. Rumahorbo, “Penerapan Algoritma Safe-Level-Smote Untuk Peningkatan Nilai G-Mean Dalam Klasifikasi Data Tidak Seimbang,” METHOMIKA J. Manaj. Inform. dan Komputerisasi Akunt., vol. 4, no. 1, pp. 67–72, 2020, doi: 10.46880/jmika.vol4no1.pp67-72.
[20] J. Davis and M. Goadrich, “The relationship between precision-recall and ROC curves,” 2021. [Online]. Available: http://portal.acm.org/citation.cfm?doid=1143844.1143874
[21] P. Zhang, Y. Jia, and Y. Shang, “Research and application of XGBoost in imbalanced data,” Int. J. Distrib. Sens. Networks, vol. 18, no. 6, 2022, doi: 10.1177/15501329221106935.