Comparison of Support Vector Machine Performance with Oversampling and Outlier Handling in Diabetic Disease Detection Classification
DOI:
https://doi.org/10.30812/matrik.v22i3.2979Keywords:
Accuracy, Diabetes Mellitus, Support Vector Machine, Synthetic Minority Over-Sampling, TechniqueAbstract
Diabetes mellitus is a disease that attacks chronic metabolism, characterized by the body’s inability to process carbohydrates, fats so that glucose levels are high. Diabetes mellitus is the sixth cause of death in the world. Classifying data about diabetes mellitus makes it easier to predict the disease. As technology develops, diabetes mellitus can be detected using machine learning methods. The method that can be done is the support vector machine. The advantage of SVM is that it is very effective in completing classification, so it can quickly separate each positive and negative point. This study aimed to obtain the best SVM classification model based on accuracy, sensitivity, and precision values in detecting diabetes by adding Synthetic Minority Over-Sampling Technique (SMOTE) and handling outliers. The SMOTE method was applied to handle class imbalance. The Support Vector Machine (SVM) method aimed to produce a function as a dividing line or what can be called a hyperplane that matches all input data with the smallest possible error. The data studied were indications of diabetes, consisting of 8-factor variables and 1 class variable. The test results show that the SVM-SMOTE scenario produces the best accuracy. The SVM SMOTE scenario produced an accuracy value of the RBF kernel of 88% with an error of 12%, and this is obtained from the division of test data and training data of 90:10. This SVM-SMOTE scenario produced a precision value of 0.880 and a sensitivity value of 0.880. The research results showed that factor classification was more accurate if it is carried out using the support vector machine (SVM) method with imbalance data handling (SMOTE), and it can be concluded that the distribution of test data and training data influences a test scenario.
Downloads
References
Melitus Tipe 2 di Puskesmas Tuntungan Kota Medan,†Talenta Conference Series: Tropical Medicine (TM), vol. 1, no. 1, pp.
124–131, 2018.
[2] B. Delvika, S. Nurhidayarnis, and P. D. Rinada, “Comparison of Classification Between Naive Bayes and K-Nearest Neighbor
on Diabetes Risk in Pregnant Women Perbandingan Klasifikasi Antara Naive Bayes dan K-Nearest Neighbor Terhadap Resiko
Diabetes Pada Ibu Hamil,†vol. 2, no. 2 october 2022, pp. 68–75, 2022.
[3] M. D. M. Tito Putri, P. Wahjudi, and I. Prasetyowati, “Gambaran Kondisi Ibu Hamil dengan Diabetes Mellitus di RSD dr.
Soebandi Jember Tahun 2013-2017,†Pustaka Kesehatan, vol. 6, no. 1, p. 46, 2018.
[4] I. Diabetes Atlas, “International Diabetes Federation,†Diabetes Research and Clinical Practice, vol. 10, no. 2, pp. 1–133, 2021.
[5] I. Maria, Asuhan Keperawatan Diabetes Mellitus Dan Asuhan Keperawatan Stroke. Deepublish, 2021.
[6] D. P. Paramita and A. W. Lestari, “Pengaruh Riwayat Keluarga Terhadap Kadar Glukosa Darah Pada Dewasa Muda Keturunan
Pertama Dari Penderita Diabetes Mellitus Tipe 2 Di Denpasar Selatan,†Jurnal Medika, vol. 8, no. 1, pp. 61–66, 2019.
[7] M. K. Murtiningsih, K. Pandelaki, and B. P. Sedli, “Gaya Hidup sebagai Faktor Risiko Diabetes Melitus Tipe 2,†Jurnal Ilmiah
Kedokteran Klinik, vol. 9, no. 2, p. 328, mar 2021.
[8] L. Hansur, D. Ugi, and A. Febriza, “Pencegahan Penyakit Diabetes Melitus Di Kelurahan Tamarunang Kec Sombaopu Kabupaten
Gowa Sulawesi Selatan,†SELAPARANG Jurnal Pengabdian Masyarakat Berkemajuan, vol. 4, no. 1, p. 417, 2020.
[9] F. Andaresta, S. Sudarsih, and M. Achwandi, “Asuhan Keperawatan Dengan Ketidakstbilan Kadar Gula Darah Pada Klien
Diabetes Mellitus,†Ph.D. dissertation, 2022.
[10] V. K. Putri and F. I. Kurniadi, “Klasifikasi Diabetes Menggunakan Model Pembelajaran Ensemble Blending,†Jurnal ULTIMATICS,
vol. 10, no. 1, pp. 11–15, 2018.
[11] A. Rahman Isnain, A. Indra Sakti, D. Alita, and N. Satya Marga, “Sentimen Analisis Publik Terhadap Kebijakan Lockdown
Pemerintah Jakarta Menggunakan Algoritma Svm,†Jdmsi, vol. 2, no. 1, pp. 31–37, 2021.
[12] A. Muqiit WS and R. Nooraeni, “Penerapan Metode Resampling Dalam Mengatasi Imbalanced Data Pada Determinan Kasus
Diare Pada Balita Di Indonesia (Analisis Data Sdki 2017),†Jurnal MSA ( Matematika dan Statistika serta Aplikasinya ), vol. 8,
no. 1, p. 19, 2020.
[13] R. D. Fitriani, H. Yasin, and Tarno, “Penanganan Klasifikasi Kelas Data Tidak Seimbang Dengan Random Oversampling Pada
Naive Bayes (Studi Kasus: Status Peserta Kb Iud Di Kabupaten Kendal,†Jurnal Gaussian, vol. 10, no. 1, pp. 11–20, 2021.
[14] S. Mutmainah, “Penanganan Imbalance Data Pada Klasifikasi,†in SNATi, vol. 1, 2021, pp. 10–16.
[15] P. M. Joshi, T. N., &Chawan, “Logistic Regression and Svm Based Diabetes,†International Journal For Technological Research
In Engineering, vol. 5, no. July, pp. 4347–4350., 2018.
[16] V. C. Bavkar and A. A. Shinde, “Machine learning algorithms for Diabetes prediction and neural network method for blood
glucose measurement,†Indian Journal of Science and Technology, vol. 14, no. 10, pp. 869–880, 2021.
[17] O. D. Amelia, A. M. Soleh, and S. Rahardiantoro, “Pemodelan Support Vector Machine Data Tidak Seimbang Keberhasilan
Studi Mahasiswa Magister IPB,†Xplore: Journal of Statistics, vol. 2, no. 1, pp. 33–40, 2018.
[18] V. P. K. Turlapati and M. R. Prusty, “Outlier-SMOTE: A refined oversampling technique for improved detection of COVID-19,â€
Intelligence-Based Medicine, vol. 3-4, no. November, p. 100023, 2020.
[19] D. Sepri, A. Fauzi, R. Wandira, O. S. Riza, Y. F. Wahyuni, and H. Hutagaol, “Prediksi Harga Cabai Merah Menggunakan
Support Vector Regression,†Computer Based Information System Journal, vol. 02, pp. 1–5, 2020.
[20] D. I. Ramadhan and B. Santosa, “Analisis Kinerja Peramalan dan Klasifikasi Permintaan Auto Parts Berbasis Data Mining,â€
Jurnal Teknik ITS, vol. 9, no. 2, pp. 162–169, jan 2021.
[21] R. M. Mashita, S. Basuki, and N. Hayatin, “Prediksi Pemakaian Kwh Listrik Menggunakan Metode Support Vector Regression
(SVR) (Studi Kasus: PT. PLN (Persero) Rayon Seririt),†Jurnal Repositor, vol. 2, no. 4, pp. 525–540, 2020.
[22] D. A. Agatsa, R. Rismala, and U. N. Wisesty, “Klasifikasi Pasien Pengidap Diabetes Metode Support Vector Machine,†e-
Proceeding of Enginering, vol. 7, no. 1, pp. 2517–2525, 2020.
[23] H. Khaulasari, “Combine Sampling Least Square Support Vector Machine Untuk Klasifikasi Multi Class Imbalanced Data,â€
Jurnal Widyaloka IKIP Widya Darma, vol. 5, no. 3, pp. 261–278, 2018.
[24] L. Luo, S. Bao, and X. Peng, “Robust monitoring of industrial processes using process data with outliers and missing values,â€
Chemometrics and Intelligent Laboratory Systems, vol. 192, p. 103827, sep 2019.
[25] E. A. Sembiring, “Pengaruh metode pencatatan persediaan dengan sistem periodik dan perpetual berbasis SIA terhadap stock
opname pada perusahaan dagang di PT Jasum Jaya,†Accumulated Journal (Accounting and Management Research Edition),
vol. 1, no. 1, pp. 69–77, 2019.
[26] P. R. Fitrayana and D. R. S. Saputro, “Algoritme Clustering Large Application (CLARA) untuk Menangani Data Outlier,†in
PRISMA, Prosiding Seminar Nasional Matematika, vol. 5, 2022, pp. 721–725.
[27] R. Andhykha, H. R. Handayani, and N.Woyanti, “Analisis Pengaruh PDRB, Tingkat Pengangguran, dan IPM Terhadap Tingkat
Kemiskinan di Provinsi Jawa Tengah,†Media Ekonomi dan Manajemen, vol. 33, no. 2, pp. 113–123, 2018.
[28] D. Alita, Y. Fernando, and H. Sulistiani, “Implementasi Algoritma Multiclass Svm Pada Opini Publik Berbahasa Indonesia Di
Twitter,†Jurnal Tekno Kompak, vol. 14, no. 2, p. 86, 2020.
[29] D. Darwis, E. S. Pratiwi, and A. F. O. Pasaribu, “Penerapan Algoritma SVM untuk Analisis Sentimen pada Data Twitter Komisi
Pemberantasan Korupsi Republik Indonesia,†Edutic - Scientific Journal of Informatics Education, vol. 7, no. 1, pp. 1–11, 2020.
[30] D. Wahyuni, “Optimasi parameter support vector machine (svm) classifier menggunakan firefly algorithm (ffa) optimization
untuk klasifikasi mri tumor otak,†Ph.D. dissertation, 2019.
[31] N. Nafiah, “Klasifikasi Kematangan Buah Mangga Berdasarkan Citra HSV dengan KNN,†Jurnal Elektronika Listrik dan
Teknologi Informasi Terapan, vol. 1, no. 2, pp. 1–4, 2019.
[32] M. Vakili, M. Ghamsari, and M. Rezaei, “Performance analysis and comparison of machine and deep learning algorithms for
IoT data classification,†arXiv preprint arXiv:2001.09636, 2020.
[33] N. Singh and P. Singh, “Stacking-based multi-objective evolutionary ensemble framework for prediction of diabetes mellitus,â€
Biocybernetics and Biomedical Engineering, vol. 40, no. 1, pp. 1–22, 2020.
Downloads
Published
Issue
Section
How to Cite
Similar Articles
- Didih Rizki Chandranegara, Faras Haidar Pratama, Sidiq Fajrianur, Moch Rizky Eka Putra, Zamah Sari, Automated Detection of Breast Cancer Histopathology Image Using Convolutional Neural Network and Transfer Learning , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 22 No. 3 (2023)
- Firman Noor Hasan, Achmad Sufyan Aziz, Yos Nofendri, Utilization of Data Mining on MSMEs using FP-Growth Algorithm for Menu Recommendations , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 22 No. 2 (2023)
- Denny Indrajaya, Adi Setiawan, Bambang Susanto, Comparison of k-Nearest Neighbor and Naive Bayes Methods for SNP Data Classification , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 22 No. 1 (2022)
- Erna Daniati, Sucipto Sucipto, Anita Sari Wardani, Akmal Hisyam Pradhana, Usability Test on the System Determination Decision Support ReleaseProduct Towards Contribution Level Decision Maker , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 24 No. 3 (2025)
- Egi Dio Bagus Sudewo, Muhammad Kunta Biddinika, Abdul Fadlil, DenseNet Architecture for Efficient and Accurate Recognition of Javanese Script Hanacaraka Character , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 23 No. 2 (2024)
- sayuti rahman, Marwan Ramli, Arnes Sembiring, Muhammad Zen, Rahmad B.Y Syah, Normalization Layer Enhancement in Convolutional Neural Network for Parking Space Classification , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 23 No. 3 (2024)
- Rahman Rahman, Teguh Iman Hermanto, Meriska Defriani, Hyperparamaters Fine Tuning for Bidirectional Long Short Term Memory on Food Delivery , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 23 No. 1 (2023)
- Bob Subhan Riza, Jufriadif Na'am, Sumijan Sumijan, Tuberculosis Extra Pulmonary Bacilli Detection System Based on Ziehl Neelsen Images with Segmentation , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 22 No. 1 (2022)
- Muhamad Nur Gunawan, Titi Farhanah, Siti Ummi Masruroh, Ahmad Mukhlis Jundulloh, Nafdik Zaydan Raushanfikar, Rona Nisa Sofia Amriza, Accuracy of K-Nearest Neighbors Algorithm Classification For Archiving Research Publications , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 23 No. 3 (2024)
- Nadindra Dwi Ariyanta, Didik Dwi Prasetya, Ilham Ari Elbaith Zaeni, Tsukasa Hirashima, Reo Wicaksono, Assessing the Semantic Alignment in Multilingual Student-Teacher Concept Maps Using mBERT , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 25 No. 1 (2025)
You may also start an advanced similarity search for this article.
Most read articles by the same author(s)
- Yuniar Farida, Adam Fahmi Khariri, Dian Yuliati, Hani Khaulasari, Clustering Couples of Childbearing Age to Get Family Planning Counseling Using K-Means Method , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 22 No. 1 (2022)
.png)











