Comparison of Support Vector Machine Performance with Oversampling and Outlier Handling in Diabetic Disease Detection Classification
DOI:
https://doi.org/10.30812/matrik.v22i3.2979Keywords:
Accuracy, Diabetes Mellitus, Support Vector Machine, Synthetic Minority Over-Sampling, TechniqueAbstract
Diabetes mellitus is a disease that attacks chronic metabolism, characterized by the body’s inability to process carbohydrates, fats so that glucose levels are high. Diabetes mellitus is the sixth cause of death in the world. Classifying data about diabetes mellitus makes it easier to predict the disease. As technology develops, diabetes mellitus can be detected using machine learning methods. The method that can be done is the support vector machine. The advantage of SVM is that it is very effective in completing classification, so it can quickly separate each positive and negative point. This study aimed to obtain the best SVM classification model based on accuracy, sensitivity, and precision values in detecting diabetes by adding Synthetic Minority Over-Sampling Technique (SMOTE) and handling outliers. The SMOTE method was applied to handle class imbalance. The Support Vector Machine (SVM) method aimed to produce a function as a dividing line or what can be called a hyperplane that matches all input data with the smallest possible error. The data studied were indications of diabetes, consisting of 8-factor variables and 1 class variable. The test results show that the SVM-SMOTE scenario produces the best accuracy. The SVM SMOTE scenario produced an accuracy value of the RBF kernel of 88% with an error of 12%, and this is obtained from the division of test data and training data of 90:10. This SVM-SMOTE scenario produced a precision value of 0.880 and a sensitivity value of 0.880. The research results showed that factor classification was more accurate if it is carried out using the support vector machine (SVM) method with imbalance data handling (SMOTE), and it can be concluded that the distribution of test data and training data influences a test scenario.
Downloads
References
Melitus Tipe 2 di Puskesmas Tuntungan Kota Medan,†Talenta Conference Series: Tropical Medicine (TM), vol. 1, no. 1, pp.
124–131, 2018.
[2] B. Delvika, S. Nurhidayarnis, and P. D. Rinada, “Comparison of Classification Between Naive Bayes and K-Nearest Neighbor
on Diabetes Risk in Pregnant Women Perbandingan Klasifikasi Antara Naive Bayes dan K-Nearest Neighbor Terhadap Resiko
Diabetes Pada Ibu Hamil,†vol. 2, no. 2 october 2022, pp. 68–75, 2022.
[3] M. D. M. Tito Putri, P. Wahjudi, and I. Prasetyowati, “Gambaran Kondisi Ibu Hamil dengan Diabetes Mellitus di RSD dr.
Soebandi Jember Tahun 2013-2017,†Pustaka Kesehatan, vol. 6, no. 1, p. 46, 2018.
[4] I. Diabetes Atlas, “International Diabetes Federation,†Diabetes Research and Clinical Practice, vol. 10, no. 2, pp. 1–133, 2021.
[5] I. Maria, Asuhan Keperawatan Diabetes Mellitus Dan Asuhan Keperawatan Stroke. Deepublish, 2021.
[6] D. P. Paramita and A. W. Lestari, “Pengaruh Riwayat Keluarga Terhadap Kadar Glukosa Darah Pada Dewasa Muda Keturunan
Pertama Dari Penderita Diabetes Mellitus Tipe 2 Di Denpasar Selatan,†Jurnal Medika, vol. 8, no. 1, pp. 61–66, 2019.
[7] M. K. Murtiningsih, K. Pandelaki, and B. P. Sedli, “Gaya Hidup sebagai Faktor Risiko Diabetes Melitus Tipe 2,†Jurnal Ilmiah
Kedokteran Klinik, vol. 9, no. 2, p. 328, mar 2021.
[8] L. Hansur, D. Ugi, and A. Febriza, “Pencegahan Penyakit Diabetes Melitus Di Kelurahan Tamarunang Kec Sombaopu Kabupaten
Gowa Sulawesi Selatan,†SELAPARANG Jurnal Pengabdian Masyarakat Berkemajuan, vol. 4, no. 1, p. 417, 2020.
[9] F. Andaresta, S. Sudarsih, and M. Achwandi, “Asuhan Keperawatan Dengan Ketidakstbilan Kadar Gula Darah Pada Klien
Diabetes Mellitus,†Ph.D. dissertation, 2022.
[10] V. K. Putri and F. I. Kurniadi, “Klasifikasi Diabetes Menggunakan Model Pembelajaran Ensemble Blending,†Jurnal ULTIMATICS,
vol. 10, no. 1, pp. 11–15, 2018.
[11] A. Rahman Isnain, A. Indra Sakti, D. Alita, and N. Satya Marga, “Sentimen Analisis Publik Terhadap Kebijakan Lockdown
Pemerintah Jakarta Menggunakan Algoritma Svm,†Jdmsi, vol. 2, no. 1, pp. 31–37, 2021.
[12] A. Muqiit WS and R. Nooraeni, “Penerapan Metode Resampling Dalam Mengatasi Imbalanced Data Pada Determinan Kasus
Diare Pada Balita Di Indonesia (Analisis Data Sdki 2017),†Jurnal MSA ( Matematika dan Statistika serta Aplikasinya ), vol. 8,
no. 1, p. 19, 2020.
[13] R. D. Fitriani, H. Yasin, and Tarno, “Penanganan Klasifikasi Kelas Data Tidak Seimbang Dengan Random Oversampling Pada
Naive Bayes (Studi Kasus: Status Peserta Kb Iud Di Kabupaten Kendal,†Jurnal Gaussian, vol. 10, no. 1, pp. 11–20, 2021.
[14] S. Mutmainah, “Penanganan Imbalance Data Pada Klasifikasi,†in SNATi, vol. 1, 2021, pp. 10–16.
[15] P. M. Joshi, T. N., &Chawan, “Logistic Regression and Svm Based Diabetes,†International Journal For Technological Research
In Engineering, vol. 5, no. July, pp. 4347–4350., 2018.
[16] V. C. Bavkar and A. A. Shinde, “Machine learning algorithms for Diabetes prediction and neural network method for blood
glucose measurement,†Indian Journal of Science and Technology, vol. 14, no. 10, pp. 869–880, 2021.
[17] O. D. Amelia, A. M. Soleh, and S. Rahardiantoro, “Pemodelan Support Vector Machine Data Tidak Seimbang Keberhasilan
Studi Mahasiswa Magister IPB,†Xplore: Journal of Statistics, vol. 2, no. 1, pp. 33–40, 2018.
[18] V. P. K. Turlapati and M. R. Prusty, “Outlier-SMOTE: A refined oversampling technique for improved detection of COVID-19,â€
Intelligence-Based Medicine, vol. 3-4, no. November, p. 100023, 2020.
[19] D. Sepri, A. Fauzi, R. Wandira, O. S. Riza, Y. F. Wahyuni, and H. Hutagaol, “Prediksi Harga Cabai Merah Menggunakan
Support Vector Regression,†Computer Based Information System Journal, vol. 02, pp. 1–5, 2020.
[20] D. I. Ramadhan and B. Santosa, “Analisis Kinerja Peramalan dan Klasifikasi Permintaan Auto Parts Berbasis Data Mining,â€
Jurnal Teknik ITS, vol. 9, no. 2, pp. 162–169, jan 2021.
[21] R. M. Mashita, S. Basuki, and N. Hayatin, “Prediksi Pemakaian Kwh Listrik Menggunakan Metode Support Vector Regression
(SVR) (Studi Kasus: PT. PLN (Persero) Rayon Seririt),†Jurnal Repositor, vol. 2, no. 4, pp. 525–540, 2020.
[22] D. A. Agatsa, R. Rismala, and U. N. Wisesty, “Klasifikasi Pasien Pengidap Diabetes Metode Support Vector Machine,†e-
Proceeding of Enginering, vol. 7, no. 1, pp. 2517–2525, 2020.
[23] H. Khaulasari, “Combine Sampling Least Square Support Vector Machine Untuk Klasifikasi Multi Class Imbalanced Data,â€
Jurnal Widyaloka IKIP Widya Darma, vol. 5, no. 3, pp. 261–278, 2018.
[24] L. Luo, S. Bao, and X. Peng, “Robust monitoring of industrial processes using process data with outliers and missing values,â€
Chemometrics and Intelligent Laboratory Systems, vol. 192, p. 103827, sep 2019.
[25] E. A. Sembiring, “Pengaruh metode pencatatan persediaan dengan sistem periodik dan perpetual berbasis SIA terhadap stock
opname pada perusahaan dagang di PT Jasum Jaya,†Accumulated Journal (Accounting and Management Research Edition),
vol. 1, no. 1, pp. 69–77, 2019.
[26] P. R. Fitrayana and D. R. S. Saputro, “Algoritme Clustering Large Application (CLARA) untuk Menangani Data Outlier,†in
PRISMA, Prosiding Seminar Nasional Matematika, vol. 5, 2022, pp. 721–725.
[27] R. Andhykha, H. R. Handayani, and N.Woyanti, “Analisis Pengaruh PDRB, Tingkat Pengangguran, dan IPM Terhadap Tingkat
Kemiskinan di Provinsi Jawa Tengah,†Media Ekonomi dan Manajemen, vol. 33, no. 2, pp. 113–123, 2018.
[28] D. Alita, Y. Fernando, and H. Sulistiani, “Implementasi Algoritma Multiclass Svm Pada Opini Publik Berbahasa Indonesia Di
Twitter,†Jurnal Tekno Kompak, vol. 14, no. 2, p. 86, 2020.
[29] D. Darwis, E. S. Pratiwi, and A. F. O. Pasaribu, “Penerapan Algoritma SVM untuk Analisis Sentimen pada Data Twitter Komisi
Pemberantasan Korupsi Republik Indonesia,†Edutic - Scientific Journal of Informatics Education, vol. 7, no. 1, pp. 1–11, 2020.
[30] D. Wahyuni, “Optimasi parameter support vector machine (svm) classifier menggunakan firefly algorithm (ffa) optimization
untuk klasifikasi mri tumor otak,†Ph.D. dissertation, 2019.
[31] N. Nafiah, “Klasifikasi Kematangan Buah Mangga Berdasarkan Citra HSV dengan KNN,†Jurnal Elektronika Listrik dan
Teknologi Informasi Terapan, vol. 1, no. 2, pp. 1–4, 2019.
[32] M. Vakili, M. Ghamsari, and M. Rezaei, “Performance analysis and comparison of machine and deep learning algorithms for
IoT data classification,†arXiv preprint arXiv:2001.09636, 2020.
[33] N. Singh and P. Singh, “Stacking-based multi-objective evolutionary ensemble framework for prediction of diabetes mellitus,â€
Biocybernetics and Biomedical Engineering, vol. 40, no. 1, pp. 1–22, 2020.
Downloads
Published
Issue
Section
How to Cite
Similar Articles
- Miftahuddin Fahmi, Anton Yudhana, Sunardi Sunardi, Image Processing Using Morphology on Support Vector Machine Classification Model for Waste Image , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 22 No. 3 (2023)
- Muhammad Alkaff, Muhammad Afrizal Miqdad, Muhammad Fachrurrazi, Muhammad Nur Abdi, Ahmad Zainul Abidin, Raisa Amalia, Hate Speech Detection for Banjarese Languages on Instagram Using Machine Learning Methods , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 22 No. 3 (2023)
- Wahyu Styo Pratama, Didik Dwi Prasetya, Triyanna Widyaningtyas, Muhammad Zaki Wiryawan, Lalu Ganda Rady Putra, Tsukasa Hirashima, Performance Evaluation of Artificial Intelligence Models for Classification in Concept Map Quality Assessment , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 24 No. 3 (2025)
- Ni Wayan Sumartini Saraswati, I Gusti Ayu Agung Diatri Indradewi, Recognize The Polarity of Hotel Reviews using Support Vector Machine , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 22 No. 1 (2022)
- Siti Ummi Masruroh, Cong Dai Nguyen, Doni Febrianus, Comparative Analysis of TF-IDF and Modern Text Embedding for the Classification of Islamic Ideologies on Indonesian Twitter , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 25 No. 1 (2025)
- Nurun Latifah, Ramaditia Dwiyansaputra, Gibran Satya Nugraha, Multiclass Text Classification of Indonesian Short Message Service (SMS) Spam using Deep Learning Method and Easy Data Augmentation , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 23 No. 3 (2024)
- Syahril Efendi, Poltak Sihombing, Sentiment Analysis of Food Order Tweets to Find Out Demographic Customer Profile Using SVM , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 21 No. 3 (2022)
- Muhammad Ibnu Choldun Rachmatullah, The Application of Repeated SMOTE for Multi Class Classification on Imbalanced Data , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 22 No. 1 (2022)
- I Gusti Ayu Agung Diatri Indradewi, Ni Wayan Sumartini Saraswati, Ni Wayan Wardani, COVID-19 Chest X-Ray Detection Performance Through Variations of Wavelets Basis Function , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 21 No. 1 (2021)
- Wikky Fawwaz Al Maki, Amien Jafar Makrufi, Support vector machine with a firefly optimization algorithm for classification of apple fruit disease , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 22 No. 1 (2022)
You may also start an advanced similarity search for this article.
Most read articles by the same author(s)
- Yuniar Farida, Adam Fahmi Khariri, Dian Yuliati, Hani Khaulasari, Clustering Couples of Childbearing Age to Get Family Planning Counseling Using K-Means Method , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 22 No. 1 (2022)
.png)











