Comparison of Distance Measurements Based on k-Numbers and Its Influence to Clustering
DOI:
https://doi.org/10.30812/matrik.v23i1.3078Keywords:
Distance Measurements, Clustering, Numerical Measurements, Optimal ClusterAbstract
Heuristic data requires appropriate clustering methods to avoid casting doubt on the information generated by the grouping process. Determining an optimal cluster choice from the results of grouping is still challenging. This study aimed to analyze the four numerical measurement formulas in light of the data patterns from categorical that are now accessible to give users of heuristic data recommendations for how to derive knowledge or information from the best clusters. The method used was clustering with four measurements: Euclidean, Canberra, Manhattan, and Dynamic Time Warping and Elbow approach for optimizing. The Elbow with Sum Square Error (SSE) is employed to calculate the optimal cluster. The number of test clusters ranges from k = 2 to k = 10. Student data from social media was used in testing to help students achieve higher GPAs. 300 completed questionnaires that were circulated and used to collect the data. The result of this study showed that the Manhattan Distance is the best numerical measurement with the largest SSE of 45.359 and optimal clustering at k = 5. The optimal cluster Manhattan generated was made up of students with GPAs above 3.00 and websites/ vlogs used as learning tools by the mathematics and computer department. Each cluster’s ability to create information can be impacted by the proximity of qualities caused by variations in the number of clusters.
Downloads
References
Algorithm for Variations Number of Centroid K,†in Journal of Physics: Conference Series, vol. 1566, no. 1, 2020, p. 7.
[2] Sapriadi, Sutarman, and E. B. Nababan, “Improvement of K-Means Performance Using a Combination of Principal Component
Analysis and Rapid Centroid Estimation,†in Journal of Physics: Conference Series, vol. 1230, no. 1, 2019, p. 8.
[3] I. H. Sarker, “Data Science and Analytics: An Overview from Data-Driven Smart Computing, Decision-Making and Applications
Perspective,†SN Computer Science, vol. 2, no. 5, pp. 1–22, 2021.
[4] H. Ren, Y. Gao, and T. Yang, “A Novel Regret Theory-Based Decision-Making Method Combined with the Intuitionistic Fuzzy
Canberra Distance,†Discrete Dynamics in Nature and Society, vol. 2020, no. -, pp. 1–9, 2020.
[5] M. Zubair, M. A. Iqbal, A. Shil, M. J. Chowdhury, M. A. Moni, and I. H. Sarker, “An Improved K-means Clustering Algorithm
Towards an Efficient Data-Driven Modeling,†Annals of Data Science, vol. June, no. June, pp. 23–25, 2022.
[6] M. Faisal, E. M. Zamzami, and Sutarman, “Comparative Analysis of Inter-Centroid K-Means Performance using Euclidean
Distance, Canberra Distance and Manhattan Distance,†in Journal of Physics: Conference Series, vol. 1566, no. 1, 2020, p. 8.
[7] H. Wu, Y. Cao, H. Wei, and Z. Tian, “Face Recognition Based on Haar like and Euclidean Distance,†in Journal of Physics:
Conference Series, vol. 1813, no. 1, 2021, pp. 2–8.
[8] P. Istalkar, S. L. Unnithan, B. Biswal, and B. Sivakumar, “A Canberra distance-based complex network classification framework
using lumped catchment characteristics,†Stochastic Environmental Research and Risk Assessment, vol. 35, no. 6, pp. 1293–
1300, 2021.
[9] M. Raeisi and A. B. Sesay, “A Distance Metric for Uneven Clusters of Unsupervised K-Means Clustering Algorithm,†IEEE
Access, vol. 10, no. August, pp. 86 286–86 297, 2022.
[10] K.-n. Neighbor, A. F. Pulungan, M. Zarlis, and S. Suwilo, “Performance Analysis of Distance Measures in K-Nearest Neighbor,â€
in ICMASES 2019, 2020, p. 9.
[11] A. Fadlil and N. Tristanti, “Comparative Analysis of Euclidean , Manhattan , Canberra , and Squared Chord Methods in Face
Recognition,†vol. 37, no. 3, pp. 593–599, 2023.
[12] Sunardi, Abdul Fadlil, and Novi Tristanti, “The Application of The Manhattan Method to Human Face Recognition,†Jurnal
RESTI (Rekayasa Sistem dan Teknologi Informasi), vol. 6, no. 6, pp. 939–944, 2022.
[13] D. Deriso and S. Boyd, “A general optimization framework for dynamic time warping,†Optimization and Engineering, vol.
June, no. 0123456789, p. 22, 2022.
[14] E. Eslami, Y. Choi, Y. Lops, A. Sayeed, and A. K. Salman, “Using wavelet transform and dynamic time warping to identify the
limitations of the CNN model as an air quality forecasting system,†Geoscientific Model Development, vol. 13, no. December,
pp. 6237–6251, 2020.
[15] P. Lippe and E. Gavves, “L Atent N Ormalizing F Lows for,†in conference paper at ICLR 2021, no. -, 2021, p. 27.
[16] C. Guyeux, S. Chr´etien, G. B. Tayeh, and J. Demerjian, “Introducing and Comparing Recent Clustering Methods for Massive
Data Management in the Internet of Things,†Journal of Sensor and Actuator Network, vol. 8, no. 56, pp. 1–25, 2019.
[17] R. Bond and P. Biglarbeigi, “Data-driven versus a domain-led approach to k-means clustering on an open heart failure dataset,â€
International Journal of Data Science and Analytics, vol. 15, no. 1, pp. 49–66, 2023.
[18] M. Cui, “Introduction to the K-Means Clustering Algorithm Based on the Elbow Method,†Accounting, Auditing and Finance,
vol. 2020, no. 1, pp. 5–8, 2020.
[19] M. A. Jassim and S. N. Abdulwahid, “Data Mining preparation: Process, Techniques and Major Issues in Data Analysis,†in
IOP Conference Series: Materials Science and Engineering, vol. 1090, no. 1, 2021, p. 012053.
[20] J. Han and M. Kamber, Data Mining: Concepts and Techniques (2nd edition), 2006, vol. 54, no. Second Edition.
[21] M.-f. O.-d. Algorithm, R. Laher, A. Grant, F. Fang, W. Chen, Z. Tian, L. Zhang, and Y. Yang, “An outlier detection algorithm
based on maximum and minimum distance,†in ICEECT, 2021, p. 6.
[22] H. S. Lee, “Application of dynamic time warping algorithm for pattern similarity of gait,†Journal of Exercise Rehabilitation,
vol. 15, no. 4, pp. 526–530, 2019.
[23] D. Bertsimas, A. Orfanoudaki, and H. Wiberg, Interpretable clustering : an optimization approach. Springer US, 2021, vol.
110, no. 1.
[24] R. D. Dana, D. Soilihudin, and R. D. Priyatna, “Improved the Performance of the K-Means Cluster Using the Sum of Squared
Error ( SSE ) optimized by using the Elbow Method,†in 1st International Conference of SNIKOM 2018, 2019, p. 7.
[25] S. Gultom, S. Sriadhi, M. Martiano, and J. Simarmata, “Comparison analysis of K-Means and K-Medoid with Ecluidience
Distance Algorithm, Chanberra Distance, and Chebyshev Distance for Big Data Clustering,†IOP Conference Series: Materials
Science and Engineering, vol. 420, no. 1, p. 8, 2018.
Downloads
Published
Issue
Section
How to Cite
Similar Articles
- Putri Jafar, Dolly Indra, Fitriyani Umar, Color Feature Extraction for Grape Variety Identification: Naïve Bayes Approach , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 23 No. 3 (2024)
- Budi Sumanto, Denting Romantika Java, Wahyu Wijaya, Jans Hendry, Seleksi Fitur Terhadap Performa Kinerja Sistem E-Nose untuk Klasifikasi Aroma Kopi Gayo , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 21 No. 2 (2022)
- Edi Ismanto, Januar Al Amien, Vitriani Vitriani, A Comparison of Enhanced Ensemble Learning Techniques for Internet of Things Network Attack Detection , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 23 No. 3 (2024)
- Purnawarman Musa, Eri Prasetyo Wibowo, Saiful Bahri Musa, Iqbal Baihaqi, Pelican Crossing System for Control a Green Man Light with Predicted Age , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 21 No. 2 (2022)
- Dyah Susilowati, Hairani Hairani, Indah Puji Lestari, Khairan Marzuki, Lalu Zazuli Azhar Mardedi, Segmentasi Lokasi Promosi Penerimaan Mahasiswa Baru Menggunakan Metode RFM dan K-Means Clustering , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 21 No. 2 (2022)
- Lathifatul Mahabbati, Andy Hidayat Jatmika, Raphael Bianco Huwae, Reducing Transmission Signal Collisions on Optimized Link State Routing Protocol Using Dynamic Power Transmission , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 24 No. 1 (2024)
- Dedi Saputra, Haryani Haryani, Artika Surniandari, Martias Martias, Fajar Akbar, Sistem Informasi Bimbingan Tugas Akhir Mahasiswa Berbasis Website Menggunakan Metode Waterfall , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 21 No. 2 (2022)
- Suhirman Suhirman, Shoffan Saifullah, Ahmad Tri Hidayat, Rr Hajar Puji Sejati, Otsu Method for Chicken Egg Embryo Detection based-on Increase Image Quality , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 21 No. 2 (2022)
- Vivin Nur Aziza, Utami Dyah Syafitri, Anwar Fitrianto, Optimizing Currency Circulation Forecasts in Indonesia: A Hybrid Prophet- Long Short Term Memory Model with Hyperparameter Tuning , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 24 No. 1 (2024)
- Lusiana Efrizoni, Sarjon Defit, Muhammad Tajuddin, Anthony Anggrawan, Komparasi Ekstraksi Fitur dalam Klasifikasi Teks Multilabel Menggunakan Algoritma Machine Learning , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 21 No. 3 (2022)
You may also start an advanced similarity search for this article.
Most read articles by the same author(s)
- Dadang Priyanto, Bambang Krismono Triwijoyo, Deny Jollyta, Hairani Hairani, Ni Gusti Ayu Dasriani, Data Mining Earthquake Prediction with Multivariate Adaptive Regression Splines and Peak Ground Acceleration , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 22 No. 3 (2023)
- Prihandoko Prihandoko, Deny Jollyta, Gusrianty Gusrianty, Muhammad Siddik, Johan Johan, Cluster Validity for Optimizing Classification Model: Davies Bouldin Index – Random Forest Algorithm , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 24 No. 1 (2024)
- Nurdin Nurdin, Erni Susanti, Hafizh Al-Kautsar Aidilof, Dadang Priyanto, Comparison of Naive Bayes and Dempster Shafer Methods in Expert System for Early Diagnosis of COVID-19 , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 22 No. 1 (2022)
- Dadang Priyanto, Raisul Azhar, SISTEM APLIKASI UNTUK KEAMANAN DATA DENGAN ALGORITMA 'DES' (Data Encryption Standard) , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 16 No. 1 (2016)
- Musta’an Musta’an, Dadang Priyanto, SISTEM INFORMASI PENGADAAN BARANG LANGSUNG BERBASIS CLIENT-SERVER (Study Kasus di Universitas Mataram) , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 15 No. 2 (2016)
- Muhammad Hairul Abror, Dadang Priyanto, MEDIA BANTU PEMBELAJARAN IPA SMP SEBAGAI BEKAL MENGHADAPI UJIAN NASIONAL (UN) , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 15 No. 1 (2015)