Comparison of Distance Measurements Based on k-Numbers and Its Influence to Clustering
DOI:
https://doi.org/10.30812/matrik.v23i1.3078Keywords:
Distance Measurements, Clustering, Numerical Measurements, Optimal ClusterAbstract
Heuristic data requires appropriate clustering methods to avoid casting doubt on the information generated by the grouping process. Determining an optimal cluster choice from the results of grouping is still challenging. This study aimed to analyze the four numerical measurement formulas in light of the data patterns from categorical that are now accessible to give users of heuristic data recommendations for how to derive knowledge or information from the best clusters. The method used was clustering with four measurements: Euclidean, Canberra, Manhattan, and Dynamic Time Warping and Elbow approach for optimizing. The Elbow with Sum Square Error (SSE) is employed to calculate the optimal cluster. The number of test clusters ranges from k = 2 to k = 10. Student data from social media was used in testing to help students achieve higher GPAs. 300 completed questionnaires that were circulated and used to collect the data. The result of this study showed that the Manhattan Distance is the best numerical measurement with the largest SSE of 45.359 and optimal clustering at k = 5. The optimal cluster Manhattan generated was made up of students with GPAs above 3.00 and websites/ vlogs used as learning tools by the mathematics and computer department. Each cluster’s ability to create information can be impacted by the proximity of qualities caused by variations in the number of clusters.
Downloads
References
Algorithm for Variations Number of Centroid K,†in Journal of Physics: Conference Series, vol. 1566, no. 1, 2020, p. 7.
[2] Sapriadi, Sutarman, and E. B. Nababan, “Improvement of K-Means Performance Using a Combination of Principal Component
Analysis and Rapid Centroid Estimation,†in Journal of Physics: Conference Series, vol. 1230, no. 1, 2019, p. 8.
[3] I. H. Sarker, “Data Science and Analytics: An Overview from Data-Driven Smart Computing, Decision-Making and Applications
Perspective,†SN Computer Science, vol. 2, no. 5, pp. 1–22, 2021.
[4] H. Ren, Y. Gao, and T. Yang, “A Novel Regret Theory-Based Decision-Making Method Combined with the Intuitionistic Fuzzy
Canberra Distance,†Discrete Dynamics in Nature and Society, vol. 2020, no. -, pp. 1–9, 2020.
[5] M. Zubair, M. A. Iqbal, A. Shil, M. J. Chowdhury, M. A. Moni, and I. H. Sarker, “An Improved K-means Clustering Algorithm
Towards an Efficient Data-Driven Modeling,†Annals of Data Science, vol. June, no. June, pp. 23–25, 2022.
[6] M. Faisal, E. M. Zamzami, and Sutarman, “Comparative Analysis of Inter-Centroid K-Means Performance using Euclidean
Distance, Canberra Distance and Manhattan Distance,†in Journal of Physics: Conference Series, vol. 1566, no. 1, 2020, p. 8.
[7] H. Wu, Y. Cao, H. Wei, and Z. Tian, “Face Recognition Based on Haar like and Euclidean Distance,†in Journal of Physics:
Conference Series, vol. 1813, no. 1, 2021, pp. 2–8.
[8] P. Istalkar, S. L. Unnithan, B. Biswal, and B. Sivakumar, “A Canberra distance-based complex network classification framework
using lumped catchment characteristics,†Stochastic Environmental Research and Risk Assessment, vol. 35, no. 6, pp. 1293–
1300, 2021.
[9] M. Raeisi and A. B. Sesay, “A Distance Metric for Uneven Clusters of Unsupervised K-Means Clustering Algorithm,†IEEE
Access, vol. 10, no. August, pp. 86 286–86 297, 2022.
[10] K.-n. Neighbor, A. F. Pulungan, M. Zarlis, and S. Suwilo, “Performance Analysis of Distance Measures in K-Nearest Neighbor,â€
in ICMASES 2019, 2020, p. 9.
[11] A. Fadlil and N. Tristanti, “Comparative Analysis of Euclidean , Manhattan , Canberra , and Squared Chord Methods in Face
Recognition,†vol. 37, no. 3, pp. 593–599, 2023.
[12] Sunardi, Abdul Fadlil, and Novi Tristanti, “The Application of The Manhattan Method to Human Face Recognition,†Jurnal
RESTI (Rekayasa Sistem dan Teknologi Informasi), vol. 6, no. 6, pp. 939–944, 2022.
[13] D. Deriso and S. Boyd, “A general optimization framework for dynamic time warping,†Optimization and Engineering, vol.
June, no. 0123456789, p. 22, 2022.
[14] E. Eslami, Y. Choi, Y. Lops, A. Sayeed, and A. K. Salman, “Using wavelet transform and dynamic time warping to identify the
limitations of the CNN model as an air quality forecasting system,†Geoscientific Model Development, vol. 13, no. December,
pp. 6237–6251, 2020.
[15] P. Lippe and E. Gavves, “L Atent N Ormalizing F Lows for,†in conference paper at ICLR 2021, no. -, 2021, p. 27.
[16] C. Guyeux, S. Chr´etien, G. B. Tayeh, and J. Demerjian, “Introducing and Comparing Recent Clustering Methods for Massive
Data Management in the Internet of Things,†Journal of Sensor and Actuator Network, vol. 8, no. 56, pp. 1–25, 2019.
[17] R. Bond and P. Biglarbeigi, “Data-driven versus a domain-led approach to k-means clustering on an open heart failure dataset,â€
International Journal of Data Science and Analytics, vol. 15, no. 1, pp. 49–66, 2023.
[18] M. Cui, “Introduction to the K-Means Clustering Algorithm Based on the Elbow Method,†Accounting, Auditing and Finance,
vol. 2020, no. 1, pp. 5–8, 2020.
[19] M. A. Jassim and S. N. Abdulwahid, “Data Mining preparation: Process, Techniques and Major Issues in Data Analysis,†in
IOP Conference Series: Materials Science and Engineering, vol. 1090, no. 1, 2021, p. 012053.
[20] J. Han and M. Kamber, Data Mining: Concepts and Techniques (2nd edition), 2006, vol. 54, no. Second Edition.
[21] M.-f. O.-d. Algorithm, R. Laher, A. Grant, F. Fang, W. Chen, Z. Tian, L. Zhang, and Y. Yang, “An outlier detection algorithm
based on maximum and minimum distance,†in ICEECT, 2021, p. 6.
[22] H. S. Lee, “Application of dynamic time warping algorithm for pattern similarity of gait,†Journal of Exercise Rehabilitation,
vol. 15, no. 4, pp. 526–530, 2019.
[23] D. Bertsimas, A. Orfanoudaki, and H. Wiberg, Interpretable clustering : an optimization approach. Springer US, 2021, vol.
110, no. 1.
[24] R. D. Dana, D. Soilihudin, and R. D. Priyatna, “Improved the Performance of the K-Means Cluster Using the Sum of Squared
Error ( SSE ) optimized by using the Elbow Method,†in 1st International Conference of SNIKOM 2018, 2019, p. 7.
[25] S. Gultom, S. Sriadhi, M. Martiano, and J. Simarmata, “Comparison analysis of K-Means and K-Medoid with Ecluidience
Distance Algorithm, Chanberra Distance, and Chebyshev Distance for Big Data Clustering,†IOP Conference Series: Materials
Science and Engineering, vol. 420, no. 1, p. 8, 2018.
Downloads
Published
Issue
Section
How to Cite
Similar Articles
- I Putu Hariyadi, Akbar Juliansyah, Analisa Penerapan Private Cloud Computing Berbasis Proxmox Virtual Environment Sebagai Media Pembelajaran Praktikum Manajemen Jaringan , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 18 No. 1 (2018)
- Dimas Afryzal Hanan, Ario Yudo Husodo, Regania Pasca Rassy, Sentiment Study of ChatGPT on Twitter Data with Hybrid K-Means and LSTM , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 24 No. 2 (2025)
- Achmad Rian Tarmizi, Ahmat Adil, Lilik Widyawati, Optimization of The use of Wireless Lan Devices to Minimize Operational Costs , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 19 No. 2 (2020)
- Inna Novianty, Walidatush Sholihah, Yudawan Aditama, Aplikasi Virtual Reality Atom Kimia di Seamolec , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 19 No. 2 (2020)
- I Putu Hariyadi, Khairan Marzuki, Implementation of Configuration Management Virtual Private Server Using Ansible , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 19 No. 2 (2020)
- Syafri Arlis, Muhammad Reza Putra, Musli Yanto, Improved Image Segmentation using Adaptive Threshold Morphology on CT-Scan Images for Brain Tumor Detection , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 23 No. 3 (2024)
- Ahmad Fatoni Dwi Putra, Muhamad Nizam Azmi, Heri Wijayanto, Satria Utama, I Gede Putu Wirarama Wedashwara Wirawan, Optimizing Rain Prediction Model Using Random Forest and Grid Search Cross-Validation for Agriculture Sector , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 23 No. 3 (2024)
- Lalu Ganda Rady Putra, Anthony Anggrawan, Pengelompokan Penerima Bantuan Sosial Masyarakat dengan Metode K-Means , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 21 No. 1 (2021)
- Fatur Rahman Harahap, Anggun Fitrian Isnawati, Khoirun Ni'amah, Variation of Distributed Power Control Algorithm in Co-Tier Femtocell Network , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 24 No. 1 (2024)
- Rahmaddeni Rahmaddeni, M. Teguh Wicaksono, Denok Wulandari, Agustriono Agustriono, Sang Adji Ibrahim, Enhancing Multiple Linear Regression with Stacking Ensemble for Dissolved Oxygen Estimation , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 24 No. 1 (2024)
You may also start an advanced similarity search for this article.
Most read articles by the same author(s)
- Dadang Priyanto, Bambang Krismono Triwijoyo, Deny Jollyta, Hairani Hairani, Ni Gusti Ayu Dasriani, Data Mining Earthquake Prediction with Multivariate Adaptive Regression Splines and Peak Ground Acceleration , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 22 No. 3 (2023)
- Prihandoko Prihandoko, Deny Jollyta, Gusrianty Gusrianty, Muhammad Siddik, Johan Johan, Cluster Validity for Optimizing Classification Model: Davies Bouldin Index – Random Forest Algorithm , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 24 No. 1 (2024)
- Nurdin Nurdin, Erni Susanti, Hafizh Al-Kautsar Aidilof, Dadang Priyanto, Comparison of Naive Bayes and Dempster Shafer Methods in Expert System for Early Diagnosis of COVID-19 , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 22 No. 1 (2022)
- Dadang Priyanto, Raisul Azhar, SISTEM APLIKASI UNTUK KEAMANAN DATA DENGAN ALGORITMA 'DES' (Data Encryption Standard) , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 16 No. 1 (2016)
- Musta’an Musta’an, Dadang Priyanto, SISTEM INFORMASI PENGADAAN BARANG LANGSUNG BERBASIS CLIENT-SERVER (Study Kasus di Universitas Mataram) , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 15 No. 2 (2016)
- Muhammad Hairul Abror, Dadang Priyanto, MEDIA BANTU PEMBELAJARAN IPA SMP SEBAGAI BEKAL MENGHADAPI UJIAN NASIONAL (UN) , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 15 No. 1 (2015)