Clustrering of BPJS National Health Insurance Participant Using DBSCAN Algorithm

Wiwit Pura Nurmayanti; Dewi Juliah Ratnaningsih; Sausan Nisrina; Abdul Rahim; Muhammad Malthuf; Wirajaya Kusuma

doi:10.30812/varian.v6i1.1886

Authors

Wiwit Pura Nurmayanti Universitas Hamzanwadi, Indonesia
Dewi Juliah Ratnaningsih Universitas Hamzanwadi, Indonesia
Sausan Nisrina Universitas Terbuka, Indonesia
Abdul Rahim Universitas Mulawarman, Indonesia
Muhammad Malthuf Mataram State Islamic University, Indoensia
Wirajaya Kusuma Universitas Bumigora, Indonesia

DOI:

https://doi.org/10.30812/varian.v6i1.1886

Keywords:

Clustering, DBSCAN, JKN BPJS Health, Noise, outlier, Spatial

Abstract

In the current era of Big Data, getting data is no longer a difficult thing because they can access easily it via the internet, which is open access. A large amount of data can cause many problems in the data, such as data that deviates too far from the average (outliers). The method used to handle outlier data is DBSCAN which is density based clustering. The DBSCAN can be applied in various fields, one of which is the social sector, namely the participation of the JKN BPJS Health in West Nusa Tenggara. This study sees the distribution of BPJS Health participation groups, and to detect outliers so that objects with noise are not included in the cluster. The results of the study using the DBSCAN algorithm show that the optimal epsilon value is between 0.37 points by observing the knee of a curve. and MinPts 3, with the highest silhouette value of 0.2763. The highest JKN BPJS participants are in cluster 1 with 5 sub-districts, the second highest cluster is cluster 3 with 5 sub-districts, while the lowest cluster is cluster 2 with 93 sub-districts. The 13 sub-districts are not included in any group because they are noise data.

Downloads

Download data is not yet available.

References

Aggarwal, C. C. (2017). An introduction to outlier analysis. In Outlier analysis, pages 1â€“34. Springer.
Alelyani, S., Tang, J., and Liu, H. (2018). Feature selection for clustering: A review. Data Clustering, pages 29â€“60.
Ali, A. and Masyfufah, L. (2021). Klasterisasi Pasien BPJS Dengan Metode K-Means Clustering Guna Menunjang Program Jaminan
Kesehatan Nasional Di Rumah Sakit Anwar Medika Balong Bendo Sidoarjo. Jurnal Wiyata: Penelitian Sains dan Kesehatan,
8(1):8â€“22.
Andria, F. and Kusnadi, N. (2017). Dampak kepesertaan bpjs bagi pekerja informal di bogor. JIMFE (Jurnal Ilmiah Manajemen
Fakultas Ekonomi), 3(1):1â€“15.
BPS NTB (2021). Profil Kesehatan NTB 2020. In Profil Kesehatan NTB 2020, BPS Provinsi NTB. https://ntb.bps.go.id/publication/
2021/10/29/84e8b14e6e70ce610e55f109/profil-kesehatan-provinsi-nusa-tenggara-barat-2020.html.
Campello, R. J., Moulavi, D., Zimek, A., and Sander, J. (2015). Hierarchical density estimates for data clustering, visualization, and
outlier detection. ACM Transactions on Knowledge Discovery from Data (TKDD), 10(1):1â€“51.
Devi, A. S., Putra, I. K. G. D., and Sukarsa, I. M. (2015). Implementasi metode clustering dbscan pada proses pengambilan keputusan.
Lontar Komputer: Jurnal Ilmiah Teknologi Informasi, pages 185â€“191.
Dogan, A. and Birant, D. (2021). Machine learning and data mining in manufacturing. Expert Systems with Applications, 166:114060.
Fisher, D., DeLine, R., Czerwinski, M., and Drucker, S. (2012). Interactions with big data analytics. interactions, 19(3):50â€“59.
Galan, S. F. (2019). Comparative evaluation of region query strategies for dbscan clustering. Â´ Information Sciences, 502:76â€“90.
Gebremeskel, G. B., Yi, C., He, Z., and Haile, D. (2016). Combined data mining techniques based patient data outlier detection for
healthcare safety. International Journal of Intelligent Computing and Cybernetics.
Hendayanti, N. P. N., Putri, G. A. M. A., and Nurhidayati, M. (2018). Ketepatan klasifikasi penerima beasiswa stmik stikom bali
dengan hybrid self organizing maps dan algoritma k-mean. Jurnal Varian, 2(1):1â€“7.
Herlinda, V., Darwis, D., and Dartono, D. (2021). Analisis clustering untuk recredesialing fasilitas kesehatan menggunakan metode
fuzzy c-means. Jurnal Teknologi Dan Sistem Informasi, 2(2):94â€“99.
Hou, J., Gao, H., and Li, X. (2016). Dsets-dbscan: A parameter-free clustering algorithm. IEEE Transactions on Image Processing,
25(7):3182â€“3193.
Kameshwaran, K. and Malarvizhi, K. (2014). Survey on clustering techniques in data mining. International Journal of Computer
Science and Information Technologies, 5(2):2272â€“2276.
Kha, N. H. and Anh, D. T. (2015). From cluster-based outlier detection to time series discord discovery. In Trends and Applications
in Knowledge Discovery and Data Mining, pages 16â€“28. Springer.
Koksalmis, E. and Kabak, O. (2019). Deriving decision makers weights in group decision making: An overview of objective methods. Â¨
Information Fusion, 49:146â€“160.
Madni, H. A., Anwar, Z., and Shah, M. A. (2017). Data mining techniques and applicationsa decade review. In 2017 23rd international conference on automation and computing (ICAC), pages 1â€“7. IEEE.
Malkomes, G., Kusner, M. J., Chen, W., Weinberger, K. Q., and Moseley, B. (2015). Fast distributed k-center clustering with outliers
on massive data. Advances in Neural Information Processing Systems, 28.
Matuschek, H., Kliegl, R., Vasishth, S., Baayen, H., and Bates, D. (2017). Balancing type i error and power in linear mixed models.
Journal of memory and language, 94:305â€“315.
Nagpal, A., Jatain, A., and Gaur, D. (2013). Review based on data clustering algorithms. In 2013 IEEE conference on information &
communication technologies, pages 298â€“303. IEEE.
Pourbahrami, S., Balafar, M. A., Khanli, L. M., and Kakarash, Z. A. (2020). A survey of neighborhood construction algorithms for
clustering and classifying data points. Computer Science Review, 38:100315.
Regin, R., Rajest, S. S., and Singh, B. (2021). Spatial data mining methods databases and statistics point of views. Innovations in
Information and Communication Technology Series, pages 103â€“109.
Riyadi, E. D. A. (2018). Analisis tingkat penyediaan jaminan sosial untuk petugas k3l di lingkungan universitas padjadjaran. Focus:
Jurnal Pekerjaan Sosial, 1(2):87â€“96.
Sadewo, T. A., Kusuma, P. D., and Setianingsih, C. (2021). Clustering pada data sentimen bpjs kesehatan menggunakan algoritma
agglomerative hierarchical clustering (ahc) average linkage. eProceedings of Engineering, 8(5).
Saky, D. A. L., Jayanti, N. A., and Nurmayanti, W. P. (2020). Clustering Petani Berdasarkan Dampak Covid-19 Yang Terjadi Pada
Sektor Pertanian. In Seminar Nasional Official Statistics, volume 2020, pages 160â€“164.
Song, H. and Lee, J.-G. (2018). Rp-dbscan: A superfast parallel dbscan algorithm based on random partitioning. In Proceedings of
the 2018 International Conference on Management of Data, pages 1173â€“1187.
Wierzchon, S. T. and KÅ‚opotek, M. A. (2018). Â´ Modern algorithms of cluster analysis, volume 34. Springer.
Zhang, M. (2019). Use density-based spatial clustering of applications with noise (dbscan) algorithm to identify galaxy cluster
members. In IOP conference series: earth and environmental science, volume 252, page 042033. IOP Publishing.
Zhao, Q., Shi, Y., Liu, Q., and Franti, P. (2015). A grid-growing clustering algorithm for geo-spatial data. Â¨ Pattern Recognition
Letters, 53:77â€“84.
Zimek, A. and Filzmoser, P. (2018). There and back again: Outlier detection between statistical reasoning and data mining algorithms.
Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 8(6):e1280.