Comparison of K-Means, Fuzzy C-Means, Fuzzy Gustafson Kessel, and DBSCAN for Village Grouping in Surabaya Based on Poverty Indicators
Abstract
The population growth rate in various countries in the world is increasing, including Indonesia. The population explosion as a result of rapid population growth has a negative impact on the socio-economic life of the community, such as increasing unemployment rates, food shortages, and high poverty rates. Therefore, local governments in each country try to overcome the poverty problem using various policies, including in Surabaya, East Java, Indonesia. This study aims to classify villages in Surabaya using non-hierarchical clustering, such as K-Means, Fuzzy C-Means, Fuzzy Gustafson Kessel, and DBSCAN (Density-Based Spatial Clustering of Applications with Noise), based on poverty indicators. Before analysis, the villages in Surabaya, East Java, Indonesia were classified using non-hierarchical clustering, and the results of cluster analysis were compared from various methods using the value of within clusters sum of squares and average silhouette width. Comparison between village grouping methods results in K-Means being the best method for village grouping in Surabaya, East Java, Indonesia based on the values of the within clusters sum of squares. While based on the average silhouette width value, the DBSCAN (Density-Based Spatial Clustering of Applications with Noise) method is found to be the best method because its value was close to 1 compared to the other methods. Thus, it can be concluded that K-Means and DBSCAN (Density-Based Spatial Clustering of Applications with Noise) is the best method for village grouping in Surabaya, East Java, Indonesia in relation to poverty problems.
Laju pertumbuhan penduduk di berbagai negara di dunia semakin meningkat, termasuk Indonesia. Ledakan penduduk akibat pertumbuhan penduduk yang pesat berdampak negatif terhadap kehidupan sosial ekonomi masyarakat, seperti meningkatnya angka pengangguran, kekurangan pangan, dan tingginya angka kemiskinan. Oleh karena itu, pemerintah daerah di setiap negara berusaha mengatasi masalah kemiskinan dengan berbagai kebijakan, termasuk di Surabaya, Jawa Timur, Indonesia. Penelitian ini bertujuan untuk mengklasifikasikan desa-desa di Surabaya, Jawa Timur, Indonesia menggunakan non-hierarchial clusterings, seperti K-Means, Fuzzy C-Means, Fuzzy Gustafson Kessel, dan DBSCAN (Density-Based Spatial Clustering of Applications with Noise), berdasarkan indikator kemiskinan. Sebelum dilakukan analisis, desa-desa di Surabaya, Jawa Timur, Indonesia diklasifikasikan menggunakan non-hierarchical clustering, dan hasil analisis cluster dibandingkan dari berbagai metode dengan menggunakan nilai cluster sum of squares dan rata-rata lebar siluet. Perbandingan antar metode pengelompokan desa menghasilkan K-Means menjadi metode terbaik untuk pengelompokan desa di Surabaya berdasarkan nilai cluster sum of squares. Sedangkan berdasarkan nilai rata-rata lebar siluet, metode DBSCAN (Density-Based Spatial Clustering of Applications with Noise) merupakan metode yang paling baik karena nilainya mendekati 1 dibandingkan dengan metode lainnya. Dengan demikian, dapat disimpulkan bahwa K-Means dan DBSCAN (Density-Based Spatial Clustering of Applications with Noise) merupakan metode terbaik untuk pengelompokan desa di Surabaya, Jawa Timur, Indonesia dalam kaitannya dengan masalah kemiskinan.
Keywords
Full Text:
PDFReferences
Agarwal, S. (2014). Data Mining: Data Mining Concepts and Techniques. Proceedings - 2013 International Conference on Machine Intelligence Research and Advancement, ICMIRA 2013. https://doi: 10.1109/ICMIRA.2013.45.
Agusta, Y. (2007). K-Means-Penerapan, Permasalahan dan Metode Terkait. Jurnal Sistem dan Informatika, 3, 47–60.
Askari, S. (2021). Fuzzy C-Means Clustering Algorithm for Data with Unequal Cluster Sizes and Contaminated with Noise and Outliers: Review and Development. Expert Systems with Applications, 165, 113856. https://doi.org/10.1016/j.eswa.2020.113856.
BPS. (2021). Hasil Sensus Penduduk 2020.
BPS Kota Surabaya. (2018). Profil Kemiskinan di Kota Surabaya Tahun 2017.
BPS Provinsi Jawa Timur. (2020). Kemiskinan dan Ketimpangan.
Gueorguieva, N., Valova, I., & Georgiev, G. (2017). M&MFCM: Fuzzy C-Means Clustering with Mahalanobis and Minkowski Distance Metrics. Procedia Computer Science, 114, 224-233. https://doi.org/10.1016/j.procs.2017.09.064.
Härdle, W., & Simar, L. (2007). Applied Multivariate Statistical Analysis: Second edition. Applied Multivariate Statistical Analysis: Second Edition. https://doi: 10.1007/978-3-540-72244-1.
Jalali, Z. (2016). Development of Slope Mass Rating System Using K-Means and Fuzzy C-Means Clustering Algorithms. International Journal of Mining Science and Technology, 26(6), 959-966. https://doi.org/10.1016/j.ijmst.2016.09.004.
Kristianto, A. (2021). Performance Analysis of K-Means and DBSCAN in Interest Clustering Use of Public Transportation. Journal of Electronics and Computers, Vol 14, 368-372. https://doi.org/10.51903/elkom.v14i2.551
Li, M. J., Ng, M. K., Cheung, Y. M., & Huang, J. Z. (2008). Agglomerative Fuzzy K-Means Clustering Algorithm with Selection of Number of Clusters. IEEE Transactions on Knowledge and Data Engineering, 20(11), 1519–1534. https://doi: 10.1109/TKDE.2008.88.
Nurwati, N. (2008). Kemiskinan: Model Pengukuran, Permasalahan dan Alternatif Kebijakan. Jurnal Kependudukan Padjadjaran, 10(1), 1–11.
Oyelade, O. J., Oladipupo, O. O., & Obagbuwa, I. C. (2010). Application of K Means Clustering Algorithm for Prediction of Students Academic Performance. 7, 292–295.
Pal, N. R., & Bezdek, J. C. (1995). On Cluster Validity for the Fuzzy C-Means Model. IEEE Transactions on Fuzzy Systems, 3(3), 370–379. https://doi: 10.1109/91.413225.
Park, J., Park, K. V., Yoo, S., Choi, S. O., & Han, S. W. (2020). Development of the WEEE Grouping System in South Korea Using the Hierarchical and Non-Hierarchical Clustering Algorithms. Resources, Conservation and Recycling, 161, 104884. https://doi.org/10.1016/j.resconrec.2020.104884.
Raval, U. R., & Jani, C. (2016). Implementing & Improvisation of K-Means Clustering Algorithm. International Journal of Computer Science and Mobile Computing, 55(5), 191–203.
Serrao, M., Chini, G., Bergantino, M., Sarnari, D., Casali, C., Conte, C., … Marinozzi, F. (2018). Identification of Specific Gait Patterns in Patients with Cerebellar Ataxia, Spastic Paraplegia, and Parkinson’s Disease: A Non-Hierarchical Cluster Analysis. Human Movement Science, 57, 267–279. https://doi.org/10.1016/j.humov.2017.09.005.
Soegimo, D., & Ruswanto. (2009). Geografi untuk SMA/MA Kelas XI. Jakarta: Pusat Perbukuan Departemen Pendidikan Nasional.
Tilson, L. V., Excell, P. S., & Green, R. J. (1988). A Generalisation of The Fuzzy C-Means Clustering Algorithm. Remote Sensing. Proc. IGARSS ’88 Symposium, Edinburgh, 1988. Vol. 3, 10(2), 1783–1784. https://doi.org/10.1109/igarss.1988.569600.
Tukiyat, Djohan, Y. (2022). Analysis of the Spread of the Covid-19 Pandemic in the City of Jakarta Using the K-Means Clustering Method and Density Based Spatial Clustering of Applications With Noise. Informatics Journal, Vol. 9, 43-54. https://doi.org/10.31294/inf.v9i1.11226.
Treiger, B., Bondarenko, I., Van Malderen, H., & Van Grieken, R. (1995). Elucidating the Composition of Atmospheric Aerosols Through the Combined Hierarchical, Non-Hierarchical and Fuzzy Clustering of Large Electron Probe Microanalysis Data Sets. Analytica Chimica Acta, 317(1), 33–51. https://doi.org/10.1016/0003-2670(95)00405-X.
DOI: http://dx.doi.org/10.21043/jpmk.v5i2.16552
Refbacks
- There are currently no refbacks.
This work is licensed under a Creative Commons Attribution 4.0 International License.
Editorial and Administration Office:
Jurnal Pendidikan Matematika (Kudus)
Tadris Matematika, Tarbiyah Faculty, Institut Agama Islam Negeri Kudus
Jl. Conge Ngembalrejo Po Box 51, Kudus, Jawa Tengah, Indonesia, Kode Pos: 59322
Email: [email protected]