Evaluation of K-Means, DBSCAN, and Hierarchical Clustering for Strategic Segmentation of Tourism SMEs in Rembang, Indonesia

Authors

  • Ardiansyah Ramadhan Computer Engineering, Telkom University, Indonesia
  • Fandi Achmad Industrial Engineering, Telkom University, Indonesia
  • Ibnu Zulkarnain Industrial Engineering, National Taiwan University of Science and Technology, Taiwan
  • Masayoshi Aritsugi Faculty of Advanced Science and Technology, Kumamoto University, Japan

DOI:

https://doi.org/10.52436/1.jutif.2025.6.3.4602

Keywords:

Clustering Algorithm, DBSCAN, Hierarchical Clustering, K-Means, Tourism Industry

Abstract

Small and Medium Enterprises (SMEs) play a crucial role in job creation, regional competitiveness, and economic equity. In the tourism sector, particularly in ecotourism and cultural tourism, clustering SMEs presents challenges due to complex and interrelated data variables. This study aims to evaluate the effectiveness of three clustering algorithms—K-Means, DBSCAN, and Hierarchical Clustering—in segmenting SMEs based on real-world tourism datasets. A purposive sampling method was applied to 203 valid respondents from SMEs in Rembang Regency, Central Java. Clustering performance was assessed using the Silhouette Coefficient and Davies-Bouldin Index, while computational efficiency and scalability were analyzed through execution time and memory usage. The results show that DBSCAN achieved the best clustering quality (Silhouette Coefficient: 0.5496, Davies-Bouldin Index: 0.3298), effectively managing noise and irregular cluster shapes. Hierarchical clustering offered moderate quality and helped reveal relationships between SMEs. In contrast, K-Means demonstrated the lowest quality (Silhouette Coefficient: 0.2321) due to its limitation in handling non-spherical clusters. For computational efficiency, Hierarchical Clustering required the least memory (0.14 MB) and shortest execution time (5.73 seconds), while K-Means took the longest time (26.00 seconds). DBSCAN consumed more memory due to density-based processing. K-Means was the most stable in scalability testing with increasing dataset sizes, whereas Hierarchical Clustering showed inefficiency. The findings support selecting appropriate clustering methods based on data complexity and size. This study enhances data-driven tourism development strategies and advances clustering methodology for applied informatics. Future work may explore hybrid clustering and predictive models for deeper insights.

Downloads

Download data is not yet available.

References

E. Aminullah, “Forecasting of technology innovation and economic growth in Indonesia,” Technol Forecast Soc Change, vol. 202, May 2024, doi: 10.1016/j.techfore.2024.123333.

S. P. Dhakal and S. P. Tjokro, “Tourism enterprises in Indonesia and the fourth industrial revolution – are they ready?,” Tourism Recreation Research, vol. 49, no. 2, pp. 439–444, Mar. 2024, doi: 10.1080/02508281.2021.1996687.

S. Utama, R. Yusfiarto, R. R. Pertiwi, and A. N. Khoirunnisa, “Intentional model of MSMEs growth: a tripod-based view and evidence from Indonesia,” Journal of Asia Business Studies, vol. 18, no. 1, pp. 62–84, Jan. 2024, doi: 10.1108/JABS-08-2022-0291.

F. Achmad and I. Inrawan Wiratmadja, “Driving Sustainable Performance in SMEs Through Frugal Innovation: The Nexus of Sustainable Leadership, Knowledge Management, and Dynamic Capabilities,” IEEE Access, vol. 12, pp. 103329–103347, 2024, doi: 10.1109/ACCESS.2024.3433474.

F. Achmad, Y. Prambudia, and A. A. Rumanti, “Improving Tourism Industry Performance through Support System Facilities and Stakeholders: The Role of Environmental Dynamism,” Sustainability (Switzerland), vol. 15, no. 5, Mar. 2023, doi: 10.3390/su15054103.

P. Jafarzadeh, T. Vähämäki, P. Nevalainen, A. Tuomisto, and J. Heikkonen, “Supporting SME companies in mapping out AI potential: a finnish AI development case,” Journal of Technology Transfer, 2024, doi: 10.1007/s10961-024-10122-5.

Q. Liu, J. Gao, and S. Li, “The innovation model and upgrade path of digitalization driven tourism industry: Longitudinal case study of OCT,” Technol Forecast Soc Change, vol. 200, Mar. 2024, doi: 10.1016/j.techfore.2023.123127.

A. I. Ramaano, “The potential significance of geographic information systems (GISs) and remote sensing (RS) in sustainable tourism and decent community involvement in African-rural neighborhoods,” Journal of Electronic Business & Digital Economics, vol. 3, no. 3, pp. 341–362, Oct. 2024, doi: 10.1108/JEBDE-03-2024-0006.

A. Ramadhan, H. M. Jumhur, and F. A. Nur, “POLICY FORMULATION FOR ANTICIPATING THE IMPACT OF ACID RAIN ON PADDY PLANTS USING NORMATIVE JURIDICAL ANALYSIS,” INDONESIAN JOURNAL OF URBAN AND ENVIRONMENTAL TECHNOLOGY, pp. 164–182, Jul. 2024, doi: 10.25105/urbanenvirotech.v7i2.19451.

V. A. Sari and S. Tiwari, “The Geography of Human Capital: Insights from the Subnational Human Capital Index in Indonesia,” Soc Indic Res, vol. 172, no. 2, pp. 673–702, Mar. 2024, doi: 10.1007/s11205-024-03322-x.

F. Achmad and I. I. Wiratmadja, “Strategic advancements in tourism development in Indonesia: Assessing the impact of facilities and services using the PLS-SEM approach,” Journal Industrial Servicess is, vol. 10, no. 1, 2024, doi: 10.62870/jiss.v10i1.24494.

F. Achmad, Y. Prambudia, and A. A. Rumanti, “Sustainable Tourism Industry Development: A Collaborative Model of Open Innovation, Stakeholders, and Support System Facilities,” IEEE Access, vol. 11, pp. 83343–83363, 2023, doi: 10.1109/ACCESS.2023.3301574.

E. Azmi, R. A. Che Rose, A. Awang, and A. Abas, “Innovative and Competitive: A Systematic Literature Review on New Tourism Destinations and Products for Tourism Supply,” Jan. 01, 2023, MDPI. doi: 10.3390/su15021187.

G. J. Oyewole and G. A. Thopil, “Data clustering: application and trends,” Artif Intell Rev, vol. 56, no. 7, pp. 6439–6475, Jul. 2023, doi: 10.1007/s10462-022-10325-y.

A. M. Ikotun, A. E. Ezugwu, L. Abualigah, B. Abuhaija, and J. Heming, “K-means clustering algorithms: A comprehensive review, variants analysis, and advances in the era of big data,” Inf Sci (N Y), vol. 622, pp. 178–210, 2023, doi: https://doi.org/10.1016/j.ins.2022.11.139.

X. Shu and Y. Ye, “Knowledge Discovery: Methods from data mining and machine learning,” Soc Sci Res, vol. 110, Feb. 2023, doi: 10.1016/j.ssresearch.2022.102817.

N. Trianasari and T. A. Permadi, “Analysis of Product Recommendation Models at Each Fixed Broadband Sales Location Using K-Means, DBSCAN, Hierarchical Clustering, SVM, RF, and ANN,” Journal of Applied Data Sciences, vol. 5, no. 2, pp. 636–652, May 2024, doi: 10.47738/jads.v5i2.210.

G. J. Oyewole and G. A. Thopil, “Data clustering: application and trends,” Artif Intell Rev, vol. 56, no. 7, pp. 6439–6475, Jul. 2023, doi: 10.1007/s10462-022-10325-y.

Mahnoor et al., “A Review of Approaches for Rapid Data Clustering: Challenges, Opportunities and Future Directions,” IEEE Access, 2024, doi: 10.1109/ACCESS.2024.3461798.

M. Chaudhry, I. Shafi, M. Mahnoor, D. L. R. Vargas, E. B. Thompson, and I. Ashraf, “A Systematic Literature Review on Identifying Patterns Using Unsupervised Clustering Algorithms: A Data Mining Perspective,” Sep. 01, 2023, Multidisciplinary Digital Publishing Institute (MDPI). doi: 10.3390/sym15091679.

H. Liu, J. Chen, J. Dy, and Y. Fu, “Transforming Complex Problems Into K-Means Solutions,” IEEE Trans Pattern Anal Mach Intell, vol. 45, no. 7, pp. 9149–9168, Jul. 2023, doi: 10.1109/TPAMI.2023.3237667.

M. S. Al-Batah, E. R. Al-Kwaldeh, M. A. Wahed, M. Alzyoud, and N. Al-Shanableh, “Enhancement over DBSCAN Satellite Spatial Data Clustering,” Journal of Electrical and Computer Engineering, vol. 2024, 2024, doi: 10.1155/2024/2330624.

X. Ran, Y. Xi, Y. Lu, X. Wang, and Z. Lu, “Comprehensive survey on hierarchical clustering algorithms and the recent developments,” Artif Intell Rev, vol. 56, no. 8, pp. 8219–8264, 2023, doi: 10.1007/s10462-022-10366-3.

M. Mariani and R. Baggio, “Big data and analytics in hospitality and tourism: a systematic literature review,” International Journal of Contemporary Hospitality Management, vol. 34, no. 1, pp. 231–278, Jan. 2022, doi: 10.1108/IJCHM-03-2021-0301.

E. Aminullah, “Forecasting of technology innovation and economic growth in Indonesia,” Technol Forecast Soc Change, vol. 202, May 2024, doi: 10.1016/j.techfore.2024.123333.

D. Theng and K. K. Bhoyar, “Feature selection techniques for machine learning: a survey of more than two decades of research,” Knowl Inf Syst, vol. 66, no. 3, pp. 1575–1637, 2024, doi: 10.1007/s10115-023-02010-5.

M. Chaudhry, I. Shafi, M. Mahnoor, D. L. R. Vargas, E. B. Thompson, and I. Ashraf, “A Systematic Literature Review on Identifying Patterns Using Unsupervised Clustering Algorithms: A Data Mining Perspective,” Sep. 01, 2023, Multidisciplinary Digital Publishing Institute (MDPI). doi: 10.3390/sym15091679.

A. E. Ezugwu et al., “A comprehensive survey of clustering algorithms: State-of-the-art machine learning applications, taxonomy, challenges, and future research prospects,” Eng Appl Artif Intell, vol. 110, p. 104743, 2022, doi: https://doi.org/10.1016/j.engappai.2022.104743.

Z. Wu, J. Sun, Y. Zhang, Z. Wei, and J. Chanussot, “Recent Developments in Parallel and Distributed Computing for Remotely Sensed Big Data Processing,” Proceedings of the IEEE, vol. 109, no. 8, pp. 1282–1305, 2021, doi: 10.1109/JPROC.2021.3087029.

G. Richer, A. Pister, M. Abdelaal, J.-D. Fekete, M. Sedlmair, and D. Weiskopf, “Scalability in Visualization,” IEEE Trans Vis Comput Graph, vol. 30, no. 7, pp. 3314–3330, 2024, doi: 10.1109/TVCG.2022.3231230.

L. S. Ling and C. T. Weiling, “Enhancing Segmentation: A Comparative Study of Clustering Methods,” IEEE Access, p. 1, 2025, doi: 10.1109/ACCESS.2025.3550339.

O. Kulkarni and A. Burhanpurwala, “A Survey of Advancements in DBSCAN Clustering Algorithms for Big Data,” in 2024 3rd International conference on Power Electronics and IoT Applications in Renewable Energy and its Control (PARC), 2024, pp. 106–111. doi: 10.1109/PARC59193.2024.10486339.

A. M. Ikotun, F. Habyarimana, and A. E. Ezugwu, “Cluster validity indices for automatic clustering: A comprehensive review,” Jan. 30, 2025, Elsevier Ltd. doi: 10.1016/j.heliyon.2025.e41953.

Z. Yuan et al., “Benchmarking spatial clustering methods with spatially resolved transcriptomics data,” Nat Methods, vol. 21, no. 4, pp. 712–722, 2024, doi: 10.1038/s41592-024-02215-8.

Z. Ma, Y. Xu, H. Xu, Z. Meng, L. Huang, and Y. Xue, “Adaptive Batch Size for Federated Learning in Resource-Constrained Edge Computing,” IEEE Trans Mob Comput, vol. 22, no. 1, pp. 37–53, 2023, doi: 10.1109/TMC.2021.3075291.

D. Muhr, M. Affenzeller, and J. Küng, “A Probabilistic Transformation of Distance-Based Outliers,” Mach Learn Knowl Extr, vol. 5, no. 3, pp. 782–802, Sep. 2023, doi: 10.3390/make5030042.

S. Bhattacharya, F. Kamper, and J. Beirlant, “Outlier detection based on extreme value theory and applications,” Scandinavian Journal of Statistics, vol. 50, no. 3, pp. 1466–1502, Sep. 2023, doi: 10.1111/sjos.12665.

B. Avanzi, M. Lavender, G. Taylor, and B. Wong, “Detection and treatment of outliers for multivariate robust loss reserving,” Annals of Actuarial Science, vol. 18, no. 1, pp. 102–125, Mar. 2024, doi: 10.1017/S1748499523000155.

A. Azizan, S. Anile, C. K. Nielsen, E. Paradis, and S. Devillard, “Population density and genetic diversity are positively correlated in wild felids globally,” Global Ecology and Biogeography, vol. 32, no. 10, pp. 1858–1869, Oct. 2023, doi: 10.1111/geb.13727.

D. Jollyta, S. Efendi, M. Zarlis, and H. Mawengkang, “Analysis of an optimal cluster approach: a review paper,” in Journal of Physics: Conference Series, Institute of Physics, 2023. doi: 10.1088/1742-6596/2421/1/012015.

F. dos S. Silva, J. C. dos Reis, and M. S. Reis, “SERIEMA: A Framework to Enhance Clustering Stability, Compactness, and Separation by Fusing Multimodal Data,” in Natural Language Processing and Information Systems, A. Rapp, L. Di Caro, F. Meziane, and V. Sugumaran, Eds., Cham: Springer Nature Switzerland, 2024, pp. 394–408.

S. Zeng, T. Wang, W. Lin, Z. Chen, and R. Xiao, “A Patent Mining Approach to Accurately Identifying Innovative Industrial Clusters Based on the Multivariate DBSCAN Algorithm,” Systems, vol. 12, no. 9, p. 321, Aug. 2024, doi: 10.3390/systems12090321.

A. A. Bushra, D. Kim, Y. Kan, and G. Yi, “AutoSCAN: automatic detection of DBSCAN parameters and efficient clustering of data in overlapping density regions,” PeerJ Comput Sci, vol. 10, 2024, doi: 10.7717/peerj-cs.1921.

J. Peng and Y. Chen, “Density-based clustering with boundary samples verification,” Appl Soft Comput, vol. 159, p. 111685, 2024, doi: https://doi.org/10.1016/j.asoc.2024.111685.

T. Z. Abdulhameed, S. A. Yousif, V. W. Samawi, and H. I. Al-Shaikhli, “SS-DBSCAN: Semi-Supervised Density-Based Spatial Clustering of Applications with Noise for Meaningful Clustering in Diverse Density Data,” IEEE Access, pp. 1–1, Sep. 2024, doi: 10.1109/access.2024.3457587.

C. Retiti Diop Emane et al., “Anomaly Detection Based on GCNs and DBSCAN in a Large-Scale Graph,” Electronics (Switzerland), vol. 13, no. 13, Jul. 2024, doi: 10.3390/electronics13132625.

N. Garg and P. Dwivedi, “A Novel Approach for Exploring Data-Driven Nutritional Insights Using Clustering and Dimensionality Reduction Techniques,” SN Comput Sci, vol. 5, no. 8, Dec. 2024, doi: 10.1007/s42979-024-03397-w.

M. Hajihosseinlou, A. Maghsoudi, and R. Ghezelbash, “Intelligent mapping of geochemical anomalies: Adaptation of DBSCAN and mean-shift clustering approaches,” J Geochem Explor, vol. 258, p. 107393, 2024, doi: https://doi.org/10.1016/j.gexplo.2024.107393.

G. Shamim and M. Rihan, “Exploratory Data Analytics and PCA-Based Dimensionality Reduction for Improvement in Smart Meter Data Clustering,” IETE J Res, vol. 70, no. 4, pp. 4159–4168, Apr. 2024, doi: 10.1080/03772063.2023.2218317.

G. Mischler, Y. A. Li, S. Bickel, A. D. Mehta, and N. Mesgarani, “Contextual feature extraction hierarchies converge in large language models and the brain,” Nat Mach Intell, vol. 6, no. 12, pp. 1467–1477, 2024, doi: 10.1038/s42256-024-00925-4.

N. Hanafi and H. Saadatfar, “A fast DBSCAN algorithm for big data based on efficient density calculation,” Expert Syst Appl, vol. 203, Oct. 2022, doi: 10.1016/j.eswa.2022.117501.

C. Bentéjac, A. Csörgő, and G. Martínez-Muñoz, “A comparative analysis of gradient boosting algorithms,” Artif Intell Rev, vol. 54, no. 3, pp. 1937–1967, Mar. 2021, doi: 10.1007/s10462-020-09896-5.

A. Ramadhan, I. Mendonça, M. Aritsugi, and I. Chandra, “Enhancing the Accuracy of Conductivity Parameters from Real-Time Rainwater Quality Measurements based on Internet of Things Utilizing Machine Learning,” in 2024 10th International Conference on Wireless and Telematics (ICWT), 2024, pp. 1–6. doi: 10.1109/ICWT62080.2024.10674689.

A. Ramadhan et al., “Central Tendency Data Real-Time Acid Rain Measurement to Evaluate Tool’s Performance Using Statistical Analysis,” vol. 14, no. 4, 2024.

P. Pellizzoni, A. Pietracaprina, and G. Pucci, “k-Center Clustering with Outliers in Sliding Windows,” Algorithms, vol. 15, no. 2, Feb. 2022, doi: 10.3390/a15020052.

A. Fahad et al., “A survey of clustering algorithms for big data: Taxonomy and empirical analysis,” IEEE Trans Emerg Top Comput, vol. 2, no. 3, pp. 267–279, Sep. 2014, doi: 10.1109/TETC.2014.2330519.

A. E. Ezugwu, A. K. Shukla, M. B. Agbaje, O. N. Oyelade, A. José-García, and J. O. Agushaka, “Automatic clustering algorithms: a systematic review and bibliometric analysis of relevant literature,” Jun. 01, 2021, Springer Science and Business Media Deutschland GmbH. doi: 10.1007/s00521-020-05395-4.

H. Mittal, A. C. Pandey, M. Saraswat, S. Kumar, R. Pal, and G. Modwel, “A comprehensive survey of image segmentation: clustering methods, performance parameters, and benchmark datasets,” Multimed Tools Appl, vol. 81, no. 24, pp. 35001–35026, 2022, doi: 10.1007/s11042-021-10594-9.

A. E. Ezugwu, A. K. Shukla, M. B. Agbaje, O. N. Oyelade, A. José-García, and J. O. Agushaka, “Automatic clustering algorithms: a systematic review and bibliometric analysis of relevant literature,” Neural Comput Appl, vol. 33, no. 11, pp. 6247–6306, 2021, doi: 10.1007/s00521-020-05395-4.

U. Fang, M. Li, J. Li, L. Gao, T. Jia, and Y. Zhang, “A Comprehensive Survey on Multi-View Clustering,” IEEE Trans Knowl Data Eng, vol. 35, no. 12, pp. 12350–12368, 2023, doi: 10.1109/TKDE.2023.3270311.

M. A. Mahdi, K. M. Hosny, and I. Elhenawy, “Scalable Clustering Algorithms for Big Data: A Review,” IEEE Access, vol. 9, pp. 80015–80027, 2021, doi: 10.1109/ACCESS.2021.3084057.

M. Sun et al., “Scalable Multi-view Subspace Clustering with Unified Anchors,” in Proceedings of the 29th ACM International Conference on Multimedia, in MM ’21. New York, NY, USA: Association for Computing Machinery, 2021, pp. 3528–3536. doi: 10.1145/3474085.3475516.

N. Monath et al., “Scalable Hierarchical Agglomerative Clustering,” in Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, in KDD ’21. New York, NY, USA: Association for Computing Machinery, 2021, pp. 1245–1255. doi: 10.1145/3447548.3467404.

H. Hu, J. Liu, X. Zhang, and M. Fang, “An Effective and Adaptable K-means Algorithm for Big Data Cluster Analysis,” Pattern Recognit, vol. 139, p. 109404, 2023, doi: https://doi.org/10.1016/j.patcog.2023.109404.

Additional Files

Published

2025-07-09

How to Cite

[1]
A. Ramadhan, F. Achmad, I. Zulkarnain, and M. Aritsugi, “Evaluation of K-Means, DBSCAN, and Hierarchical Clustering for Strategic Segmentation of Tourism SMEs in Rembang, Indonesia”, J. Tek. Inform. (JUTIF), vol. 6, no. 3, pp. 1605–1630, Jul. 2025.