Hybrid Unsupervised-Supervised Learning for Housing Submarket Segmentation and Price Prediction in Surabaya Urban Areas

Authors

  • Rinabi Tanamal Information Systems, Universitas Ciputra, Indonesia
  • Satria Adi Nugraha Information Systems, Universitas Ciputra, Indonesia
  • Nathalia Minoque Kusuma Salma Rasyid Jr Information Systems, Universitas Ciputra, Indonesia
  • Livanty Efatania Dendy Information Systems, Universitas Ciputra, Indonesia
  • Jessica Theijer Information Systems, Universitas Ciputra, Indonesia

DOI:

https://doi.org/10.52436/1.jutif.2026.7.3.5517

Keywords:

Clustering Algorithms, Housing Price Prediction, Hybrid Machine Learning, Submarket Segmentation, Surabaya Real Estate

Abstract

Surabaya’s rapid population growth, reaching 3.02 million residents, has intensified housing affordability challenges and increased structural variability in residential markets. This study proposes a hybrid machine learning framework that combines unsupervised clustering with supervised classification to identify submarket segments and predict housing price categories. A dataset of 490 properties containing structural, land, ownership, and contextual features was preprocessed and analyzed using K-Means. Cluster quality assessment through elbow inspection and a silhouette score of 0.45 indicated the presence of five meaningful market segments. These segments served as targets for a supervised classification stage that evaluated seven models, optimized via randomized hyperparameter search within a standardized preprocessing pipeline.

The RBF-SVM achieved the strongest performance, reaching 97 percent accuracy and a macro-F1 score of 0.97, representing an 8 percent improvement over non-hybrid baselines and outperforming boosted ensembles such as XGBoost. Permutation importance analysis identified number of floors, building orientation, position rank, and ownership status as dominant drivers of segment differentiation. The integration of clustering and classification enhances predictive reliability while improving interpretability, offering a transparent analytical toolkit for housing market assessment.

The proposed framework provides actionable insights for developers, appraisers, and policymakers in Surabaya, enabling data-driven identification of submarkets and supporting more equitable housing strategies aligned with SDG 11 on sustainable urban development. The approach is scalable to other Indonesian cities and establishes a foundation for future work incorporating spatial, socioeconomic, or temporal predictors.

Downloads

Download data is not yet available.

References

“Penduduk, Laju Pertumbuhan Penduduk, Distribusi Persentase Penduduk, Kepadatan Penduduk, Rasio Jenis Kelamin Penduduk Menurut Provinsi, 2025,” Badan Pusat Statistik. Accessed: Nov. 03, 2025. [Online]. Available: https://www.bps.go.id/id/statistics-table/3/V1ZSbFRUY3lTbFpEYTNsVWNGcDZjek53YkhsNFFUMDkjMyMwMDAw/jumlah-penduduk--laju-pertumbuhan-penduduk--distribusi-persentase-penduduk--kepadatan-penduduk--rasio-jenis-kelamin-penduduk-menurut-provinsi.html?year=2025

Muh. N. B. A. Yasin and N. A. Pratomoatmojo, “Analisis Fenomena Densifikasi Perkotaan pada Wilayah Surabaya Timur dengan Metode Point Pattern Analysis,” JURNAL TEKNIK ITS, 2021, doi: 10.12962/j23373539.v10i1.60517.

“Jumlah penduduk menurut kelompok umur dan jenis kelamin (ribu jiwa) di Kota Surabaya 2024,” Badan Pusat Statistik Kota Surabaya. Accessed: Nov. 03, 2025. [Online]. Available: https://surabayakota.bps.go.id/id/statistics-table/3/WVc0MGEyMXBkVFUxY25KeE9HdDZkbTQzWkVkb1p6MDkjMyMzNTc4/jumlah-penduduk-menurut-kelompok-umur-dan-jenis-kelamin-ribu-jiwa-di-kota-surabaya.html?year=2024

C. Panggabean and W. Aya Rumbia, “FAKTOR-FAKTOR YANG MENDORONG PERTUMBUHAN PENDUDUK DI KECAMATAN SOROPIA,” Jurnal Ekonomi (JE), vol. 9, no. 3, pp. 80–88, Dec. 2024, [Online]. Available: http://jurnal-ekonomi.uho.ac.id

Badan Pusat Statistik, Proyeksi Penduduk Kota Surabaya Tahun 2023–2032. Badan Pusat Statistik (BPS), 2022. Accessed: Dec. 12, 2025. [Online]. Available: https://disdukcapil.surabaya.go.id/wp-content/uploads/2022/11/Proyeksi-Penduduk-2023-2032.pdf

M. Satar, “Properti Investasi di Indonesia,” JURNAL SOSIAL, EKONOMI, DAN HUMANIORA (SOSIERA), Dec. 2024, doi: 10.56244/sosiera.v3i2.886.

C. H. Hung and S. W. Tzang, “Consumption and investment values in housing price: A real options approach,” International Journal of Strategic Property Management, vol. 25, no. 4, pp. 278–290, May 2021, doi: 10.3846/ijspm.2021.14914.

J. Nworah, E. Idu, and J. Ogbuefi, “The Impact of Inflation on Real Estate Investment Perfomance And Effective Investment Decisions,” Journal of Law and Sustainable Development, vol. 11, no. 12, p. e1625, Dec. 2023, doi: 10.55908/sdgs.v11i12.1625.

E. U. Otty, C. C. Egolum, and E. I. Oladejo, “Evaluation of Factors Driving Real Estate Investment Decisions by Private Investors in South – East Nigeria,” International Journal of Civil Engineering, Construction and Estate Management, vol. 11, no. 4, pp. 41–63, Apr. 2023, doi: 10.37745/ijcecem.14/vol11n44163.

L. G. Perdamaian and Z. Zhai, “Status of Livability in Indonesian Affordable Housing,” Architecture, vol. 4, no. 2, pp. 281–302, Jun. 2024, doi: 10.3390/architecture4020017.

K. R. Hayati, A. Rachma C, M. Ferry Firmansyah, and R. N. Sari, “Pengaruh Tingkat Kepadatan Penduduk Yang Semakin Kompleks dan Terus Meningkat di Kota Surabaya,” Madani : Jurnal Ilmiah Multidisipline, vol. 1, no. 5, Jun. 2023, doi: 10.5281/zenodo.8045384.

H. Irawan, “Analysis Of Occupant Satisfaction Level With Performance Of Infrastructure, Facilities, And Utilities In Jongke Apartment Occupancy, Sleman Regency,” Jurnal Indonesia Sosial Teknologi, vol. 4, no. 9, pp. 1413–1427, Sep. 2023, doi: 10.59141/jist.v4i9.706.

M. Rafee Majid, D. G. Pampanga, M. Zaman, N. Ruslik, I. Medugu, and M. Amer, “URBAN LIVABILITY INDICATORS FOR SECONDARY CITIES IN ASEAN REGION,” Journal of the Malaysian Institute of Planners, vol. 18, pp. 261–272, 2020, doi: 10.21837/pm.v18i13.791.

D. Arfiansyah, H. Han, and S. Zlatanova, “Land Suitability Analysis for Residential Development in an Ecologically Sensitive Area: A Case Study of Nusantara, the New Indonesian Capital,” Sustainability (Switzerland) , vol. 16, no. 13, Jul. 2024, doi: 10.3390/su16135767.

K. Kanagarathinam, R. Manikandan, and T. S. Kumar, “Machine learning algorithms-based decision support model for diabetes,” Review of Computer Engineering Research, vol. 11, no. 1, pp. 16–29, 2024, doi: 10.18488/76.v11i1.3598.

Š. Skovajsa, “Review of Clustering Methods Used in Data-Driven Housing Market Segmentation,” Sep. 01, 2023, Sciendo. doi: 10.2478/remav-2023-0022.

N. Septiani and R. Herdiana, “Penerapan Algoritma K-Means Clustering Untuk Harga Rumah di Jakarta Selatan Nuraeni Septiani Sekolah Tinggi Manajemen Informatika dan Komputer (STMIK) IKMI Cirebon Saeful Anwar Sekolah Tinggi Manajemen Informatika dan Komputer (STMIK) IKMI Cirebon,” Trending: Jurnal Ekonomi, Akuntansi dan Manajemen, vol. 1, no. 2, 2023.

S. Bhushan Jha, V. Pandey, R. Kumar Jha, and R. F. Babiceanu, “Machine Learning Approaches to Real Estate Market Prediction Problem: A Case Study,” Aug. 2020. doi: 10.48550/arXiv.2008.09922.

G. Sudarawerti and Arif Fahmi, “Improving Housing Price Prediction with Machine Learning: Evidence from Yogyakarta and Implications for Emerging Urban Markets,” International Journal of Management, Entrepreneurship, Social Science and Humanities, vol. 9, Oct. 2025, doi: 10.31098/ijmesh.v9i1.3567.

Warjiyono, A. N. Rais, I. Alfarobi, S. W. Hadi, and W. Kurniawan, “ANALISA PREDIKSI HARGA JUAL RUMAH MENGGUNAKAN ALGORITMA RANDOM FOREST MACHINE LEARNING,” Jurnal Sistem Informasi dan Teknologi Informasi, vol. 6, no. 2, pp. 416–423, May 2024, doi: 10.52005/jursistekni.v6i2.323.

Y. Feng and J. Park, “Using machine learning-based binary classifiers for predicting organizational members’ user satisfaction with collaboration software,” PeerJ Comput Sci, vol. 9, 2023, doi: 10.7717/peerj-cs.1481.

H. S. Almari, M. M. Ben Ismail, and O. Bchir, “Real Estate Price Classification Using Machine Learning Techniques,” International Journal of Computer and Information Engineering, Feb. 2025.

R. Tanamal, N. Minoque, T. Wiradinata, Y. Soekamto, and T. Ratih, “House Price Prediction Model Using Random Forest in Surabaya City,” TEM Journal, vol. 12, no. 1, pp. 126–132, Feb. 2023, doi: 10.18421/TEM121-17.

A. Merdekawati and J. T. Kumalasari, “Model Hybrid K-Means dan Decision Tree untuk Penentuan Status Kemiskinan Penduduk Indonesia,” Jurnal Nasional Komputasi dan Teknologi Informasi (JNKTI), vol. 8, no. 3, pp. 1680–1688, Jun. 2025, doi: 10.32672/jnkti.v8i3.9214.

L. M. Soegianto, A. T. Hinandra, P. A. Suri, and M. Fajar, “Comparison of Model Performance on Housing Business Using Linear Regression, Random Forest Regressor, SVR, and Neural Network,” in Procedia Computer Science, Elsevier B.V., 2024, pp. 1139–1145. doi: 10.1016/j.procs.2024.10.343.

K. C. Chiu, “A long short-term memory model for forecasting housing prices in Taiwan in the post-epidemic era through big data analytics,” Asia Pacific Management Review, vol. 29, no. 3, pp. 273–283, Sep. 2024, doi: 10.1016/j.apmrv.2023.08.002.

P. Gümmer, J. Rosenberger, M. Kraus, P. Zschech, and N. Hambauer, “Unveiling Location-Specific Price Drivers: A Two-Stage Cluster Analysis for Interpretable House Price Predictions,” in 20th International Conference on Wirtschaftsinformatik (WI 2025), Münster, Aug. 2025. doi: 10.48550/arXiv.2508.03156.

F. G. Ahmatshin and L. A. Kazakotsev, “ ,” in Journal of Physics: Conference Series, IOP Publishing Ltd, Nov. 2020. doi: 10.1088/1742-6596/1679/3/032085.

E. U. Oti, M. O. Olusola, F. C. Eze, and S. U. Enogwe, “Comprehensive Review of K-Means Clustering Algorithms,” International Journal of Advances in Scientific Research and Engineering, vol. 07, no. 08, pp. 64–69, 2021, doi: 10.31695/ijasre.2021.34050.

N. T. M. Sagala and A. A. S. Gunawan, “Discovering the Optimal Number of Crime Cluster Using Elbow, Silhouette, Gap Statistics, and NbClust Methods,” ComTech: Computer, Mathematics and Engineering Applications, vol. 13, no. 1, pp. 1–10, Feb. 2022, doi: 10.21512/comtech.v13i1.7270.

F. K. H. Mihna et al., “Bridging Law and Machine Learning: A Cybersecure Model for Classifying Digital Real Estate Contracts in the Metaverse,” Mesopotamian Journal of Big Data, vol. 2025, pp. 35–49, Apr. 2025, doi: 10.58496/MJBD/2025/003.

C. Çılgın and H. Gökçen, “A Hybrid Machine Learning Model Architecture with Clustering Analysis and Stacking Ensemble for Real Estate Price Prediction,” Comput Econ, vol. 66, no. 1, pp. 127–178, Jul. 2025, doi: 10.1007/s10614-024-10703-4.

H. Okurlar and Y. Eroğlu, “Real Estate Price Estimation with AI: A Hybrid Approach Combining Clustering and Machine Learning,” International Journal of Multidisciplinary Studies and Innovative Technologies, vol. 9, no. 1, p. 137, 2025, doi: 10.36287/ijmsit.9.1.19.

J. Rani, S. K. Verma, L. Dhiman, D. Rawat, S. Kumar, and S. S. Sharma, “Advanced Machine Learning Techniques for Real Estate Price Prediction: A Comprehensive Review,” in Proceedings of the International Conference on Advances and Applications in Artificial Intelligence (ICAAAI 2025), Jun. 2025, pp. 959–971. doi: 10.2991/978-94-6463-738-0_75.

L. H. T. Choy and W. K. O. Ho, “The Use of Machine Learning in Real Estate Research,” Land (Basel), vol. 12, no. 4, Apr. 2023, doi: 10.3390/land12040740.

Z. Huang and G. Lai, “A House Price Prediction Model Based on K-means Clustering and Random Forest in Guangzhou,” Frontiers in Business, Economics and Management, vol. 10, no. 2, 2023, doi: 10.54097/fbem.v10i2.11077.

M. Kandasamy, R. Shanmugam, A. Dave, C. Chawda, K. Shah, and U. Seladiya, “Prediction and Analysis of House Price Through Machine Learning Approach,” International Journal for Multidisciplinary Research, vol. 5, no. 4, Aug. 2023, doi: 10.36948/ijfmr.2023.v05i04.5255.

Additional Files

Published

2026-06-15

How to Cite

[1]
R. Tanamal, S. A. Nugraha, N. M. K. S. Rasyid Jr, L. E. Dendy, and J. Theijer, “Hybrid Unsupervised-Supervised Learning for Housing Submarket Segmentation and Price Prediction in Surabaya Urban Areas”, J. Tek. Inform. (JUTIF), vol. 7, no. 3, pp. 2092–2113, Jun. 2026.