OPTIMAL STUDY OF REAL-ESTATE PRICE PREDICTION MODELS USING MACHINE LEARNING
Abstract
Everyone wants a place to live, especially close to work, shopping centers, easy transportation, low crime rates and others. Pricing must also pay attention to external factors, not just the house. Determining this price is sometimes difficult for some people. Therefore, the aim of this research is to predict real-estate prices by taking these factors into account. Prediction results are very useful for sellers who have difficulty determining prices and also for prospective buyers who are confused when making financial plans to buy a house in the desired neighborhood. The dataset used in this research was obtained from Kaggle and consists of 506 samples with 14 attributes. Several machine learning algorithms, such as Extra Trees (ET), Support Vector Regression (SVR), Random Forest (RF), eXtreme Gradient Boosting (XGB), Gradient Boosting Machine (GBM), Light Gradient Boosting Machine (LGBM), and CatBoost, used to predict real-estate prices. This research uses Principal Component Analysis (PCA) for feature selection techniques in data sets after the preprocessing phase and before model building. The highest accuracy model obtained is CatBoost with GridSearchCV, this model has been cross validated so there is very little chance of overfitting when given new data. The SVR model with a poly kernel uses a Principal Component (PC) of 10 and GridSearchCV gets an R2 Score of 0.87, a very large number close to the score of CatBoost with GridSearchCV.
Downloads
References
P.-F. Pai and W.-C. Wang, “Using Machine Learning Models and Actual Transaction Data for Predicting Real Estate Prices,” Applied Sciences, vol. 10, no. 17, Sep. 2020, doi: 10.3390/app10175832.
Q. Truong, M. Nguyen, H. Dang, and B. Mei, “Housing Price Prediction via Improved Machine Learning Techniques,” Procedia Comput Sci, vol. 174, pp. 433–442, 2020, doi: 10.1016/j.procs.2020.06.111.
R. Tanamal, N. Minoque, T. Wiradinata, Y. Soekamto, and T. Ratih, “House Price Prediction Model Using Random Forest in Surabaya City,” TEM Journal, vol. 12, no. 1, pp. 126–132, Feb. 2023, doi: 10.18421/TEM121-17.
O. Babb, “A Comparison of Machine Learning Approaches to Housing Value Estimation,” SIAM Undergrad Res Online, vol. 12, Nov. 2019, doi: 10.1137/18S017296.
J. Kang, H. J. Lee, S. H. Jeong, H. S. Lee, and K. J. Oh, “Developing a Forecasting Model for Real Estate Auction Prices Using Artificial Intelligence,” Sustainability, vol. 12, no. 7, Apr. 2020, doi: 10.3390/su12072899.
T. D. Phan, “Housing Price Prediction Using Machine Learning Algorithms: The Case of Melbourne City, Australia,” International Conference on Machine Learning and Data Engineering (iCMLDE), pp. 35–42, Jan. 2018, doi: 10.1109/iCMLDE.2018.00017.
M. Tekin and I. U. Sari, “Real Estate Market Price Prediction Model of Istanbul,” Real Estate Management and Valuation, vol. 30, no. 4, pp. 1–16, Dec. 2022, doi: 10.2478/remav-2022-0025.
E. Tripathi, R. Shivaramakrishnan, D. Nanani, and A. Deshmukh, “Understanding Real Estate Price Prediction Using Machine Learning,” Int J Res Appl Sci Eng Technol, vol. 9, no. 4, pp. 811–816, Apr. 2021, doi: 10.22214/ijraset.2021.33720.
L. Rampini and F. R. Cecconi, “Artificial Intelligence Algorithms to Predict Italian Real Estate Market Prices,” Journal of Property Investment and Finance, 2021, doi: 10.1108/JPIF-08-2021-0073.
J. Kalliola, J. Kapočiūte-Dzikiene, and R. Damaševičius, “Neural Network Hyperparameter Optimization for Prediction of Real Estate Prices in Helsinki,” PeerJ Comput Sci, vol. 7, pp. 1–25, Apr. 2021, doi: 10.7717/peerj-cs.444.
M. Štubňová, M. Urbaníková, J. Hudáková, and V. Papcunová, “Estimation of Residential Property Market Price: Comparison of Artificial Neural Networks and Hedonic Pricing Model,” Emerging Science Journal, vol. 4, no. 6, pp. 530–538, Dec. 2020, doi: 10.28991/esj-2020-01250.
R.-T. Mora-Garcia, M.-F. Cespedes-Lopez, and V. R. Perez-Sanchez, “Housing Price Prediction Using Machine Learning Algorithms in COVID-19 Times,” Land (Basel), vol. 11, no. 11, Nov. 2022, doi: 10.3390/land11112100.
M. Čeh, M. Kilibarda, A. Lisec, and B. Bajat, “Estimating the Performance of Random Forest Versus Multiple Regression for Predicting Prices of the Apartments,” ISPRS Int J Geoinf, vol. 7, no. 5, May 2018, doi: 10.3390/ijgi7050168.
F. Mostofi, V. Toǧan, and H. B. Başaǧa, “Real-estate Price Prediction With Deep Neural Network and Principal Component Analysis,” Organization, Technology and Management in Construction, vol. 14, no. 1, pp. 2741–2759, Jan. 2022, doi: 10.2478/otmcj-2022-0016.
S. Ayesha, M. K. Hanif, and R. Talib, “Overview and Comparative Study of Dimensionality Reduction Techniques for High Dimensional Data,” Information Fusion, vol. 59, pp. 44–58, Jul. 2020, doi: 10.1016/j.inffus.2020.01.005.
A. M. Siregar, J. H. Jaman, and A. Mufti, “Analisa Prediksi Kesehatan Masyarakat Indonesia Menggunakan Recurrent Neural Network,” INTERNAL (Information System Journal), vol. 4, no. 1, pp. 28–34, Jun. 2021, doi: 10.32627/internal.v4i1.285.
E. V. P. Darshini, I. Vinuthna, G. B. S. Gayathri, G. Rani, and I. G. A. Roy, “Prediction of House Price Using Machine Learning Algorithms,” International Research Journal of Modernization in Engineering Technology and Science, Mar. 2023, doi: 10.56726/irjmets34307.
M. Heidari, S. Zad, and S. Rafatirad, “Ensemble of supervised and unsupervised learning models to predict a profitable business decision,” in 2021 IEEE International IOT, Electronics and Mechatronics Conference, IEMTRONICS 2021 - Proceedings, Institute of Electrical and Electronics Engineers Inc., Apr. 2021. doi: 10.1109/IEMTRONICS52119.2021.9422649.
Koirunnisa, A. M. Siregar, and S. Faisal, “Optimized Machine Learning Performance with Feature Selection for Breast Cancer Disease Classification,” Jurnal Ilmiah Teknik Elektro Komputer dan Informatika (JITEKI), vol. 9, no. 4, pp. 1131–1143, 2023, doi: 10.26555/jiteki.v9i4.27527.
K. S. R. Kundra, B. J. Lakshmi, I. V. S. Venugopal, and V. Guthula, “Flood Prediction using MLP, CatBoost and Extra-Tree Classifier,” International Journal on Recent and Innovation Trends in Computing and Communication, vol. 11, no. 7 s, pp. 35–44, Jul. 2023, doi: 10.17762/ijritcc.v11i7s.6974.
H. Han and W. Wang, “A Hybrid BPNN-GARF-SVR Prediction Model Based on EEMD for Ship Motion,” CMES - Computer Modeling in Engineering and Sciences, vol. 134, no. 2, pp. 1353–1370, 2023, doi: 10.32604/cmes.2022.021494.
A. Hidayanti, A. M. Siregar, S. A. P. Lestari, and Y. Cahyana, “Model Analisis Kasus Covid-19 Di Indonesia Menggunakan Algoritma Regresi Linier Dan Random Forest,” PETIR, vol. 15, no. 1, pp. 91–101, Dec. 2021, doi: 10.33322/petir.v15i1.1487.
T. L. Octaviani and Z. Rustam, “Random Forest for Breast Cancer Prediction,” AIP Conf Proc, vol. 2168, Nov. 2019, doi: 10.1063/1.5132477.
S. Guan, Y. Wang, L. Liu, J. Gao, Z. Xu, and S. Kan, “Ultra-short-term Wind Power Prediction Method Based on FTI-VACA-XGB Model,” Expert Syst Appl, vol. 235, Jan. 2024, doi: 10.1016/j.eswa.2023.121185.
A. Malik et al., “Deep Learning Versus Gradient Boosting Machine for Pan Evaporation Prediction,” Engineering Applications of Computational Fluid Mechanics, vol. 16, no. 1, pp. 570–587, 2022, doi: 10.1080/19942060.2022.2027273.
S. Shi, “Comparison of Real Estate Price Prediction Based on LSTM and LGBM,” Highlights in Science, Engineering and Technology AMMSAC, vol. 2023, 2023, doi: 10.54097/hset.v49i.8521.
P. Kangane, A. Mallya, A. Gawane, V. Joshi, and S. Gulve, “Analysis of Different Regression Models for Real Estate Price Prediction,” International Journal of Engineering Applied Sciences and Technology, vol. 5, no. 11, pp. 247–254, 2021, doi: 10.33564/IJEAST.2021.v05i11.041.
M. Yazdani, “Machine Learning, Deep Learning, and Hedonic Methods for Real Estate Price Prediction,” Oct. 2021, doi: 10.48550/arXiv.2110.07151.
A. Georgiadis, “Real Estate Valuation Using Regression Models and Artificial Neural Networks: an Applied Study in Thessaloniki,” International Journal of Real Estate and Land Planning, vol. 1, pp. 2623–4807, 2018, doi: 10.26262/reland.v1i0.6485.
S. Khare, M. K. Gourisaria, H. GM, S. Joardar, and V. Singh, “Real Estate Cost Estimation Through Data Mining Techniques,” IOP Conf Ser Mater Sci Eng, vol. 1099, no. 1, p. 012053, Mar. 2021, doi: 10.1088/1757-899x/1099/1/012053.
C. Xue, Y. Ju, S. Li, and Q. Zhou, “Research on the Sustainable Development of Urban Housing Price Based on Transport Accessibility: a Case Study of Xi’an, China,” Sustainability (Switzerland), vol. 12, no. 4, Feb. 2020, doi: 10.3390/su12041497.
K. Chanasit, E. Chuangsuwanich, A. Suchato, and P. Punyabukkana, “A Real Estate Valuation Model Using Boosted Feature Selection,” IEEE Access, vol. 9, pp. 86938–86953, 2021, doi: 10.1109/ACCESS.2021.3089198.
Copyright (c) 2024 Ikhsan Maulana, Amril Mutoi Siregar, Santi Arum Puspita Lestari, Sutan Faisal
This work is licensed under a Creative Commons Attribution 4.0 International License.