Regression Based Prediction of Roblox Game Popularity Using Extreme Gradient Boosting with Hyperparameter Optimization
DOI:
https://doi.org/10.52436/1.jutif.2026.7.1.5648Keywords:
Game Analytics, Hyperparameter Tuning, Machine Learning, Popularity Prediction, Roblox, XGBoostAbstract
The rapid growth of the digital gaming industry has increased the importance of predicting game popularity on user-generated content platforms such as Roblox, where diverse games and highly variable user engagement patterns create challenges in modeling long-term popularity trends. This study aims to develop a regression-based popularity prediction model using the Extreme Gradient Boosting (XGBoost) algorithm based on user interaction indicators, including visits, likes, dislikes, favorites, and active players. To investigate the effect of model optimization, hyperparameter tuning is performed using GridSearchCV. Model performance is evaluated using Mean Absolute Error (MAE), Mean Squared Error (MSE), Root Mean Squared Error (RMSE), and the Coefficient of Determination (R²). Experimental results show that the baseline XGBoost model achieves an R² value of 80.74%, indicating strong capability in capturing non-linear popularity patterns. However, the optimized model yields a lower R² value of 77.71%, accompanied by slight increases in prediction error metrics, revealing that hyperparameter optimization does not always improve performance for highly skewed popularity data. Feature importance analysis further indicates that interaction-based attributes, particularly likes and dislikes, are the most influential predictors. These findings provide an important contribution to Informatics research by demonstrating the effectiveness of ensemble regression models for digital entertainment analytics while highlighting the need for critical evaluation of optimization strategies rather than assuming universal performance gains.
Downloads
References
Y. Liu, H. Duan, and W. Cai, “User-Generated Content and Editors in Games: A Comprehensive Survey,” Dec. 2024, doi: https://doi.org/10.48550/arXiv.2412.13743.
D. Yi, “Predicting the Popularity Level of Roblox Games Using Gameplay and Metadata Features with Machine Learning Models,” International Journal for Applied Information Management, vol. 5, no. 1, pp. 30–42, Apr. 2025, doi: 10.47738/ijaim.v5i1.97.
N. H. Nguyen, D. T. A. Nguyen, B. Ma, and J. Hu, “The application of machine learning and deep learning in sport: predicting NBA players’ performance and popularity,” Journal of Information and Telecommunication, vol. 6, no. 2, pp. 217–235, 2022, doi: 10.1080/24751839.2021.1977066.
T. Chen and C. Guestrin, “XGBoost: A Scalable Tree Boosting System,” Jun. 2016, doi: 10.1145/2939672.2939785.
N. M. Lefi and M. Rahardi, “Hyperparameter Optimization and Feature Selection Analysis on the XGBoost Model for Hepatitis C Infection Prediction,” 2025. doi: https://doi.org/10.30871/jaic.v9i6.10876.
“Optimization of XGBoost hyperparameters using grid search and random search for credit card default prediction”, doi: https://doi.org/10.35335/mandiri.v14i2.468.
M. T. Syamkalla, S. Khomsah, and Y. S. R. Nur, “Implementasi Algoritma Catboost Dan Shapley Additive Explanations (SHAP) Dalam Memprediksi Popularitas Game Indie Pada Platform Steam,” Jurnal Teknologi Informasi dan Ilmu Komputer, vol. 11, no. 4, pp. 777–786, Aug. 2024, doi: 10.25126/jtiik.1148503.
“Analisis Klasifikasi Popularitas Game Roblox Menggunakan Algoritma K-Nearest Neighbor (KNN) (Buana, et al.)”, doi: 10.63822/0dwatj19.
G. Airlangga, “Performance Evaluation of Machine Learning Models for Predicting Household Energy Consumption: A Comparative Study,” Indonesian Journal of Artificial Intelligence and Data Mining, vol. 8, no. 1, p. 76, Dec. 2024, doi: 10.24014/ijaidm.v8i1.32791.
G. A. Narkunam, K. Kala, and S. Arunpandiyan, “Enhancing Agricultural Forecasting with an Ensemble Learning Approach for Broccoli Yield Prediction ARTICLE INFO ABSTRACT,” 2024. [Online]. Available: https://www.jisem-journal.com/
Tarwidi D; Pudjaprasetya SR; Adytia D; Apri M, “An optimized XGBoost-based machine learning method for predicting wave run-up on a sloping beach,” Mar. 2023, doi: 10.1016/j.mex.2023.102119.
N. Rismayanti, “Predicting Online Gaming Behaviour Using Machine Learning Techniques,” Indonesian Journal of Data and Science, vol. 5, no. 2, Jul. 2024, doi: 10.56705/ijodas.v5i2.166.
S. Mohammed, F. Naumann, and H. Harmouch, “Step-by-Step Data Cleaning Recommendations to Improve ML Prediction Accuracy,” in Advances in Database Technology - EDBT, OpenProceedings.org, Mar. 2025, pp. 542–554. doi: 10.48786/edbt.2025.43.
P. Li, X. Rao, J. Blase, Y. Zhang, X. Chu, and C. Zhang, “CleanML: A Study for Evaluating the Impact of Data Cleaning on ML Classification Tasks,” Apr. 2021, doi: https://doi.org/10.48550/arXiv.1904.09483.
P. Koukaras and C. Tjortjis, “Data Preprocessing and Feature Engineering for Data Mining: Techniques, Tools, and Best Practices,” Oct. 01, 2025, Multidisciplinary Digital Publishing Institute (MDPI). doi: 10.3390/ai6100257.
Y. Xu and R. Goodacre, “On Splitting Training and Validation Set: A Comparative Study of Cross-Validation, Bootstrap and Systematic Sampling for Estimating the Generalization Performance of Supervised Learning,” J. Anal. Test., vol. 2, no. 3, pp. 249–262, Jul. 2018, doi: 10.1007/s41664-018-0068-2.
A. Performa et al., “Performance Analysis of XGBoost Algorithm to Determine the Most Optimal Parameters and Features in Predicting Stock Price Movement,” Jurnal Informatika dan Teknologi Informasi, vol. 20, no. 1, pp. 91–102, 2023, doi: 10.31515/telematika.v20i1.9329.
N. Alamsyah, B. Budiman, T. P. Yoga, and R. Y. R. Alamsyah, “XGBOOST HYPERPARAMETER OPTIMIZATION USING RANDOMIZEDSEARCHCV FOR ACCURATE FOREST FIRE DROUGHT CONDITION PREDICTION,” Jurnal Pilar Nusa Mandiri, vol. 20, no. 2, pp. 103–110, Sep. 2024, doi: 10.33480/pilar.v20i2.5569.
D. Morreale¨università, M. Morreale¨università, D. Studi, G. Marconi, A. Rosa, and D. Morreale, “Roblox and the Pervasiveness of Play: What Game-Making Communities Can Teach Us About Participatory Practices in Affinity Spaces,” 2024. Accessed: Feb. 03, 2026. [Online]. Available: https://ijoc.org/index.php/ijoc/article/view/21902
X. Ying, “An Overview of Overfitting and its Solutions,” in Journal of Physics: Conference Series, Institute of Physics Publishing, Mar. 2019. doi: 10.1088/1742-6596/1168/2/022022.
S. Putatunda and K. Rama, “A comparative analysis of hyperopt as against other approaches for hyper-parameter optimization of XGBoost,” in ACM International Conference Proceeding Series, Association for Computing Machinery, Nov. 2018, pp. 6–10. doi: 10.1145/3297067.3297080.
S. Abulhaija, S. Hattab, A. Abdeen, and W. Etaiwi, “Predicting Mobile Apps Performance using Machine Learning,” Journal of System and Management Sciences, vol. 12, no. 6, pp. 300–314, 2022, doi: 10.33168/JSMS.2022.0619.
Sugiarto et al., “Optimizing The XGBoost Model with Grid Search Hyperparameter Tuning for Maximum Temperature Forecasting,” Journal of Applied Data Sciences, vol. 6, no. 4, pp. 2517–2529, Dec. 2025, doi: 10.47738/jads.v6i4.885.
C. G. L. Pringandana and K. Kusnawi, “A Comparative Analysis of Hyperparameter-Tuned XGBoost and LightGBM for Multiclass Rainfall Classification in Jakarta,” Jurnal Teknik Informatika (Jutif), vol. 6, no. 4, pp. 2467–2483, Aug. 2025, doi: 10.52436/1.jutif.2025.6.4.4965.
K. Kapadia, H. Abdel-Jaber, F. Thabtah, and W. Hadi, “Sport analytics for cricket game results using machine learning: An experimental study,” Applied Computing and Informatics, vol. 18, no. 3–4, pp. 256–266, Jun. 2022, doi: 10.1016/j.aci.2019.11.006.
Additional Files
Published
How to Cite
Issue
Section
License
Copyright (c) 2026 Inna Nur Amalina, Norhikmah, Dony Ariyus, Muhammad Koprawi, Rafli Ilham Prasetyo

This work is licensed under a Creative Commons Attribution 4.0 International License.





