Performance Comparison Of Xgboost Lightgbm And Lstm For E-Commerce Repeat Buyer Prediction

Authors

  • Lustiyono Prasetyo Nugroho Computer Science, Universitas Amikom Purwokerto, Indonesia
  • Rujianto Eko Saputro Computer Science, Universitas Amikom Purwokerto, Indonesia
  • Fandy Setyo Utomo Computer Science, Universitas Amikom Purwokerto, Indonesia

DOI:

https://doi.org/10.52436/1.jutif.2026.7.1.5746

Keywords:

E-commerce, LightGBM, LSTM, Repeat Buyer Prediction, XGBoost

Abstract

Repeat buyer behavior is a critical indicator of customer retention success in e-commerce platforms. However, accurately predicting repeat buyers remains a challenging problem due to the complexity of user behavior patterns and the temporal characteristics embedded in interaction data. Existing studies often focus on single modeling approaches or limited sequence exploration, resulting in insufficient comparative insight between ensemble-based machine learning and sequence-based deep learning models. Therefore, this study aims to systematically compare the performance of tree-based ensemble models (XGBoost and LightGBM) and a sequence-based deep learning model (LSTM) in predicting repeat buyers using user behavior data. To ensure fair evaluation, data preprocessing and feature engineering were carefully designed to prevent data leakage by utilizing user behavior prior to the first purchase. Model performance was evaluated using Accuracy, F1-score, and ROC–AUC metrics. Experimental results show that XGBoost and LightGBM achieve stable classification performance with accuracy values of 86.11% and 85.84%, respectively, while the LSTM model attains the highest ROC–AUC value of 0.937, indicating superior capability in capturing temporal behavioral patterns. This study provides valuable insights for e-commerce platforms seeking to optimize predictive models for repeat buyers, contributing to more effective customer retention strategies.

Downloads

Download data is not yet available.

References

Z. M. Jannah, Miftahul, “The Digital Economy Boom: How E-Commerce is Reshaping Indonesia’s Market,” 2025. doi: https://doi.org/10.15294/indi.v2i1.23034.

S. F. Mauludiah, Y. M. Arif, M. Faisal, and D. D. Putra, “Struggling Models: An Analysis of Logistic Regression and Random Forest in Predicting Repeat Buyers with Imbalanced Performance Metrics,” Appl. Inf. Syst. Manag., vol. 7, no. 2, pp. 31–38, 2024, doi: 10.15408/aism.v7i2.39326.

S. Silaban, N. Kusnadi, and F. Feryanto, “the Impact of E-Commerce Implementation on the Performance of Micro and Small Industries in Indonesia,” Agrisocionomics J. Sos. Ekon. Pertan., vol. 8, no. 1, pp. 126–133, 2024, doi: 10.14710/agrisocionomics.v8i1.17725.

M. Izhan and M. Yusoff, “Machine Learning : An Overview,” pp. 89–99, 2024, doi: 10.4236/ojmsi.2024.123006.

I. Markoulidakis and G. Markoulidakis, “Probabilistic Confusion Matrix: A Novel Method for Machine Learning Algorithm Generalized Performance Analysis,” Technologies, vol. 12, no. 7, 2024, doi: 10.3390/technologies12070113.

E. Kuric, A. Puskas, P. Demcak, and D. Mensatorisova, “Effect of Low-Level Interaction Data in Repeat Purchase Prediction Task,” Int. J. Hum. Comput. Interact., vol. 40, no. 10, pp. 2515–2533, 2024, doi: 10.1080/10447318.2023.2175973.

R. Esmeli and A. Gokce, “An Analysis of Consumer Purchase Behavior Following Cart Addition in E-Commerce Utilizing Explainable Artificial Intelligence,” J. Theor. Appl. Electron. Commer. Res. , vol. 20, no. 1, pp. 1–18, 2025, doi: 10.3390/jtaer20010028.

L. Gan, “XGBoost-Based E-Commerce Customer Loss Prediction,” Comput. Intell. Neurosci., vol. 2022, 2022, doi: 10.1155/2022/1858300.

S. N. Ruscikasani, R. Roro, N. Oktalivia, F. R. Putra, and A. J. Wahidin, “Prediksi Pembelian E-Commerce Menggunakan XGBoost Berbasis Perilaku Sesi Pengguna,” vol. 4, no. 4, pp. 5666–5672, 2025.

J. Du, “Predictions for Consumer Behaviour of E-Commerce Sales Data 2023-2024 Based on the LightGBM Model,” no. Ecai 2024, pp. 109–116, 2025, doi: 10.5220/0013207400004568.

F. Alharbi, “A comparative study of SMOTE and ADASYN for multiclass classification of IoT anomalies,” Int. J. Inf. Technol. Secur., vol. 17, no. 2, pp. 15–24, 2025, doi: 10.59035/qefu7977.

R. Sinaga and S. Widianto, “Understanding Telecommunication Customer Churn: Insights from LightGBM Predictive Modelling and SHAP Feature Interpretation,” ASEAN Mark. J., vol. 15, no. 1, 2023, doi: 10.7454/amj.v15i1.1229.

Y. Ling, “RF-LighGBM: A PROBABILISTIC ENSEMBLE WAY TO PREDICT CUSTOMER REPURCHASE BEHAVIOUR IN COMMUNITY E-COMMERCE A,” pp. 1–15, 2021.

B. Predić, M. Ćirić, and L. Stoimenov, “Business Purchase Prediction Based on XAI and LSTM Neural Networks,” Electron., vol. 12, no. 21, 2023, doi: 10.3390/electronics12214510.

D. Tran and A. W. Tham, “Accuracy Comparison Between Feedforward Neural Network, Support Vector Machine and Boosting Ensembles for Financial Risk Evaluation,” J. Risk Financ. Manag., vol. 18, no. 4, 2025, doi: 10.3390/jrfm18040215.

M. Nasseri, T. Falatouri, P. Brandtner, and F. Darbanian, “Applying Machine Learning in Retail Demand Prediction—A Comparison of Tree-Based Ensembles and Long Short-Term Memory-Based Deep Learning,” Appl. Sci., vol. 13, no. 19, 2023, doi: 10.3390/app131911112.

K. Cai and M. R. Rodavia, “XGBoost Analysis based on Consumer Behavior,” Front. Comput. Intel. Syst., vol. 5, no. 2, pp. 85–89, 2023, [Online]. Available: doi: https://doi.org/10.54097/fcis.v5i2.12974

Sutarman, R. Siringoringo, D. Arisandi, E. Kurniawan, and E. B. Nababan, “Model Klasifikasi Dengan Logistic Regression Dan Recursive Feature Elimination Pada Data Tidak Seimbang,” J. Teknol. Inf. dan Ilmu Komput., vol. 11, no. 4, pp. 735–742, 2024, doi: 10.25126/jtiik.1148198.

M. Z. Alam and T. Roy, “Predicting Online Repeat Purchases: A Comparative Analysis of Machine Learning Algorithms,” 2025 Int. Conf. Electr. Comput. Commun. Eng. ECCE 2025, no. May, pp. 1–6, 2025, doi: 10.1109/ECCE64574.2025.11013423.

S. J. Haddadi, A. Farshidvard, F. dos S. Silva, J. C. dos Reis, and M. da Silva Reis, “Customer churn prediction in imbalanced datasets with resampling methods: A comparative study,” Expert Syst. Appl., vol. 246, no. September 2023, p. 123086, 2024, doi: 10.1016/j.eswa.2023.123086.

W. Junhai, W. Yunfeng, O. Ibrahim, and S. Member, “Explainable E-Commerce Transaction Prediction Using LightGBM and Permutation Importance,” IEEE Access, vol. 14, no. January, pp. 10153–10169, 2026, doi: 10.1109/ACCESS.2026.3654903.

J. Rixen, N. Blass, S. Lyra, and S. Leonhardt, “Comparison of Machine Learning Classifiers for the Detection of Breast Cancer in an Electrical Impedance Tomography Setup,” 2023.

Z. Shao and M. N. Ahmad, “Comparison of Random Forest and XGBoost Classifiers Using Integrated Optical and SAR Features for Mapping Urban Impervious Surface,” 2024.

K. Diamantaras, M. Salampasis, A. Katsalis, and K. Christantonis, “Predicting shopping intent of e-commerce users using LSTM recurrent neural networks,” Proc. 10th Int. Conf. Data Sci. Technol. Appl. DATA 2021, no. Data, pp. 252–259, 2021, doi: 10.5220/0010554102520259.

R. Kasemrat and T. Kraiwanit, “Attention-enhanced LSTM for high-value customer behavior prediction: Insights from Thailand’s E-commerce sector,” Intell. Syst. with Appl., vol. 26, no. December 2024, p. 200523, 2025, doi: 10.1016/j.iswa.2025.200523.

I. Z. A. Illah, W. S. Jauharis Sapu, and A. T. Damaliana, “Implementasi Metode Klasifikasi LightGBM dan Analisis Survival dalam Memprediksi Pelanggan Churn,” J. Komtika (Komputasi dan Inform., vol. 8, no. 1, pp. 43–53, 2024, doi: 10.31603/komtika.v8i1.11194.

Additional Files

Published

2026-03-08

How to Cite

[1]
L. P. . Nugroho, R. E. . Saputro, and F. S. . Utomo, “Performance Comparison Of Xgboost Lightgbm And Lstm For E-Commerce Repeat Buyer Prediction”, J. Tek. Inform. (JUTIF), vol. 7, no. 1, pp. 730–742, Mar. 2026.

Most read articles by the same author(s)