Comparative Analysis of Supervised Learning Algorithms for Delivery Status Prediction in Big Data Supply Chain Management

Authors

  • Riri Damayanti Apnena Industrial Mechanics and Design, Politeknik TEDC, Indonesia
  • Gerinata Ginting Computerized Accounting, Politeknik TEDC, Indonesia
  • Ari Sudrajat Informatics Engineering, Politeknik TEDC, Indonesia
  • Hussain Md Mehedul Islam Software Engineer, The Matlab Inc., United States

DOI:

https://doi.org/10.52436/1.jutif.2025.6.3.4689

Keywords:

Delivery Status Prediction, Handling Big Data, Machine Learning Models, Supervised Learning, Supply Chain Analysis, XGBoost

Abstract

This study addresses the problem of predicting delivery status in supply chain data, a critical task for optimizing logistics and operations. The dataset, which includes multiple features like order details, product specifications, and customer information, was pre-processed using oversampling to address class imbalance, ensuring that the model could handle rare cases of late or canceled deliveries. The data cleaning process involved handling missing values, removing irrelevant columns, and transforming categorical variables into numerical formats. After pre-processing and cleaning, five machine learning models were applied: Logistic Regression, Random Forest, SVM, K-Nearest Neighbors (KNN), and XGBoost. Each model was evaluated using metrics such as accuracy, precision, recall, and F1-score. The results showed that XGBoost outperformed the other models, achieving the highest accuracy and providing the most reliable predictions for the delivery status. This makes XGBoost the best choice for supply chain data analysis in this context. This study contributes to the growing application of machine learning in supply chain optimization by identifying XGBoost as a robust model for delivery status prediction in large datasets. For future research, exploring hybrid models and advanced feature engineering techniques could further improve prediction accuracy and address additional challenges in supply chain optimization, especially in the context of real-time data processing and dynamic supply chain environments.  

Downloads

Download data is not yet available.

References

N. Zhao, J. Hong, and K. H. Lau, “Impact of supply chain digitalization on supply chain resilience and performance: A multi-mediation model,” Int J Prod Econ, vol. 259, p. 108817, May 2023, doi: 10.1016/j.ijpe.2023.108817.

M. K. Lim, Y. Li, C. Wang, and M.-L. Tseng, “A literature review of blockchain technology applications in supply chains: A comprehensive analysis of themes, methodologies and industries,” Comput Ind Eng, vol. 154, p. 107133, Apr. 2021, doi: 10.1016/j.cie.2021.107133.

B. Rolf, I. Jackson, M. Müller, S. Lang, T. Reggelin, and D. Ivanov, “A review on reinforcement learning algorithms and applications in supply chain management,” Int J Prod Res, vol. 61, no. 20, pp. 7151–7179, Oct. 2023, doi: 10.1080/00207543.2022.2140221.

Kiran Kumar Reddy Penubaka, “Optimizing Decision-Making in Supply Chain Management Using Machine Learning and Mathematical Modeling Techniques,” Journal of Information Systems Engineering and Management, vol. 10, no. 11s, pp. 574–586, Feb. 2025, doi: 10.52783/jisem.v10i11s.1654.

Y. Popova and I. Sproge, “Decision-making within smart city: Waste sorting,” Sustainability (Switzerland), vol. 13, no. 19, Oct. 2021, doi: 10.3390/su131910586.

S. Setyani, I. Abu Hanifah, and I. I. Ismawati, “The Role of Budget Decision Making as A Mediation of Accounting Information Systems and Organizational Culture on The Performance of Government Agencies,” Journal of Applied Business, Taxation and Economics Research, vol. 1, no. 3, pp. 311–324, Feb. 2022, doi: 10.54408/jabter.v1i3.59.

N. R. D. Cahyo, C. A. Sari, E. H. Rachmawanto, C. Jatmoko, R. R. A. Al-Jawry, and M. A. Alkhafaji, “A Comparison of Multi Class Support Vector Machine vs Deep Convolutional Neural Network for Brain Tumor Classification,” in 2023 International Seminar on Application for Technology of Information and Communication (iSemantic), IEEE, Sep. 2023, pp. 358–363. doi: 10.1109/iSemantic59612.2023.10295336.

M. M. I. Al-Ghiffary, C. A. Sari, E. H. Rachmawanto, N. M. Yacoob, N. R. D. Cahyo, and R. R. Ali, “Milkfish Freshness Classification Using Convolutional Neural Networks Based on Resnet50 Architecture,” Advance Sustainable Science Engineering and Technology, vol. 5, no. 3, p. 0230304, Oct. 2023, doi: 10.26877/asset.v5i3.17017.

F. Farhan, C. A. Sari, E. H. Rachmawanto, and N. R. D. Cahyo, “Mangrove Tree Species Classification Based on Leaf, Stem, and Seed Characteristics Using Convolutional Neural Networks with K-Folds Cross Validation Optimalization,” Advance Sustainable Science Engineering and Technology, vol. 5, no. 3, p. 02303011, Oct. 2023, doi: 10.26877/asset.v5i3.17188.

A. Wieland, “Dancing the Supply Chain: Toward Transformative Supply Chain Management,” Journal of Supply Chain Management, vol. 57, no. 1, pp. 58–73, Jan. 2021, doi: 10.1111/jscm.12248.

L. Bednarski, S. Roscoe, C. Blome, and M. C. Schleper, “Geopolitical disruptions in global supply chains: a state-of-the-art literature review,” Production Planning & Control, pp. 1–27, Dec. 2023, doi: 10.1080/09537287.2023.2286283.

C. Irawan, E. H. Rachmawanto, and H. P. Hadi, “An Ensemble Learning Layer for Wayang Recognition using CNN-based ResNet-50 and LSTM,” Kinetik: Game Technology, Information System, Computer Network, Computing, Electronics, and Control, 2025.

E. H. Rachmawanto, C. A. Sari, and F. O. Isinkaye, “A good result of brain tumor classification based on simple convolutional neural network architecture,” Telkomnika (Telecommunication Computing Electronics and Control), vol. 22, no. 3, pp. 711–719, Jun. 2024, doi: 10.12928/TELKOMNIKA.v22i3.25863.

R. R. Ali et al., “Learning Architecture for Brain Tumor Classification Based on Deep Convolutional Neural Network: Classic and ResNet50,” Diagnostics, vol. 15, no. 5, p. 624, Mar. 2025, doi: 10.3390/diagnostics15050624.

M. M. I. Al-Ghiffary, N. R. D. Cahyo, E. H. Rachmawanto, C. Irawan, and N. Hendriyanto, “Adaptive deep learning based on FaceNet convolutional neural network for facial expression recognition,” Journal of Soft Computing, vol. 05, no. 03, pp. 271–280, 2024, doi: https://doi.org/10.52465/joscex.v5i3.450.

N. R. D. Cahyo and M. M. I. Al-Ghiffary, “An Image Processing Study: Image Enhancement, Image Segmentation, and Image Classification using Milkfish Freshness Images,” IJECAR) International Journal of Engineering Computing Advanced Research, vol. 1, no. 1, pp. 11–22, 2024.

P. S. Kang and B. Bhawna, “Enhancing supply chain resilience through supervised machine learning: supplier performance analysis and risk profiling for a multi-class classification problem,” Business Process Management Journal, Jan. 2025, doi: 10.1108/BPMJ-03-2024-0174.

S. Sani, H. Xia, J. Milisavljevic-Syed, and K. Salonitis, “Supply Chain 4.0: A Machine Learning-Based Bayesian-Optimized LightGBM Model for Predicting Supply Chain Risk,” Machines, vol. 11, no. 9, p. 888, Sep. 2023, doi: 10.3390/machines11090888.

A. J. Albert, R. Murugan, and T. Sripriya, “Diagnosis of heart disease using oversampling methods and decision tree classifier in cardiology,” Research on Biomedical Engineering, vol. 39, no. 1, pp. 99–113, Dec. 2022, doi: 10.1007/s42600-022-00253-9.

U. Hasanah, A. M. Soleh, and K. Sadik, “Effect of Random Under sampling, Oversampling, and SMOTE on the Performance of Cardiovascular Disease Prediction Models,” Jurnal Matematika, Statistika dan Komputasi, vol. 21, no. 1, pp. 88–102, Sep. 2024, doi: 10.20956/j.v21i1.35552.

P. Li, X. Rao, J. Blase, Y. Zhang, X. Chu, and C. Zhang, “CleanML: A Study for Evaluating the Impact of Data Cleaning on ML Classification Tasks,” in 2021 IEEE 37th International Conference on Data Engineering (ICDE), IEEE, Apr. 2021, pp. 13–24. doi: 10.1109/ICDE51399.2021.00009.

A. D. Amirruddin, F. M. Muharam, M. H. Ismail, N. P. Tan, and M. F. Ismail, “Synthetic Minority Over-sampling TEchnique (SMOTE) and Logistic Model Tree (LMT)-Adaptive Boosting algorithms for classifying imbalanced datasets of nutrient and chlorophyll sufficiency levels of oil palm (Elaeis guineensis) using spectroradiometers and unmanned aerial vehicles,” Comput Electron Agric, vol. 193, p. 106646, Feb. 2022, doi: 10.1016/j.compag.2021.106646.

M. A. Rasyidi, T. Bariyah, Y. I. Riskajaya, and A. D. Septyani, “Classification of handwritten javanese script using random forest algorithm,” Bulletin of Electrical Engineering and Informatics, vol. 10, no. 3, pp. 1308–1315, Jun. 2021, doi: 10.11591/eei.v10i3.3036.

E. H. Rachmawanto, D. R. I. M. Setiadi, N. Rijati, A. Susanto, I. U. W. Mulyono, and H. Rahmalan, “Attribute Selection Analysis for the Random Forest Classification in Unbalanced Diabetes Dataset,” in 2021 International Seminar on Application for Technology of Information and Communication (iSemantic), 2021, pp. 82–86. doi: 10.1109/iSemantic52711.2021.9573181.

S. S. Kavitha and N. Kaulgud, “Quantum machine learning for support vector machine classification,” Evol Intell, 2022, doi: 10.1007/s12065-022-00756-5.

N. Bayati et al., “Locating high-impedance faults in DC microgrid clusters using support vector machines,” Appl Energy, vol. 308, Feb. 2022, doi: 10.1016/j.apenergy.2021.118338.

C. Umam, L. B. Handoko, and F. O. Isinkaye, “Performance Analysis of Support Vector Classification and Random Forest in Phishing Email Classification,” Scientific Journal of Informatics, vol. 11, no. 2, pp. 367–374, May 2024, doi: 10.15294/sji.v11i2.3301.

R. Widadi, B. Arifwidodo, K. Masykuroh, and A. Saputra, “Klasifikasi Tingkat Kesegaran Ikan Nila Menggunakan K-Nearest Neighbor Berdasarkan Fitur Statistis Piksel Citra Mata Ikan,” JURNAL MEDIA INFORMATIKA BUDIDARMA, vol. 7, no. 1, p. 242, Jan. 2023, doi: 10.30865/mib.v7i1.5196.

A. Farzipour, R. Elmi, and H. Nasiri, “Detection of Monkeypox Cases Based on Symptoms Using XGBoost and Shapley Additive Explanations Methods,” Diagnostics, vol. 13, no. 14, p. 2391, Jul. 2023, doi: 10.3390/diagnostics13142391.

J. Ou et al., “Coupling UAV Hyperspectral and LiDAR Data for Mangrove Classification Using XGBoost in China’s Pinglu Canal Estuary,” Forests, vol. 14, no. 9, p. 1838, Sep. 2023, doi: 10.3390/f14091838.

S. C. Kim and Y. S. Cho, “Predictive System Implementation to Improve the Accuracy of Urine Self-Diagnosis with Smartphones: Application of a Confusion Matrix-Based Learning Model through RGB Semiquantitative Analysis,” Sensors, vol. 22, no. 14, Jul. 2022, doi: 10.3390/s22145445.

I. Markoulidakis and G. Markoulidakis, “Probabilistic Confusion Matrix: A Novel Method for Machine Learning Algorithm Generalized Performance Analysis,” Technologies (Basel), vol. 12, no. 7, p. 113, Jul. 2024, doi: 10.3390/technologies12070113.

I. Markoulidakis, I. Rallis, I. Georgoulas, G. Kopsiaftis, A. Doulamis, and N. Doulamis, “Multiclass Confusion Matrix Reduction Method and Its Application on Net Promoter Score Classification Problem,” Technologies (Basel), vol. 9, no. 4, Dec. 2021, doi: 10.3390/technologies9040081.

I. P. Kamila, C. A. Sari, E. H. Rachmawanto, and N. R. D. Cahyo, “A Good Evaluation Based on Confusion Matrix for Lung Diseases Classification using Convolutional Neural Networks,” Advance Sustainable Science, Engineering and Technology, vol. 6, no. 1, p. 0240102, Dec. 2023, doi: 10.26877/asset.v6i1.17330.

N. E. W. Nugroho and A. Harjoko, “Transliteration of Hiragana and Katakana Handwritten Characters Using CNN-SVM,” IJCCS (Indonesian Journal of Computing and Cybernetics Systems), vol. 15, no. 3, p. 221, Jul. 2021, doi: 10.22146/ijccs.66062.

Additional Files

Published

2025-06-23

How to Cite

[1]
R. D. Apnena, G. . Ginting, A. Sudrajat, and H. M. M. Islam, “Comparative Analysis of Supervised Learning Algorithms for Delivery Status Prediction in Big Data Supply Chain Management”, J. Tek. Inform. (JUTIF), vol. 6, no. 3, pp. 1443–1456, Jun. 2025.

Most read articles by the same author(s)