Efficient Waste Classification in Cisadane River Using Vision Transformer and Swin Transformer Architectures

Authors

  • Asep Surahmat Technology and Design Faculty, Universitas Utpadaka Swastika, Indonesia
  • Rezza Anugrah Mutiarawan Technology and Design Faculty, Universitas Utpadaka Swastika, Indonesia

DOI:

https://doi.org/10.52436/1.jutif.2025.6.6.4451

Keywords:

AI, Swin Transformer, Vision Transformer, Waste Classification, Waste Sorting

Abstract

The increasing volume of waste in rivers has become a serious environmental problem. This study proposes the implementation of Artificial Intelligence (AI)-based models, specifically Vision Transformer (ViT) and Swin Transformer, for an automatic waste sorting system in the Cisadane River, Tangerang. The dataset used combines public sources and field data, processed through preprocessing and augmentation to improve robustness. Model training was conducted using k-fold cross-validation, pruning, and deployment testing on edge devices to ensure generalization and efficiency. Several architectural innovations were introduced, including Dynamic Patch Size for adapting to various waste shapes and sizes, and Spatial-Aware Attention to enhance focus on waste objects against complex river backgrounds. The evaluation involved a confusion matrix and statistical analysis using a paired t-test to validate the significance of the results. Experimental findings show that Swin Transformer achieved the highest accuracy of 94.2%, surpassing ViT at 91.8%, with precision of 93.5%, recall of 92.7%, and F1-score of 93.1%. Swin Transformer also proved more reliable in dynamic lighting and cluttered environments. This study demonstrates the potential of Transformer-based architectures in automatic waste classification, contributing to smarter and more efficient AI-based environmental management technologies.

Downloads

Download data is not yet available.

References

A. C. S, J. Mammoo, A. P. S, and A. S. P. A, “Deep Learning Approaches for Waste Classification,” in 2024 International Conference on Advancements in Power, Communication and Intelligent Systems (APCI), 2024, pp. 1–7. doi: 10.1109/APCI61480.2024.10617120.

L. Sulistyowati, Nurhasanah, E. Riani, and M. R. Cordova, “The occurrence and abundance of microplastics in surface water of the midstream and downstream of the Cisadane River, Indonesia,” Chemosphere, vol. 291, p. 133071, 2022, doi: https://doi.org/10.1016/j.chemosphere.2021.133071.

H. Widjaja, A. Wellsan, G. Mistissy, N. Qibthia, and F. Yenni, Garbage Pollution In The Cisadane River In The Tangerang Region. 2020. doi: 10.4108/eai.22-10-2019.2291483.

D. Honingh, T. Van Emmerik, W. Uijttewaal, H. Kardhana, O. Hoes, and N. Van De Giesen, “Urban River Water Level Increase Through Plastic Waste Accumulation at a Rack Structure,” vol. 8, p., 2020, doi: 10.3389/feart.2020.00028.

B. Fakouri, M. V. Samani, M. V. Samani, and M. Mazaheri, “Cost-based model for optimal waste-load allocation and pollution loading losses in river system: simulation–optimization approach,” International Journal of Environmental Science and Technology, vol. 19, pp. 12103–12118, 2022, doi: 10.1007/s13762-022-04422-2.

J. D. Ortiz-Mata, X. J. Oleas-Vélez, N. A. Valencia-Castillo, M. Del Rocío Villamar-Aveiga, and D. Dáger-López, “Comparison of Vertex AI and Convolutional Neural Networks for Automatic Waste Sorting,” Sustainability, p., 2025, doi: 10.3390/su17041481.

A. Arishi, “Real-Time Household Waste Detection and Classification for Sustainable Recycling: A Deep Learning Approach,” Sustainability, p., 2025, doi: 10.3390/su17051902.

J. Ni, K. Shen, Y. Chen, and S. Yang, “An Improved SSD-Like Deep Network-Based Object Detection Method for Indoor Scenes,” IEEE Trans Instrum Meas, vol. 72, pp. 1–15, 2023, doi: 10.1109/TIM.2023.3244819.

Y. Jia et al., “CroApp: A CNN-Based Resource Optimization Approach in Edge Computing Environment,” IEEE Trans Industr Inform, vol. 18, pp. 6300–6307, 2022, doi: 10.1109/tii.2022.3154473.

B. Song, D. Kc, R. Y. Yang, S. Li, C. Zhang, and R. Liang, “Classification of Mobile-Based Oral Cancer Images Using the Vision Transformer and the Swin Transformer,” Cancers (Basel), vol. 16, p., 2024, doi: 10.3390/cancers16050987.

V. Gupta, A. Yadav, and D. Vishwakarma, “HumanPoseNet: An all-transformer architecture for pose estimation with efficient patch expansion and attentional feature refinement,” Expert Syst. Appl., vol. 244, p. 122894, 2023, doi: 10.1016/j.eswa.2023.122894.

A. Setyanto et al., “Knowledge Distillation in Object Detection for Resource-Constrained Edge Computing,” IEEE Access, vol. 13, pp. 18200–18214, 2025, doi: 10.1109/ACCESS.2025.3534020.

N. Zailan, M. Azizan, K. Hasikin, A. S. M. Khairuddin, and U. Khairuddin, “An automated solid waste detection using the optimized YOLO model for riverine management,” Front Public Health, vol. 10, p., 2022, doi: 10.3389/fpubh.2022.907280.

J. D. Ortiz-Mata, X. J. Oleas-Vélez, N. A. Valencia-Castillo, M. Del Rocío Villamar-Aveiga, and D. Dáger-López, “Comparison of Vertex AI and Convolutional Neural Networks for Automatic Waste Sorting,” Sustainability, vol. 3, pp. 121–129, 2025, doi: 10.3390/su17041481.

Z. Wang, L. Ye, F. Chen, T. Zhou, and Y. Zhao, “Multi-category sorting of plastic waste using Swin Transformer: A vision-based approach.,” J Environ Manage, vol. 370, p. 122742, 2024, doi: 10.1016/j.jenvman.2024.122742.

Z. Liu, Y. Tan, Q. He, and Y. Xiao, “SwinNet: Swin Transformer Drives Edge-Aware RGB-D and RGB-T Salient Object Detection,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 32, pp. 4486–4497, 2022, doi: 10.1109/TCSVT.2021.3127149.

H. Dai, X. Peng, X. Shi, L. He, Q. Xiong, and H. Jin, “Reveal training performance mystery between TensorFlow and PyTorch in the single GPU environment,” Science China Information Sciences, vol. 65, p., 2021, doi: 10.1007/s11432-020-3182-1.

J. Wei and H. Chen, “Determining the number of factors in approximate factor models by twice K-fold cross validation,” Econ Lett, vol. 191, p. 109149, 2020, doi: 10.1016/j.econlet.2020.109149.

O. Rainio, J. Teuho, and R. Klén, “Evaluation metrics and statistical tests for machine learning,” Sci Rep, vol. 14, p., 2024, doi: 10.1038/s41598-024-56706-x.

B. Song, D. Kc, R. Y. Yang, S. Li, C. Zhang, and R. Liang, “Classification of Mobile-Based Oral Cancer Images Using the Vision Transformer and the Swin Transformer,” Cancers (Basel), vol. 16, p., 2024, doi: 10.3390/cancers16050987.

C. Yang et al., “The Identification of Breast Cancer Subtypes by Raman Spectroscopy Integrated With Machine Learning Algorithms: Analyzing the Influence of Baseline,” Journal of Raman Spectroscopy, p., 2025, doi: 10.1002/jrs.6799.

H. L. Vu, K. Ng, A. Richter, and C. An, “Analysis of input set characteristics and variances on k-fold cross validation for a Recurrent Neural Network model on waste disposal rate estimation.,” J Environ Manage, vol. 311, p. 114869, 2022, doi: 10.1016/j.jenvman.2022.114869.

S. Singha and B. Aydin, “Automated Drone Detection Using YOLOv4,” Drones, p., 2021, doi: 10.3390/drones5030095.

L. Pires, J. Figueiredo, R. Martins, and J. Martins, “IoT-Enabled Real-Time Monitoring of Urban Garbage Levels Using Time-of-Flight Sensing Technology,” Sensors (Basel), vol. 25, p., 2025, doi: 10.3390/s25072152.

E. Assunção et al., “Real-Time Weed Control Application Using a Jetson Nano Edge Device and a Spray Mechanism,” Remote. Sens., vol. 14, p. 4217, 2022, doi: 10.3390/rs14174217.

Additional Files

Published

2025-12-23

How to Cite

[1]
A. Surahmat and R. A. . Mutiarawan, “Efficient Waste Classification in Cisadane River Using Vision Transformer and Swin Transformer Architectures”, J. Tek. Inform. (JUTIF), vol. 6, no. 6, pp. 5450–5461, Dec. 2025.