Performance Evaluation of Transformer Models: Scratch, Bart, and Bert for News Document Summarization

Khadijah Fahmi Hayati Holle; Daurin Nabilatul Munna; Enggarani Wahyu Ekaputri

doi:10.52436/1.jutif.2025.6.2.2534

Authors

Khadijah Fahmi Hayati Holle Informatics Engineering, Universitas Islam Negeri Maulana Malik Ibrahim, Indonesia
Daurin Nabilatul Munna Informatics Engineering, Universitas Islam Negeri Maulana Malik Ibrahim, Indonesia
Enggarani Wahyu Ekaputri Informatics Engineering, Universitas Islam Negeri Maulana Malik Ibrahim, Indonesia

DOI:

https://doi.org/10.52436/1.jutif.2025.6.2.2534

Keywords:

BART, BERT, Document Summarization, NLP, ROUGE, Transformer

Abstract

This study evaluates the performance of three Transformer models: Transformer from Scratch, BART (Bidirectional and Auto-Regressive Transformers), and BERT (Bidirectional Encoder Representations from Transformers) in the task of summarizing news documents. The evaluation results show that BERT excels in understanding the bidirectional context of text, with a ROUGE-1 value of 0.2471, ROUGE-2 of 0.1597, and ROUGE-L of 0.1597. BART shows strong ability in de-noising and producing coherent summaries, with a ROUGE-1 value of 0.5239, ROUGE-2 of 0.3517, and ROUGE-L of 0.3683. Transformer from Scratch, despite requiring large training data and computational resources, produces good performance when trained optimally, with ROUGE-1 scores of 0.7021, ROUGE-2 scores of 0.5652, and ROUGE-L scores of 0.6383. This evaluation provides insight into the strengths and weaknesses of each model in the context of news document summarization.

Downloads

Download data is not yet available.

References

B. A. K. Balouch and F. Hussain, “A Transformer based approach for Abstractive Text Summarization of Radiology Reports,” International Conference on Applied Engineering and Natural Sciences, vol. 1, no. 1, pp. 476–486, 2023, doi: 10.59287/icaens.1042.

H. Lyu, N. Sha, S. Qin, M. Yan, Y. Xie, and R. Wang, “Manifold Denoising by Nonlinear Robust Principal Component Analysis,” Adv Neural Inf Process Syst, vol. 32, no. NeurIPS, pp. 1–11, 2019, [Online]. Available: http://arxiv.org/abs/1911.03831

M. Lewis et al., “BART: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension,” Proceedings of the Annual Meeting of the Association for Computational Linguistics, pp. 7871–7880, 2020, doi: 10.18653/v1/2020.acl- main.703.

J. Devlin, M. W. Chang, K. Lee, and K. Toutanova, “BERT: Pre-training of deep bidirectional transformers for language understanding,” NAACL HLT 2019 - 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies - Proceedings of the Conference, vol. 1, no. Mlm, pp. 4171–4186, 2019.

C. Raffel et al., “Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer,” Journal of Machine Learning Research, vol. 21, pp. 1–67, 2020, Accessed: Jul. 24, 2024. [Online]. Available: http://jmlr.org/papers/v21/20-074.html.

I Nyoman Purnama and Ni Nengah Widya Utami, “IMPLEMENTASI PERINGKAS DOKUMEN BERBAHASA INDONESIA MENGGUNAKAN METODE TEXT TO TEXT

TRANSFER TRANSFORMER (T5),” Jurnal Teknologi Informasi dan Komputer, vol. 9, no. 4, Aug. 2023, doi: 10.36002/jutik.v9i4.2531.

F. Raihanunnisa, M. Arhami, and R. Hidayat, “PENDEKATAN HYBRID PADA SISTEM PERINGKAS TEKS ARTIKEL BERITA BAHASA INGGRIS MENGGUNAKAN NATURAL

LANGUAGE PROCESSING”, Accessed: Jul. 24, 2024. [Online]. Available: https://journal.budiluhur.ac.id/index.php/telematika/

A. JAISWAL, S. Liu, T. Chen, and Z. “Atlas” Wang, “The Emergence of Essential Sparsity in Large Pre-trained Models: The Weights that Matter,” Adv Neural Inf Process Syst, vol. 36, pp. 38887–38901, Dec. 2023, Accessed: Jul. 24, 2024. [Online]. Available: https://github.com/VITA-Group/essential_sparsity.

F. V. P. Samosir, H. Toba, and M. Ayub, “BESKlus : BERT Extractive Summarization with K- Means Clustering in Scientific Paper,” Jurnal Teknik Informatika dan Sistem Informasi, vol. 8, no. 1, 2022, doi: 10.28932/jutisi.v8i1.4474.

A. A. Fattahila, A. Romadhony, and S. Al Faraby, “Peringkasan Artikel Berita Menggunakan Pendekatan Abstraktif Dengan Model Transformers,” vol. 10, no. 5, pp. 4980–4986, 2023.

Mulaab, M. (2020). RINGKASAN DOKUMEN ILMIAH BERBASIS REPRESENTASI DATA GRAPH MENGGUNAKAN UKURAN CENTRALITY. Jurnal Simantec, 9(1), 33-38.

F. Noprianto, S. Agustian, M. Irsyad, N. Sultan, and S. K. Riau, “Clustering Peringkasan Teks Otomatis Dokumen Berita menggunakan Metode K-Means,” pp. 139–147, 2023, [Online].

Available: https://prosiding.unipma.ac.id/index.php/sendiko/article/view/3911/3569

S. HENI, “Implementasi Berbagai Metode Kecerdasan Buatan (Artificial Intelligence) Pada Masalah Gangguan Kepribadian (Narcissistic …,” Digital Repostiry UNILA, 2023, [Online].

Available: http://digilib.unila.ac.id/78126/

Rahman, S., Sembiring, A., Aulia, R., Dafitri, H., & Liza, R. (2023). Pengenalan ChatGPT untuk Meningkatkan Pengetahuan Siswa-Siswi di SMK Negeri 1 Pantai Labu. Prioritas: Jurnal Pengabdian Kepada Masyarakat, 5(01), 1-7.

Yuliska, Y., & Syaliman, K. U. (2020). Literatur Review Terhadap Metode, Aplikasi dan Dataset Peringkasan Dokumen Teks Otomatis untuk Teks Berbahasa Indonesia. IT Journal Research and Development, 5(1), 19-31.

Al. Nasir, et, “No 主観的健康感を中心とした在宅高齢者における健康関連指標に関する共分散構造分析Title,” vol. 9, pp. 356–363, 2023.

N. S. Keskar, B. McCann, L. R. Varshney, C. Xiong, and R. Socher, “CTRL: A Conditional Transformer Language Model for Controllable Generation,” pp. 1–18, 2019, [Online]. Available: http://arxiv.org/abs/1909.05858

Q. Fan, H. Huang, X. Zhou, and R. He, “Lightweight Vision Transformer with Bidirectional Interaction,” Adv Neural Inf Process Syst, vol. 36, no. NeurIPS, 2023.

B. Juarto and Yulianto, “Indonesian News Classification Using IndoBert,” International Journal of Intelligent Systems and Applications in Engineering, vol. 11, no. 2, pp. 454–460, 2023.

B. Kurniawan, A. Ari Aldino, and A. Rahman Isnain, “Sentimen Analisis Terhadap Kebijakan Penyelenggara Sistem Elektronik (PSE) Menggunakan Algoritma Bidirectional Encorder Representations From Transformer (BERT),” Jurnal Teknologi dan Sistem Informasi (JTSI), vol. 3, no. 4, pp. 98–106, 2022, [Online]. Available: http://jim.teknokrat.ac.id/index.php/JTSI

H. E. Syah, D. F. Huwaida, and L. Danuarto, “Efisiensi Manajemen Sekolah Dan Mutu Pembelajaran,” vol. 21, no. 1, pp. 15–22, 2021.

A. Alokla, W. Gad, W. Nazih, M. Aref, and A. B. Salem, “Pseudocode Generation from Source Code Using the BART Model,” Mathematics, vol. 10, no. 21, 2022, doi: 10.3390/math10213967.

S. Nabilah, “Analisis Sentimen Berbasis Aspek pada Ulasan Aplikasi Novel Online di Media Sosial Menggunakan Latent Dirichlet Allocation dan Bidirectional Encoder Representation from Transformers,” 2022, [Online]. Available: https://repository.uinjkt.ac.id/dspace/handle/123456789/65262

S. McDonnell, O. Nada, M. R. Abid, and E. Amjadian, “CyberBERT: A Deep Dynamic-State Session-Based Recommender System for Cyber Threat Recognition,” IEEE Aerospace Conference Proceedings, vol. 2021-March, no. March, 2021, doi: 10.1109/AERO50100.2021.9438286.

D. Setiawan, E. A. D. Karuniawati, and S. I. Janty, “Peran Chat Gpt (Generative Pre-Training Transformer) Dalam Implementasi Ditinjau Dari Dataset,” INNOVATIVE: Journal of Social Science Research, vol. 3, no. 3, pp. 9527–9539, 2023, [Online]. Available: https://j- innovative.org/index.php/Innovative/article/view/3286

M. T. Anwar, L. Heriyanto, and F. Fanini, “Model Prediksi Dropout Mahasiswa Menggunakan Teknik Data Mining,” Jurnal Informatika Upgris, vol. 7, no. 1, pp. 56–60, 2021, doi: 10.26877/jiu.v7i1.8023.

A. Jaiswal, S. Liu, T. Chen, and Z. Wang, “The Emergence of Essential Sparsity in Large Pre- trained Models: The Weights that Matter,” Adv Neural Inf Process Syst, vol. 36, no. NeurIPS,

pp. 1–15, 2023.

Farah Raihanunnisa, Muhammad Arhami, and Rahmad Hidayat, “Pendekatan Hybrid Pada Sistem Peringkas Teks Artikel Berita Bahasa Inggris Menggunakan Natural Language Processing,” Telematika Mkom, vol. 15, no. 2, pp. 86–92, 2023, [Online]. Available: https://journal.budiluhur.ac.id/index.php/telematika/article/view/2679

Performance Evaluation of Transformer Models: Scratch, Bart, and Bert for News Document Summarization

Authors

DOI:

Keywords:

Abstract

Downloads

References

Additional Files

Published

How to Cite

Issue

Section

License

Make a Submission

sidebar

Information