Information Retrieval Related to Information Regarding Covid-19 Using Transformers Architecture

Wiktasari Wiktasari; Prayitno Prayitno; Vinda Setya  Kartika; Eri Eli  Lavindi; Naufal Reky Ardhana; Rucirasatti  Nariswana

doi:10.52436/1.jutif.2025.6.2.2606

Authors

Wiktasari Computer Engineering Technology, Electrical Engineering, Semarang State Polytechnic, Indonesia
Prayitno Computer Engineering Technology, Electrical Engineering, Semarang State Polytechnic, Indonesia
Vinda Setya Kartika Computer Engineering Technology, Electrical Engineering, Semarang State Polytechnic, Indonesia
Eri Eli Lavindi Computer Engineering Technology, Electrical Engineering, Semarang State Polytechnic, Indonesia
Naufal Reky Ardhana Computer Engineering Technology, Electrical Engineering, Semarang State Polytechnic, Indonesia
Rucirasatti Nariswana Computer Engineering Technology, Electrical Engineering, Semarang State Polytechnic, Indonesia

DOI:

https://doi.org/10.52436/1.jutif.2025.6.2.2606

Keywords:

BERT, Bi-Encoder, Cross-Encoder, Information Retrieval, Transformers

Abstract

The spread of the COVID-19 virus has occurred exponentially, necessitating advanced search technologies that provide accurate information. The primary challenge in searching for COVID-19 related information involves the diversity and rapid changes in data, as well as the need to understand specific medical contexts. Unstructured information sources, such as research articles, news reports, and social media discussions, add complexity to retrieving relevant and up-to-date information. As the volume of data and information related to the COVID-19 pandemic increases, there is a pressing need for effective and accurate information retrieval systems. Transformer architecture, known for its capabilities in natural language processing and managing complex contexts, offers great potential to enhance search quality in the healthcare domain. BERT is a deep learning model that performs searches based on specific queries, with search results sorted accordingly. The ranking process uses BERT architecture to compare the performance of transformer encoders, specifically between bi-encoders and cross-encoders. A bi- encoder is an architecture where two separate encoders process two different inputs, such as queries and documents. In contrast, a cross-encoder processes two texts simultaneously using a single encoder, allowing the model to capture contextual interactions between them. Research indicates that cross-encoder performance is significantly better than bi-encoder for cases with relatively small data sets. Evaluation results show that the NDCG score for bi-encoder is 0.89, while for cross-encoder it is 0.9. The mAP score for bi-encoder is 0.7, and for cross-encoder, it is 0.89. Both bi-encoder and cross-encoder achieved an MRR score of 1.0.

Downloads

Download data is not yet available.

References

E. Andre, K. Anuprit, P. Romain, H. Kazuma, Y. Wenpeng, R. Dragomir, and S. Richard, “COVID-19 information retrieval with deep-learning based semantic search, question answering, and abstractive summarization,” npj Digital Medicine, vol. 68, pp. 21–37, 2021.

C. Tanmoy, L. G. Valerio, M. Vincenzo, and S. Giancarlo, “Information retrieval algorithms and neural ranking models to detect previously fact-checked information,” Neurocomputing, vol. 557, pp. 66–80, 2023.

G. Yan and C. Georgina, “Improving visual-semantic embeddings by learning semantically- enhanced hard negatives for cross-modal information retrieval,” Pattern Recognition, vol. 137,

pp. 159–272, 2023.

O. Arantxa, S. V. Iñaki, S. Xabier, P. Anselmo, L. Borja, and A. Eneko, “Information retrieval and question answering: A case study on COVID-19 scientific literature,” Knowledge-Based Systems, vol. 240, pp. 60–72, 2022.

T. Alexander, M. Stanislav, and Z. Marinka, “Comparison of BERT implementations for natural language processing of narrative medical documents,” Informatics in Medicine Unlocked, vol. 36, pp. 11–19, 2023.

J. L. Z. M. João and C. A. d. C. Cristiano, “The HoPE Model Architecture: a Novel Approach to Pregnancy Information Retrieval Based on Conversational Agents,” Journal of Healthcare Informatics Research, vol. 6, pp. 253–294, 2022.

K. P. Kevin and U. Saritha, “A comparison of chatbot platforms with the state-of-the-art sentence BERT for answering online student FAQs,” Results in Engineering, vol. 17, pp. 10–25, 2023.

F. Khalisma, Indriati, and P. A. Putra, “Pencarian Berita Berbahasa Indonesia Menggunakan Metode BM25,” Jurnal Pengembangan Teknologi Informasi dan Ilmu Komputer, vol. 3, pp. 2589–2595, 2019.

T. P. Diaz and B. S. Erwin, “Sentiment Analysis on Social Media with Glove Using Combination CNN and RoBERTa,” Rekayasa Sistem dan Teknologi Informasi, vol. 3, pp. 457–563, 2023.

E. U. Samson and S. Domnic, “Topic Modelling and Opinion Analysis On Climate Change Twitter Data Using LDA And BERT Model,” Procedia Computer Science, vol. 218, pp. 908– 917, 2023.

H. O. Amir, D. Subasish, L. Jinli, and R. Ashifur, “Using Bidirectional Encoder Representations from Transformers (BERT) to classify traffic crash severity types,” Natural Language Processing Journal, vol. 3, pp. 100–112, 2023.

L. Jiaxin, W. Shuai, Z. Qianqian, Z. Chenchen, Z. Guogang, Z. Zijian, and L. Xuchen, “Overhead transmission line condition assessment based on intention classification and slot filling using optimized BERT model,” Energy Reports, vol. 9, pp. 838–846, 2023.

S. Bano, S. Khalid, N. M. Tairan, H. Shah, and H. A. Khattak, “Summarization of scholarly articles using BERT and BiGRU: Deep learning-based extractive approach,” Journal of King Saud University, vol. 557, p. 101739, 2023.

K. Kamaljit and K. Parminder, “BERT-CNN: Improving BERT for Requirements Classification using CNN,” Procedia Computer Science, vol. 2018, pp. 2604–2611, 2023.

P. TJasman, G. H. Milda, and R. Rizky, “Implementasi dan Perbandingan Metode Okapi BM25 dan PLSA pada Aplikasi Information Retrieval,” in Proc. Seminar Nasional Informatika dan Aplikasinya (SNIA), 2021.

A. Haya, E. Tamer, and J. J. Bernard, “Improving conversational search with query reformulation using selective contextual history,” Data and Information Management, vol. 7, pp. 100–125, 2023.

T. J. Hai, Z. B. Y. Bo, H. Chao, L. W. H. Wen, X. W. B., F. B. Yu, and R. Li, “Application of graph neural network and feature information enhancement in relation inference of sparse knowledge graph,” Journal of Electronic Science and Technology, vol. 21, pp. 100–124, 2023.

A. H. Ramadhan, S. Casi, and A. N. Ratna, “Deteksi Pelanggaran Parkir pada Bahu Jalan Tol dengan Intelligent Transportation System Menggunakan Algoritma Faster R-CNN,” e- Proceeding of Engineering, vol. 9, pp. 1047–1060, 2022.

W. Beinan, “A parallel implementation of computing mean average precision,” arXiv preprint, arXiv:2206.09504v1 [cs.CV], 2022.

R. Mehrdad, F. Vahid, A. Sajad, J. J. S. M., and O. Mourad, “A novel healthy and time-aware food recommender system using attributed community detection,” arXiv preprint, arXiv:2103.06523v2 [cs.IR], 2021.