Implementation of IndoBERT for Sustainability Impact Assessment in University Collaboration Information Systems

Authors

  • Ryan Hamonangan Informatics Engineering, STMIK IKMI Cirebon, Indonesia
  • Raditya Danar Dana Informatics Management, STMIK IKMI Cirebon, Indonesia
  • Yudhistira Arie Wijaya Information System, STMIK IKMI Cirebon, Indonesia
  • Odi Nurdiawan Informatics Management, STMIK IKMI Cirebon, Indonesia

DOI:

https://doi.org/10.52436/1.jutif.2026.7.3.5330

Keywords:

Information Systems, IndoBERT, Natural Language Processing, Sustainability Impact Assessment, University Collaboration

Abstract

University collaboration plays a critical role in enhancing institutional quality and supporting global sustainability agendas. However, many higher education institutions face challenges in managing Memorandum of Understanding (MoU), Memorandum of Agreement (MoA), and Implementation Agreement (IA) documents, particularly in monitoring implementation and assessing their alignment with sustainability goals. This study introduces a University Collaboration Information System enhanced with IndoBERT-based Natural Language Processing (NLP) to automate sustainability impact assessment. A synthetic corpus of 30 annotated collaboration documents was developed, covering multi-label Sustainable Development Goals (SDG) classification and span-level Named Entity Recognition (NER). Two approaches were evaluated: (1) baseline TF-IDF + Support Vector Machine (SVM) for SDG classification and rule-based NER, and (2) fine-tuned IndoBERT for both tasks. Experimental results show that IndoBERT significantly outperforms the baselines, achieving an average F1-score of 0.93 for SDG classification (+16.3%) and 0.96 for NER (+18.5%). The system integrates these models to generate automated entity extraction, sustainability dashboards, and document monitoring features. This work contributes to the advancement of informatics by demonstrating the effectiveness of Transformer-based NLP in processing institutional documents and by providing an integrated information-system framework that strengthens the role of NLP within the field of computer science.

Downloads

Download data is not yet available.

References

L. ; N. Ludhiana A., “University collaboration governance: Challenges in monitoring and evaluation of academic partnerships,” Int J Educ Dev, vol. 98, p. 102717, 2023, doi: 10.1016/j.ijedudev.2023.102717.

N. ; L. Ridei I.; Petrova O., “Challenges of higher education collaboration management in sustainable development,” Sustainability, vol. 15, no. 4, p. 3172, 2023, doi: 10.3390/su15043172.

P. ; P. Nesi G.; Paoli I., “Keyword and keyphrase extraction using NLP in web-based repositories,” Future Generation Computer Systems, vol. 108, pp. 385–398, 2020, doi: 10.1016/j.future.2020.02.001.

A. Arthur, “Automated text summarization and documentation: Advances in NLP applications,” Journal of Computational Linguistics, vol. 48, no. 3, pp. 455–472, 2022, doi: 10.1162/coli_a_00456.

D. ; K. Frost V.; Li H., “Document clustering and retrieval using hybrid machine learning models,” Inf Process Manag, vol. 58, no. 6, p. 102703, 2021, doi: 10.1016/j.ipm.2021.102703.

R. ; S. Pichiyan J.; Banerjee S., “NLP-enhanced web scraping for unstructured text analytics: Techniques and challenges,” Information Systems Frontiers, vol. 25, pp. 421–438, 2023, doi: 10.1007/s10796-022-10367-5.

T. ; X. Vo X., “Domain-specific NLP for curriculum design in higher education: Bridging academia and industry,” Educ Inf Technol (Dordr), vol. 27, pp. 11943–11960, 2022, doi: 10.1007/s10639-022-11117-7.

H. ; A. Younisse M., “Natural language processing applications in software engineering project management: A systematic review,” Journal of Systems and Software, vol. 201, p. 111645, 2023, doi: 10.1016/j.jss.2023.111645.

A. ; R. Rejeb K.; Simske S.; Keogh J. G., “ChatGPT and education: Web mining, ethics, and future research directions,” Educ Inf Technol (Dordr), vol. 29, no. 2, pp. 1379–1398, 2024, doi: 10.1007/s10639-023-11827-9.

I. ; G. David R., “Generative AI in education: Opportunities, challenges, and governance implications,” Comput Educ, vol. 205, 2024, doi: 10.1016/j.compedu.2023.104899.

D. ; D. Khurana S.; Bansal A., “Speech emotion recognition: Performance evaluation of CNN and Transformer-based models,” Neural Comput Appl, vol. 35, no. 2, pp. 1231–1245, 2023, doi: 10.1007/s00521-022-07890-1.

T. ; T. Toprak F., “Automatic thematic dictionary construction using NLP and semantic similarity methods,” Lang Resour Eval, vol. 58, pp. 301–320, 2024, doi: 10.1007/s10579-023-09658-4.

P. Singh, “Natural language processing in document analytics: A review,” Artif Intell Rev, vol. 54, pp. 5463–5492, 2021, doi: 10.1007/s10462-021-10033-8.

D. Naik, “Multimodal healthcare AI: Identifying and designing clinically relevant vision-language applications for radiology,” Radiol Artif Intell, vol. 6, no. 3, p. e230129, 2024, doi: 10.1148/ryai.230129.

R. Ma, “Multimodal machine learning enables AI chatbot to diagnose ophthalmic diseases and provide high-quality medical responses,” Nature Digital Medicine, vol. 8, no. 1, pp. 55–68, 2025, doi: 10.1038/s42256-025-01234-5.

N. Yildirim, “Applications of AI chatbots based on generative AI, large language models and multimodal models,” AI Soc, vol. 39, no. 2, pp. 225–240, 2024, doi: 10.1007/s00146-023-01789-2.

W. H. Organization, “Ethics and governance of artificial intelligence for health,” World Health Organization, 2021. [Online]. Available: https://www.who.int/publications/i/item/9789240029200

L. Weidinger et al., “Taxonomy of risks posed by language models,” in Proceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency, 2022, pp. 214–229. doi: 10.1145/3531146.3533088.

A. Suryani, F. Rahman, and M. Hidayat, “Evaluasi penggunaan chatbot berbasis bahasa lokal dalam pelayanan kesehatan di Indonesia,” Jurnal Informatika Kesehatan Indonesia, vol. 7, no. 2, pp. 85–97, 2023, doi: 10.33560/jiki.v7i2.456.

R. Mulyawan, R. D. Dana, and A. Bahtiar, “Evaluasi chatbot kesehatan multimodal berbahasa Indonesia dengan guardrail pada infografik,” Jurnal Khazanah Informatika, vol. 11, no. 1, pp. 55–68, 2025, doi: 10.20885/khifor.vol11.iss1.art5.

M. Mathew, D. Karatzas, and C. V Jawahar, “InfographicVQA: Visual question answering on infographic images,” IEEE Trans Pattern Anal Mach Intell, vol. 44, no. 10, pp. 7121–7133, 2022, doi: 10.1109/TPAMI.2021.3131455.

J. Lee et al., “BioBERT: A pre-trained biomedical language representation model for biomedical text mining,” Bioinformatics, vol. 36, no. 4, pp. 1234–1240, 2020, doi: 10.1093/bioinformatics/btz682.

A. R. Hevner, S. T. March, J. Park, and S. Ram, “Design science in information systems research,” MIS Quarterly, vol. 28, no. 1, pp. 75–105, 2004, doi: 10.2307/25148625.

G. V. Research, “Artificial intelligence in healthcare market size report,” 2022. [Online]. Available: https://www.grandviewresearch.com.

Additional Files

Published

2026-06-15

How to Cite

[1]
R. . Hamonangan, R. Danar Dana, Y. Arie Wijaya, and O. Nurdiawan, “Implementation of IndoBERT for Sustainability Impact Assessment in University Collaboration Information Systems”, J. Tek. Inform. (JUTIF), vol. 7, no. 3, pp. 2506–2516, Jun. 2026.