Enhancing Accessibility in Local Government Data Portals via Retrieval- Augmented Generation: A Case Study on Satu Data Indonesia in Banyumas Regency

Authors

  • Agus Nur Hadie Magister of Computer Science, Faculty of Computer Science, Universitas Amikom Purwokerto, Indonesia
  • Imam Tahyudin Magister of Computer Science, Faculty of Computer Science, Universitas Amikom Purwokerto, Indonesia
  • Taqwa Hariguna Magister of Computer Science, Faculty of Computer Science, Universitas Amikom Purwokerto, Indonesia

DOI:

https://doi.org/10.52436/1.jutif.2025.6.4.5153

Keywords:

Accessibility, Data Portal, Government Data, Large Language Model, n8n, pgvector, Retrieval- Augmented Generation, Satu Data Indonesia

Abstract

Public access to local government data in Indonesia, such as that in the Satu Data Indonesia portal for Banyumas Regency, is severely hampered by outdated search interfaces and the technical complexity of handling heterogeneous data formats like PDF, Excel, and CSV. This research directly addresses this accessibility gap by designing, developing, and evaluating an intelligent question-answering system. We introduce a novel application of a Retrieval- Augmented Generation (RAG) architecture tailored for Indonesian local government data. The core novelty lies in our methodology for handling heterogeneous data formats (PDF, Excel, CSV) by integrating a low-code orchestrator (n8n) with a high-performance vector database (pgvector), a practical solution for a common public sector challenge. The system utilizes the text-embedding-3-large model for semantic understanding and gpt-4.1 for generating grounded, factual answers. The system's effectiveness was rigorously validated, achieving a perfect 100% score across accuracy, precision, recall, and F1-score on defined test cases. Crucially, usability testing with end-users confirmed the system is perceived as significantly more efficient and user-friendly than manual data searching. The primary impact of this work is a validated, replicable blueprint for local governments to democratize public information. By transforming complex data retrieval into an intuitive conversation, this research offers a practical AI solution to enhance governmental transparency and citizen engagement.

Downloads

Download data is not yet available.

References

A. Androutsopoulou, N. Karacapilidis, E. Loukis, dan Y. Charalabidis, “Transforming the communication between citizens and government through AI-guided chatbots,” Gov. Inf. Q., vol. 36, no. 2, hlm. 358–367, Apr 2019, doi: 10.1016/j.giq.2018.10.001.

J. Han, J. Lu, Y. Xu, J. You, dan B. Wu, “Intelligent Practices of Large Language Models in Digital Government Services,” IEEE Access, vol. 12, no. 01, hlm. 8633–8640, 2024, doi: 10.1109/ACCESS.2024.3349969.

G. Papageorgiou, V. Sarlis, M. Maragoudakis, dan C. Tjortjis, “Enhancing E-Government Services through State-of-the-Art, Modular, and Reproducible Architecture over Large Language Models,” Appl. Sci., vol. 14, no. 18, hlm. 8259, Sep 2024, doi: 10.3390/app14188259.

K. Hardy dan A. Maurushat, “Opening up government data for Big Data analysis and public benefit,” Comput. Law Secur. Rev., vol. 33, no. 1, hlm. 30–37, Feb 2017, doi: 10.1016/j.clsr.2016.11.003.

S. Baack dkk., “Towards Best Practices for Open Datasets for LLM Training,” arXiv, vol. 2025, no. 01, hlm. 1-32, 2025, doi: 10.48550/arXiv.2501.08365.

H. Jun, Y. Tanaka, S. Johri, B. Reardon, dan E. M. Van Allen, “Implementing a Retrieval- Augmented Generation-Based Large Language Model to Guide Oncologists in Searching for FDA-approved Therapies for Patient Treatment Planning.,” J. Clin. Oncol., vol. 43, no. 16, hlm. 1-10, 2025, doi: 10.1200/jco.2025.43.16_suppl.e13612.

M. E. Mamalis, E. Kalampokis, A. Karamanou, P. Brimos, dan K. Tarabanis, “Can Large Language Models Revolutionalize Open Government Data Portals? A Case of Using ChatGPT in statistics.gov.scot,” Proceedings of the 27th Pan-Hellenic Conference on Progress in Computing and Informatics, vol. 2024, no. 02, hlm. 53–59, 2024, doi: 10.1145/3635059.3635068.

D. Canning dan L. Jaillant, “AI to review government records: new work to unlock historically significant digital records,” AI Soc., vol. 2025, no. 02, hlm. 1-15, 2025, doi: 10.1007/s00146- 025-02221-0.

M. Murtiyoso, “A Systematic Review of Retrieval-Augmented Generation for Enhancing Domain-Specific Knowledge in Large Language Models,” Sinkron, vol. 09, no. 2, hlm. 969-977, 2025, doi: 10.33395/sinkron.v9i2.14824.

J. Yae, “A Staged Framework for LLM-powered Information Extraction in Government

Contracts,” Theses Diss., Mar 2024, [Daring]. Tersedia pada: https://scholar.afit.edu/etd/7734

W. Zhong, “Performance of ChatGPT-4o and Four Open-Source Large Language Models in Generating Diagnoses Based on China’s Rare Disease Catalog: Comparative Study,” J. Med. Internet Res., vol. 27, no. 6, hlm 1-13, 2025, doi: 10.2196/69929.

N. Moldabay, “TRANSFORMATION FROM ELECTRONIC GOVERNMENT TO SMART

GOVERNMENT,” Transformation from electronic government to smart government, Des 2024, Diakses: 4 Mei 2025. [Daring]. Tersedia pada: https://unitesi.unive.it/handle/20.500.14247/15733

E. Tan, “Designing an AI compatible open government data ecosystem for public governance,”

Inf. Polity, vol. 28, no. 4, hlm. 541–557, 2023, doi: 10.3233/IP-220020.

M. E. Mamalis, E. Kalampokis, F. Fitsilis, G. Theodorakopoulos, dan K. Tarabanis, “A Large Language Model Agent Based Legal Assistant for Governance Applications,” Electronic Government, vol. 14841, no. 8, hlm. 286–301, 2024, doi: 10.1007/978-3-031-70274-7_18.

A. M. Wahid, L. Afuan, dan F. S. Utomo, “Enhancing Collaboration Data Management Through Data Warehouse Design: Meeting Ban-Pt Accreditation And Kerma Reporting Requirements In Higher Education,” J. Tek. Inform. Jutif, vol. 5, no. 6, hlm. 1517-1527, Des 2024, doi: 10.52436/1.jutif.2024.5.6.1747.

L. Yun, S. Yun, dan H. Xue, “Improving citizen-government interactions with generative artificial intelligence: Novel human-computer interaction strategies for policy understanding through large language models,” PLOS ONE, vol. 19, no. 12, hlm. e0311410, Des 2024, doi: 10.1371/journal.pone.0311410.

H. Chafetz, S. Saxena, dan S. G. Verhulst, “A Fourth Wave of Open Data? Exploring the Spectrum of Scenarios for Open Data and Generative AI,”arXiv, vol. 2024, no. 5, 2024, doi: 10.48550/ARXIV.2405.04333.

M. Kuziemski dan G. Misuraca, “AI governance in the public sector: Three tales from the frontiers of automated decision-making in democratic settings,” Telecommun. Policy, vol. 44, no. 6, hlm. 101976, Jul 2020, doi: 10.1016/j.telpol.2020.101976.

K. Fang dan K. Xu, “Automating Government Response to Citizens’ Questions: A Large Language Model-Based Question-Answering Guidance Generation System,” dalam 2023 3rd International Conference on Digital Society and Intelligent Systems (DSInS), Chengdu, China: IEEE, Nov 2023, hlm. 386–389. doi: 10.1109/DSInS60115.2023.10455136.

I. Pencheva, M. Esteve, dan S. J. Mikhaylov, “Big Data and AI – A transformational shift for government: So, what next for research?,” Public Policy Adm., vol. 35, no. 1, hlm. 24–44, Jan 2020, doi: 10.1177/0952076718780537.

S. Liu, L. Zhang, W. Liu, J. Zhang, D. Gao, dan X. Jia, “The Evaluation Framework and Benchmark for Large Language Models in the Government Affairs Domain,” ACM Trans. Intell. Syst. Technol., vol. 2025, no. 2, hlm. 3716854, Feb 2025, doi: 10.1145/3716854.

P. Mikalef dkk., “Enabling AI capabilities in government agencies: A study of determinants for European municipalities,” Gov. Inf. Q., vol. 39, no. 4, hlm. 101596, Okt 2022, doi: 10.1016/j.giq.2021.101596.

L. Powell, R. Nour, R. Sleibi, H. A. Suwaidi, dan N. Zary, “Democratizing the Development of Chatbots to Improve Public Health: Feasibility Study of COVID-19 Misinformation,” Jmir Hum. Factors, vol. 10, no. 12, hlm. 1-15, 2023, doi: 10.2196/43120.

Y. Zheng dkk., “Integrating Retrieval-Augmented Generation for Enhanced Personalized Physician Recommendations in Web-Based Medical Services: Model Development Study,” Front. Public Health, vol. 13, no. 1, hlm. 1-10, 2025, doi: 10.3389/fpubh.2025.1501408.

S. K. Lho dkk., “Large Language Models and Text Embeddings for Detecting Depression and Suicide in Patient Narratives,” Jama Netw. Open, vol. 8, no. 5, hlm. e2511922, 2025, doi: 10.1001/jamanetworkopen.2025.11922.

F. M. Delgado-Chaves dkk., “Transforming Literature Screening: The Emerging Role of Large Language Models in Systematic Reviews,” Proc. Natl. Acad. Sci., vol. 122, no. 2, hlm. e2411962122, 2025, doi: 10.1073/pnas.2411962122.

G. Caldarini, S. Jaf, dan K. McGarry, “A Literature Survey of Recent Advances in Chatbots,”

Information, 2022, vol. 13, no. 41, hlm. 1-22, doi: 10.3390/info13010041.

Y. Christian dan M. Erline, “Web-Based Chatbot With Natural Language Processing and Knuth- Morris-Pratt (Case Study: Universitas Internasional Batam),” JST J. Sains Dan Teknol., vol. 11, no. 1, hlm. 132-141, 2022, doi: 10.23887/jstundiksha.v11i1.43258.

B. S. Mallika Rao, “Replacing AI Agents for Backend,” Interantional J. Sci. Res. Eng. Manag., vol. 09, no. 6, hlm. 1-8, 2025, doi: 10.55041/ijsrem.ncft011.

Mr. S. Pawar, “A Practical Evaluation of Self-Hosted N8n for Secure and Scalable Workflow Automation,” Interantional J. Sci. Res. Eng. Manag., vol. 6, no. 9, hlm. 1-9, 2025, doi: 10.55041/ijsrem50302.

T. Nadarzynski dkk., “Barriers and Facilitators to Engagement With Artificial Intelligence (AI)- based Chatbots for Sexual and Reproductive Health Advice: A Qualitative Analysis,” Sex. Health, vol. 18, no. 5, hlm. 385-393, 2021, doi: 10.1071/sh21123.

E. S. de Paiva dkk., “Continued pre-training of LLMs for Portuguese and Government domain: A proposal for product identification in textual purchase descriptions,” dipresentasikan pada AAAI-2024 Workshop on Public Sector LLMs: Algorithmic and Sociotechnical Design, Feb 2024. Diakses: 4 Mei 2025. [Daring]. Tersedia pada: https://openreview.net/forum?id=HBDb1ybEcs

D. Benavent, V. Venerito, dan X. Michelena, “RAGing Ahead in Rheumatology: New Language Model Architectures to Tame Artificial Intelligence,” Ther. Adv. Musculoskelet. Dis., vol. 2025, no. 17, hlm. 1-13, 2025, doi: 10.1177/1759720x251331529.

Additional Files

Published

2025-08-21

How to Cite

[1]
A. N. Hadie, I. Tahyudin, and T. Hariguna, “Enhancing Accessibility in Local Government Data Portals via Retrieval- Augmented Generation: A Case Study on Satu Data Indonesia in Banyumas Regency”, J. Tek. Inform. (JUTIF), vol. 6, no. 4, pp. 2420–2433, Aug. 2025.