An Integrated Pipeline with Hierarchical Segmentation and CNN for Automated KTP-el Data Extraction on the e-Magang Platform
DOI:
https://doi.org/10.52436/1.jutif.2025.6.5.5279Keywords:
character segmentation, CNN, flask API, K-Fold Cross Validation, KTP-el, OCRAbstract
In alignment with Indonesia's digital transformation agenda, this research addresses the inefficiencies and error-prone nature of manual data entry on the Foreign Policy Strategy Agency's (BSKLN) e-magang platform. This study introduces a comprehensive, end-to-end Optical Character Recognition (OCR) pipeline, specifically designed for structured identity documents and real-world government platform integration. The proposed methodology features a robust workflow, including image preprocessing with histogram matching, hierarchical segmentation using vertical projection, and intelligent postprocessing to structure the output. To overcome the limitations of a small dataset, three specialized Convolutional Neural Network (CNN) models were rigorously trained and validated using a stratified 5-fold cross-validation technique. The final system was successfully integrated, connecting a Flask-based model engine with the existing Laravel and React platform. End-to-end testing demonstrated strong performance, achieving an average character-reading accuracy of 93.31% with a mean processing time of 14.48 seconds per image. The primary contribution of this research to the field of informatics is the development of a complete and deployable system architecture that ensures data interoperability and reliability, providing a practical blueprint for integrating intelligent automation into digital public services.
Downloads
References
Republik Indonesia, “Undang-Undang Republik Indonesia Nomor 24 Tahun 2013 Tentang Administrasi Kependudukan,” 2013
S. Salsabila, A. Zetra, dan R. E. Putera, “Penerapan E-Government Dalam Pelayanan KTP Pada Dinas Kependudukan dan Pencatatan Sipil Kota Padang,” J. Ilmu Adm. Negara ASIAN (Asosiasi Ilmuwan Adm. Negara), vol. 9, no. 2, hal. 314–324, 2022, doi: 10.47828/jianaasian.v9i2.65.
M. Tampang, I. Sartika, dan F. Ruhana, “Kualitas Pelayanan Publik Dalam Pembuatan Kartu Tanda Penduduk Elektronik (E-Ktp) Di Suku Dinas Kependudukan Dan Catatan Sipil Kota Jakarta Selatan,” J. Kaji. Pemerintah J. Gov. Soc. Polit., vol. 10, no. 1, hal. 73–85, 2024, doi: 10.25299/jkp.2024.vol10(1).16958.
A. Doramia Lumbanraja, “Urgensi Transformasi Pelayanan Publik melalui E-Government Pada New Normal dan Reformasi Regulasi Birokrasi,” Adm. Law Gov. J., vol. 3, no. 2, hal. 220–231, 2020, doi: 10.14710/alj.v3i2.220-231.
M. Alfarizi, “Digitalisasi Kartu Tanda Penduduk dan Partisipasi Milenial-Gen Z: Investigasi Penerimaan Transformasi Digital dalam Kebijakan Kependudukan Indonesia,” J. Stud. Kebijak. Publik, vol. 2, no. 1 SE-, hal. 41–54, Mei 2023, doi: 10.21787/jskp.2.2023.41-54.
K. Luar Negeri, “Peraturan Menteri Luar Negeri Republik Indonesia Nomor 1 Tahun 2024 Tentang Standar Layanan Informasi Publik Di Kementerian Luar Negeri Dan Perwakilan Republik Indonesia,” 2016
M. R. Reyvansyah, “Penerapan Metode Optical Character Recognition (OCR) Untuk Mengambil Data Arsip,” J. Tek. Elektro dan Komput. TRIAC, vol. 10, no. 2, hal. 44–50, 2023, doi: 10.21107/triac.v10i2.20809.
J. Felisa, D. Setiawan, dan I. Khalisa, “Perancangan Perangkat Lunak Pengenalan Karakter Plat Nomor Kendaraan dengan Metode Convolutional Neural Network,” Media Inform., vol. 21, no. 3, hal. 280–306, 2023, doi: 10.37595/mediainfo.v21i3.156.
I. Wijaya dan C. Lubis, “Pengimplementasian Ocr Menggunakan Cnn Untuk Ekstraksi Teks Pada Gambar,” J. Ilmu Komput. dan Sist. Inf., vol. 10, no. 1, 2022, doi: 10.24912/jiksi.v10i1.17836.
F. Imran, M. A. Hossain, dan M. Al Mamun, “Identification and Recognition of Printed Distorted Characters Using Proposed DCR Method,” 2020 IEEE Reg. 10 Symp. TENSYMP 2020, no. March 2023, hal. 1478–1481, 2020, doi: 10.1109/TENSYMP50017.2020.9230646.
M. S. Gumilang dan D. Avianto, “RECOGNITION OF REAL-TIME HANDWRITTEN CHARACTERS USING CONVOLUTIONAL NEURAL NETWORK ARCHITECTURE,” J. Tek. Inform., vol. 4, no. 5 SE-Articles, hal. 1143–1150, Okt 2023, doi: 10.52436/1.jutif.2023.4.5.993.
M. Haris, M. G. Suryanata, dan M. Yetri, “Implementasi OCR Menggunakan Algoritma Template Matching Correlation pada Pengarsipan e-KTP,” J-SISKO TECH (Jurnal Teknol. Sist. Inf. dan Sist. Komput. TGD), vol. 6, no. 2, hal. 281, 2023, doi: 10.53513/jsk.v6i2.8134.
Fatih Gesang Panuntun dan Rr. Hajar Puji Sejati, “Sistem Otomatisasi Deteksi dan Ekstraksi Data KTP Berbasis Convolutional Neural Network dan Optical Character Recognition,” JSAI (Journal Sci. Appl. Informatics), vol. 7, no. 3, hal. 464–471, 2024, doi: 10.36085/jsai.v7i3.7269.
G. Sugiarta, D. P. Andini, dan S. Hidayatullah, “Ekstraksi Informasi/Data e-KTP Menggunakan Optical Character Recognition Convolutional Neural Network,” JTERA (Jurnal Teknol. Rekayasa), vol. 6, no. 1, hal. 1, 2021, doi: 10.31544/jtera.v6.i1.2021.1-6.
A. R. Irawati, D. Kurniawan, Y. T. Utami, dan R. Taufik, “An Exploration of TensorFlow-Enabled Convolutional Neural Network Model Development for Facial Recognition : Advancements in Student Attendance System,” vol. 11, no. 2, hal. 413–428, 2024, doi: 10.15294/sji.v11i2.3585.
A. Kabir Rifai, M. Rafi Muttaqin, D. Irmayanti, S. Tinggi Teknologi Wastukancana, J. Cikopak No, dan J. Barat, “Pemanfaatan Algoritma Convolutional Neural Network Dengan Untuk Mendeteksi Penyakit Pada Tumbuhan Jagung,” Sist. J. Ilm. Sist. Inf., vol. Vol.1, no. 1, hal. 18–26, 2024, [Daring]. Tersedia pada: https://ejournal.rizaniamedia.com/index.php/sistematis
C. A. Maharani, B. Warsito, dan R. Santoso, “Analisis Sentimen Vaksin Covid-19 Pada Twitter Menggunakan Recurrent Neural Network (Rnn) Dengan Algoritma Long Short-Term Memory (Lstm),” J. Gaussian, vol. 12, no. 3, hal. 403–413, 2024, doi: 10.14710/j.gauss.12.3.403-413.
F. G. Safinatunnajah, A. Prasetiadi, dan M. Wibowo, “CLASSIFICATION OF CAT SOUNDS USING CONVOLUTIONAL NEURAL NETWORK (CNN) AND LONG SHORT-TERM MEMORY (LSTM) METHODS,” J. Tek. Inform., vol. 3, no. 5 SE-Articles, hal. 1349–1353, Okt 2022, doi: 10.20884/1.jutif.2022.3.5.373.
M. R. R. Allaam dan A. T. Wibowo, “Klasifikasi Genus Tanaman Anggrek Menggunakan Metode Convolutional Neural Network (CNN),” eProceedings Eng., vol. 8, no. 2, hal. 1153, 2021, [Daring]. Tersedia pada: https://openlibrarypublications.telkomuniversity.ac.id/index.php/engineering/article/view/14708
P. F. Johari, N. Arifin, M. Muzaki, dan M. S. A. Utama, “Corn Leaf Diseases Classification Using CNN with GLCM, HSV, and L*a*b* Features,” J. Tek. Inform., vol. 6, no. 2, hal. 709–722, 2025, doi: 10.52436/1.jutif.2025.6.2.4345.
K. Azmi, S. Defit, dan S. Sumijan, “Implementasi Convolutional Neural Network (CNN) Untuk Klasifikasi Batik Tanah Liat Sumatera Barat,” J. Unitek, vol. 16, no. 1, hal. 28–40, 2023, doi: 10.52072/unitek.v16i1.504.
S. H. Apandi, J. Sallim, dan R. Mohamed, “A Convolutional Neural Network (CNN) Classification Model for Web Page: A Tool for Improving Web Page Category Detection Accuracy,” JITSI J. Ilm. Teknol. Sist. Inf., vol. 4, no. 3, hal. 110–121, 2023, doi: 10.30630/jitsi.4.3.181.
M. T R, V. K. V, D. K. V, O. Geman, M. Margala, dan M. Guduri, “The stratified K-folds cross-validation and class-balancing methods with high-performance ensemble classifiers for breast cancer classification,” Healthc. Anal., vol. 4, no. July, hal. 100247, 2023, doi: 10.1016/j.health.2023.100247.
H. Ma’we, A. Y. Husodo, dan B. Irmawati, “Performance Comparison of Naive Bayes and Bidirectional Lstm Algorithms in Bsi Mobile Review Sentiment Analysis,” J. Tek. Inform., vol. 6, no. 1, hal. 159–172, 2024, doi: 10.52436/1.jutif.2024.5.6.4178.
S. Widodo, H. Brawijaya, dan S. Samudi, “Stratified K-fold cross validation optimization on machine learning for prediction,” Sinkron, vol. 7, no. 4, hal. 2407–2414, 2022, doi: 10.33395/sinkron.v7i4.11792.
A. Rianti, N. W. A. Majid, dan A. Fauzi, “CRISP-DM: Metodologi Proyek Data Science,” Pros. Semin. Nas. Teknol. …, hal. 107–114, 2023, [Daring]. Tersedia pada: http://ojs.udb.ac.id/index.php/Senatib/article/view/3015
S. Alden dan B. N. Sari, “Implementasi Algoritma CNN Untuk Pemilahan Jenis Sampah Berbasis Android Dengan Metode CRISP-DM,” J. Inform., vol. 10, no. 1, hal. 62–71, 2023, doi: 10.31294/inf.v10i1.14985.
R. Khanam, M. Hussain, R. Hill, dan P. Allen, “A Comprehensive Review of Convolutional Neural Networks for Defect Detection in Industrial Applications,” IEEE Access, vol. 12, hal. 94250–94295, 2024, doi: 10.1109/ACCESS.2024.3425166.
D. B. Santosa, A. Wahana, dan W. Uriawan, “Implementation of Convolutional Neural Network Using Mobilenetv2 To Distinguish Human and Artificial Intelligence Painting,” J. Tek. Inform., vol. 6, no. 1, hal. 441–452, 2025, doi: 10.52436/1.jutif.2025.6.1.3827.
W. Wijiyanto, A. I. Pradana, S. Sopingi, dan V. Atina, “Teknik K-Fold Cross Validation untuk Mengevaluasi Kinerja Mahasiswa,” J. Algoritm., vol. 21, no. 1, hal. 239–248, 2024, doi: 10.33364/algoritma/v.21-1.1618.
B. S. Abunasser, M. R. J. AL-Hiealy, I. S. Zaqout, dan S. S. Abu-Naser, “Convolution Neural Network for Breast Cancer Detection and Classification Using Deep Learning,” Asian Pacific J. Cancer Prev., vol. 24, no. 2, hal. 531–544, 2023, doi: 10.31557/APJCP.2023.24.2.531.
I. S. Had, W. Maulana Baihaqi, dan D. Putriana Nuramanah Kinding, “Improving Tesseract OCR Accuracy Using SymSpell Algorithm on Passport Data,” Sinkron, vol. 9, no. 1, hal. 374–381, 2025, doi: 10.33395/sinkron.v9i1.14395.
Y. Widyaningsih, G. P. Arum, dan K. Prawira, “Aplikasi K-Fold Cross Validation Dalam Penentuan Model Regresi Binomial Negatif Terbaik,” BAREKENG J. Ilmu Mat. dan Terap., vol. 15, no. 2, hal. 315–322, 2021, doi: 10.30598/barekengvol15iss2pp315-322.
J. A. Wuisan, A. Jacobus, dan S. R. U. A. Sompie, “Data Balancing Methods on Radiographic Image Classification on Unbalance Dataset (Perbandingan Metode Penyeimbangan Data pada Klasifikasi Citra Radiografi pada Dataset Tidak Seimbang),” J. Tek. Elektro dan Komput. , vol. 11, no. 1, hal. 1–8, 2022.
M. R. Hartono, C. A. Sari, dan R. R. Ali, “Football Player Tracking, Team Assignment, and Speed Estimation Using Yolov5 and Optical Flow,” J. Tek. Inform., vol. 6, no. 1, hal. 51–62, 2025, doi: 10.52436/1.jutif.2025.6.1.4165.
Additional Files
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Nuansa Syafrie Rahardian, Eddy Maryanto, Devi Astri Nawangnugraeni

This work is licensed under a Creative Commons Attribution 4.0 International License.