IMPLEMENTATION OF SUPPORT VECTOR MACHINE METHOD IN CLASSIFYING SCHOOL LIBRARY BOOKS WITH COMBINATION OF TF-IDF AND WORD2VEC

  • Salsabila Nida Cahyani Informatics Engineering, Computer Science Faculty, Universitas Dian Nuswantoro, Indonesia
  • Galuh Wilujeng Saraswati Informatics Engineering, Computer Science Faculty, Universitas Dian Nuswantoro, Indonesia
Keywords: classification, dewey decimal classification, library, support vector machine, TF-IDF, Word2Vec

Abstract

The development of technology in education is integral to enhancing its quality, such as implementing information technology in school libraries. Searching for books in school libraries is time-consuming due to conventional book classification, lacking organization based on classifications. Therefore, implementing information technology in school libraries is crucial to improve library management effectiveness. An innovative solution optimizing library management involves leveraging artificial intelligence, particularly machine learning. In applying machine learning to library book classification, Support Vector Machine acts as an algorithm understanding patterns and characteristics of book titles, categorizing them into Dewey Decimal Classification (DDC). The dataset comprises 10 classes aligned with DDC. Random data collection follows an 80:20 scale for training and testing data. Data preprocessing is an initial research stage, addressing imbalanced data through oversampling. Testing the SVM algorithm with a linear kernel and C = 1 parameter is conducted three times using different feature extraction methods: TF-IDF alone, Word2Vec alone, and a combination of TF-IDF and Word2Vec. Model performance evaluation employs K-Fold Cross-Validation. After the three objective tests, the most accurate book classification results were obtained using a combination of TF-IDF and Word2Vec feature extraction. It's concluded that SVM's book classification method can be applied, yielding the highest accuracy of 73% with the TF-IDF and Word2Vec feature extraction combination. This outperforms other feature extraction methods, with precision at 83%, recall at 72%, and an F1-Score of 76%.

Downloads

Download data is not yet available.

References

S. Nanda, “Perkembangan Trend Terbaru Dalam Temu Kembali Informasi Bagi Mahasiswa Pascasarjana Uin Sunan Kalijagayogyakarta,” shaut, vol. 11, no. 2, pp. 198–209, Jan. 2020, doi: 10.37108/shaut.v11i2.251.

Martinus Maslim and Stephanie Pamela Adithama, “Pembangunan Sistem Informasi Perpustakaan Sekolah Dasar Berbasis Web,” dinamisia, vol. 3, no. 2, pp. 350–360, Jan. 2020, doi: 10.31849/dinamisia.v3i2.3073.

D. Anggoro and A. Hidayat, “Rancang Bangun Sistem Informasi Perpustakaan Sekolah Berbasis Web Guna Meningkatkan Efektivitas Layanan Pustakawan,” EDUMATIC, vol. 4, no. 1, pp. 151–160, Jun. 2020, doi: 10.29408/edumatic.v4i1.2130.

A. O. P. Dewi, “Kecerdasan Buatan sebagai Konsep Baru pada Perpustakaan,” Anuva, vol. 4, no. 4, pp. 453–460, Nov. 2020, doi: 10.14710/anuva.4.4.453-460.

A. Le Glaz et al., “Machine Learning and Natural Language Processing in Mental Health: Systematic Review,” J Med Internet Res, vol. 23, no. 5, p. e15708, May 2021, doi: 10.2196/15708.

P. Shiroya, “Book Genre Categorization Using Machine Learning Algorithms (K-Nearest Neighbor, Support Vector Machine and Logistic Regression) using Customized Dataset,” IJCSMC, vol. 10, no. 3, pp. 14–25, Mar. 2021, doi: 10.47760/ijcsmc.2021.v10i03.002.

A. A. Hermawan, G. W. Saraswati, and E. Kartikadarma, “Metode MICE Support Vector Machine (MICE-SVM) untuk Klasifikasi Performance Mahasiswa Merdeka Belajar Kampus Merdeka,” MIB, vol. 7, no. 4, pp. 1686–1697, Oct. 2023, doi: 10.30865/mib.v7i4.6821.

A. A. Kasim and M. Sudarsono, “Algoritma Support Vector Machine (SVM) untuk Klasifikasi Ekonomi Penduduk Penerima Bantuan Pemerintah di Kecamatan Simpang Raya Sulawesi Tengah,” in Seminar Nasional APTIKOM (SEMNASTIK) 2019, Semarang, Indonesia: APTIKOM Universitas Dian Nuswantoro, Nov. 2019, pp. 568–573.

F. Pratama, M. Nasir, and S. Sauda, “Implementasi Metode Klasifikasi Dengan Algoritma Support Vector Machine Untuk Menentukan Stok Persediaan Barang Pada Koperasi Karyawan Pangan Utama,” Journal-SEA, vol. 1, no. 2, pp. 71–81, May 2020, doi: 10.51519/journalsea.v1i2.46.

Musrifah, “Strategi Pengembangan Sistem Temu Kembali Informasi Berbasis Gambar (Content Based Image Retrieval System) Di Perpustakaan Perguruan Tinggi Kedokteran,” JIPI, vol. 3, no. 1, pp. 1–20, 2018, doi: 10.30829/jipi.v3i1.1486.

K. I. Gunawan and J. Santoso, “Multilabel Text Classification Menggunakan SVM dan Doc2Vec Classification Pada Dokumen Berita Bahasa Indonesia,” INSIGHT, vol. 3, no. 01, pp. 29–38, Apr. 2021, doi: 10.37823/insight.v3i01.126.

Mardianto, Maryaningsih, and R. Supardi, “Book Classification Application Using Dewey Decimal Classification Method (DDC) Case Study Library SMAN 4 Kaur,” JKOMITEK, vol. 1, no. 2, pp. 290–298, 2021.

D. B. Anggraeni, Widyastuti, F. P. Rahmawati, and M. G. Aditama, “Pengembangan Sistem Klasifikasi Kepustakaan dengan Dewey Decimal Classification (DDC),” Buletin KKN Pendidikan, vol. 3, no. 2, pp. 152–160, Dec. 2021, doi: 10.23917/bkkndik.v3i2.15734.

K. Puritat and K. Intawong, “Development of an Open Source Automated Library System with Book Recommedation System for Small Libraries,” in 2020 Joint International Conference on Digital Arts, Media and Technology with ECTI Northern Section Conference on Electrical, Electronics, Computer and Telecommunications Engineering (ECTI DAMT & NCON), Pattaya, Thailand: IEEE, Mar. 2020, pp. 128–132. doi: 10.1109/ECTIDAMTNCON48261.2020.9090753.

D. H. Amalia and W. Yustanti, “Klasifikasi Buku Menggunakan Metode Support Vector Machine pada Digital Library,” JINACS, vol. 3, no. 01, pp. 55–61, Aug. 2021, doi: 10.26740/jinacs.v3n01.p55-61.

Muhammad Alwi, Oddy Virgantara Putra, and Dihin Muriyatmoko1, “Classification of Book Collections Based on DDC 23 Using Text Mining Algorithm at UNIDA Gontor Library,” PELS, vol. 2, Nov. 2021, doi: 10.21070/pels.v2i0.1164.

N. Arifin, U. Enri, and N. Sulistiyowati, “Penerapan Algoritma Support Vector Machine (SVM) dengan TF-IDF N-Gram untuk Text Classification,” STRING, vol. 6, no. 2, p. 129, Dec. 2021, doi: 10.30998/string.v6i2.10133.

A. T. Ni’mah and A. Z. Arifin, “Perbandingan Metode Term Weighting terhadap Hasil Klasifikasi Teks pada Dataset Terjemahan Kitab Hadis,” Rekayasa, vol. 13, no. 2, pp. 172–180, Aug. 2020, doi: 10.21107/rekayasa.v13i2.6412.

E. Suryati, Styawati, and A. A. Aldino, “Analisis Sentimen Transportasi Online Menggunakan Ekstraksi Fitur Model Word2vec Text Embedding Dan Algoritma Support Vector Machine (SVM),” JUTISI, vol. 4, no. 1, pp. 96–106, Mar. 2023, doi: 10.33365/jtsi.v4i1.2445.

F. Riandari, H. T. Sihotang, T. Tarigan, and M. Rafli, “Classification of Book Types Using the Support Vector Machine (SVM) Method,” Mantik, vol. 6, no. 1, pp. 42–49, Mar. 2022.

Honakan, Adiwijaya, and S. Al Faraby, “Analisis Dan Implementasi Support Vector Machine Dengan String Kernel Dalam Melakukan Klasifikasi Berita Berbahasa Indonesia,” in eProceedings of Engineering, in 1, vol. 5. Bandung, Indonesia: Telkom University Open Library, Mar. 2018, pp. 1701–1710.

A. Salama, Adiwijaya, and S. Al Faraby, “Klasifikasi Topik Ayat Al-Qur’an Terjemahan Berbahasa Inggris Menggunakan Metode Support Vector Machine Berbasis Vector Space Model dan Word2Vec,” in e-Proceeding of Engineering, in 2, vol. 6. Bandung, Indonesia: Telkom University Open Library, Aug. 2019, pp. 9133–9140.

Published
2023-12-30
How to Cite
[1]
S. N. Cahyani and G. W. Saraswati, “IMPLEMENTATION OF SUPPORT VECTOR MACHINE METHOD IN CLASSIFYING SCHOOL LIBRARY BOOKS WITH COMBINATION OF TF-IDF AND WORD2VEC ”, J. Tek. Inform. (JUTIF), vol. 4, no. 6, pp. 1555-1566, Dec. 2023.