Analyzing the Application of Optical Character Recognition: A Case Study in International Standard Book Number Detection
DOI:
https://doi.org/10.30812/matrik.v24i2.4367Keywords:
Book Number Detection, International Standard, Optical Character RecognitionAbstract
In the era of advanced education, assessing lecturer performance is crucial to maintaining educational quality. One aspect of this assessment involves evaluating the textbooks authored by lecturers. This study addresses the problem of efficiently detecting International Standard Book Numbers (ISBNs) within these textbooks using optical character recognition (OCR) as a potential solution. The objective is to determine the effectiveness of OCR, specifically the Tesseract platform, in facilitating ISBN detection to support lecturer performance assessments. The research method involves automated data collection and ISBN detection using Tesseract OCR on various sections of textbooks, including covers, tables of contents, and identity pages, across different file formats (JPG and PDF) and orientations. The study evaluates OCR performance concerning image quality, rotation, and file type. Results of this study indicate that Tesseract performs effectively on high-quality, low-noise JPG images, achieving an F1 score of 0.97 for JPG and 0.99 for PDF files. However, its performance decreases with rotated images and certain PDF conditions, highlighting specific limitations of OCR in ISBN detection. These findings suggest that OCR can be a valuable tool in enhancing lecturer performance assessments through efficient ISBN detection in textbooks.
Downloads
References
[2] U. Rahardja, N. Lutfiani, A. Setiani Rafika, and E. Purnama Harahap, “Determinants of Lecturer Performance to Enhance Accreditation in Higher Education,†in 2020 8th International Conference on Cyber and IT Service Management (CITSM), IEEE, Oct. 2020, pp. 1–7. doi: 10.1109/CITSM50537.2020.9268871.
[3] A. F. Wulandari, A. Winarno, B. S. Luturlean, and F. Nur, “Explaining Gender in Moderating the Effect of Competency, Work Discipline and Job Satisfaction on Lecturer Performance,†Al-Tanzim: Jurnal Manajemen Pendidikan Islam, vol. 8, no. 2, pp. 650–663, May 2024, doi: 10.33650/al-tanzim.v8i2.7193.
[4] F. Riandari, H. T. Sihotang, and H. Husain, “Forecasting the Number of Students in Multiple Linear Regressions,†MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer, vol. 21, no. 2, pp. 249–256, 2022, doi: 10.30812/matrik.v21i2.1348.
[5] J. Memon, M. Sami, R. A. Khan, and M. Uddin, “Handwritten Optical Character Recognition (OCR): A Comprehensive Systematic Literature Review (SLR),†IEEE Access, vol. 8, pp. 142642–142668, 2020, doi: 10.1109/ACCESS.2020.3012542.
[6] S. Drobac and K. Lindén, “Optical character recognition with neural networks and post-correction with finite state methods,†International Journal on Document Analysis and Recognition (IJDAR), vol. 23, no. 4, pp. 279–295, Dec. 2020, doi: 10.1007/s10032-020-00359-9.
[7] R. M. Ahmed et al., “Kurdish Handwritten character recognition using deep learning techniques,†Gene Expression Patterns, vol. 46, p. 119278, Dec. 2022, doi: 10.1016/j.gep.2022.119278.
[8] M. Li et al., “TrOCR: Transformer-Based Optical Character Recognition with Pre-trained Models,†in The Thirty-Seventh AAAI Conference on Artificial Intelligence (AAAI-23), 2023, pp. 13094–13102. doi: https://doi.org/10.48550/arXiv.2109.10282.
[9] C. Clausner, A. Antonacopoulos, and S. Pletschacher, “Efficient and effective OCR engine training,†International Journal on Document Analysis and Recognition (IJDAR), vol. 23, no. 1, pp. 73–88, Mar. 2020, doi: 10.1007/s10032-019-00347-8.
[10] S. Dome and A. P. Sathe, “Optical Charater Recognition using Tesseract and Classification,†in 2021 International Conference on Emerging Smart Computing and Informatics (ESCI), IEEE, Mar. 2021, pp. 153–158. doi: 10.1109/ESCI50559.2021.9397008.
[11] T. Hegghammer, “OCR with Tesseract, Amazon Textract, and Google Document AI: a benchmarking experiment,†Journal of Computational Social Science, vol. 5, no. 1, pp. 861–882, May 2022, doi: 10.1007/s42001-021-00149-1.
[12] N. Anwar, T. Khan, and A. F. Mollah, “Text Detection from Scene and Born Images: How Good is Tesseract?,†in Recent Trends in Communication and Intelligent Systems, Singapore: Springer, May 2022, pp. 115–122. doi: 10.1007/978-981-19-1324-2_13.
[13] A. D. R N, S. Chinta, N. K. Ashili, B. S. Babu, R. R. Vydugula, and R. S. VSL, “An Intelligent Invoice Processing System Using Tesseract OCR,†in 2024 International Conference on Advances in Data Engineering and Intelligent Computing Systems (ADICS), IEEE, Apr. 2024, pp. 1–6. doi: 10.1109/ADICS58448.2024.10533509.
[14] A. Benaissa, A. Bahri, A. El Allaoui, and M. Abdelouahab Salahddine, “Build a Trained Data of Tesseract OCR engine for Tifinagh Script Recognition,†Data and Metadata, vol. 2, p. 185, Dec. 2023, doi: 10.56294/dm2023185.
[15] Tarun, T. Chauhan, and Varsha, “The Efficacy of Tesseract OCR: Insights from a Practical Application Study,†in 11th International Conference on Cutting-Edge Developments in Engineering Technology and Science, ICCDETS, May 2024, pp. 1601–1605. doi: 10.62919/hdsg3874.
[16] T. T. H. Nguyen, A. Jatowt, M. Coustaty, and A. Doucet, “Survey of Post-OCR Processing Approaches,†ACM Computing Surveys, vol. 54, no. 6, pp. 1–37, Jul. 2022, doi: 10.1145/3453476.
[17] D. Khairani, D. A. Bangkit, N. F. Rozi, S. U. Masruroh, S. Oktaviana, and T. Rosyadi, “Named-Entity Recognition and Optical Character Recognition for Detecting Halal Food Ingredients: Indonesian Case Study,†in 2022 10th International Conference on Cyber and IT Service Management (CITSM), IEEE, Sep. 2022, pp. 01–05. doi: 10.1109/CITSM56380.2022.9935966.
[18] L. Jianyang, B. Junrong, L. Bingjin, F. Zhiang, and Z. Su, “The Character Recognition Method Based on OCR,†in 2023 26th ACIS International Winter Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing (SNPD-Winter), IEEE, Jul. 2023, pp. 92–95. doi: 10.1109/SNPD-Winter57765.2023.10223979.
[19] K. Olejniczak and M. Šulc, “Text Detection Forgot About Document OCR,†in CEUR Workshop Proceedings, CEUR Workshop Proceedings, 2023.
[20] L. Jain, M. J. Wilber, and T. E. Boult, “Issues in Rotational (Non-)invariance and Image Preprocessing,†in 2013 IEEE Conference on Computer Vision and Pattern Recognition Workshops, IEEE, Jun. 2013, pp. 76–83. doi: 10.1109/CVPRW.2013.19.
[21] P. Wang, J. Qiao, and N. Liu, “An Improved Convolutional Neural Network-Based Scene Image Recognition Method,†Computational Intelligence and Neuroscience, vol. 2022, pp. 1–10, Jun. 2022, doi: 10.1155/2022/3464984.
[22] J. He, Z. Zhang, H. Zhao, and J. Yang, “ACP- based Circular target image Rotation normalization system,†in 2023 4th International Conference on Computer Vision, Image and Deep Learning (CVIDL), IEEE, May 2023, pp. 17–20. doi: 10.1109/CVIDL58838.2023.10166580.
[23] D. Purwanto and A. Agustiyar, “GLOBAL THRESHOLDING IMPLEMENTATION FOR NOISE HANDLING IN DIGITAL IMAGE RECOGNITION,†Jurnal Transformatika, vol. 21, no. 2, p. 93, Jan. 2024, doi: 10.26623/transformatika.v21i2.8713.
Downloads
Published
Issue
Section
License
Copyright (c) 2025 Imam Fahrur Rozi, Ahmadi Yuli Ananta, Endah Septa Sintiya, Astrifidha Rahma Amalia, Yuri Ariyanto, Arin Kistia Nugraeni

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
How to Cite
Similar Articles
- Bobby Poerwanto, Baso Ali, Implementasi Algoritma Fuzzy C-Means dalam Mengelompokkan Kecamatan di Tana Luwu Berdasarkan Produktifitas Hasil Perkebunan , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 19 No. 1 (2019)
- Desi Vinsensia, Siskawati Amri, Jonhariono Sihotang, Hengki Tamando Sihotang, New Method for Identification and Response to Infectious Disease Patterns Based on Comprehensive Health Service Data , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 23 No. 3 (2024)
- Mudafiq Riyan Pratama, Muhammad Yunus, Sistem Deteksi Struktur Kalimat Bahasa Arab Menggunakan Algoritma Light Stemming , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 19 No. 1 (2019)
- Angga Rahagiyanto, Identifikasi Ekstraksi Fitur untuk Gerakan Tangan dalam Bahasa Isyarat (SIBI) Menggunakan Sensor MYO Armband , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 19 No. 1 (2019)
- Didih Rizki Chandranegara, Faras Haidar Pratama, Sidiq Fajrianur, Moch Rizky Eka Putra, Zamah Sari, Automated Detection of Breast Cancer Histopathology Image Using Convolutional Neural Network and Transfer Learning , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 22 No. 3 (2023)
- Melati Rosanensi, Improving E-Commerce Effectiveness Using Augmented Reality , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 19 No. 2 (2020)
- Solikhun Solikhun, Lise Pujiastuti, Mochamad Wahyudi, Enhancing Lung Cancer Prediction Accuracy UsingQuantum-Enhanced K-Medoids with Manhattan Distance , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 24 No. 3 (2025)
- Tri Oktarina, Media Pembelajaran Online untuk Mendukung Belajar Pada Stebis Islam Darussalam , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 19 No. 2 (2020)
- Muhammad Alkaff, Muhammad Afrizal Miqdad, Muhammad Fachrurrazi, Muhammad Nur Abdi, Ahmad Zainul Abidin, Raisa Amalia, Hate Speech Detection for Banjarese Languages on Instagram Using Machine Learning Methods , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 22 No. 3 (2023)
- Ahmad Naufal Labiib Nabhaan, Rakandhiya Daanii Rachmanto, Arief Setyanto, Characterizing Hardware Utilization on Edge Devices when Inferring Compressed Deep Learning Models , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 24 No. 1 (2024)
You may also start an advanced similarity search for this article.