Analyzing the Application of Optical Character Recognition: A Case Study in International Standard Book Number Detection
DOI:
https://doi.org/10.30812/matrik.v24i2.4367Keywords:
Book Number Detection, International Standard, Optical Character RecognitionAbstract
In the era of advanced education, assessing lecturer performance is crucial to maintaining educational quality. One aspect of this assessment involves evaluating the textbooks authored by lecturers. This study addresses the problem of efficiently detecting International Standard Book Numbers (ISBNs) within these textbooks using optical character recognition (OCR) as a potential solution. The objective is to determine the effectiveness of OCR, specifically the Tesseract platform, in facilitating ISBN detection to support lecturer performance assessments. The research method involves automated data collection and ISBN detection using Tesseract OCR on various sections of textbooks, including covers, tables of contents, and identity pages, across different file formats (JPG and PDF) and orientations. The study evaluates OCR performance concerning image quality, rotation, and file type. Results of this study indicate that Tesseract performs effectively on high-quality, low-noise JPG images, achieving an F1 score of 0.97 for JPG and 0.99 for PDF files. However, its performance decreases with rotated images and certain PDF conditions, highlighting specific limitations of OCR in ISBN detection. These findings suggest that OCR can be a valuable tool in enhancing lecturer performance assessments through efficient ISBN detection in textbooks.
Downloads
References
[2] U. Rahardja, N. Lutfiani, A. Setiani Rafika, and E. Purnama Harahap, “Determinants of Lecturer Performance to Enhance Accreditation in Higher Education,†in 2020 8th International Conference on Cyber and IT Service Management (CITSM), IEEE, Oct. 2020, pp. 1–7. doi: 10.1109/CITSM50537.2020.9268871.
[3] A. F. Wulandari, A. Winarno, B. S. Luturlean, and F. Nur, “Explaining Gender in Moderating the Effect of Competency, Work Discipline and Job Satisfaction on Lecturer Performance,†Al-Tanzim: Jurnal Manajemen Pendidikan Islam, vol. 8, no. 2, pp. 650–663, May 2024, doi: 10.33650/al-tanzim.v8i2.7193.
[4] F. Riandari, H. T. Sihotang, and H. Husain, “Forecasting the Number of Students in Multiple Linear Regressions,†MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer, vol. 21, no. 2, pp. 249–256, 2022, doi: 10.30812/matrik.v21i2.1348.
[5] J. Memon, M. Sami, R. A. Khan, and M. Uddin, “Handwritten Optical Character Recognition (OCR): A Comprehensive Systematic Literature Review (SLR),†IEEE Access, vol. 8, pp. 142642–142668, 2020, doi: 10.1109/ACCESS.2020.3012542.
[6] S. Drobac and K. Lindén, “Optical character recognition with neural networks and post-correction with finite state methods,†International Journal on Document Analysis and Recognition (IJDAR), vol. 23, no. 4, pp. 279–295, Dec. 2020, doi: 10.1007/s10032-020-00359-9.
[7] R. M. Ahmed et al., “Kurdish Handwritten character recognition using deep learning techniques,†Gene Expression Patterns, vol. 46, p. 119278, Dec. 2022, doi: 10.1016/j.gep.2022.119278.
[8] M. Li et al., “TrOCR: Transformer-Based Optical Character Recognition with Pre-trained Models,†in The Thirty-Seventh AAAI Conference on Artificial Intelligence (AAAI-23), 2023, pp. 13094–13102. doi: https://doi.org/10.48550/arXiv.2109.10282.
[9] C. Clausner, A. Antonacopoulos, and S. Pletschacher, “Efficient and effective OCR engine training,†International Journal on Document Analysis and Recognition (IJDAR), vol. 23, no. 1, pp. 73–88, Mar. 2020, doi: 10.1007/s10032-019-00347-8.
[10] S. Dome and A. P. Sathe, “Optical Charater Recognition using Tesseract and Classification,†in 2021 International Conference on Emerging Smart Computing and Informatics (ESCI), IEEE, Mar. 2021, pp. 153–158. doi: 10.1109/ESCI50559.2021.9397008.
[11] T. Hegghammer, “OCR with Tesseract, Amazon Textract, and Google Document AI: a benchmarking experiment,†Journal of Computational Social Science, vol. 5, no. 1, pp. 861–882, May 2022, doi: 10.1007/s42001-021-00149-1.
[12] N. Anwar, T. Khan, and A. F. Mollah, “Text Detection from Scene and Born Images: How Good is Tesseract?,†in Recent Trends in Communication and Intelligent Systems, Singapore: Springer, May 2022, pp. 115–122. doi: 10.1007/978-981-19-1324-2_13.
[13] A. D. R N, S. Chinta, N. K. Ashili, B. S. Babu, R. R. Vydugula, and R. S. VSL, “An Intelligent Invoice Processing System Using Tesseract OCR,†in 2024 International Conference on Advances in Data Engineering and Intelligent Computing Systems (ADICS), IEEE, Apr. 2024, pp. 1–6. doi: 10.1109/ADICS58448.2024.10533509.
[14] A. Benaissa, A. Bahri, A. El Allaoui, and M. Abdelouahab Salahddine, “Build a Trained Data of Tesseract OCR engine for Tifinagh Script Recognition,†Data and Metadata, vol. 2, p. 185, Dec. 2023, doi: 10.56294/dm2023185.
[15] Tarun, T. Chauhan, and Varsha, “The Efficacy of Tesseract OCR: Insights from a Practical Application Study,†in 11th International Conference on Cutting-Edge Developments in Engineering Technology and Science, ICCDETS, May 2024, pp. 1601–1605. doi: 10.62919/hdsg3874.
[16] T. T. H. Nguyen, A. Jatowt, M. Coustaty, and A. Doucet, “Survey of Post-OCR Processing Approaches,†ACM Computing Surveys, vol. 54, no. 6, pp. 1–37, Jul. 2022, doi: 10.1145/3453476.
[17] D. Khairani, D. A. Bangkit, N. F. Rozi, S. U. Masruroh, S. Oktaviana, and T. Rosyadi, “Named-Entity Recognition and Optical Character Recognition for Detecting Halal Food Ingredients: Indonesian Case Study,†in 2022 10th International Conference on Cyber and IT Service Management (CITSM), IEEE, Sep. 2022, pp. 01–05. doi: 10.1109/CITSM56380.2022.9935966.
[18] L. Jianyang, B. Junrong, L. Bingjin, F. Zhiang, and Z. Su, “The Character Recognition Method Based on OCR,†in 2023 26th ACIS International Winter Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing (SNPD-Winter), IEEE, Jul. 2023, pp. 92–95. doi: 10.1109/SNPD-Winter57765.2023.10223979.
[19] K. Olejniczak and M. Šulc, “Text Detection Forgot About Document OCR,†in CEUR Workshop Proceedings, CEUR Workshop Proceedings, 2023.
[20] L. Jain, M. J. Wilber, and T. E. Boult, “Issues in Rotational (Non-)invariance and Image Preprocessing,†in 2013 IEEE Conference on Computer Vision and Pattern Recognition Workshops, IEEE, Jun. 2013, pp. 76–83. doi: 10.1109/CVPRW.2013.19.
[21] P. Wang, J. Qiao, and N. Liu, “An Improved Convolutional Neural Network-Based Scene Image Recognition Method,†Computational Intelligence and Neuroscience, vol. 2022, pp. 1–10, Jun. 2022, doi: 10.1155/2022/3464984.
[22] J. He, Z. Zhang, H. Zhao, and J. Yang, “ACP- based Circular target image Rotation normalization system,†in 2023 4th International Conference on Computer Vision, Image and Deep Learning (CVIDL), IEEE, May 2023, pp. 17–20. doi: 10.1109/CVIDL58838.2023.10166580.
[23] D. Purwanto and A. Agustiyar, “GLOBAL THRESHOLDING IMPLEMENTATION FOR NOISE HANDLING IN DIGITAL IMAGE RECOGNITION,†Jurnal Transformatika, vol. 21, no. 2, p. 93, Jan. 2024, doi: 10.26623/transformatika.v21i2.8713.
Downloads
Published
Issue
Section
License
Copyright (c) 2025 Imam Fahrur Rozi, Ahmadi Yuli Ananta, Endah Septa Sintiya, Astrifidha Rahma Amalia, Yuri Ariyanto, Arin Kistia Nugraeni

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
How to Cite
Similar Articles
- Nurun Latifah, Ramaditia Dwiyansaputra, Gibran Satya Nugraha, Multiclass Text Classification of Indonesian Short Message Service (SMS) Spam using Deep Learning Method and Easy Data Augmentation , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 23 No. 3 (2024)
- Jian Budiarto, Jihadil Qudsi, Deteksi Citra Kendaraan Berbasis Web Menggunakan Javascript Framework Library , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 18 No. 1 (2018)
- Melinda Melinda, Zharifah Muthiah, Fitri Arnia, Elizar Elizar, Muhammad Irhmasyah, Image Data Acquisition and Classification of Vannamei Shrimp Cultivation Results Based on Deep Learning , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 23 No. 3 (2024)
- Christofer Satria, Anthony Anggrawan, Tinjauan Kritis Jurnal Ilmiah: “The Influence of Transformational Leadership and Organizational Culture on Learning Organization: a Comparative Analysis of The it Sector†, MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 18 No. 1 (2018)
- Supangat Supangat, Mohd Zainuri Bin Saringat, Mochamad Yovi Fatchur Rochman, Predicting Handling Covid-19 Opinion using Naive Bayes and TF-IDF for Polarity Detection , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 22 No. 2 (2023)
- Yarza Aprizal, Rabin Ibnu Zainal, Afriyudi Afriyudi, Perbandingan Metode Backpropagation dan Learning Vector Quantization (LVQ) Dalam Menggali Potensi Mahasiswa Baru di STMIK PalComTech , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 18 No. 2 (2019)
- Nurdin Nurdin, Erni Susanti, Hafizh Al-Kautsar Aidilof, Dadang Priyanto, Comparison of Naive Bayes and Dempster Shafer Methods in Expert System for Early Diagnosis of COVID-19 , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 22 No. 1 (2022)
- Saiful Nur Arif, Muhammad Dahria, Sarjon Defit, Dicky Novriansyah, Ali Ikhwan, Implementation of Single Linked on Machine Learning for Clustering Student Scientific Fields , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 22 No. 1 (2022)
- Edi Ismanto, Januar Al Amien, Vitriani Vitriani, A Comparison of Enhanced Ensemble Learning Techniques for Internet of Things Network Attack Detection , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 23 No. 3 (2024)
- Tjut Awaliyah Zuraiyah, Sufiatul Maryana, Asep Kohar, Automatic Door Access Model Based on Face Recognition using Convolutional Neural Network , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 22 No. 1 (2022)
You may also start an advanced similarity search for this article.