Analyzing the Application of Optical Character Recognition: A Case Study in International Standard Book Number Detection
DOI:
https://doi.org/10.30812/matrik.v24i2.4367Keywords:
Book Number Detection, International Standard, Optical Character RecognitionAbstract
In the era of advanced education, assessing lecturer performance is crucial to maintaining educational quality. One aspect of this assessment involves evaluating the textbooks authored by lecturers. This study addresses the problem of efficiently detecting International Standard Book Numbers (ISBNs) within these textbooks using optical character recognition (OCR) as a potential solution. The objective is to determine the effectiveness of OCR, specifically the Tesseract platform, in facilitating ISBN detection to support lecturer performance assessments. The research method involves automated data collection and ISBN detection using Tesseract OCR on various sections of textbooks, including covers, tables of contents, and identity pages, across different file formats (JPG and PDF) and orientations. The study evaluates OCR performance concerning image quality, rotation, and file type. Results of this study indicate that Tesseract performs effectively on high-quality, low-noise JPG images, achieving an F1 score of 0.97 for JPG and 0.99 for PDF files. However, its performance decreases with rotated images and certain PDF conditions, highlighting specific limitations of OCR in ISBN detection. These findings suggest that OCR can be a valuable tool in enhancing lecturer performance assessments through efficient ISBN detection in textbooks.
Downloads
References
[2] U. Rahardja, N. Lutfiani, A. Setiani Rafika, and E. Purnama Harahap, “Determinants of Lecturer Performance to Enhance Accreditation in Higher Education,†in 2020 8th International Conference on Cyber and IT Service Management (CITSM), IEEE, Oct. 2020, pp. 1–7. doi: 10.1109/CITSM50537.2020.9268871.
[3] A. F. Wulandari, A. Winarno, B. S. Luturlean, and F. Nur, “Explaining Gender in Moderating the Effect of Competency, Work Discipline and Job Satisfaction on Lecturer Performance,†Al-Tanzim: Jurnal Manajemen Pendidikan Islam, vol. 8, no. 2, pp. 650–663, May 2024, doi: 10.33650/al-tanzim.v8i2.7193.
[4] F. Riandari, H. T. Sihotang, and H. Husain, “Forecasting the Number of Students in Multiple Linear Regressions,†MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer, vol. 21, no. 2, pp. 249–256, 2022, doi: 10.30812/matrik.v21i2.1348.
[5] J. Memon, M. Sami, R. A. Khan, and M. Uddin, “Handwritten Optical Character Recognition (OCR): A Comprehensive Systematic Literature Review (SLR),†IEEE Access, vol. 8, pp. 142642–142668, 2020, doi: 10.1109/ACCESS.2020.3012542.
[6] S. Drobac and K. Lindén, “Optical character recognition with neural networks and post-correction with finite state methods,†International Journal on Document Analysis and Recognition (IJDAR), vol. 23, no. 4, pp. 279–295, Dec. 2020, doi: 10.1007/s10032-020-00359-9.
[7] R. M. Ahmed et al., “Kurdish Handwritten character recognition using deep learning techniques,†Gene Expression Patterns, vol. 46, p. 119278, Dec. 2022, doi: 10.1016/j.gep.2022.119278.
[8] M. Li et al., “TrOCR: Transformer-Based Optical Character Recognition with Pre-trained Models,†in The Thirty-Seventh AAAI Conference on Artificial Intelligence (AAAI-23), 2023, pp. 13094–13102. doi: https://doi.org/10.48550/arXiv.2109.10282.
[9] C. Clausner, A. Antonacopoulos, and S. Pletschacher, “Efficient and effective OCR engine training,†International Journal on Document Analysis and Recognition (IJDAR), vol. 23, no. 1, pp. 73–88, Mar. 2020, doi: 10.1007/s10032-019-00347-8.
[10] S. Dome and A. P. Sathe, “Optical Charater Recognition using Tesseract and Classification,†in 2021 International Conference on Emerging Smart Computing and Informatics (ESCI), IEEE, Mar. 2021, pp. 153–158. doi: 10.1109/ESCI50559.2021.9397008.
[11] T. Hegghammer, “OCR with Tesseract, Amazon Textract, and Google Document AI: a benchmarking experiment,†Journal of Computational Social Science, vol. 5, no. 1, pp. 861–882, May 2022, doi: 10.1007/s42001-021-00149-1.
[12] N. Anwar, T. Khan, and A. F. Mollah, “Text Detection from Scene and Born Images: How Good is Tesseract?,†in Recent Trends in Communication and Intelligent Systems, Singapore: Springer, May 2022, pp. 115–122. doi: 10.1007/978-981-19-1324-2_13.
[13] A. D. R N, S. Chinta, N. K. Ashili, B. S. Babu, R. R. Vydugula, and R. S. VSL, “An Intelligent Invoice Processing System Using Tesseract OCR,†in 2024 International Conference on Advances in Data Engineering and Intelligent Computing Systems (ADICS), IEEE, Apr. 2024, pp. 1–6. doi: 10.1109/ADICS58448.2024.10533509.
[14] A. Benaissa, A. Bahri, A. El Allaoui, and M. Abdelouahab Salahddine, “Build a Trained Data of Tesseract OCR engine for Tifinagh Script Recognition,†Data and Metadata, vol. 2, p. 185, Dec. 2023, doi: 10.56294/dm2023185.
[15] Tarun, T. Chauhan, and Varsha, “The Efficacy of Tesseract OCR: Insights from a Practical Application Study,†in 11th International Conference on Cutting-Edge Developments in Engineering Technology and Science, ICCDETS, May 2024, pp. 1601–1605. doi: 10.62919/hdsg3874.
[16] T. T. H. Nguyen, A. Jatowt, M. Coustaty, and A. Doucet, “Survey of Post-OCR Processing Approaches,†ACM Computing Surveys, vol. 54, no. 6, pp. 1–37, Jul. 2022, doi: 10.1145/3453476.
[17] D. Khairani, D. A. Bangkit, N. F. Rozi, S. U. Masruroh, S. Oktaviana, and T. Rosyadi, “Named-Entity Recognition and Optical Character Recognition for Detecting Halal Food Ingredients: Indonesian Case Study,†in 2022 10th International Conference on Cyber and IT Service Management (CITSM), IEEE, Sep. 2022, pp. 01–05. doi: 10.1109/CITSM56380.2022.9935966.
[18] L. Jianyang, B. Junrong, L. Bingjin, F. Zhiang, and Z. Su, “The Character Recognition Method Based on OCR,†in 2023 26th ACIS International Winter Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing (SNPD-Winter), IEEE, Jul. 2023, pp. 92–95. doi: 10.1109/SNPD-Winter57765.2023.10223979.
[19] K. Olejniczak and M. Šulc, “Text Detection Forgot About Document OCR,†in CEUR Workshop Proceedings, CEUR Workshop Proceedings, 2023.
[20] L. Jain, M. J. Wilber, and T. E. Boult, “Issues in Rotational (Non-)invariance and Image Preprocessing,†in 2013 IEEE Conference on Computer Vision and Pattern Recognition Workshops, IEEE, Jun. 2013, pp. 76–83. doi: 10.1109/CVPRW.2013.19.
[21] P. Wang, J. Qiao, and N. Liu, “An Improved Convolutional Neural Network-Based Scene Image Recognition Method,†Computational Intelligence and Neuroscience, vol. 2022, pp. 1–10, Jun. 2022, doi: 10.1155/2022/3464984.
[22] J. He, Z. Zhang, H. Zhao, and J. Yang, “ACP- based Circular target image Rotation normalization system,†in 2023 4th International Conference on Computer Vision, Image and Deep Learning (CVIDL), IEEE, May 2023, pp. 17–20. doi: 10.1109/CVIDL58838.2023.10166580.
[23] D. Purwanto and A. Agustiyar, “GLOBAL THRESHOLDING IMPLEMENTATION FOR NOISE HANDLING IN DIGITAL IMAGE RECOGNITION,†Jurnal Transformatika, vol. 21, no. 2, p. 93, Jan. 2024, doi: 10.26623/transformatika.v21i2.8713.
Downloads
Published
Issue
Section
License
Copyright (c) 2025 Imam Fahrur Rozi, Ahmadi Yuli Ananta, Endah Septa Sintiya, Astrifidha Rahma Amalia, Yuri Ariyanto, Arin Kistia Nugraeni

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
How to Cite
Similar Articles
- Hepatika Zidny Ilmadina, Muhammad Naufal, Dega Surono Wibowo, Drowsiness Detection Based on Yawning Using Modified Pre-trained Model MobileNetV2 and ResNet50 , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 22 No. 3 (2023)
- Radimas Putra Muhammad Davi Labib, Sirojul Hadi, Parama Diptya Widayaka, Low Cost System for Face Mask Detection Based Haar Cascade Classifier Method , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 21 No. 1 (2021)
- Umar Aditiawarman, Alfian Dody, Teddy Mantoro, Haris Al Qodri Maarif, Anggy Pradiftha, Evading Antivirus Software Detection Using Python and PowerShell Obfuscation Framework , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 22 No. 3 (2023)
- Fristi Riandari, Hengki Tamando Sihotang, Husain Husain, Forecasting the Number of Students in Multiple Linear Regressions , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 21 No. 2 (2022)
- Bambang Krismono Triwijoyo, Ahmat Adil, Anthony Anggrawan, Convolutional Neural Network With Batch Normalization for Classification of Emotional Expressions Based on Facial Images , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 21 No. 1 (2021)
- Muhammad Ibnu Choldun Rachmatullah, The Application of Repeated SMOTE for Multi Class Classification on Imbalanced Data , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 22 No. 1 (2022)
- Bob Subhan Riza, Jufriadif Na'am, Sumijan Sumijan, Tuberculosis Extra Pulmonary Bacilli Detection System Based on Ziehl Neelsen Images with Segmentation , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 22 No. 1 (2022)
- Irawan Afrianto, Andri Heryandi, Sufa Atin, Blockchain-based Trust, Transparent, Traceable Modeling on Learning Recognition System Kampus Merdeka , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 22 No. 2 (2023)
- Nenny Anggraini, Zulkifli Zulkifli, Nashrul Hakiem, Development of Smart Charity Box Monitoring Robot in Mosque with Internet of Things and Firebase using Raspberry Pi , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 24 No. 1 (2024)
- Miftahus Sholihin, Mohd Farhan Bin Md. Fudzee, Lilik Anifah, A Novel CNN-Based Approach for Classification of Tomato Plant Diseases , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 24 No. 3 (2025)
You may also start an advanced similarity search for this article.