Characterizing Hardware Utilization on Edge Devices when Inferring Compressed Deep Learning Models
DOI:
https://doi.org/10.30812/matrik.v24i1.3938Keywords:
Deep Learning, Edge Devices, Hardware Utilization, Memory Allocation, Post-training QuantizationAbstract
Implementing edge AI involves running AI algorithms near the sensors. Deep Learning (DL) Model has successfully tackled image classification tasks with remarkable performance. However, their requirements for huge computing resources hinder the implementation of edge devices. Compressing the model is an essential task to allow the implementation of the DL model on edge devices. Post-training quantization (PTQ) is a compression technique that reduces the bit representation of the model weight parameters. This study looks at the impact of memory allocation on the latency of compressed DL models on Raspberry Pi 4 Model B (RPi4B) and NVIDIA Jetson Nano (J. Nano). This research aims to understand hardware utilization in central processing units (CPU), graphics processing units (GPU),
and memory. This study focused on the quantitative method, which controls memory allocation and measures warm-up time, latency, CPU, and GPU utilization. Speed comparison among inference of DL models on RPi4B and J. Nano. This paper observes the correlation between hardware utilization versus the various DL inference latencies. According to our experiment, we concluded that smaller memory allocation led to high latency on both RPi4B and J. Nano. CPU utilization on RPi4B. CPU utilization in RPi4B increases along with the memory allocation; however, the opposite is shown on J. Nano since the GPU carries out the main computation on the device. Regarding computation, the
smaller DL Size and smaller bit representation lead to faster inference (low latency), while bigger bit representation on the same DL model leads to higher latency.
Downloads
References
SN Computer Science, vol. 2, no. 6, p. 420, nov 2021, https://doi.org/10.1007/s42979-021-00815-1.
[2] L. Alzubaidi, J. Zhang, A. J. Humaidi, A. Al-Dujaili, Y. Duan, O. Al-Shamma, J. Santamar´ıa, M. A. Fadhel, M. Al-Amidie, and
L. Farhan, “Review of deep learning: concepts, CNN architectures, challenges, applications, future directions,†Journal of Big
Data, vol. 8, no. 1, p. 53, mar 2021, https://doi.org/10.1186/s40537-021-00444-8.
[3] A. Susanto, C. A. Sari, E. H. Rachmawanto, I. U.W. Mulyono, and N. Mohd Yaacob, “A Comparative Study of Javanese Script Classification with GoogleNet, DenseNet, ResNet, VGG16 and VGG19,†Scientific Journal of Informatics, vol. 11, no. 1, pp.
31–40, jan 2024, https://doi.org/10.15294/sji.v11i1.47305.
[4] H. P. Hadi, E. H. Rachmawanto, and R. R. Ali, “Comparison of DenseNet-121 and MobileNet for Coral Reef Classification,â€
MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer, vol. 23, no. 2, pp. 333–342, mar 2024, https:
//doi.org/10.30812/matrik.v23i2.3683.
[5] D. Saha, M. P. Mangukia, and A. Manickavasagan, “Real-Time Deployment of MobileNetV3 Model in Edge Computing Devices
Using RGB Color Images for Varietal Classification of Chickpea,†Applied Sciences, vol. 13, no. 13, p. 7804, jul 2023,
https://doi.org/10.3390/app13137804.
[6] R. Raza, F. Zulfiqar, M. O. Khan, M. Arif, A. Alvi, M. A. Iftikhar, and T. Alam, “Lung-EffNet: Lung cancer classification
using EfficientNet from CT-scan images,†Engineering Applications of Artificial Intelligence, vol. 126, p. 106902, nov 2023,
https://doi.org/10.1016/j.engappai.2023.106902.
[7] T. S. Ajani, A. L. Imoize, and A. A. Atayero, “An Overview of Machine Learning within Embedded and Mobile Devices–
Optimizations and Applications,†Sensors, vol. 21, no. 13, p. 4412, jun 2021, https://doi.org/10.3390/s21134412.
[8] J. Lee, L. Mukhanov, A. S. Molahosseini, U. Minhas, Y. Hua, J. Martinez del Rincon, K. Dichev, C.-H. Hong, and H. Vandierendonck,
“Resource-Efficient Convolutional Networks: A Survey on Model-, Arithmetic-, and Implementation-Level Techniques,â€
ACM Computing Surveys, vol. 55, no. 13s, pp. 1–36, dec 2023, https://doi.org/10.1145/3587095.
[9] A. Abouaomar, S. Cherkaoui, Z. Mlika, and A. Kobbane, “Resource Provisioning in Edge Computing for Latency-Sensitive
Applications,†IEEE Internet of Things Journal, vol. 8, no. 14, pp. 11 088–11 099, jul 2021, https://doi.org/10.1109/JIOT.2021.
3052082.
[10] P. P. Ray, “A review on TinyML: State-of-the-art and prospects,†Journal of King Saud University - Computer and Information
Sciences, vol. 34, no. 4, pp. 1595–1623, apr 2022, https://doi.org/10.1016/j.jksuci.2021.11.019.
[11] L. U. Khan, I. Yaqoob, N. H. Tran, S. M. A. Kazmi, T. N. Dang, and C. S. Hong, “Edge-Computing-Enabled Smart Cities: A
Comprehensive Survey,†IEEE Internet of Things Journal, vol. 7, no. 10, pp. 10 200–10 232, oct 2020, https://doi.org/10.1109/
JIOT.2020.2987070.
[12] A. Garcia-Perez, R. Mi˜n´on, A. I. Torre-Bastida, and E. Zulueta-Guerrero, “Analysing Edge Computing Devices for the Deployment
of Embedded AI,†Sensors, vol. 23, no. 23, p. 9495, nov 2023, https://doi.org/10.3390/s23239495.
[13] A. Carvalho, D. Riordan, and J.Walsh, “A Novel Edge Platform Streamlining Connectivity between Modern Edge Devices and
the Cloud,†Future Internet, vol. 16, no. 4, p. 111, mar 2024, https://doi.org/10.3390/fi16040111.
[14] K. Sarvajcz, L. Ari, and J. Menyhart, “AI on the Road: NVIDIA Jetson Nano-Powered Computer Vision-Based System for
Real-Time Pedestrian and Priority Sign Detection,†Applied Sciences, vol. 14, no. 4, p. 1440, feb 2024, https://doi.org/10.3390/
app14041440.
[15] S. Park, J. Lee, and H. Kim, “Hardware Resource Analysis in Distributed Training with Edge Devices,†Electronics, vol. 9,
no. 1, p. 28, dec 2019, https://doi.org/10.3390/electronics9010028.
[16] H. Li, Z. Wang, X. Yue, W. Wang, H. Tomiyama, and L. Meng, “An architecture-level analysis on deep learning models for
low-impact computations,†Artificial Intelligence Review, vol. 56, no. 3, pp. 1971–2010, mar 2023, https://doi.org/10.1007/
s10462-022-10221-5.
[17] R. P. M. D. Labib, S. Hadi, and P. D. Widayaka, “Low Cost System for Face Mask Detection Based Haar Cascade Classifier
Method,†MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer, vol. 21, no. 1, pp. 21–30, nov 2021,
https://doi.org/10.30812/matrik.v21i1.1187.
[18] J. Maly and R. Saab, “A simple approach for quantizing neural networks,†Applied and Computational Harmonic Analysis,
vol. 66, pp. 138–150, sep 2023, https://doi.org/10.1016/j.acha.2023.04.004.
[19] J. Zhang, Y. Zhou, and R. Saab, “Post-training Quantization for Neural Networks with Provable Guarantees,†SIAM Journal on
Mathematics of Data Science, vol. 5, no. 2, pp. 373–399, 2023, https://doi.org/10.1137/22M1511709.
[20] C. Ji, F. Wu, Z. Zhu, L.-P. Chang, H. Liu, and W. Zhai, “Memory-efficient deep learning inference with incremental weight
loading and data layout reorganization on edge systems,†Journal of Systems Architecture, vol. 118, p. 102183, sep 2021,
https://doi.org/10.1016/j.sysarc.2021.102183.
[21] C. Chen, P. Zhang, H. Zhang, J. Dai, Y. Yi, H. Zhang, and Y. Zhang, “Deep Learning on Computational-Resource-Limited
Platforms: A Survey,†Mobile Information Systems, vol. 2020, pp. 1–19, mar 2020, https://doi.org/10.1155/2020/8454327.
[22] O. Shafi, C. Rai, R. Sen, and G. Ananthanarayanan, “Demystifying TensorRT: Characterizing Neural Network Inference Engine
on Nvidia Edge Devices,†in 2021 IEEE International Symposium on Workload Characterization (IISWC). IEEE, nov 2021,
pp. 226–237, https://doi.org/10.1109/IISWC53511.2021.00030.
[23] C. Wisultschew, A. Perez, A. Otero, G. Mujica, and J. Portilla, “Characterizing Deep Neural Networks on Edge Computing
Systems for Object Classification in 3D Point Clouds,†IEEE Sensors Journal, vol. 22, no. 17, pp. 17 075–17 089, sep 2022,
https://doi.org/10.1109/JSEN.2022.3193060.
[24] P. S.K, S. A. Kesanapalli, and Y. Simmhan, “Characterizing the Performance of Accelerated Jetson Edge Devices for Training
Deep Learning Models,†Proceedings of the ACM on Measurement and Analysis of Computing Systems, vol. 6, no. 3, pp. 1–26,
dec 2022, https://doi.org/10.1145/3570604.
[25] S. Jing, Q. Bao, P. Wang, X. Tang, and D. Wu, “Characterizing AI Model Inference Applications Running in the SGX Environment,â€
in 2021 IEEE International Conference on Networking, Architecture and Storage (NAS). IEEE, oct 2021, pp. 1–4,
https://doi.org/10.1109/NAS51552.2021.9605445.
[26] J. Hao, P. Subedi, I. K. Kim, and L. Ramaswamy, “Characterizing Resource Heterogeneity in Edge Devices for Deep Learning
Inferences,†in Proceedings of the 2021 on Systems and Network Telemetry and Analytics. New York: ACM, jun 2020, pp.
21–24, https://doi.org/10.1145/3452411.3464446.
[27] N. James, L.-Y. Ong, and M.-C. Leow, “Exploring Distributed Deep Learning Inference Using Raspberry Pi Spark Cluster,â€
Future Internet, vol. 14, no. 8, p. 220, jul 2022, https://doi.org/10.3390/fi14080220.
[28] T. Aboneh, A. Rorissa, R. Srinivasagan, and A. Gemechu, “Computer Vision Framework for Wheat Disease Identification
and Classification Using Jetson GPU Infrastructure,†Technologies, vol. 9, no. 3, p. 47, jul 2021, https://doi.org/10.3390/
technologies9030047.
[29] M. A. Wakili, H. A. Shehu, M. H. Sharif, M. H. U. Sharif, A. Umar, H. Kusetogullari, I. F. Ince, and S. Uyaver, “Classification
of Breast Cancer Histopathological Images Using DenseNet and Transfer Learning,†Computational Intelligence and
Neuroscience, vol. 2022, pp. 1–31, oct 2022, https://doi.org/10.1155/2022/8904768.
[30] J. Lee, M. Yu, Y. Kwon, and T. Kim, “Quantune: Post-training quantization of convolutional neural networks using extreme
gradient boosting for fast deployment,†Future Generation Computer Systems, vol. 132, pp. 124–135, jul 2022, https://doi.org/
10.1016/j.future.2022.02.005.
[31] Y. Nahshan, B. Chmiel, C. Baskin, E. Zheltonozhskii, R. Banner, A. M. Bronstein, and A. Mendelson, “Loss aware post-training
quantization,†Machine Learning, vol. 110, no. 11-12, pp. 3245–3262, dec 2021, https://doi.org/10.1007/s10994-021-06053-z.
[32] ´ E. T. Morais, G. A. Barberes, I. V. A. F. Souza, F. G. Leal, J. V. P. Guzzo, and A. L. D. Spigolon, “Pearson Correlation Coefficient
Applied to Petroleum System Characterization: The Case Study of Potiguar and Reconcavo Basins, Brazil,†Geosciences,
vol. 13, no. 9, p. 282, sep 2023, https://doi.org/10.3390/geosciences13090282.
Downloads
Published
Issue
Section
How to Cite
Similar Articles
- Pardomuan Robinson Sihombing, Istiqomatul Fajriyah Yuliati, Penerapan Metode Machine Learning dalam Klasifikasi Risiko Kejadian Berat Badan Lahir Rendah di Indonesia , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 20 No. 2 (2021)
- Debby Ummul Hidayah, Ika Romadoni Yunita, Gustin Setyaningsih, Evaluasi Website Kuliah Online STMIK Amikom Purwokerto Menggunakan Metode Heuristik (Studi Kasus Mata Kuliah Enterprise Resource Management) , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 18 No. 2 (2019)
- Roudlotul Jannah Alfirdausy, Nurissaidah Ulinnuha, Wika Dianita Utami, Implementation of The Extreme Gradient Boosting Algorithm with Hyperparameter Tuning in Celiac Disease Classification , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 24 No. 1 (2024)
- Deni Marta, M. Angga Eka Putra, Guntoro Barovih, Analisis Perbandingan Performa Virtualisasi Server Sebagai Basis Layanan Infrastructure As A Service Pada Jaringan Cloud , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 19 No. 1 (2019)
- Arief Hermawan, Adityo Permana Wibowo, Akmal Setiawan Wijaya, The Improvement of Artificial Neural Network Accuracy Using Principle Component Analysis Approach , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 22 No. 1 (2022)
- Andris Faesal, Aziz Muslim, Aditya Hastami Ruger, Kusrini Kusrini, Sentimen Analisis pada Data Tweet Pengguna Twitter Terhadap Produk Penjualan Toko Online Menggunakan Metode K-Means , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 19 No. 2 (2020)
- Bambang Suprihatin, Yuli Andriani, Fauziah Nuraini Kurdi, Anita Desiani, Ibra Giovani Dwi Putra, Muhammad Akmal Shidqi, Lungs X-Ray Image Segmentation and Classification of Lung Disease using Convolutional Neural Network Architectures , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 23 No. 1 (2023)
- Reni Fatrisna Salsabila, Didik Dwi Prasetya, Triyanna Widyaningtyas, Tsukasa Hirashima, Comparison of Text Representation for Clustering Student Concept Maps , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 24 No. 2 (2025)
- Donny Kurniawan, Anthony Anggrawan, Hairani Hairani, Graduation Prediction System on Students Using C4.5 Algorithm , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 19 No. 2 (2020)
- Muhammad Hairul Abror, Dadang Priyanto, MEDIA BANTU PEMBELAJARAN IPA SMP SEBAGAI BEKAL MENGHADAPI UJIAN NASIONAL (UN) , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 15 No. 1 (2015)
You may also start an advanced similarity search for this article.
Most read articles by the same author(s)
- Rakandhiya Daanii Rachmanto, Ahmad Naufal Labiib Nabhaan, Arief Setyanto, Deep Learning Model Compression Techniques Performance on Edge Devices , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 23 No. 3 (2024)
.png)











