LOW-LATENCY DEEP NEURAL NETWORK COMPRESSION FOR REAL-TIME IOT APPLICATIONS

Michael

LOW-LATENCY DEEP NEURAL NETWORK COMPRESSION FOR REAL-TIME IOT APPLICATIONS

Authors

Michael Author

Keywords:

DNN Compression, IoT, Low-Latency Computing, Model Pruning, Quantization, Edge AI, RealTime Processing.

Abstract

The rapid growth of Internet of Things (IoT) systems has emphasized the need for efficient deep neural network (DNN) processing under stringent latency and resource constraints. Conventional DNNs require significant computational power, making them unsuitable for real-time IoT deployments with limited memory, bandwidth, and processing capability. This paper proposes a lowlatency DNN compression framework that combines structured pruning, quantization-aware training, and lightweight model reparameterization. The proposed method reduces computational complexity while maintaining competitive accuracy, enabling faster inference on edge IoT devices. Experimental evaluations demonstrate up to 62% reduction in model size and 48% improvement in inference speed. The approach provides a scalable and energy-efficient solution for real-time IoT applications.

Downloads

Published

2023-04-27

Issue

Vol. 4 No. 2 (2023): Volume 4, Issue 2, 2023

Section

Articles

License

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

How to Cite

Michael. (2023). LOW-LATENCY DEEP NEURAL NETWORK COMPRESSION FOR REAL-TIME IOT APPLICATIONS. International Journal of Economic Social Science and Management LAW, 4(2), 8-11. https://ijeml.com/journal/index.php/ijeml/article/view/40