Přístupnostní navigace
E-application
Search Search Close
Publication detail
KLHŮFEK, J. ŠAFÁŘ, M. MRÁZEK, V. VAŠÍČEK, Z. SEKANINA, L.
Original Title
Exploiting Quantization and Mapping Synergy in Hardware-Aware Deep Neural Network Accelerators
Type
conference paper
Language
English
Original Abstract
Energy efficiency and memory footprint of a convolutional neural network (CNN) implemented on a CNN inference accelerator depend on many factors, including a weight quantization strategy (i.e., data types and bit-widths) and mapping (i.e., placement and scheduling of DNN elementary operations on hardware units of the accelerator). We show that enabling rich mixed quantization schemes during the implementation can open a previously hidden space of mappings that utilize the hardware resources more effectively. CNNs utilizing quantized weights and activations and suitable mappings can significantly improve trade-offs among the accuracy, energy, and memory requirements compared to less carefully optimized CNN implementations. To find, analyze, and exploit these mappings, we: (i) extend a general-purpose state-of-the-art mapping tool (Timeloop) to support mixed quantization, which is not currently available; (ii) propose an efficient multi-objective optimization algorithm to find the most suitable bit-widths and mapping for each DNN layer executed on the accelerator; and (iii) conduct a detailed experimental evaluation to validate the proposed method. On two CNNs (MobileNetV1 and MobileNetV2) and two accelerators (Eyeriss and Simba) we show that for a given quality metric (such as the accuracy on ImageNet), energy savings are up to 37% without any accuracy drop.
Keywords
Quantization, Neural networks, Hardware accelerator
Authors
KLHŮFEK, J.; ŠAFÁŘ, M.; MRÁZEK, V.; VAŠÍČEK, Z.; SEKANINA, L.
Released
18. 2. 2024
Publisher
Institute of Electrical and Electronics Engineers
Location
Kielce
ISBN
979-8-3503-5934-3
Book
2024 27th International Symposium on Design & Diagnostics of Electronic Circuits & Systems (DDECS)
Pages from
1
Pages to
6
Pages count
URL
https://arxiv.org/abs/2404.05368
BibTex
@inproceedings{BUT188463, author="Jan {Klhůfek} and Miroslav {Šafář} and Vojtěch {Mrázek} and Zdeněk {Vašíček} and Lukáš {Sekanina}", title="Exploiting Quantization and Mapping Synergy in Hardware-Aware Deep Neural Network Accelerators", booktitle="2024 27th International Symposium on Design & Diagnostics of Electronic Circuits & Systems (DDECS)", year="2024", pages="1--6", publisher="Institute of Electrical and Electronics Engineers", address="Kielce", doi="10.1109/DDECS60919.2024.10508920", isbn="979-8-3503-5934-3", url="https://arxiv.org/abs/2404.05368" }