Publication detail

Exploiting Quantization and Mapping Synergy in Hardware-Aware Deep Neural Network Accelerators

KLHŮFEK, J. ŠAFÁŘ, M. MRÁZEK, V. VAŠÍČEK, Z. SEKANINA, L.

Original Title

Exploiting Quantization and Mapping Synergy in Hardware-Aware Deep Neural Network Accelerators

Type

conference paper

Language

English

Original Abstract

Energy efficiency and memory footprint of a convolutional neural network (CNN) implemented on a CNN inference accelerator depend on many factors, including a weight quantization strategy (i.e., data types and bit-widths) and mapping (i.e., placement and scheduling of DNN elementary operations on hardware units of the accelerator). We show that enabling rich mixed quantization schemes during the implementation can open a previously hidden space of mappings that utilize the hardware resources more effectively. CNNs utilizing quantized weights and activations and suitable mappings can significantly improve trade-offs among the accuracy, energy, and memory requirements compared to less carefully optimized CNN implementations. To find, analyze, and exploit these mappings, we: (i) extend a general-purpose state-of-the-art mapping tool (Timeloop) to support mixed quantization, which is not currently available; (ii) propose an efficient multi-objective optimization algorithm to find the most suitable bit-widths and mapping for each DNN layer executed on the accelerator; and (iii) conduct a detailed experimental evaluation to validate the proposed method. On two CNNs (MobileNetV1 and MobileNetV2) and two accelerators (Eyeriss and Simba) we show that for a given quality metric (such as the accuracy on ImageNet), energy savings are up to 37% without any accuracy drop. 

Keywords

Quantization, Neural networks, Hardware accelerator

Authors

KLHŮFEK, J.; ŠAFÁŘ, M.; MRÁZEK, V.; VAŠÍČEK, Z.; SEKANINA, L.

Released

18. 2. 2024

Publisher

Institute of Electrical and Electronics Engineers

Location

Kielce

ISBN

979-8-3503-5934-3

Book

2024 27th International Symposium on Design & Diagnostics of Electronic Circuits & Systems (DDECS)

Pages from

1

Pages to

6

Pages count

6

URL

BibTex

@inproceedings{BUT188463,
  author="Jan {Klhůfek} and Miroslav {Šafář} and Vojtěch {Mrázek} and Zdeněk {Vašíček} and Lukáš {Sekanina}",
  title="Exploiting Quantization and Mapping Synergy in Hardware-Aware Deep Neural Network Accelerators",
  booktitle="2024 27th International Symposium on Design & Diagnostics of Electronic Circuits & Systems (DDECS)",
  year="2024",
  pages="1--6",
  publisher="Institute of Electrical and Electronics Engineers",
  address="Kielce",
  doi="10.1109/DDECS60919.2024.10508920",
  isbn="979-8-3503-5934-3",
  url="https://arxiv.org/abs/2404.05368"
}