Dynamically adapting floating-point precision to accelerate deep neural network training

Cita com:
hdl:2117/364945
Document typeConference report
Defense date2021
PublisherInstitute of Electrical and Electronics Engineers (IEEE)
Rights accessOpen Access
All rights reserved. This work is protected by the corresponding intellectual and industrial
property rights. Without prejudice to any existing legal exemptions, reproduction, distribution, public
communication or transformation of this work are prohibited without permission of the copyright holder
Abstract
Mixed-precision (MP) arithmetic combining both single- and half-precision operands has been successfully applied to train deep neural networks. Despite its advantages in terms of reducing the need for key resources like memory bandwidth or register file size, it has a limited capacity for diminishing further computing costs, as it requires 32-bits to represent its output. On the other hand, full half-precision arithmetic fails to deliver state-of-the-art training accuracy. We design a binary tool SERP based on Intel Pin which allows us to characterize and analyze computer arithmetic usage in machine learning frameworks (Pytorch, Caffe, Tensorflow) and to emulate different floating point formats. Based on empirical observations about precision needs on representative deep neural networks, this paper proposes a seamless approach to dynamically adapt floating point arithmetic. Our dynamically adaptive methodology enables the use of full half-precision arithmetic for up to 96.4% of the computations when training state-of-the-art neural networks; while delivering comparable accuracy to 32-bit floating point arithmetic. Microarchitectural simulations indicate that our Dynamic approach accelerates training deep convolutional and recurrent networks with respect to FP32 by 1.39 × and 1.26 ×, respectively.
CitationOsorio, J. [et al.]. Dynamically adapting floating-point precision to accelerate deep neural network training. A: IEEE International Conference on Machine Learning and Applications. "20th IEEE International Conference on Machine Learning and Applications, ICMLA 2021: 13-16 December 2021, virtual event: proceedings". Institute of Electrical and Electronics Engineers (IEEE), 2021, p. 980-987. ISBN 978-1-6654-4337-1. DOI 10.1109/ICMLA52953.2021.00161.
ISBN978-1-6654-4337-1
Publisher versionhttps://ieeexplore.ieee.org/document/9680041
Collections
- Doctorat en Arquitectura de Computadors - Ponències/Comunicacions de congressos [251]
- Computer Sciences - Ponències/Comunicacions de congressos [530]
- CAP - Grup de Computació d'Altes Prestacions - Ponències/Comunicacions de congressos [784]
- Departament d'Arquitectura de Computadors - Ponències/Comunicacions de congressos [1.874]
Files | Description | Size | Format | View |
---|---|---|---|---|
Dynamically Ada ... eural Network Training.pdf | 637,0Kb | View/Open |