Understanding power consumption and reliability of high-bandwidth memory with voltage underscaling
Document typeConference report
PublisherInstitute of Electrical and Electronics Engineers (IEEE)
Rights accessOpen Access
European Commission's projectLEGaTO - Low Energy Toolset for Heterogeneous Computing (EC-H2020-780681)
Modern computing devices employ High-Bandwidth Memory (HBM) to meet their memory bandwidth requirements. An HBM-enabled device consists of multiple DRAM layers stacked on top of one another next to a compute chip (e.g, CPU, GPU, and FPGA) in the same package. Although such HBM structures provide high bandwidth at a small form factor, the stacked memory layers consume a substantial portion of the package's power budget. Therefore, power-saving techniques that preserve the performance of HBM are desirable. Undervolting is one such technique: it reduces the supply voltage to decrease power consumption without reducing the device's operating frequency to avoid performance loss. Undervolting takes advantage of voltage guardbands put in place by manufacturers to ensure correct operation under all environmental conditions. However, reducing voltage without changing frequency can lead to reliability issues manifested as unwanted bit flips. In this paper, we provide the first experimental study of real HBM chips under reduced-voltage conditions. We show that the guardband regions for our HBM chips constitute 19% of the nominal voltage. Pushing the supply voltage down within the guardband region reduces power consumption by a factor of 1.5X for all bandwidth utilization rates. Pushing the voltage down further by 11% leads to a total of 2.3X power savings at the cost of unwanted bit flips. We explore and characterize the rate and types of these reduced-voltage-induced bit flips and present a fault map that enables the possibility of a three-factor trade-off among power, memory capacity, and fault rate.
CitationNabavilarimi, S. [et al.]. Understanding power consumption and reliability of high-bandwidth memory with voltage underscaling. A: Design, Automation and Test in Europe Conference and Exhibition. "Proceedings of the 2021 Design, Automation & Test in Europe (DATE 2021): 01-05 February 2021, virtual conference". Institute of Electrical and Electronics Engineers (IEEE), 2021, p. 517-522. ISBN 978-3-9819263-5-4. DOI 10.23919/DATE51398.2021.9474024.
- Doctorat en Arquitectura de Computadors - Ponències/Comunicacions de congressos 
- Computer Sciences - Ponències/Comunicacions de congressos 
- CAP - Grup de Computació d'Altes Prestacions - Ponències/Comunicacions de congressos 
- Departament d'Arquitectura de Computadors - Ponències/Comunicacions de congressos [1.690]
All rights reserved. This work is protected by the corresponding intellectual and industrial property rights. Without prejudice to any existing legal exemptions, reproduction, distribution, public communication or transformation of this work are prohibited without permission of the copyright holder