A two level neural approach combining off-chip prediction with adaptive prefetch filtering
Cita com:
hdl:2117/406359
Document typeConference report
Defense date2024
PublisherInstitute of Electrical and Electronics Engineers (IEEE)
Rights accessOpen Access
All rights reserved. This work is protected by the corresponding intellectual and industrial
property rights. Without prejudice to any existing legal exemptions, reproduction, distribution, public
communication or transformation of this work are prohibited without permission of the copyright holder
ProjectBSC - COMPUTACION DE ALTAS PRESTACIONES VIII (AEI-PID2019-107255GB-C21)
REDES DE INTERCONEXION, ACELERADORES HARDWARE Y OPTIMIZACION DE APLICACIONES (AEI-PID2019-105660RB-C22)
REDES DE INTERCONEXION, ACELERADORES HARDWARE Y OPTIMIZACION DE APLICACIONES (AEI-PID2019-105660RB-C22)
Abstract
To alleviate the performance and energy overheads of contemporary applications with large data footprints, we propose the Two Level Perceptron (TLP) predictor, a neural mechanism that effectively combines predicting whether an access will be off-chip with adaptive prefetch filtering at the first-level data cache (L1D). TLP is composed of two connected microarchitectural perceptron predictors, named First Level Predictor (FLP) and Second Level Predictor (SLP). FLP performs accurate off-chip prediction by using several program features based on virtual addresses and a novel selective delay component. The novelty of SLP relies on leveraging off-chip prediction to drive L1D prefetch filtering by using physical addresses and the FLP prediction as features. TLP constitutes the first hardware proposal targeting both off-chip prediction and prefetch filtering using a multilevel perceptron hardware approach. TLP only requires 7KB of storage. To demonstrate the benefits of TLP we compare its performance with state-of-the-art approaches using off-chip prediction and prefetch filtering on a wide range of single-core and multi-core workloads. Our experiments show that TLP reduces the average DRAM transactions by 30.7% and 17.7%, as compared to a baseline using state-of-the-art cache prefetchers but no off-chip prediction mechanism, across the single-core and multi-core workloads, respectively, while recent work significantly increases DRAM transactions. As a result, TLP achieves geometric mean performance speedups of 6.2% and 11.8% across single-core and multi-core workloads, respectively. In addition, our evaluation demonstrates that TLP is effective independently of the L1D prefetching logic.
CitationAlexandre, J. [et al.]. A two level neural approach combining off-chip prediction with adaptive prefetch filtering. A: IEEE International Symposium on High-Performance Computer Architecture. "2024 IEEE International Symposium on High-Performance Computer Architecture, HPCA 2024: 2-6 March 2024, Edinburgh, United Kingdom". Institute of Electrical and Electronics Engineers (IEEE), 2024, p. 528-542. ISBN 979-8-3503-9313-2. DOI 10.1109/HPCA57654.2024.00046.
ISBN979-8-3503-9313-2
Publisher versionhttps://ieeexplore.ieee.org/abstract/document/10476485
Files | Description | Size | Format | View |
---|---|---|---|---|
HPCA30_Paper___Camera_Ready-3.pdf | 1,405Mb | View/Open |