Software-managed power reduction in Infiniband links
Document typeConference report
PublisherInstitute of Electrical and Electronics Engineers (IEEE)
Rights accessOpen Access
The backbone of a large-scale supercomputer is the interconnection network. As compute nodes become more energy-efficient, the interconnect is accounting for an increasing proportion of the total system energy consumption. The interconnect's energy consumption is, however, only starting to receive serious attention. Some hardware-based schemes have been proposed that exploit idle periods or low utilisation, either by turning off the links or by lowering the frequency and voltage. Although these schemes are effective in certain cases, they do not have enough global information about the application's communication behaviour to efficiently manage the network power consumption. This paper proposes an alternative approach: moving the intelligence into the PMPI layer of the MPI library, and using prediction to discover repetitive patterns in the application's communication behaviour. The core of the prediction algorithm is an n-gram extraction technique, which can accurately predict not only when a link will become unused but also when it will become active again, allowing lanes to be switched off during the idle periods and switched back on again in time to avoid incurring a significant performance degradation. Many HPC applications benefit from prediction, since they have repetitive computation and communication phases. By implementing the energy-saving mechanism inside the MPI library, existing MPI programs do not need to be modified. Using an event-driven simulator, driven by representative HPC workloads, we demonstrate average energy savings in Infiniband switches up to around 33%, while the average execution time increase is only up to 1%.
CitationDickov, B. [et al.]. Software-managed power reduction in Infiniband links. A: International Conference on Parallel Processing. "43rd International Conference on Parallel Processing, ICPP 2014: 9-12 September 2014, Minneapolis, Minnesota, USA: proceedings". Minneapolis: Institute of Electrical and Electronics Engineers (IEEE), 2014, p. 311-320.