A simulation framework to automatically analyze the communication-computation overlap in scientific applications
Document typeConference report
PublisherInstitute of Electrical and Electronics Engineers (IEEE)
Rights accessOpen Access
Overlapping communication and computation has been devised as an attractive technique to alleviate the huge application's network requirements at large scale. Overlapping will allow to fully or partially hide the long communication delays suffered when transferring messages through the network. This will relax the application's network requirements, and hence allow to deploy more cost-effective network designs. However, today's scientific applications make little use of overlapping. In addition, there is no support to analyze how overlap could impact the performance of real scientific applications. In this paper we address this issue by presenting a simulation framework to automatically analyze the benefits of communication-computation overlap. The simulation framework consists of a binary translation tool (Valgrind), a distributed machine simulator (Dimemas), and a visualization tool (Paraver). Valgrind instruments the legacy MPI application and generates the execution traces, then Dimemas uses the obtained traces and reconstructs the application's time-behavior on a configurable parallel platform, and finally Paraver visualizes the obtained time-behaviors. Our simulation methodology brings two new features into the study of overlap: 1) automatic simulation of the overlapped execution - as there is no need for code restructuring in applications; and 2) visualization of simulated time behaviors, that further allows useful comparisons of the non-overlapped and the overlapped executions.
CitationSubotic, V., Sancho, J.C., Labarta, J., Valero, M. A simulation framework to automatically analyze the communication-computation overlap in scientific applications. A: International Conference on Cluster Computing. "2010 IEEE International Conference on Cluster Computing: CLUSTER 2010: 20-24 September 2010, Heraklion, Crete, Greece: proceedings". Crete: Institute of Electrical and Electronics Engineers (IEEE), 2010, p. 275-283.