Using graph partitioning to accelerate task-based parallel applications
Document typeConference report
PublisherBarcelona Supercomputing Center
Rights accessOpen Access
Current high performance computing architectures are composed of large shared memory NUMA nodes, among other components. Such nodes are becoming increasingly complex as they have several NUMA domains with different access latencies depending on the core where the access is issued. In this work, we propose techniques based on graph partitioning to efficiently mitigate the negative impact of NUMA effects on parallel applications performance, which are able to improve the execution time of OpenMP parallel codes 2.02× times on average when run on architectures with strong NUMA effects.