Speeding up distributed MapReduce applications using hardware accelerators
Document typeConference report
Rights accessOpen Access
In an attempt to increase the performance/cost ratio, large compute clusters are becoming heterogeneous at multiple levels: from asymmetric processors, to different system architectures, operating systems and networks. Exploiting the intrinsic multi-level parallelism present in such a complex execution environment has become a challenging task using traditional parallel and distributed programming models. As a result, an increasing need for novel approaches to exploiting parallelism has arisen in these environments. MapReduce is a data-driven programming model originally proposed by Google back in 2004 as a flexible alternative to the existing models, specially devoted to hiding the complexity of both developing and running massively distributed applications in large compute clusters. In some recent works, the MapReduce model has been also used to exploit parallelism in other non-distributed environments, such as multi-cores, heterogeneous processors and GPUs. In this paper we introduce a novel approach for exploiting the heterogeneity of a Cell BE cluster linking an existing MapReduce runtime implementation for distributed clusters and one runtime to exploit the parallelism of the Cell BE nodes. The novel contribution of this work is the design and evaluation of a MapReduce execution environment that effectively exploits the parallelism existing at both the Cell BE cluster level and the heterogeneous processors level.
CitationBecerra, Y. [et al.]. Speeding up distributed MapReduce applications using hardware accelerators. A: International Conference on Parallel Processing. "38th International Conference on Parallel Processing". Viena: 2009, p. 42-49.