Big data deployment in containerized infrastructures through the interconnection of network namespaces
manuscript.pdf (476,4Kb) (Restricted access) Request copy
Què és aquest botó?
Aquest botó permet demanar una còpia d'un document restringit a l'autor. Es mostra quan:
- Disposem del correu electrònic de l'autor
- El document té una mida inferior a 20 Mb
- Es tracta d'un document d'accés restringit per decisió de l'autor o d'un document d'accés restringit per política de l'editorial
Rights accessRestricted access - publisher's policy (embargoed until 2021-01-27)
Big Data applications tackle the challenge of fast handling of large streams of data. Their performance is not only dependent on the data frameworks implementation and the underlying hardware but also on the deployment scheme and its potential for fast scaling. Consequently, several efforts have focused on the ease of deployment of Big Data applications, notably through the use of containerization. This technology was indeed raised to bring multitenancy and multiprocessing out of clusters, providing high deployment flexibility through lightweight container images. Recent studies have focused mostly on Docker containers. Notwithstanding, this article is actually interested in recent Singularity containers as they provide more security and support high-performance computing (HPC) environments and, in this way, they can make Big Data applications benefit from the specialized hardware of HPC. Singularity 2.x, however, does not isolate network resources as required by most Big Data components. Singularity 3.x allows allocating each container with isolated network resources, but their interconnection requires a nontrivial amount of configuration effort. In this context, this article makes a functional contribution in the form of a deployment scheme based on the interconnection of network namespaces, through underlay and overlay networking approaches, to make Big Data applications easily deployable inside Singularity containers. We provide detailed account of our deployment scheme when using both interconnection approaches in the form of a “how-to-do-it” report, and we evaluate it by comparing three Big Data applications based on Hadoop when performing on a bare-metal infrastructure and on scenarios involving Singularity and Docker instances.
CitationSauvanaud, C. [et al.]. Big data deployment in containerized infrastructures through the interconnection of network namespaces. "Software. Practice and experience", Juliol 2020, vol. 50, núm. 7, p. 1087-1113.
All rights reserved. This work is protected by the corresponding intellectual and industrial property rights. Without prejudice to any existing legal exemptions, reproduction, distribution, public communication or transformation of this work are prohibited without permission of the copyright holder