Evaluating the benefits of key-value databases for scientific applications
Document typeConference report
Rights accessOpen Access
The convergence of Big Data applications with High-Performance Computing requires new methodologies to store, manage and process large amounts of information. Traditional storage solutions are unable to scale and that results in complex coding strategies. For example, the brain atlas of the Human Brain Project has the challenge to process large amounts of high-resolution brain images. Given the computing needs, we study the effects of replacing a traditional storage system with a distributed Key-Value database on a cell segmentation application. The original code uses HDF5 files on GPFS through an intricate interface, imposing synchronizations. On the other hand, by using Apache Cassandra or ScyllaDB through Hecuba, the application code is greatly simplified. Thanks to the Key-Value data model, the number of synchronizations is reduced and the time dedicated to I/O scales when increasing the number of nodes.
CitationSantamaría, P. [et al.]. Evaluating the benefits of key-value databases for scientific applications. A: International Conference on Computational Science. "Computational Science: ICCS 2019, 19th International Conference: Faro, Portugal, June 12–14, 2019: proceedings, part I". Berlín: Springer, 2019, p. 412-426.