Automatic query driven data modelling in Cassandra

Hernandez, Roger; Becerra Fontal, Yolanda; Torres Viñals, Jordi; Ayguadé Parra, Eduard

Visualitza/Obre

Automatic_query_DS2105-BoAv11-7.pdf (60,43Kb)

Veure estadístiques d'ús d'UPCommons

Estadístiques de LA Referencia / Recolecta

Cita com:

Mostra el registre d'ítem complet

Hernandez, Roger

Becerra Fontal, Yolanda

Torres Viñals, Jordi

Ayguadé Parra, Eduard

Tipus de documentText en actes de congrés

Data publicació2015-05-05

EditorBarcelona Supercomputing Center

Condicions d'accésAccés obert

Attribution-NonCommercial-NoDerivs 3.0 Spain

Llevat que s'hi indiqui el contrari, els continguts d'aquesta obra estan subjectes a la llicència de Creative Commons : Reconeixement-NoComercial-SenseObraDerivada 3.0 Espanya

Abstract

Non-relational databases have recently been the preferred choice when it comes to dealing with Big Data challenges, but their performance is very sensitive to the chosen data organisations. We have seen differences of over 70 times in response time for the same query on different models. This brings users the need to be fully conscious of the queries they intend to serve in order to design their data model. The common practice then, is to replicate data into different models designed to fit different query requirements. In this scenario, the user is in charge of the code implementation required to keep consistency between the different data replicas. Manually replicating data in such high layers of the database results in a lot of squandered storage due to the underlying system replication mechanisms that are formerly designed for availability and reliability ends. We propose and design a mechanism and a prototype to provide users with transparent management, where queries are matched with a well-performing model option. Additionally, we propose to do so by transforming the replication mechanism into a heterogeneous replication one, in order to avoid squandering disk space while keeping the availability and reliability features. The result is a system where, regardless of the query or model the user specifies, response time will always be that of an affine query.

CitacióHernandez, Roger [et al.]. Automatic query driven data modelling in Cassandra. A: "BSC Doctoral Symposium (2nd: 2015: Barcelona)". 2nd ed. Barcelona: Barcelona Supercomputing Center, 2015, p. 114.

URIhttp://hdl.handle.net/2099/16571

Col·leccions

BSC International Doctoral Symposium - 2nd BSC International Doctoral Symposium, Barcelona, 5th-7th May, 2015 [47]

Veure estadístiques d'ús d'UPCommons

Mostra el registre d'ítem complet

Fitxers	Descripció	Mida	Format	Visualitza
Automatic_query_DS2105-BoAv11-7.pdf		60,43Kb	PDF	Visualitza/Obre

UPCommons. Portal del coneixement obert de la UPC

Automatic query driven data modelling in Cassandra

Visualitza/Obre

Explora