An integration-oriented ontology to govern evolution in big data ecosystems
Fitxers
Títol de la revista
ISSN de la revista
Títol del volum
Col·laborador
Editor
Tribunal avaluador
Realitzat a/amb
Tipus de document
Data publicació
Editor
Condicions d'accés
item.page.rightslicense
Publicacions relacionades
Datasets relacionats
Projecte CCD
Abstract
Big Data architectures allow to flexibly store and process heterogeneous data, from multiple sources, in their original format. The structure of those data, commonly supplied by means of REST APIs, is continuously evolving. Thus data analysts need to adapt their analytical processes after each API release. This gets more challenging when performing an integrated or historical analysis. To cope with such complexity, in this paper, we present the Big Data Integration ontology, the core construct to govern the data integration process under schema evolution by systematically annotating it with information regarding the schema of the sources. We present a query rewriting algorithm that, using the annotated ontology, converts queries posed over the ontology to queries over the sources. To cope with syntactic evolution in the sources, we present an algorithm that semi-automatically adapts the ontology upon new releases. This guarantees ontology-mediated queries to correctly retrieve data from the most recent schema version as well as correctness in historical queries. A functional and performance evaluation on real-world APIs is performed to validate our approach.
Descripció
Persones/entitats
Document relacionat
Versió de
Citació
Ajut
Forma part
Dipòsit legal
ISBN
ISSN
Versió de l'editor
Altres identificadors
Referències
Col·leccions
inSSIDE - integrated Software, Service, Information and Data Engineering - Articles de revista
Departament d'Enginyeria de Serveis i Sistemes d'Informació - Articles de revista
GESSI - Grup d'Enginyeria del Software i dels Serveis - Articles de revista
IMP - Information Modeling and Processing - Articles de revista


