An integration-oriented ontology to govern evolution in big data ecosystems
Document typeConference lecture
Rights accessOpen Access
European Commission's projectSUPERSEDE - SUpporting evolution and adaptation of PERsonalized Software by Exploiting contextual Data and End-user feedback (EC-H2020-644018)
Big Data architectures allow to flexibly store and process heterogeneous data, from multiple sources, in its original format. The structure of those data, commonly supplied by means of REST APIs, is continuously evolving, forcing data analysts using it need to adapt their analytical processes after each release. This gets more challenging when aiming to perform an integrated or historical analysis of multiple sources. To cope with such complexity, in this paper we present the Big Data Integration ontology, the core construct for a data governance protocol that systematically annotates and integrates data from multiple sources in its original format. To cope with syntactic evolution in the sources, we present an algorithm that semi-automatically adapts the ontology upon new releases. A functional evaluation on real world APIs is performed in order to validate our approach.
CitationNadal, S., Romero, O., Abelló, A., Vassiliadis , P., Vansummeren, S. An integration-oriented ontology to govern evolution in big data ecosystems. A: International Workshop On Design, Optimization, Languages and Analytical Processing of Big Data. "Proceedings of the Workshops of the EDBT/ICDT 2017 Joint Conference (EDBT/ICDT 2017): Venice, Italy, March 21-24, 2017". Venice: CEUR-WS.org, 2017, p. 1-10.