dc.contributor.author | Li, Yalei |
dc.contributor.author | Nadal Francesch, Sergi |
dc.contributor.author | Romero Moral, Óscar |
dc.contributor.other | Universitat Politècnica de Catalunya. Departament d'Enginyeria de Serveis i Sistemes d'Informació |
dc.date.accessioned | 2022-10-27T09:02:09Z |
dc.date.available | 2023-08-29T00:28:08Z |
dc.date.issued | 2022 |
dc.identifier.citation | Li, Y.; Nadal, S.; Romero, O. A data quality framework for graph-based virtual data integration systems. A: European Conference on Advances in Databases and Information Systems. "Advances in Databases and Information Systems: 26th European Conference, ADBIS 2022: Turin, Italy, September 5-8, 2022: proceedings". Berlín: Springer, 2022, p. 104-117. ISBN 978-3-031-15740-0. DOI 10.1007/978-3-031-15740-0_9. |
dc.identifier.isbn | 978-3-031-15740-0 |
dc.identifier.uri | http://hdl.handle.net/2117/375124 |
dc.description.abstract | Data Quality (DQ) plays a critical role in data integration. Up to now, DQ has mostly been addressed from a single database perspective. Popular DQ frameworks rely on Integrity Constraints (IC) to enforce valid application semantics, which lead to the Denial Constraint (DC) formalism which models a broad range of ICs in real-world applications. Yet, current approaches are rather monolithic, considering a single database and do not suit data integration scenarios. In this paper, we address DQ for data integration systems. Specifically, we extend virtual data integration systems to elicit DCs from disparate data sources to be integrated, using DC-related state-of-the-art, and propagate them to the integrated schema (global DCs). Then, we propose a method to manage global DCs and identify (i) minimal DCs and (ii) potential clashes between them. |
dc.description.sponsorship | This work was partly supported by the DOGO4ML project, funded by the Spanish Ministerio de Ciencia e Innovación under project PID2020-117191RB-I00. Sergi Nadal is partly supported by the Spanish Ministerio de Ciencia e Innovación, as well as the European Union - NextGenerationEU, under project FJC2020-045809-I. |
dc.format.extent | 14 p. |
dc.language.iso | eng |
dc.publisher | Springer |
dc.subject | Àrees temàtiques de la UPC::Informàtica::Sistemes d'informació |
dc.subject.lcsh | Decision-making |
dc.subject.lcsh | Big data |
dc.subject.other | Data quality |
dc.subject.other | Data integration |
dc.subject.other | Denial constraints |
dc.title | A data quality framework for graph-based virtual data integration systems |
dc.type | Conference report |
dc.subject.lemac | Decisió, Presa de |
dc.subject.lemac | Dades massives |
dc.contributor.group | Universitat Politècnica de Catalunya. inSSIDE - integrated Software, Services, Information and Data Engineering |
dc.identifier.doi | 10.1007/978-3-031-15740-0_9 |
dc.description.peerreviewed | Peer Reviewed |
dc.relation.publisherversion | https://link.springer.com/chapter/10.1007/978-3-031-15740-0_9 |
dc.rights.access | Open Access |
local.identifier.drac | 34229433 |
dc.description.version | Postprint (author's final draft) |
dc.relation.projectid | info:eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica y Técnica y de Innovación 2017-2020/PID2020-117191RB-I00/ES/DESARROLLO, OPERATIVA Y GOBERNANZA DE DATOS PARA SISTEMAS SOFTWARE BASADOS EN APRENDIZAJE AUTOMATICO/ |
local.citation.author | Li, Y.; Nadal, S.; Romero, O. |
local.citation.contributor | European Conference on Advances in Databases and Information Systems |
local.citation.pubplace | Berlín |
local.citation.publicationName | Advances in Databases and Information Systems: 26th European Conference, ADBIS 2022: Turin, Italy, September 5-8, 2022: proceedings |
local.citation.startingPage | 104 |
local.citation.endingPage | 117 |