Show simple item record

dc.contributor.authorNakuçi, Emona
dc.contributor.authorTheodorou, Vasileios
dc.contributor.authorJovanovic, Petar
dc.contributor.authorAbelló Gamazo, Alberto
dc.contributor.otherUniversitat Politècnica de Catalunya. Departament d'Enginyeria de Serveis i Sistemes d'Informació
dc.date.accessioned2015-01-28T11:39:00Z
dc.date.created2014
dc.date.issued2014
dc.identifier.citationNakuçi, E. [et al.]. Bijoux : data generator for evaluating ETL process quality. A: International Workshop On Data Warehousing and OLAP. "Proceedings of the 17th International Workshop on Data Warehousing and OLAP". Shanghai: 2014, p. 23-32.
dc.identifier.isbn978-1-4503-0999-8
dc.identifier.urihttp://hdl.handle.net/2117/26130
dc.description.abstractObtaining the right set of data for evaluating the fulfillment of different quality standards in the extract-transform-load (ETL) process design is rather challenging. First, the real data might be out of reach due to different privacy constraints, while providing a synthetic set of data is known as a labor-intensive task that needs to take various combinations of process parameters into account. Additionally, having a single dataset usually does not represent the evolution of data throughout the complete process lifespan, hence missing the plethora of possible test cases. To facilitate such demanding task, in this paper we propose an automatic data generator (i.e., Bijoux). Starting from a given ETL process model, Bijoux extracts the semantics of data transformations, analyzes the constraints they imply over data, and automatically generates testing datasets. At the same time, it considers different dataset and transformation characteristics (e.g., size, distribution, selectivity, etc.) in order to cover a variety of test scenarios. We report our experimental findings showing the effectiveness and scalability of our approach.
dc.format.extent10 p.
dc.language.isoeng
dc.subjectÀrees temàtiques de la UPC::Informàtica::Sistemes d'informació
dc.subject.lcshData warehousing
dc.subject.otherProcess quality
dc.subject.otherData generator
dc.subject.otherETL
dc.titleBijoux : data generator for evaluating ETL process quality
dc.typeConference report
dc.subject.lemacRepositoris
dc.subject.lemacGestors de dades
dc.contributor.groupUniversitat Politècnica de Catalunya. MPI - Modelització i Processament de la Informació
dc.identifier.doi10.1145/2666158.2666183
dc.description.peerreviewedPeer Reviewed
dc.relation.publisherversionhttp://dl.acm.org/citation.cfm?doid=2666158.2666183
dc.rights.accessRestricted access - publisher's policy
local.identifier.drac15392604
dc.description.versionPostprint (published version)
dc.date.lift10000-01-01
local.citation.authorNakuçi, E.; Theodorou, V.; Jovanovic, P.; Abello, A.
local.citation.contributorInternational Workshop On Data Warehousing and OLAP
local.citation.pubplaceShanghai
local.citation.publicationNameProceedings of the 17th International Workshop on Data Warehousing and OLAP
local.citation.startingPage23
local.citation.endingPage32


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record

All rights reserved. This work is protected by the corresponding intellectual and industrial property rights. Without prejudice to any existing legal exemptions, reproduction, distribution, public communication or transformation of this work are prohibited without permission of the copyright holder