Show simple item record

dc.contributorTheodorou, Vasileios
dc.contributorJovanovic, Petar
dc.contributorAbelló Gamazo, Alberto
dc.contributor.authorNakuçi, Emona
dc.contributor.otherUniversitat Politècnica de Catalunya. Departament d'Enginyeria de Serveis i Sistemes d'Informació
dc.date.accessioned2015-01-29T08:34:11Z
dc.date.available2015-01-29T08:34:11Z
dc.date.issued2014-09-04
dc.identifier.urihttp://hdl.handle.net/2099.1/24817
dc.description.abstractIncreasing need for application benchmarking and testing purposes requires large amounts of data. However, obtaining realistic data from the industry for testing purposes, is often impossible due to confidentiality issues and expensive data transfer over the network i.e., Internet. Hence, there is a gap between the need to benchmark and the lack of a common testing environment to achieve it. The scope of this thesis is to contribute in narrowing the above presented gap, by introducing a theoretical framework of data generation for the simulation of data processes. Therefore, we aim at generating input data and hence, providing a common testing environment for testing and evaluating data processes. Specifically, we focus on generating data for ETL data processes by analyzing the semantics of the ow. The motivation comes from the fact that ETL processes are often time-consuming and error prone. Therefore, it is of high importance to evaluate and benchmark them, in order to identify bottlenecks and constantly improve their performance. Moreover, we introduce a layered architecture design for developing a prototype of the ETL data generation framework. In addition, we present a pilot tool developed for implementing the ETL data generation framework following the proposed architecture and the ETL semantics principle. As a conclusion to our work, we introduce the data generation approach and moreover show its feasibility to generate workload scenarios useful for testing and benchmarking ETL processes.
dc.language.isoeng
dc.publisherUniversitat Politècnica de Catalunya
dc.subjectÀrees temàtiques de la UPC::Informàtica::Sistemes d'informació
dc.subject.lcshDatabase management
dc.subject.lcshData mining
dc.subject.othertester
dc.subject.othergenerator
dc.subject.othertesting
dc.subject.otherbenchmark
dc.titleData Generation for the Simulation of Artifact-Centric Processes
dc.title.alternativeDetection of ETL bottlenecks by using process mining
dc.typeMaster thesis
dc.subject.lemacBases de dades--Gestió
dc.subject.lemacMineria de dades
dc.identifier.slug100342
dc.rights.accessOpen Access
dc.date.updated2015-01-23T16:37:31Z
dc.audience.educationlevelMàster
dc.audience.mediatorFacultat d'Informàtica de Barcelona


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record

All rights reserved. This work is protected by the corresponding intellectual and industrial property rights. Without prejudice to any existing legal exemptions, reproduction, distribution, public communication or transformation of this work are prohibited without permission of the copyright holder