Mostra el registre d'ítem simple
Data Generation for the Simulation of Artifact-Centric Processes
dc.contributor | Theodorou, Vasileios |
dc.contributor | Jovanovic, Petar |
dc.contributor | Abelló Gamazo, Alberto |
dc.contributor.author | Nakuçi, Emona |
dc.contributor.other | Universitat Politècnica de Catalunya. Departament d'Enginyeria de Serveis i Sistemes d'Informació |
dc.date.accessioned | 2015-01-29T08:34:11Z |
dc.date.available | 2015-01-29T08:34:11Z |
dc.date.issued | 2014-09-04 |
dc.identifier.uri | http://hdl.handle.net/2099.1/24817 |
dc.description.abstract | Increasing need for application benchmarking and testing purposes requires large amounts of data. However, obtaining realistic data from the industry for testing purposes, is often impossible due to confidentiality issues and expensive data transfer over the network i.e., Internet. Hence, there is a gap between the need to benchmark and the lack of a common testing environment to achieve it. The scope of this thesis is to contribute in narrowing the above presented gap, by introducing a theoretical framework of data generation for the simulation of data processes. Therefore, we aim at generating input data and hence, providing a common testing environment for testing and evaluating data processes. Specifically, we focus on generating data for ETL data processes by analyzing the semantics of the ow. The motivation comes from the fact that ETL processes are often time-consuming and error prone. Therefore, it is of high importance to evaluate and benchmark them, in order to identify bottlenecks and constantly improve their performance. Moreover, we introduce a layered architecture design for developing a prototype of the ETL data generation framework. In addition, we present a pilot tool developed for implementing the ETL data generation framework following the proposed architecture and the ETL semantics principle. As a conclusion to our work, we introduce the data generation approach and moreover show its feasibility to generate workload scenarios useful for testing and benchmarking ETL processes. |
dc.language.iso | eng |
dc.publisher | Universitat Politècnica de Catalunya |
dc.subject | Àrees temàtiques de la UPC::Informàtica::Sistemes d'informació |
dc.subject.lcsh | Database management |
dc.subject.lcsh | Data mining |
dc.subject.other | tester |
dc.subject.other | generator |
dc.subject.other | testing |
dc.subject.other | benchmark |
dc.title | Data Generation for the Simulation of Artifact-Centric Processes |
dc.title.alternative | Detection of ETL bottlenecks by using process mining |
dc.type | Master thesis |
dc.subject.lemac | Bases de dades--Gestió |
dc.subject.lemac | Mineria de dades |
dc.identifier.slug | 100342 |
dc.rights.access | Open Access |
dc.date.updated | 2015-01-23T16:37:31Z |
dc.audience.educationlevel | Màster |
dc.audience.mediator | Facultat d'Informàtica de Barcelona |
dc.audience.degree | MÀSTER UNIVERSITARI ERASMUS MUNDUS EN TECNOLOGIES DE LA INFORMACIÓ PER A LA INTEL·LIGÈNCIA EMPRESARIAL (Pla 2012) |