Data Generation for the Simulation of Artifact-Centric Processes

Nakuçi, Emona

Visualitza/Obre

100342.pdf (2,062Mb)

Veure estadístiques d'ús d'UPCommons

Estadístiques de LA Referencia / Recolecta

Cita com:

Mostra el registre d'ítem complet

Nakuçi, Emona

Tutor / directorTheodorou, Vasileios; Jovanovic, Petar

; Abelló Gamazo, Alberto

Tipus de documentProjecte Final de Màster Oficial

Data2014-09-04

Condicions d'accésAccés obert

Tots els drets reservats. Aquesta obra està protegida pels drets de propietat intel·lectual i industrial corresponents. Sense perjudici de les exempcions legals existents, queda prohibida la seva reproducció, distribució, comunicació pública o transformació sense l'autorització del titular dels drets

Abstract

Increasing need for application benchmarking and testing purposes requires large amounts of data. However, obtaining realistic data from the industry for testing purposes, is often impossible due to confidentiality issues and expensive data transfer over the network i.e., Internet. Hence, there is a gap between the need to benchmark and the lack of a common testing environment to achieve it. The scope of this thesis is to contribute in narrowing the above presented gap, by introducing a theoretical framework of data generation for the simulation of data processes. Therefore, we aim at generating input data and hence, providing a common testing environment for testing and evaluating data processes. Specifically, we focus on generating data for ETL data processes by analyzing the semantics of the ow. The motivation comes from the fact that ETL processes are often time-consuming and error prone. Therefore, it is of high importance to evaluate and benchmark them, in order to identify bottlenecks and constantly improve their performance. Moreover, we introduce a layered architecture design for developing a prototype of the ETL data generation framework. In addition, we present a pilot tool developed for implementing the ETL data generation framework following the proposed architecture and the ETL semantics principle. As a conclusion to our work, we introduce the data generation approach and moreover show its feasibility to generate workload scenarios useful for testing and benchmarking ETL processes.

MatèriesDatabase management, Data mining, Bases de dades--Gestió, Mineria de dades

TitulacióMÀSTER UNIVERSITARI ERASMUS MUNDUS EN TECNOLOGIES DE LA INFORMACIÓ PER A LA INTEL·LIGÈNCIA EMPRESARIAL (Pla 2012)

URIhttp://hdl.handle.net/2099.1/24817

Col·leccions

Màsters oficials - Màster universitari Erasmus Mundus en Tecnologies de la Informació per a la Intel·ligència Empresarial (IT4BI) [18]

Veure estadístiques d'ús d'UPCommons

Mostra el registre d'ítem complet

Fitxers	Descripció	Mida	Format	Visualitza
100342.pdf		2,062Mb	PDF	Visualitza/Obre

UPCommons. Portal del coneixement obert de la UPC

Data Generation for the Simulation of Artifact-Centric Processes

Visualitza/Obre

Explora