SETA: A suite-independent analytical framework
Visualitza/Obre
Estadístiques de LA Referencia / Recolecta
Inclou dades d'ús des de 2022
Cita com:
hdl:2117/106397
Tipus de documentProjecte Final de Màster Oficial
Data2016-10-20
Condicions d'accésAccés obert
Tots els drets reservats. Aquesta obra està protegida pels drets de propietat intel·lectual i
industrial corresponents. Sense perjudici de les exempcions legals existents, queda prohibida la seva
reproducció, distribució, comunicació pública o transformació sense l'autorització del titular dels drets
Abstract
Nowadays, business analytical users need agile processes spanning from the selection
of relevant data from raw data sources to the generation of data structures
prepared to serve as input for OLAP, Data Mining and/or other analytical tools.
However, the wide range of analytical needs and the increasingly need of adaptive
Business strategies discourages the use of the ’All-In-One’ existing suites (i.e.,
end-to-end Solutions from a single vendor). Oppositely, an agile approach suiteindependent
is advisable to boost user’s independence from a specific vendor and
the analytical capabilities enabled by combining several suites / tools according to
the user’s needs. In this thesis we present and develop ’SETA’, a suite-independent
agile analytical framework by proposing a novel approach combining rich metadata
definition and automation components. As proof of validity, we instantiate
the developed framework in a real-world project for the WHO Chagas Programme.
This thesis introduces two main contributions. First, an approach to store and
integrate a set of heterogeneous data sources into a flexible data store in some
intermediate point between the classical Data Warehouse (DW) approaches and
the recent Data Lake strategies. We argue that classical DW systems are too
rigid to accommodate agile analytical pipelines, whereas Data Lakes and Big Data
technologies are not suitable to much of today’s organizations. Thus, a novel
approach combining both approaches is presented. Second, a rich definitional
system to represent 1) the data components at Source, Global Schema and Domain
levels, 2) the data mappings between this levels and 3) the final user analytical
requirements. This definitional system provides a flexible view of the data schema
at different levels and habilitates the automation of the target data schemas and
the ETL to feed them.
TitulacióMÀSTER UNIVERSITARI EN INNOVACIÓ I RECERCA EN INFORMÀTICA (Pla 2012)
Col·leccions
Fitxers | Descripció | Mida | Format | Visualitza |
---|---|---|---|---|
118236.pdf | 4,904Mb | Visualitza/Obre |