Exploració per autor "Lehner, Wolfgang"
Ara es mostren els items 1-15 de 15
-
A cost-based storage format selector for materialized results in big data frameworks
Munir, Rana Faisal; Abelló Gamazo, Alberto; Romero Moral, Óscar; Thiele, Maik; Lehner, Wolfgang (2019-05-08)
Article
Accés obertModern big data frameworks (such as Hadoop and Spark) allow multiple users to do large-scale analysis simultaneously, by deploying data-intensive workflows (DIWs). These DIWs of different users share many common tasks (i.e, ... -
A framework for user-centered declarative ETL
Theodorou, Vasileios; Abelló Gamazo, Alberto; Thiele, Maik; Lehner, Wolfgang (2014)
Text en actes de congrés
Accés obertAs business requirements evolve with increasing information density and velocity, there is a growing need for efficiency and automation of Extract-Transform-Load (ETL) processes. Current approaches for the modeling and ... -
A machine learning approach for layout inference in spreadsheets
Koci, Elvis; Thiele, Maik; Romero Moral, Óscar; Lehner, Wolfgang (SciTePress, 2016)
Text en actes de congrés
Accés obertSpreadsheet applications are one of the most used tools for content generation and presentation in industry and the Web. In spite of this success, there does not exist a comprehensive approach to automatically extract and ... -
ATUN-HL: auto tuning of hybrid layouts using workload and data characteristics
Munir, Rana Faisal; Abelló Gamazo, Alberto; Romero Moral, Óscar; Thiele, Maik; Lehner, Wolfgang (2018)
Text en actes de congrés
Accés obertAd-hoc analysis implies processing data in near real-time. Thus, raw data (i.e., neither normalized nor transformed) is typically dumped into a distributed engine, where it is generally stored into a hybrid layout. Hybrid ... -
Automatically configuring parallelism for hybrid layouts
Munir, Rana Faisal; Abelló Gamazo, Alberto; Romero Moral, Óscar; Thiele, Maik; Lehner, Wolfgang (Springer, 2019)
Comunicació de congrés
Accés obertDistributed processing frameworks process data in parallel by dividing it into multiple partitions and each partition is processed in a separate task. The number of tasks is always created based on the total file size. ... -
Configuring parallelism for hybrid layouts using multi-objective optimization
Munir, Rana Faisal; Abelló Gamazo, Alberto; Romero Moral, Óscar; Thiele, Maik; Lehner, Wolfgang (2020-06-01)
Article
Accés obertModern organizations typically store their data in a raw format in data lakes. These data are then processed and usually stored under hybrid layouts, because they allow projection and selection operations. Thus, they allow ... -
Frequent patterns in ETL workflows: An empirical approach
Theodorou, Vasileios; Abelló Gamazo, Alberto; Thiele, Maik; Lehner, Wolfgang (Elsevier, 2017-09-05)
Article
Accés obertThe complexity of Business Intelligence activities has driven the proposal of several approaches for the effective modeling of Extract-Transform-Load (ETL) processes, based on the conceptual abstraction of their operations. ... -
Intermediate results materialization selection and format for data-intensive flows
Munir, Rana Faisal; Nadal Francesch, Sergi; Romero Moral, Óscar; Abelló Gamazo, Alberto; Jovanovic, Petar; Thiele, Maik; Lehner, Wolfgang (2018-05-01)
Article
Accés restringit per política de l'editorialData-intensive flows deploy a variety of complex data transformations to build information pipelines from data sources to different end users. As data are processed, these workflows generate large intermediate results, ... -
POIESIS: A tool for quality-aware ETL process redesign
Theodorou, Vasileios; Abelló Gamazo, Alberto; Thiele, Maik; Lehner, Wolfgang (2015)
Text en actes de congrés
Accés obertWe present a tool, called POIESIS, for automatic ETL process enhancement. ETL processes are essential data-centric activities in modern business intelligence environments and they need to be examined through a viewpoint ... -
Quality measures for ETL processes
Theodorou, Vasileios; Abelló Gamazo, Alberto; Lehner, Wolfgang (Springer, 2014)
Text en actes de congrés
Accés restringit per política de l'editorialETL processes play an increasingly important role for the support of modern business operations. These business processes are centred around artifacts with high variability and diverse lifecycles, which correspond to key ... -
Quality measures for ETL processes: from goals to implementation
Theodorou, Vasileios; Abelló Gamazo, Alberto; Lehner, Wolfgang; Thiele, Maik (2016-10-01)
Article
Accés obertExtraction transformation loading (ETL) processes play an increasingly important role for the support of modern business operations. These business processes are centred around artifacts with high variability and diverse ... -
Resilient store: a heuristic-based data format selector for intermediate results
Munir, Rana Faisal; Romero Moral, Óscar; Abelló Gamazo, Alberto; Bilalli, Besim; Thiele, Maik; Lehner, Wolfgang (2016)
Número de revista
Accés obertLarge-scale data analysis is an important activity in many organizations that typically requires the deployment of data-intensive workflows. As data is processed these workflows generate large intermediate results, which ... -
Table identification and reconstruction in spreadsheets
Koci, Elvis; Thiele, Maik; Romero Moral, Óscar; Lehner, Wolfgang (Springer, 2017)
Text en actes de congrés
Accés obertSpreadsheets are one of the most successful content generation tools, used in almost every enterprise to perform data transformation, visualization, and analysis. The high degree of freedom provided by these tools results ... -
Table recognition in spreadsheets via a graph representation
Koci, Elvis; Thiele, Maik; Lehner, Wolfgang; Romero Moral, Óscar (2018)
Text en actes de congrés
Accés obertSpreadsheet software are very popular data management tools. Their ease of use and abundant functionalities equip novices and professionals alike with the means to generate, transform, analyze, and visualize data. As a ... -
XLIndy: interactive recognition and information extraction in spreadsheets
Koci, Elvis; Kuban, Dana; Luetting, Nico; Olwig, Dominik; Thiele, Maik; Gonsior, Julius; Lehner, Wolfgang; Romero Moral, Óscar (Association for Computing Machinery (ACM), 2019)
Text en actes de congrés
Accés obertOver the years, spreadsheets have established their presence in many domains, including business, government, and science. However, challenges arise due to spreadsheets being partially-structured and carrying implicit ...