Browsing by Author "Munir, Rana Faisal"
Now showing items 1-9 of 9
-
A cost-based storage format selector for materialized results in big data frameworks
Munir, Rana Faisal; Abelló Gamazo, Alberto; Romero Moral, Óscar; Thiele, Maik; Lehner, Wolfgang (2019-05-08)
Article
Open AccessModern big data frameworks (such as Hadoop and Spark) allow multiple users to do large-scale analysis simultaneously, by deploying data-intensive workflows (DIWs). These DIWs of different users share many common tasks (i.e, ... -
A framework for assessing the peer review duration of journals: case study in computer science
Bilalli, Besim; Munir, Rana Faisal; Abelló Gamazo, Alberto (Springer Nature, 2021-01)
Article
Open AccessIn various fields, scientific article publication is a measure of productivity and in many occasions it is used as a critical factor for evaluating researchers. Therefore, a lot of time is dedicated to writing articles ... -
ATUN-HL: auto tuning of hybrid layouts using workload and data characteristics
Munir, Rana Faisal; Abelló Gamazo, Alberto; Romero Moral, Óscar; Thiele, Maik; Lehner, Wolfgang (2018)
Conference report
Open AccessAd-hoc analysis implies processing data in near real-time. Thus, raw data (i.e., neither normalized nor transformed) is typically dumped into a distributed engine, where it is generally stored into a hybrid layout. Hybrid ... -
Automatically configuring parallelism for hybrid layouts
Munir, Rana Faisal; Abelló Gamazo, Alberto; Romero Moral, Óscar; Thiele, Maik; Lehner, Wolfgang (Springer, 2019)
Conference lecture
Open AccessDistributed processing frameworks process data in parallel by dividing it into multiple partitions and each partition is processed in a separate task. The number of tasks is always created based on the total file size. ... -
Configuring parallelism for hybrid layouts using multi-objective optimization
Munir, Rana Faisal; Abelló Gamazo, Alberto; Romero Moral, Óscar; Thiele, Maik; Lehner, Wolfgang (2020-06-01)
Article
Open AccessModern organizations typically store their data in a raw format in data lakes. These data are then processed and usually stored under hybrid layouts, because they allow projection and selection operations. Thus, they allow ... -
Intermediate results materialization selection and format for data-intensive flows
Munir, Rana Faisal; Nadal Francesch, Sergi; Romero Moral, Óscar; Abelló Gamazo, Alberto; Jovanovic, Petar; Thiele, Maik; Lehner, Wolfgang (2018-05-01)
Article
Restricted access - publisher's policyData-intensive flows deploy a variety of complex data transformations to build information pipelines from data sources to different end users. As data are processed, these workflows generate large intermediate results, ... -
PRESISTANT : data pre-processing assistant
Bilalli, Besim; Abelló Gamazo, Alberto; Aluja Banet, Tomàs; Munir, Rana Faisal; Wrembel, Robert (Springer, 2019)
Conference lecture
Open AccessA concrete classification algorithm may perform differently on datasets with different characteristics, e.g., it might perform better on a dataset with continuous attributes rather than with categorical attributes, or the ... -
Resilient store: a heuristic-based data format selector for intermediate results
Munir, Rana Faisal; Romero Moral, Óscar; Abelló Gamazo, Alberto; Bilalli, Besim; Thiele, Maik; Lehner, Wolfgang (2016)
Número de revista
Open AccessLarge-scale data analysis is an important activity in many organizations that typically requires the deployment of data-intensive workflows. As data is processed these workflows generate large intermediate results, which ... -
Storage format selection and optimization for materialized intermediate results in data-intensive flows
Munir, Rana Faisal (Universitat Politècnica de Catalunya, 2019-12-05)
Doctoral thesis
Open AccessModern organizations produce and collect large volumes of data, that need to be processed repeatedly and quickly for gaining business insights. For such processing, typically, Data-intensive Flows (DIFs) are deployed on ...