Now showing items 1-19 of 19

  • A requirement-driven approach to the design and evolution of data warehouses 

    Jovanovic, Petar; Romero Moral, Óscar; Simitsis, Alkis; Abelló Gamazo, Alberto; Mayorova, Daria (2014-08-01)
    Article
    Restricted access - publisher's policy
    Designing data warehouse (DW) systems in highly dynamic enterprise environments is not an easy task. At each moment, the multidimensional (MD) schema needs to satisfy the set of information requirements posed by the business ...
  • A unified view of data-intensive flows in business intelligence systems : a survey 

    Jovanovic, Petar; Romero Moral, Óscar; Abelló Gamazo, Alberto (2016-12)
    Article
    Open Access
    Data-intensive flows are central processes in today’s business intelligence (BI) systems, deploying different technologies to deliver data, from a multitude of data sources, in user-preferred and analysis-ready formats. ...
  • BabbleFlow : a translator for analytic data flow programs 

    Jovanovic, Petar; Simitsis, Alkis; Wilkinson, Kevin (Association for Computing Machinery (ACM), 2014)
    Conference lecture
    Restricted access - publisher's policy
    A complex analytic data flow may perform multiple, inter-dependent tasks where each task uses a different processing engine. Such a multi-engine flow, termed a hybrid flow, may comprise subflows written in more than one ...
  • Bijoux : data generator for evaluating ETL process quality 

    Nakuçi, Emona; Theodorou, Vasileios; Jovanovic, Petar; Abelló Gamazo, Alberto (2014)
    Conference report
    Restricted access - publisher's policy
    Obtaining the right set of data for evaluating the fulfillment of different quality standards in the extract-transform-load (ETL) process design is rather challenging. First, the real data might be out of reach due to ...
  • Data generator for evaluating ETL process quality 

    Theodorou, Vasileios; Jovanovic, Petar; Abelló Gamazo, Alberto; Nakuçi, Emona (Elsevier, 2017-01-01)
    Article
    Open Access
    Obtaining the right set of data for evaluating the fulfillment of different quality factors in the extract-transform-load (ETL) process design is rather challenging. First, the real data might be out of reach due to different ...
  • H-word: Supporting job scheduling in Hadoop with workload-driven data redistribution 

    Jovanovic, Petar; Romero Moral, Óscar; Calders, Toon; Abelló Gamazo, Alberto (2016)
    Conference report
    Open Access
    Today’s distributed data processing systems typically follow a query shipping approach and exploit data locality for reducing network traffic. In such systems the distribution of data over the cluster resources plays a ...
  • Incremental consolidation of data-intensive multi-flows 

    Jovanovic, Petar; Romero Moral, Óscar; Simitsis, Alkis; Abelló Gamazo, Alberto (2016-05-01)
    Article
    Open Access
    Business intelligence (BI) systems depend on efficient integration of disparate and often heterogeneous data. The integration of data is governed by data-intensive flows and is driven by a set of information requirements. ...
  • Integrating ETL processes from information requirements 

    Romero Moral, Óscar; Jovanovic, Petar; Simitsis, Alkis; Abelló Gamazo, Alberto (Springer, 2012)
    Conference report
    Restricted access - publisher's policy
    Data warehouse (DW) design is based on a set of requirements expressed as service level agreements (SLAs) and business level objects (BLOs). Populating a DW system from a set of information sources is realized with ...
  • Integration of Multidimensional and ETL design 

    Jovanovic, Petar (Universitat Politècnica de Catalunya, 2011-06-23)
    Master thesis
    Open Access
    This project represents master thesis and the final project, on the Master in Computing program, at Technical University of Catalonia. Led by the motivations and goals previously expressed, this project consists of the ...
  • Intermediate results materialization selection and format for data-intensive flows 

    Munir, Rana Faisal; Nadal Francesch, Sergi; Romero Moral, Óscar; Abelló Gamazo, Alberto; Jovanovic, Petar; Thiele, Maik; Lehner, Wolfgang (2018-05-01)
    Article
    Restricted access - publisher's policy
    Data-intensive flows deploy a variety of complex data transformations to build information pipelines from data sources to different end users. As data are processed, these workflows generate large intermediate results, ...
  • Mapreduce performance model for Hadoop 2.x 

    Glushkova, Daria; Jovanovic, Petar; Abelló Gamazo, Alberto (Elsevier, 2019-01)
    Article
    Restricted access - publisher's policy
    MapReduce is a popular programming model for distributed processing of large data sets. Apache Hadoop is one of the most common open-source implementations of such paradigm. Performance analysis of concurrent job executions ...
  • MapReduce performance models for Hadoop 2.x 

    Glushkova, Daria; Jovanovic, Petar; Abelló Gamazo, Alberto (CEUR-WS.org, 2017)
    Conference report
    Open Access
    MapReduce is a popular programming model for distributed processing of large data sets. Apache Hadoop is one of the most common open-source implementations of such paradigm. Performance analysis of concurrent job executions ...
  • Quarry 

    Abelló Gamazo, Alberto; Romero Moral, Óscar; Jovanovic, Petar; Nadal Francesch, Sergi; Bilalli, Besim; Candón Arenas, Héctor; Mayorova, Daria; Thavornun, Varunya; Gil González, Daniel (2015-07-01)
    Computer program
    Restricted access - confidentiality agreement
  • Quarry : digging up the gems of your data treasury 

    Jovanovic, Petar; Romero Moral, Óscar; Simitsis, Alkis; Abelló Gamazo, Alberto; Candón Arenas, Héctor; Nadal Francesch, Sergi (2015)
    Conference lecture
    Open Access
    The design lifecycle of a data warehousing (DW) system is primarily led by requirements of its end-users and the complexity of underlying data sources. The process of designing a multidimensional (MD) schema and back-end ...
  • Requirement-driven creation and deployment of multidimensional and ETL designs 

    Jovanovic, Petar; Romero Moral, Óscar; Simitsis, Alkis; Abelló Gamazo, Alberto (Springer, 2012)
    Conference report
    Open Access
    We present our tool for assisting designers in the error-prone and time-consuming tasks carried out at the early stages of a data warehousing project. Our tool semi-automatically produces multidimensional (MD) and ETL ...
  • Requirement-driven design and optimization of data-intensive flows 

    Jovanovic, Petar (2016-09-26)
    Doctoral thesis
    Open Access
    Covenantee:  Université libre de Bruxelles
    Data have become number one assets of today's business world. Thus, its exploitation and analysis attracted the attention of people from different fields and having different technical backgrounds. Data-intensive flows are ...
  • Supporting data integration tasks with semi-automatic ontology construction 

    Touma, Rizkallah; Romero Moral, Óscar; Jovanovic, Petar (Association for Computing Machinery (ACM), 2015)
    Conference report
    Restricted access - publisher's policy
    Data integration aims to facilitate the exploitation of heterogeneous data by providing the user with a unified view of data residing in different sources. Currently, ontologies are commonly used to represent this unified ...
  • Towards automated data integration in software analytics 

    Martínez Fernández, Silverio; Jovanovic, Petar; Franch Gutiérrez, Javier; Jedlitschka, Andreas (Association for Computing Machinery (ACM), 2018)
    Conference lecture
    Open Access
    Software organizations want to be able to base their decisions on the latest set of available data and the real-time analytics derived from them. In order to support "real-time enterprise" for software organizations and ...
  • xPAD: A platform for analytic data flows 

    Simitsis, Alkis; Wilkinson, Kevin; Jovanovic, Petar (2013)
    Conference report
    Restricted access - publisher's policy
    As enterprises become more automated, real-time, and data-driven, they need to integrate new data sources and specialized processing engines. The traditional business intelligence architecture of Extract-Transform-Load ...