Now showing items 1-20 of 72

    • A cost-based storage format selector for materialized results in big data frameworks 

      Munir, Rana Faisal; Abelló Gamazo, Alberto; Romero Moral, Óscar; Thiele, Maik; Lehner, Wolfgang (2019-05-08)
      Article
      Open Access
      Modern big data frameworks (such as Hadoop and Spark) allow multiple users to do large-scale analysis simultaneously, by deploying data-intensive workflows (DIWs). These DIWs of different users share many common tasks (i.e, ...
    • A Data-driven approach to improve the process of data-intensive API creation and evolution 

      Abelló Gamazo, Alberto; Ayala Martínez, Claudia Patricia; Farré Tost, Carles; Gómez Seoane, Cristina; Oriol Hilari, Marc; Romero Moral, Óscar (CEUR-WS.org, 2017)
      Conference report
      Open Access
      The market of data-intensive Application Programming Interfaces (APIs) has recently experienced an exponential growth, but the creation and evolution of such APIs is still done ad-hoc, with little automated support and ...
    • A framework for building OLAP cubes on graphs 

      Ghrab, Amine; Romero Moral, Óscar; Skhiri, Sabri; Vaisman, Alejandro; Zimányi, Esteban (Springer, 2015)
      Conference report
      Open Access
      Graphs are widespread structures providing a powerful abstraction for modeling networked data. Large and complex graphs have emerged in various domains such as social networks, bioinformatics, and chemical data. However, ...
    • A framework for multidimensional design of data warehouses from ontologies 

      Romero Moral, Óscar; Abelló Gamazo, Alberto (2009-07)
      External research report
      Open Access
      Some research efforts have proposed the automation of the data warehouse design in order to free this task of being (completely) performed by an expert and facilitate the whole process. Most advanced approaches exclusively ...
    • A machine learning approach for layout inference in spreadsheets 

      Koci, Elvis; Thiele, Maik; Romero Moral, Óscar; Lehner, Wolfgang (SciTePress, 2016)
      Conference report
      Open Access
      Spreadsheet applications are one of the most used tools for content generation and presentation in industry and the Web. In spite of this success, there does not exist a comprehensive approach to automatically extract and ...
    • A requirement-driven approach to the design and evolution of data warehouses 

      Jovanovic, Petar; Romero Moral, Óscar; Simitsis, Alkis; Abelló Gamazo, Alberto; Mayorova, Daria (2014-08-01)
      Article
      Restricted access - publisher's policy
      Designing data warehouse (DW) systems in highly dynamic enterprise environments is not an easy task. At each moment, the multidimensional (MD) schema needs to satisfy the set of information requirements posed by the business ...
    • A software reference architecture for semantic-aware big data systems 

      Nadal Francesch, Sergi; Herrero Otal, Víctor; Romero Moral, Óscar; Abelló Gamazo, Alberto; Franch Gutiérrez, Javier; Vansummeren, Stijn; Valerio, Danilo (2016-06-13)
      Article
      Open Access
      Context: Big Data systems are a class of software systems that ingest, store, process and serve massive amounts of heterogeneous data, from multiple sources. Despite their undisputed impact in current society, their ...
    • A software tool for e-assessment of relational database skills 

      Abelló Gamazo, Alberto; Burgués Illa, Xavier; Casany Guerrero, María José; Martín Escofet, Carme; Quer, Carme; Rodríguez González, M. Elena; Romero Moral, Óscar; Urpí Tubella, Antoni (Tempus Publications, 2016-01-01)
      Article
      Restricted access - publisher's policy
      The objective of this paper is to present a software tool for the e-assessment of relational database skills. The tool is referred to as LearnSQL (Learning Environment for Automatic Rating of Notions of SQL). LearnSQL is ...
    • A unified view of data-intensive flows in business intelligence systems : a survey 

      Jovanovic, Petar; Romero Moral, Óscar; Abelló Gamazo, Alberto (2016-12)
      Article
      Open Access
      Data-intensive flows are central processes in today’s business intelligence (BI) systems, deploying different technologies to deliver data, from a multitude of data sources, in user-preferred and analysis-ready formats. ...
    • An integration-oriented ontology to govern evolution in big data ecosystems 

      Nadal Francesch, Sergi; Romero Moral, Óscar; Abelló Gamazo, Alberto; Vassiliadis, Panos; Vansummeren, Stijn (CEUR-WS.org, 2017)
      Conference lecture
      Open Access
      Big Data architectures allow to flexibly store and process heterogeneous data, from multiple sources, in its original format. The structure of those data, commonly supplied by means of REST APIs, is continuously evolving, ...
    • An integration-oriented ontology to govern evolution in big data ecosystems 

      Nadal Francesch, Sergi; Romero Moral, Óscar; Abelló Gamazo, Alberto; Vassiliadis, Panos; Vansummeren, Stijn (2019-01)
      Article
      Open Access
      Big Data architectures allow to flexibly store and process heterogeneous data, from multiple sources, in their original format. The structure of those data, commonly supplied by means of REST APIs, is continuously evolving. ...
    • Analytical metadata modeling for next generation BI systems 

      Varga, Jovan; Romero Moral, Óscar; Bach Pedersen, Torben; Thomsen, Christian (2018-10)
      Article
      Open Access
      Business Intelligence (BI) systems are extensively used as in-house solutions to support decision-making in organizations. Next generation BI 2.0 systems claim for expanding the use of BI solutions to external data sources ...
    • ARDI: automatic generation of RDFS models from heterogeneous data sources 

      Nigatu, Shumet Tadesse; Gómez Seoane, Cristina; Romero Moral, Óscar; Hose, Katja; Rabbani, Kashif (Institute of Electrical and Electronics Engineers (IEEE), 2019)
      Conference report
      Open Access
      The current wealth of information, typically known as Big Data, generates a large amount of available data for organisations. Data Integration provides foundations to query disparate data sources as if they were integrated ...
    • ATUN-HL: auto tuning of hybrid layouts using workload and data characteristics 

      Munir, Rana Faisal; Abelló Gamazo, Alberto; Romero Moral, Óscar; Thiele, Maik; Lehner, Wolfgang (2018)
      Conference report
      Open Access
      Ad-hoc analysis implies processing data in near real-time. Thus, raw data (i.e., neither normalized nor transformed) is typically dumped into a distributed engine, where it is generally stored into a hybrid layout. Hybrid ...
    • Automatic multidimensional design of data warehouses from requirements 

      Romero Moral, Óscar; Abelló Gamazo, Alberto (2009-07)
      External research report
      Open Access
      The ideal scenario to derive the multidimensional conceptual schema of a data warehouse would entail a hybrid approach (i.e. a combined data-driven and requirement-driven approach). Thus, the resulting multidimensional ...
    • Automatically configuring parallelism for hybrid layouts 

      Munir, Rana Faisal; Abelló Gamazo, Alberto; Romero Moral, Óscar; Thiele, Maik; Lehner, Wolfgang (Springer, 2019)
      Conference lecture
      Open Access
      Distributed processing frameworks process data in parallel by dividing it into multiple partitions and each partition is processed in a separate task. The number of tasks is always created based on the total file size. ...
    • Automating the multidimensional design of data warehouses 

      Romero Moral, Óscar (Universitat Politècnica de Catalunya, 2010-02-09)
      Doctoral thesis
      Open Access
      Les experiències prèvies en l'àmbit dels magatzems de dades (o data warehouse), mostren que l'esquema multidimensional del data warehouse ha de ser fruit d'un enfocament híbrid; això és, una proposta que consideri tant els ...
    • Big data management challenges in SUPERSEDE 

      Nadal Francesch, Sergi; Abelló Gamazo, Alberto; Romero Moral, Óscar; Varga, Jovan (CEUR-WS.org, 2017)
      Conference report
      Open Access
      The H2020 SUPERSEDE (www.supersede.eu) project aims to support decision-making in the evolution and adaptation of software services and applications by exploiting end-user feedback and runtime data, with the overall goal ...
    • Configuring parallelism for hybrid layouts using multi-objective optimization 

      Munir, Rana Faisal; Abelló Gamazo, Alberto; Romero Moral, Óscar; Thiele, Maik; Lehner, Wolfgang (2020-06-01)
      Article
      Open Access
      Modern organizations typically store their data in a raw format in data lakes. These data are then processed and usually stored under hybrid layouts, because they allow projection and selection operations. Thus, they allow ...
    • Data engineering for data science: two sides of the same coin 

      Romero Moral, Óscar; Wrembel, Robert (Springer, 2020)
      Conference report
      Open Access
      A de facto technological standard of data science is based on notebooks (e.g., Jupyter), which provide an integrated environment to execute data workflows in different languages. However, from a data engineering point of ...