Now showing items 1-20 of 102

    • A cost-based storage format selector for materialized results in big data frameworks 

      Munir, Rana Faisal; Abelló Gamazo, Alberto; Romero Moral, Óscar; Thiele, Maik; Lehner, Wolfgang (2019-05-08)
      Article
      Open Access
      Modern big data frameworks (such as Hadoop and Spark) allow multiple users to do large-scale analysis simultaneously, by deploying data-intensive workflows (DIWs). These DIWs of different users share many common tasks (i.e, ...
    • A Data-driven approach to improve the process of data-intensive API creation and evolution 

      Abelló Gamazo, Alberto; Ayala Martínez, Claudia Patricia; Farré Tost, Carles; Gómez Seoane, Cristina; Oriol Hilari, Marc; Romero Moral, Óscar (CEUR-WS.org, 2017)
      Conference report
      Open Access
      The market of data-intensive Application Programming Interfaces (APIs) has recently experienced an exponential growth, but the creation and evolution of such APIs is still done ad-hoc, with little automated support and ...
    • A framework for multidimensional design of data warehouses from ontologies 

      Romero Moral, Óscar; Abelló Gamazo, Alberto (2009-07)
      External research report
      Open Access
      Some research efforts have proposed the automation of the data warehouse design in order to free this task of being (completely) performed by an expert and facilitate the whole process. Most advanced approaches exclusively ...
    • A framework for user-centered declarative ETL 

      Theodorou, Vasileios; Abelló Gamazo, Alberto; Thiele, Maik; Lehner, Wolfgang (2014)
      Conference report
      Open Access
      As business requirements evolve with increasing information density and velocity, there is a growing need for efficiency and automation of Extract-Transform-Load (ETL) processes. Current approaches for the modeling and ...
    • A requirement-driven approach to the design and evolution of data warehouses 

      Jovanovic, Petar; Romero Moral, Óscar; Simitsis, Alkis; Abelló Gamazo, Alberto; Mayorova, Daria (2014-08-01)
      Article
      Restricted access - publisher's policy
      Designing data warehouse (DW) systems in highly dynamic enterprise environments is not an easy task. At each moment, the multidimensional (MD) schema needs to satisfy the set of information requirements posed by the business ...
    • A situational approach for the definition and tailoring of a data-driven software evolution method 

      Franch Gutiérrez, Javier; Ralyté, Jolita; Perini, Anna; Abelló Gamazo, Alberto; Ameller, David; Gorroñogoitia, Jesús; Nadal Francesch, Sergi; Oriol Hilari, Marc; Seyff, Norbert; Siena, Alberto; Susi, Angelo (Springer, 2018)
      Conference report
      Open Access
      Successful software evolution heavily depends on the selection of the right features to be included in the next release. Such selection is difficult, and companies often report bad experiences about user acceptance. To ...
    • A software reference architecture for semantic-aware big data systems 

      Nadal Francesch, Sergi; Herrero Otal, Víctor; Romero Moral, Óscar; Abelló Gamazo, Alberto; Franch Gutiérrez, Javier; Vansummeren, Stijn; Valerio, Danilo (2016-06-13)
      Article
      Open Access
      Context: Big Data systems are a class of software systems that ingest, store, process and serve massive amounts of heterogeneous data, from multiple sources. Despite their undisputed impact in current society, their ...
    • A software tool for e-assessment of relational database skills 

      Abelló Gamazo, Alberto; Burgués Illa, Xavier; Casany Guerrero, María José; Martín Escofet, Carme; Quer, Carme; Rodríguez González, M. Elena; Romero Moral, Óscar; Urpí Tubella, Antoni (Tempus Publications, 2016-01-01)
      Article
      Restricted access - publisher's policy
      The objective of this paper is to present a software tool for the e-assessment of relational database skills. The tool is referred to as LearnSQL (Learning Environment for Automatic Rating of Notions of SQL). LearnSQL is ...
    • A unified view of data-intensive flows in business intelligence systems : a survey 

      Jovanovic, Petar; Romero Moral, Óscar; Abelló Gamazo, Alberto (2016-12)
      Article
      Open Access
      Data-intensive flows are central processes in today’s business intelligence (BI) systems, deploying different technologies to deliver data, from a multitude of data sources, in user-preferred and analysis-ready formats. ...
    • Adaptació d'assignatures de bases de dades a l'EEES 

      Martín Escofet, Carme; Abelló Gamazo, Alberto; Burgués Illa, Xavier; Casany Guerrero, María José; Quer, Carme; Rodríguez González, María Elena; Urpí Tubella, Antoni (2010)
      Conference report
      Open Access
      Els canvis recents en els plans d'estudis de la UPC i la UOC tenen en compte el nou espai europeu d'educació superior (EEES). Una de les conseqüències directes d'aquests canvis és la necessitat d'afitar i optimitzar el ...
    • Adapting LEARN-SQL to database computer supported cooperative learning 

      Burgués Illa, Xavier; Martín Escofet, Carme; Quer, Carme; Abelló Gamazo, Alberto; Casany Guerrero, María José; Urpí Tubella, Antoni; Rodríguez González, María Elena (2010)
      Conference report
      Open Access
      LEARN-SQL is a tool that we are using since three years ago in several database courses, and that has shown its positive effects in the learning of different database issues. This tool allows proposing remote questionnaires ...
    • Aggregating energy flexibilities under constraints 

      Valsomatzis, Emmanouil; Bach Pedersen, Torben; Abelló Gamazo, Alberto; Hose, Katja (Institute of Electrical and Electronics Engineers (IEEE), 2016)
      Conference report
      Open Access
      The flexibility of individual energy prosumers (producers and/or consumers) has drawn a lot of attention in recent years. Aggregation of such flexibilities provides prosumers with the opportunity to directly participate ...
    • An integration-oriented ontology to govern evolution in big data ecosystems 

      Nadal Francesch, Sergi; Romero Moral, Óscar; Abelló Gamazo, Alberto; Vassiliadis, Panos; Vansummeren, Stijn (CEUR-WS.org, 2017)
      Conference lecture
      Open Access
      Big Data architectures allow to flexibly store and process heterogeneous data, from multiple sources, in its original format. The structure of those data, commonly supplied by means of REST APIs, is continuously evolving, ...
    • An integration-oriented ontology to govern evolution in big data ecosystems 

      Nadal Francesch, Sergi; Romero Moral, Óscar; Abelló Gamazo, Alberto; Vassiliadis, Panos; Vansummeren, Stijn (2019-01)
      Article
      Open Access
      Big Data architectures allow to flexibly store and process heterogeneous data, from multiple sources, in their original format. The structure of those data, commonly supplied by means of REST APIs, is continuously evolving. ...
    • Approximating the DTD of a set of XML documents 

      Abelló Gamazo, Alberto; Palol Arregui, Xavier de; Hacid, Mohand-Saïd (2005-03)
      External research report
      Open Access
      The WWW contains a huge amount of documents. Some of them share the subject, but are generated by different people or even organizations. To guarantee the interchange of such documents, we can use XML. This allows to share ...
    • Approximating the schema of a set of documents by means of resemblance 

      Abelló Gamazo, Alberto; Palol, Xavier de; Hacid, Mohand-Saïd (Springer, 2018-06-02)
      Article
      Open Access
      The WWW contains a huge amount of documents. Some of them share the same subject, but are generated by different people or even by different organizations. A semi-structured model allows to share documents that do not have ...
    • ATUN-HL: auto tuning of hybrid layouts using workload and data characteristics 

      Munir, Rana Faisal; Abelló Gamazo, Alberto; Romero Moral, Óscar; Thiele, Maik; Lehner, Wolfgang (2018)
      Conference report
      Open Access
      Ad-hoc analysis implies processing data in near real-time. Thus, raw data (i.e., neither normalized nor transformed) is typically dumped into a distributed engine, where it is generally stored into a hybrid layout. Hybrid ...
    • Automated data pre-processing via meta-learning 

      Bilalli, Besim; Abelló Gamazo, Alberto; Aluja Banet, Tomàs; Wrembel, Robert (2016)
      Conference report
      Open Access
      A data mining algorithm may perform differently on datasets with different characteristics, e.g., it might perform better on a dataset with continuous attributes rather than with categorical attributes, or the other way ...
    • Automatic multidimensional design of data warehouses from requirements 

      Romero Moral, Óscar; Abelló Gamazo, Alberto (2009-07)
      External research report
      Open Access
      The ideal scenario to derive the multidimensional conceptual schema of a data warehouse would entail a hybrid approach (i.e. a combined data-driven and requirement-driven approach). Thus, the resulting multidimensional ...
    • Automatically configuring parallelism for hybrid layouts 

      Munir, Rana Faisal; Abelló Gamazo, Alberto; Romero Moral, Óscar; Thiele, Maik; Lehner, Wolfgang (Springer, 2019)
      Conference lecture
      Open Access
      Distributed processing frameworks process data in parallel by dividing it into multiple partitions and each partition is processed in a separate task. The number of tasks is always created based on the total file size. ...