Enviaments recents

  • Graph-driven federated data management 

    Nadal Francesch, Sergi; Abelló Gamazo, Alberto; Romero Moral, Óscar; Vansummeren, Stijn; Vassiliadis, Panos (2023-01-01)
    Article
    Accés obert
    Modern data analysis applications, require the ability to provide on-demand integration of data sources while offering a flexible and user-friendly query interface. Traditional techniques for answering queries using views, ...
  • Effective and scalable data discovery with NextiaJD 

    Flores Herrera, Javier de Jesús; Nadal Francesch, Sergi; Romero Moral, Óscar (OpenProceedings, 2021)
    Comunicació de congrés
    Accés obert
    We present NextiaJD, a data discovery system with high predictive performance and computational efficiency. NextiaJD aids data scientists in the discovery of datasets that can be crossed. To that end, it proposes a ranking ...
  • DocDesign 2.0: Automated database design for document stores with multi-criteria optimization 

    Hewasinghage, Moditha Lakshan Dharmasir; Nadal Francesch, Sergi; Abelló Gamazo, Alberto (OpenProceedings, 2021)
    Comunicació de congrés
    Accés obert
    We present DocDesign 2.0, a novel system that supports database design for document stores. DocDesign 2.0 automatically generates a document store design driven by a query workload and a set of optimization objectives. In ...
  • Towards scalable data discovery 

    Flores Herrera, Javier de Jesús; Nadal Francesch, Sergi; Romero Moral, Óscar (OpenProceedings, 2021)
    Comunicació de congrés
    Accés obert
    We study the problem of discovering joinable datasets at scale. We approach the problem from a learning perspective relying on profiles. These are succinct representations that capture the underlying characteristics of the ...
  • A framework for assessing the peer review duration of journals: case study in computer science 

    Bilalli, Besim; Munir, Rana Faisal; Abelló Gamazo, Alberto (Springer Nature, 2021-01)
    Article
    Accés obert
    In various fields, scientific article publication is a measure of productivity and in many occasions it is used as a critical factor for evaluating researchers. Therefore, a lot of time is dedicated to writing articles ...
  • Configuring parallelism for hybrid layouts using multi-objective optimization 

    Munir, Rana Faisal; Abelló Gamazo, Alberto; Romero Moral, Óscar; Thiele, Maik; Lehner, Wolfgang (2020-06-01)
    Article
    Accés obert
    Modern organizations typically store their data in a raw format in data lakes. These data are then processed and usually stored under hybrid layouts, because they allow projection and selection operations. Thus, they allow ...
  • On the performance impact of using JSON, beyond impedance mismatch 

    Hewasinghage, Moditha Lakshan Dharmasir; Nadal Francesch, Sergi; Abelló Gamazo, Alberto (Springer, 2020)
    Text en actes de congrés
    Accés obert
    NOSQL database management systems adopt semi-structured data models, such as JSON, to easily accommodate schema evolution and overcome the overhead generated from transforming internal structures to tabular data (i.e., ...
  • Quarry: A user-centered big data integration platform 

    Jovanovic, Petar; Nadal Francesch, Sergi; Romero Moral, Óscar; Abelló Gamazo, Alberto; Bilalli, Besim (2021-02)
    Article
    Accés obert
    Obtaining valuable insights and actionable knowledge from data requires cross-analysis of domain data typically coming from various sources. Doing so, inevitably imposes burdensome processes of unifying different data ...
  • TopoGraph: an end-to-end framework to build and analyze graph cubes 

    Ghrab, Amine; Romero Moral, Óscar; Skhiri, Sabri; Zimányi, Esteban (2021-02)
    Article
    Accés obert
    Graphs are a fundamental structure that provides an intuitive abstraction for modeling and analyzing complex and highly interconnected data. Given the potential complexity of such data, some approaches proposed extending ...
  • Keeping the data lake in form: proximity mining for pre-filtering schema matching 

    Al-serafi, Ayman Mounir Mohamed; Abelló Gamazo, Alberto; Romero Moral, Óscar; Calders, Toon (2020-05)
    Article
    Accés obert
    Data Lakes (DLs) are large repositories of raw datasets from disparate sources. As more datasets are ingested into a DL, there is an increasing need for efficient techniques to profile them and to detect the relationships ...
  • Multidimensional integration of RDF datasets 

    Behan, Jam Jahanzeb Khan; Romero Moral, Óscar; Zimányi, Esteban (Springer, 2019)
    Comunicació de congrés
    Accés obert
    Data providers have been uploading RDF datasets on the web to aid researchers and analysts in finding insights. These datasets, made available by different data providers, contain common characteristics that enable their ...
  • XLIndy: interactive recognition and information extraction in spreadsheets 

    Koci, Elvis; Kuban, Dana; Luetting, Nico; Olwig, Dominik; Thiele, Maik; Gonsior, Julius; Lehner, Wolfgang; Romero Moral, Óscar (Association for Computing Machinery (ACM), 2019)
    Text en actes de congrés
    Accés obert
    Over the years, spreadsheets have established their presence in many domains, including business, government, and science. However, challenges arise due to spreadsheets being partially-structured and carrying implicit ...

Mostra'n més