Doctorat Erasmus Mundus en Tecnologies de la Informació per a la Intel·ligència Empresarial
Collections in this community
Recent Submissions
-
Graph-driven federated data management
(2023-01-01)
Article
Open AccessModern data analysis applications, require the ability to provide on-demand integration of data sources while offering a flexible and user-friendly query interface. Traditional techniques for answering queries using views, ... -
Effective and scalable data discovery with NextiaJD
(OpenProceedings, 2021)
Conference lecture
Open AccessWe present NextiaJD, a data discovery system with high predictive performance and computational efficiency. NextiaJD aids data scientists in the discovery of datasets that can be crossed. To that end, it proposes a ranking ... -
DocDesign 2.0: Automated database design for document stores with multi-criteria optimization
(OpenProceedings, 2021)
Conference lecture
Open AccessWe present DocDesign 2.0, a novel system that supports database design for document stores. DocDesign 2.0 automatically generates a document store design driven by a query workload and a set of optimization objectives. In ... -
Towards scalable data discovery
(OpenProceedings, 2021)
Conference lecture
Open AccessWe study the problem of discovering joinable datasets at scale. We approach the problem from a learning perspective relying on profiles. These are succinct representations that capture the underlying characteristics of the ... -
A framework for assessing the peer review duration of journals: case study in computer science
(Springer Nature, 2021-01)
Article
Open AccessIn various fields, scientific article publication is a measure of productivity and in many occasions it is used as a critical factor for evaluating researchers. Therefore, a lot of time is dedicated to writing articles ... -
Configuring parallelism for hybrid layouts using multi-objective optimization
(2020-06-01)
Article
Open AccessModern organizations typically store their data in a raw format in data lakes. These data are then processed and usually stored under hybrid layouts, because they allow projection and selection operations. Thus, they allow ... -
On the performance impact of using JSON, beyond impedance mismatch
(Springer, 2020)
Conference report
Open AccessNOSQL database management systems adopt semi-structured data models, such as JSON, to easily accommodate schema evolution and overcome the overhead generated from transforming internal structures to tabular data (i.e., ... -
Quarry: A user-centered big data integration platform
(2021-02)
Article
Open AccessObtaining valuable insights and actionable knowledge from data requires cross-analysis of domain data typically coming from various sources. Doing so, inevitably imposes burdensome processes of unifying different data ... -
TopoGraph: an end-to-end framework to build and analyze graph cubes
(2021-02)
Article
Open AccessGraphs are a fundamental structure that provides an intuitive abstraction for modeling and analyzing complex and highly interconnected data. Given the potential complexity of such data, some approaches proposed extending ... -
Keeping the data lake in form: proximity mining for pre-filtering schema matching
(2020-05)
Article
Open AccessData Lakes (DLs) are large repositories of raw datasets from disparate sources. As more datasets are ingested into a DL, there is an increasing need for efficient techniques to profile them and to detect the relationships ... -
Multidimensional integration of RDF datasets
(Springer, 2019)
Conference lecture
Open AccessData providers have been uploading RDF datasets on the web to aid researchers and analysts in finding insights. These datasets, made available by different data providers, contain common characteristics that enable their ... -
XLIndy: interactive recognition and information extraction in spreadsheets
(Association for Computing Machinery (ACM), 2019)
Conference report
Open AccessOver the years, spreadsheets have established their presence in many domains, including business, government, and science. However, challenges arise due to spreadsheets being partially-structured and carrying implicit ...