Now showing items 1-8 of 8

    • An alternative view on data processing pipelines from the DOLAP 2019 perspective 

      Romero Moral, Óscar; Wrembel, Robert; Song, Il-Yeol (Elsevier, 2020-09)
      Article
      Open Access
      Data science requires constructing data processing pipelines (DPPs), which span diverse phases such as data integration, cleaning, pre-processing, and analysis. However, current solutions lack a strong data engineering ...
    • Automated data pre-processing via meta-learning 

      Bilalli, Besim; Abelló Gamazo, Alberto; Aluja Banet, Tomàs; Wrembel, Robert (2016)
      Conference report
      Open Access
      A data mining algorithm may perform differently on datasets with different characteristics, e.g., it might perform better on a dataset with continuous attributes rather than with categorical attributes, or the other way ...
    • Data engineering for data science: two sides of the same coin 

      Romero Moral, Óscar; Wrembel, Robert (Springer, 2020)
      Conference report
      Open Access
      A de facto technological standard of data science is based on notebooks (e.g., Jupyter), which provide an integrated environment to execute data workflows in different languages. However, from a data engineering point of ...
    • Intelligent assistance for data pre-processing 

      Bilalli, Besim; Abelló Gamazo, Alberto; Aluja Banet, Tomàs; Wrembel, Robert (Elsevier, 2017-06-03)
      Article
      Open Access
      A data mining algorithm may perform differently on datasets with different characteristics, e.g., it might perform better on a dataset with continuous attributes rather than with categorical attributes, or the other way ...
    • moduli: A disaggregated data management architecture for data-intensive workflows 

      Ceravolo, Paolo; Catarci, Tiziana; Console, Marco; Cudré-Mauroux, Philippe; Groppe, Sven; Hose, Katja; Pokorný, Jaroslav; Romero Moral, Óscar; Wrembel, Robert (2024-02)
      Article
      Open Access
      As companies store, process, and analyse bigger and bigger volumes of highly heterogeneous data, novel research and technological challenges are emerging. Traditional and rigid data integration and processing techniques ...
    • PRESISTANT : data pre-processing assistant 

      Bilalli, Besim; Abelló Gamazo, Alberto; Aluja Banet, Tomàs; Munir, Rana Faisal; Wrembel, Robert (Springer, 2019)
      Conference lecture
      Open Access
      A concrete classification algorithm may perform differently on datasets with different characteristics, e.g., it might perform better on a dataset with continuous attributes rather than with categorical attributes, or the ...
    • PRESISTANT: Learning based assistant for data pre-processing 

      Bilalli, Besim; Abelló Gamazo, Alberto; Aluja Banet, Tomàs; Wrembel, Robert (Elsevier, 2019-09)
      Article
      Open Access
      Data pre-processing is one of the most time consuming and relevant steps in a data analysis process (e.g., classification task). A given data pre-processing operator can have positive, negative, or zero impact on the final ...
    • Towards intelligent data analysis : the metadata challenge 

      Bilalli, Besim; Abelló Gamazo, Alberto; Aluja Banet, Tomàs; Wrembel, Robert (2016)
      Conference lecture
      Restricted access - publisher's policy
      Once analyzed correctly, data can yield substantial benefits. The process of analyzing the data and transforming it into knowledge is known as Knowledge Discovery in Databases (KDD). The plethora and subtleties of algorithms ...