Mostra el registre d'ítem simple

dc.contributor.authorAl-serafi, Ayman Mounir Mohamed
dc.contributor.authorAbelló Gamazo, Alberto
dc.contributor.authorRomero Moral, Óscar
dc.contributor.authorCalders, Toon
dc.contributor.otherUniversitat Politècnica de Catalunya. Departament d'Enginyeria de Serveis i Sistemes d'Informació
dc.date.accessioned2017-02-08T16:20:42Z
dc.date.available2017-02-08T16:20:42Z
dc.date.issued2016
dc.identifier.citationAl-serafi, A., Abello, A., Romero, O., Calders, T. Towards information profiling: data lake content metadata management. A: Workshop on Data Integration and Applications. "ICDMW 2016: IEEE 16th International Conference on Data Mining Workshops, 12-15 December 2016, Barcelona, Catalonia, Spain". Barcelona: Institute of Electrical and Electronics Engineers (IEEE), 2016, p. 178-185.
dc.identifier.isbn978-1-5090-5472-5
dc.identifier.urihttp://hdl.handle.net/2117/100712
dc.description.abstractThere is currently a burst of Big Data (BD) processed and stored in huge raw data repositories, commonly called Data Lakes (DL). These BD require new techniques of data integration and schema alignment in order to make the data usable by its consumers and to discover the relationships linking their content. This can be provided by metadata services which discover and describe their content. However, there is currently a lack of a systematic approach for such kind of metadata discovery and management. Thus, we propose a framework for the profiling of informational content stored in the DL, which we call information profiling. The profiles are stored as metadata to support data analysis. We formally define a metadata management process which identifies the key activities required to effectively handle this.We demonstrate the alternative techniques and performance of our process using a prototype implementation handling a real-life case-study from the OpenML DL, which showcases the value and feasibility of our approach.
dc.format.extent8 p.
dc.language.isoeng
dc.publisherInstitute of Electrical and Electronics Engineers (IEEE)
dc.subjectÀrees temàtiques de la UPC::Informàtica::Enginyeria del software
dc.subject.lcshMetadata
dc.subject.lcshContent management
dc.subject.otherMetadata
dc.subject.otherInformation profiling
dc.subject.otherData lake
dc.subject.otherContent management
dc.titleTowards information profiling: data lake content metadata management
dc.typeConference report
dc.subject.lemacMetadades
dc.subject.lemacTecnologia de la informació -- Gestió
dc.contributor.groupUniversitat Politècnica de Catalunya. MPI - Modelització i Processament de la Informació
dc.identifier.doi10.1109/ICDMW.2016.0033
dc.description.peerreviewedPeer Reviewed
dc.relation.publisherversionhttp://ieeexplore.ieee.org/document/7836664/
dc.rights.accessOpen Access
local.identifier.drac19680707
dc.description.versionPostprint (author's final draft)
local.citation.authorAl-serafi, A.; Abello, A.; Romero, O.; Calders, T.
local.citation.contributorWorkshop on Data Integration and Applications
local.citation.pubplaceBarcelona
local.citation.publicationNameICDMW 2016: IEEE 16th International Conference on Data Mining Workshops, 12-15 December 2016, Barcelona, Catalonia, Spain
local.citation.startingPage178
local.citation.endingPage185


Fitxers d'aquest items

Thumbnail

Aquest ítem apareix a les col·leccions següents

Mostra el registre d'ítem simple