dc.contributor.author | Bilalli, Besim |
dc.contributor.author | Abelló Gamazo, Alberto |
dc.contributor.author | Aluja Banet, Tomàs |
dc.contributor.author | Wrembel, Robert |
dc.contributor.other | Universitat Politècnica de Catalunya. Departament d'Enginyeria de Serveis i Sistemes d'Informació |
dc.contributor.other | Universitat Politècnica de Catalunya. Departament d'Estadística i Investigació Operativa |
dc.date.accessioned | 2018-01-26T09:24:30Z |
dc.date.available | 2019-06-04T02:30:46Z |
dc.date.issued | 2017-06-03 |
dc.identifier.citation | Bilalli, B., Abello, A., Aluja, T., Wrembel, R. Intelligent assistance for data pre-processing. "Computer standards & interfaces", 3 Juny 2017, vol. 57, p. 101-109. |
dc.identifier.issn | 0920-5489 |
dc.identifier.uri | http://hdl.handle.net/2117/113239 |
dc.description.abstract | A data mining algorithm may perform differently on datasets with different characteristics, e.g., it might perform better on a dataset with continuous attributes rather than with categorical attributes, or the other way around. Typically, a dataset needs to be pre-processed before being mined. Taking into account all the possible pre-processing operators, there exists a staggeringly large number of alternatives. As a consequence, non-experienced users become overwhelmed with pre-processing alternatives. In this paper, we show that the problem can be addressed by automating the pre-processing with the support of meta-learning. To this end, we analyzed a wide range of data pre-processing techniques and a set of classification algorithms. For each classification algorithm that we consider and a given dataset, we are able to automatically suggest the transformations that improve the quality of the results of the algorithm on the dataset. Our approach will help non-expert users to more effectively identify the transformations appropriate to their applications, and hence to achieve improved results. |
dc.format.extent | 9 p. |
dc.language.iso | eng |
dc.publisher | Elsevier |
dc.subject | Àrees temàtiques de la UPC::Informàtica::Enginyeria del software |
dc.subject.lcsh | Data mining |
dc.subject.other | Data pre-processing
Data mining
Meta-learning |
dc.title | Intelligent assistance for data pre-processing |
dc.type | Article |
dc.subject.lemac | Mineria de dades |
dc.contributor.group | Universitat Politècnica de Catalunya. inSSIDE - integrated Software, Service, Information and Data Engineering |
dc.contributor.group | Universitat Politècnica de Catalunya. LIAM - Laboratori de Modelització i Anàlisi de la Informació |
dc.identifier.doi | 10.1016/j.csi.2017.05.004 |
dc.relation.publisherversion | https://www.sciencedirect.com/science/article/pii/S0920548916302306?via%3Dihub |
dc.rights.access | Open Access |
local.identifier.drac | 21872723 |
dc.description.version | Postprint (author's final draft) |
local.citation.author | Bilalli, B.; Abello, A.; Aluja, T.; Wrembel, R. |
local.citation.publicationName | Computer standards & interfaces |
local.citation.volume | 57 |
local.citation.startingPage | 101 |
local.citation.endingPage | 109 |