Show simple item record

dc.contributor.authorGibert, Karina
dc.contributor.authorSànchez-Marrè, Miquel
dc.contributor.authorIzquierdo, Joaquín
dc.contributor.otherUniversitat Politècnica de Catalunya. Departament d'Estadística i Investigació Operativa
dc.contributor.otherUniversitat Politècnica de Catalunya. Departament de Ciències de la Computació
dc.date.accessioned2018-11-05T09:16:13Z
dc.date.available2018-11-05T09:16:13Z
dc.date.issued2016-12
dc.identifier.citationGibert, Karina, Sànchez-Marrè, M., Izquierdo, J. A survey on pre-processing techniques: relevant issues in the context of environmental data mining. "AI communications: the european journal of artificial intelligence", Desembre 2016, vol. 29, núm. 6, p. 627-663.
dc.identifier.issn0921-7126
dc.identifier.urihttp://hdl.handle.net/2117/123530
dc.description.abstractOne of the important issues related with all types of data analysis, either statistical data analysis, machine learning, data mining, data science or whatever form of data-driven modeling, is data quality. The more complex the reality to be analyzed is, the higher the risk of getting low quality data. Unfortunately real data often contain noise, uncertainty, errors, redundancies or even irrelevant information. Useless models will be obtained when built over incorrect or incomplete data. As a consequence, the quality of decisions made over these models, also depends on data quality. This is why pre-processing is one of the most critical steps of data analysis in any of its forms. However, pre-processing has not been properly systematized yet, and little research is focused on this. In this paper a survey on most popular pre-processing steps required in environmental data analysis is presented, together with a proposal to systematize it. Rather than providing technical details on specific pre-processing techniques, the paper focus on providing general ideas to a non-expert user, who, after reading them, can decide which one is the more suitable technique required to solve his/her problem.
dc.format.extent37 p.
dc.language.isoeng
dc.publisherIOS Press
dc.subjectÀrees temàtiques de la UPC::Matemàtiques i estadística::Matemàtica aplicada a les ciències
dc.subjectÀrees temàtiques de la UPC::Matemàtiques i estadística::Estadística matemàtica
dc.subjectÀrees temàtiques de la UPC::Matemàtiques i estadística::Anàlisi numèrica
dc.subject.lcshArtificial intelligence
dc.subject.lcshSurvival analysis (Biometry)
dc.subject.lcshNumerical analysis--Simulation methods
dc.subject.otherPre-processing
dc.subject.otherdata quality
dc.subject.otherdata mining
dc.subject.otherknowledge discovery from databases
dc.subject.othermultidisciplinary approach
dc.subject.otherenvironmental systems
dc.titleA survey on pre-processing techniques: relevant issues in the context of environmental data mining
dc.typeArticle
dc.subject.lemacIntel·ligència artificial
dc.subject.lemacAnàlisi de supervivència (Biometria)
dc.subject.lemacAnàlisi numèrica
dc.contributor.groupUniversitat Politècnica de Catalunya. KEMLG - Grup d'Enginyeria del Coneixement i Aprenentatge Automàtic
dc.identifier.doi10.3233/AIC-160710
dc.description.peerreviewedPeer Reviewed
dc.subject.amsClassificació AMS::68 Computer science::68T Artificial intelligence
dc.subject.amsClassificació AMS::62 Statistics::62N Survival analysis and censored data
dc.subject.amsClassificació AMS::65 Numerical analysis::65C Probabilistic methods, simulation and stochastic differential equations
dc.relation.publisherversionhttp://content.iospress.com/articles/ai-communications/aic710
dc.rights.accessOpen Access
local.identifier.drac19259942
dc.description.versionPostprint (author's final draft)
local.citation.authorGibert, Karina; Sànchez-Marrè, M.; Izquierdo, J.
local.citation.publicationNameAI communications: the european journal of artificial intelligence
local.citation.volume29
local.citation.number6
local.citation.startingPage627
local.citation.endingPage663


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record