A survey on pre-processing techniques: relevant issues in the context of environmental data mining

Gibert, Karina; Sànchez-Marrè, Miquel; Izquierdo, Joaquín

doi:10.3233/AIC-160710

Visualitza/Obre

AIC710def.pdf (876,6Kb)

Veure estadístiques d'ús d'UPCommons

Estadístiques de LA Referencia / Recolecta

Cita com:

Mostra el registre d'ítem complet

Gibert, Karina

Sànchez-Marrè, Miquel

Izquierdo, Joaquín

Tipus de documentArticle

Data publicació2016-12

EditorIOS Press

Condicions d'accésAccés obert

Tots els drets reservats. Aquesta obra està protegida pels drets de propietat intel·lectual i industrial corresponents. Sense perjudici de les exempcions legals existents, queda prohibida la seva reproducció, distribució, comunicació pública o transformació sense l'autorització del titular dels drets

Abstract

One of the important issues related with all types of data analysis, either statistical data analysis, machine learning, data mining, data science or whatever form of data-driven modeling, is data quality. The more complex the reality to be analyzed is, the higher the risk of getting low quality data. Unfortunately real data often contain noise, uncertainty, errors, redundancies or even irrelevant information. Useless models will be obtained when built over incorrect or incomplete data. As a consequence, the quality of decisions made over these models, also depends on data quality. This is why pre-processing is one of the most critical steps of data analysis in any of its forms. However, pre-processing has not been properly systematized yet, and little research is focused on this. In this paper a survey on most popular pre-processing steps required in environmental data analysis is presented, together with a proposal to systematize it. Rather than providing technical details on specific pre-processing techniques, the paper focus on providing general ideas to a non-expert user, who, after reading them, can decide which one is the more suitable technique required to solve his/her problem.

CitacióGibert, Karina, Sànchez-Marrè, M., Izquierdo, J. A survey on pre-processing techniques: relevant issues in the context of environmental data mining. "AI communications: the european journal of artificial intelligence", Desembre 2016, vol. 29, núm. 6, p. 627-663.

URIhttp://hdl.handle.net/2117/123530

DOI10.3233/AIC-160710

ISSN0921-7126

Versió de l'editorhttp://content.iospress.com/articles/ai-communications/aic710

Col·leccions

Veure estadístiques d'ús d'UPCommons

Mostra el registre d'ítem complet

Fitxers	Descripció	Mida	Format	Visualitza
AIC710def.pdf		876,6Kb	PDF	Visualitza/Obre

UPCommons. Portal del coneixement obert de la UPC

A survey on pre-processing techniques: relevant issues in the context of environmental data mining

Visualitza/Obre

Explora