Anàlisi i millora de l'algorisme Las Vegas Filter per a la selecció de variables en models predictius.

Martínez Escobar, Rubén

Visualitza/Obre

156493.pdf (2,851Mb)

Veure estadístiques d'ús d'UPCommons

Estadístiques de LA Referencia / Recolecta

Cita com:

Mostra el registre d'ítem complet

Martínez Escobar, Rubén

Tutor / directorBelanche Muñoz, Luis Antonio

Tipus de documentTreball Final de Grau

Data2021-01-19

Condicions d'accésAccés obert

Tots els drets reservats. Aquesta obra està protegida pels drets de propietat intel·lectual i industrial corresponents. Sense perjudici de les exempcions legals existents, queda prohibida la seva reproducció, distribució, comunicació pública o transformació sense l'autorització del titular dels drets

Abstract

Des de temps prehistòrics l'ésser humà s'ha diferenciat de la resta d'animals per la seva racionalitat, fet que impulsa a l'espècie a la recerca del coneixement. Aquest comportament únic ha estat el catalitzador principal de tota la nostra avançada evolució. Una evolució en la qual trobem importants fets que ens han marcat i canviat per sempre. Des del control del foc, l'escriptura, la llum i molts altres invents humans. Actualment ens trobem davant d'un altre descobriment el qual està tenint un gran impacte sobre tota la població, les ciències de les dades. Aquest nou paradigma que a priori pot semblar de menys precedència, està canviant la nostra evolucionada recerca del coneixement per complet i ens dota d'una capacitat d'anàlisi mai vista. Aquest projecte elaborat a la Universitat Politècnica de Catalunya, tractarà la problemàtica de la selecció de variables per a la construcció de models predictius. Un dels problemes de més rellevància en les ciències de les dades i que per culpa de l'apogeu del Big data cada cop cobra més importància. Aquest problema serà estudiat amb un enfocament poc habitual, ja que partirem de la base d'un algorisme probabilístic que amb un seguit d'optimitzacions intentarem millorar el seu rendiment. L'algorisme probabilístic a optimitzar és el Las Vegas Filter(LVF), un algorisme principalment utilitzat com a mètode de filtre i amb una mesura d'avaluació anomenada inconsistència. Durant el transcurs del projecte s'estudiaran optimitzacions ja proposades per altres autors i s'elaboraran diverses propostes de noves optimitzacions. A causa de la seva naturalesa probabilística, trobem un nivell d'aleatorietat molt alt en l'algorisme, el qual en alguns aspectes empitjora el rendiment d'aquest. Enfocarem les optimitzacions proposades en aquest projecte a resoldre aquest problema i a preservar els beneficis que ens aporta aquesta aleatorietat. Durant el transcurs del projecte tractarem principalment el LVF com un mètode de filtre, però per a resoldre el problema de la pèrdua de precisió en els models predictius construïts amb els subconjunts de variables solució donats pel LVF, serem pioners en la introducció d'un enfocament de mètode híbrid en el LVF, el qual ens obligarà a estudiar també optimitzacions ja proposades utilitzant el LVF com a un mètode d'embolcall.

Since prehistoric times, human beings have been differentiated from other animals by their rationality, a fact that drives the species in search of knowledge. This unique behaviour has been the main catalyst for all our advanced evolution. An evolution in which we find important facts that have marked and changed us forever. From the control of fire, writing, light and many other human inventions. Today we are facing another discovery which is having a great impact on the whole population, the data sciences. This new paradigm, which at first sight may seem to be of lesser precedence, is changing our evolved research of knowledge completely and giving us a capacity for analysis never seen before. This project, carried out at theUniversitat Politècnica de Catalunya, will deal with the problem of selecting features for the construction of predictive models. This is one of the most important problems in the data sciences and one that is becoming increasingly important due to the rise of Big Data. This problem will be studied with an unusual approach, since we will start from a probabilistic algorithm that with a series of optimizations we will try to improve its performance. The probabilistic algorithm to be optimised is the Las Vegas Filter(LVF), an algorithm mainly used as a filtering method and with an evaluation measure called inconsistency During the course of the project optimisations already proposed by other authors will be studied and several proposals for new optimisations will be developed. Due to its probabilistic nature, we found a very high level of randomness in the algorithm, which in some aspects worsens the performance of the algorithm.We will focus the optimizations proposed in this project to solve this problem and to preserve the benefits that this randomness brings us. During the course of the project, we will mainly treat the LVF as a filter method,but to solve the problem of loss of accuracy in predictive models built with the subsets of solution variables given by the LVF, we will pioneer the introduction of a hybrid method approach in the LVF, which will force us to study also optimisations already proposed using the LVF as a wrapper method.

MatèriesData mining, Machine learning, Mineria de dades, Aprenentatge automàtic

TitulacióGRAU EN ENGINYERIA INFORMÀTICA (Pla 2010)

URIhttp://hdl.handle.net/2117/343792

Col·leccions

Facultat d'Informàtica de Barcelona - Grau en Enginyeria Informàtica (Pla 2010) [2.482]

Veure estadístiques d'ús d'UPCommons

Mostra el registre d'ítem complet

Fitxers	Descripció	Mida	Format	Visualitza
156493.pdf		2,851Mb	PDF	Visualitza/Obre

UPCommons. Portal del coneixement obert de la UPC

Anàlisi i millora de l'algorisme Las Vegas Filter per a la selecció de variables en models predictius.

Visualitza/Obre

Explora