An investigation into new kernels for categorical variables
Visualitza/Obre
Estadístiques de LA Referencia / Recolecta
Inclou dades d'ús des de 2022
Cita com:
hdl:2099.1/17172
Tipus de documentProjecte Final de Màster Oficial
Data2013-01
Condicions d'accésAccés obert
Llevat que s'hi indiqui el contrari, els
continguts d'aquesta obra estan subjectes a la llicència de Creative Commons
:
Reconeixement-NoComercial-SenseObraDerivada 3.0 Espanya
Abstract
Kernel-based methods first appeared in the form of support vector
machines. Since the first Support Vector Machine (SVM) formulation in
1995, we have seen how the number of proposed kernel
functions has quickly grown, and how these kernels have approached a
wide range of problems and domains. The most common and direct
applications of these methods are focused on continuous numeric data,
given that SVMs at the end involves the solution of an optimization problem. Additionally, some kernel functions have been oriented to more
symbolic data, in problems like text analysis, or hand-written digits
recognition. But surprisingly, there is a gap in the area of kernel
functions devoted to handle datasets with qualitative variables. One of the
most common practices to overcome this lack consists on recoding the
source qualitative information, making them suitable for applying numeric
kernel functions.
This thesis presents the development of new kernel functions that can
better model symbolic information presented as categorical variables, in a
direct way, and without the need of data preprocessing methods. The
proposition is based on the use of probabilistic information (probability
mass distribution) to compare the different modalities of a variable.
Additionally, the idea is formulated through a modular schema, combining a
set of components to obtain the kernel functions, facilitating the
modification and extension of single components.
The experimental results suggest an slightly improvement with respect
to traditional kernel functions, in the accuracy obtained on classification
problems. This progress is clearer on datasets with known probabilistic
structure.
MatèriesSupport vector machines, Kernel functions, Computer algorithms, Kernel, Funcions de, Algorismes computacionals
TitulacióMÀSTER UNIVERSITARI EN COMPUTACIÓ (Pla 2006)
Col·leccions
Fitxers | Descripció | Mida | Format | Visualitza |
---|---|---|---|---|
MarcoVillegas.pdf | 635,5Kb | Visualitza/Obre |