Mostra el registre d'ítem simple
An investigation into new kernels for categorical variables
dc.contributor | Belanche Muñoz, Luis Antonio |
dc.contributor.author | Villegas García, Marco Antonio |
dc.contributor.other | Universitat Politècnica de Catalunya. Departament de Llenguatges i Sistemes Informàtics |
dc.date.accessioned | 2013-02-11T13:49:07Z |
dc.date.available | 2013-02-11T13:49:07Z |
dc.date.issued | 2013-01 |
dc.identifier.uri | http://hdl.handle.net/2099.1/17172 |
dc.description.abstract | Kernel-based methods first appeared in the form of support vector machines. Since the first Support Vector Machine (SVM) formulation in 1995, we have seen how the number of proposed kernel functions has quickly grown, and how these kernels have approached a wide range of problems and domains. The most common and direct applications of these methods are focused on continuous numeric data, given that SVMs at the end involves the solution of an optimization problem. Additionally, some kernel functions have been oriented to more symbolic data, in problems like text analysis, or hand-written digits recognition. But surprisingly, there is a gap in the area of kernel functions devoted to handle datasets with qualitative variables. One of the most common practices to overcome this lack consists on recoding the source qualitative information, making them suitable for applying numeric kernel functions. This thesis presents the development of new kernel functions that can better model symbolic information presented as categorical variables, in a direct way, and without the need of data preprocessing methods. The proposition is based on the use of probabilistic information (probability mass distribution) to compare the different modalities of a variable. Additionally, the idea is formulated through a modular schema, combining a set of components to obtain the kernel functions, facilitating the modification and extension of single components. The experimental results suggest an slightly improvement with respect to traditional kernel functions, in the accuracy obtained on classification problems. This progress is clearer on datasets with known probabilistic structure. |
dc.language.iso | eng |
dc.publisher | Universitat Politècnica de Catalunya |
dc.rights | Attribution-NonCommercial-NoDerivs 3.0 Spain |
dc.rights.uri | http://creativecommons.org/licenses/by-nc-nd/3.0/es/ |
dc.subject | Àrees temàtiques de la UPC::Informàtica::Informàtica teòrica::Algorísmica i teoria de la complexitat |
dc.subject.lcsh | Support vector machines |
dc.subject.lcsh | Kernel functions |
dc.subject.lcsh | Computer algorithms |
dc.title | An investigation into new kernels for categorical variables |
dc.type | Master thesis |
dc.subject.lemac | Kernel, Funcions de |
dc.subject.lemac | Algorismes computacionals |
dc.rights.access | Open Access |
dc.audience.educationlevel | Màster |
dc.audience.mediator | Facultat d'Informàtica de Barcelona |
dc.audience.degree | MÀSTER UNIVERSITARI EN COMPUTACIÓ (Pla 2006) |