Designing mixture of deep experts

Kalyan, Sai Krishna

Visualitza/Obre

127266.pdf (2,587Mb)

Veure estadístiques d'ús d'UPCommons

Estadístiques de LA Referencia / Recolecta

Cita com:

Mostra el registre d'ítem complet

Kalyan, Sai Krishna

Tutor / directorBelanche Muñoz, Luis Antonio

Realitzat a/ambUniversité Laval

Tipus de documentProjecte Final de Màster Oficial

Data2017-03-07

Condicions d'accésAccés obert

Tots els drets reservats. Aquesta obra està protegida pels drets de propietat intel·lectual i industrial corresponents. Sense perjudici de les exempcions legals existents, queda prohibida la seva reproducció, distribució, comunicació pública o transformació sense l'autorització del titular dels drets

Abstract

Mixture of Experts (MoE) is a classical architecture for ensembles where each member is specialised in a given part of the input space or its expertise area. Working in this manner, we aim to specialise the experts on smaller problems, solving the original problem through some type of divide and conquer approach. The goal of our research is to initially reproduce the work done by Collobert et al[1] , 2002 followed by extending this work by using neural networks as experts on different datasets. Specialised representations will be learned over different aspects of the problem, and the results of the different members will be merged according to their specific expertise. This expertise can then be learned itself by a given network acting as a gating function. MOE architecture composed on N expert networks. These experts are combined via a gating network, which partition the input space accordingly. It is based on divide and conquer strategy supervised by a gating network. Using a specialised cost function the experts specialise in their sub-space. Using the discriminative power of experts is much better than simply clustering. The gating network needs to needs to learn how to assign examples to different specialists. Such models show promise for building larger networks that are still cheap to compute at test time, and more parallelizable at training time. We were able to reproduce the work by the author and implemented a multi-class gater to classify images. We know that Neural Networks perform the best with lots of data. However, some of our experiments require us to divide the dataset and train multiple Neural Networks. We observe that in data deprived condition our MoE are almost on par and compete with ensembles trained on complete data. Keywords : Machine Learning, Multi Layer Perceptrons, Mixture of Experts, Support Vector Machines, Divide and Conquer, Stochastic Gradient Descent, Optimization.

MatèriesMachine learning, Aprenentatge automàtic, Vectors

TitulacióMÀSTER UNIVERSITARI EN INNOVACIÓ I RECERCA EN INFORMÀTICA (Pla 2012)

Localització

: Quebec J0M, Canadà

URIhttp://hdl.handle.net/2117/115295

Col·leccions

Màsters oficials - Master in Innovation and Research in Informatics - MIRI [454]

Veure estadístiques d'ús d'UPCommons

Mostra el registre d'ítem complet

Fitxers	Descripció	Mida	Format	Visualitza
127266.pdf		2,587Mb	PDF	Visualitza/Obre

UPCommons. Portal del coneixement obert de la UPC

Designing mixture of deep experts

Visualitza/Obre

Explora