Show simple item record

dc.contributor.authorAgostini, Alejandro Gabriel
dc.contributor.authorCelaya Llover, Enric
dc.contributor.otherInstitut de Robòtica i Informàtica Industrial
dc.date.accessioned2011-11-30T14:25:06Z
dc.date.available2011-11-30T14:25:06Z
dc.date.issued2011
dc.identifier.urihttp://hdl.handle.net/2117/14123
dc.description.abstractIn this work we propose an approach for generalization in continuous domain Reinforcement Learning that, instead of using a single function approximator, tries many different function approximators in parallel, each one defined in a different region of the domain. Associated with each approximator is a relevance function that locally quantifies the quality of its approximation, so that, at each input point, the approximator with highest relevance can be selected. The relevance function is defined using parametric estimations of the variance of the q-values and the density of samples in the input space, which are used to quantify the accuracy and the confidence in the approximation, respectively. These parametric estimations are obtained from a probability density distribution represented as a Gaussian Mixture Model embedded in the input-output space of each approximator. In our experiments, the proposed approach required a lesser number of experiences for learning and produced more stable convergence profiles than when using a single function approximator.
dc.format.extent6 p.
dc.language.isoeng
dc.subjectÀrees temàtiques de la UPC::Informàtica::Intel·ligència artificial
dc.subject.lcshQ-Learning
dc.subject.lcshReinforcement learning
dc.subject.othergeneralisation (artificial intelligence) learning (artificial intelligence) AUTOR: reinforcement learning
dc.titleA competitive strategy for function approximation in Q-learning
dc.typeConference report
dc.subject.lemacAprenentatge -- Tècniques
dc.contributor.groupUniversitat Politècnica de Catalunya. ROBiri - Grup de Robòtica de l'IRI
dc.contributor.groupUniversitat Politècnica de Catalunya. VIS - Visió Artificial i Sistemes Intel.ligents
dc.description.peerreviewedPeer Reviewed
dc.relation.publisherversionhttp://ijcai.org/papers11/Papers/IJCAI11-196.pdf
dc.rights.accessOpen Access
drac.iddocument5961220
dc.description.versionPreprint


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record

All rights reserved. This work is protected by the corresponding intellectual and industrial property rights. Without prejudice to any existing legal exemptions, reproduction, distribution, public communication or transformation of this work are prohibited without permission of the copyright holder