DSpace DSpace UPC
 Català   Castellano   English  

E-prints UPC >
Altres >
Enviament des de DRAC >

Empreu aquest identificador per citar o enllaçar aquest ítem: http://hdl.handle.net/2117/10368

Ítem no disponible en accés obert per política de l'editorial

Arxiu Descripció MidaFormat
draccelaya.pdf563,18 kBAdobe PDF Accés restringit

Citació: Agostini, A.G.; Celaya, E. Reinforcement learning for robot control using probability density estimations. A: International Conference on Informatics in Control, Automation and Robotics. "7Th International Conference on Informatics in Control, Automation and Robotics". Funchal: INSTICC Press. Institute for Systems and Technologies of Information, Control and Communication, 2010, p. 160-168.
Títol: Reinforcement learning for robot control using probability density estimations
Autor: Agostini, Alejandro Gabriel Veure Producció científica UPC; Celaya Llover, Enric Veure Producció científica UPC
Editorial: INSTICC Press. Institute for Systems and Technologies of Information, Control and Communication
Data: 2010
Tipus de document: Conference report
Resum: The successful application of Reinforcement Learning (RL) techniques to robot control is limited by the fact that, in most robotic tasks, the state and action spaces are continuous, multidimensional, and in essence, too large for conventional RL algorithms to work. The well known curse of dimensionality makes infeasible using a tabular representation of the value function, which is the classical approach that provides convergence guarantees. When a function approximation technique is used to generalize among similar states, the convergence of the algorithm is compromised, since updates unavoidably affect an extended region of the domain, that is, some situations are modified in a way that has not been really experienced, and the update may degrade the approximation. We propose a RL algorithm that uses a probability density estimation in the joint space of states, actions and Q-values as a means of function approximation. This allows us to devise an updating approach that, taking into account the local sampling density, avoids an excessive modification of the approximation far from the observed sample.
URI: http://hdl.handle.net/2117/10368
Versió de l'editor: http://www.icinco.org/Abstracts/2010/ICINCO_2010_Abstracts.htm
Apareix a les col·leccions:VIS - Visió Artificial i Sistemes Intel.ligents. Ponències/Comunicacions de congressos
Institut de Robòtica i Informàtica Industrial, CSIC-UPC. Ponències/Comunicacions de congressos
Altres. Enviament des de DRAC
Comparteix:


Stats Mostra les estadístiques d'aquest ítem

SFX Query

Aquest ítem (excepte textos i imatges no creats per l'autor) està subjecte a una llicència de Creative Commons Llicència Creative Commons
Creative Commons

 

Valid XHTML 1.0! Programari DSpace Copyright © 2002-2004 MIT and Hewlett-Packard Comentaris
Universitat Politècnica de Catalunya. Servei de Biblioteques, Publicacions i Arxius