Exploració per autor "Agostini, Alejandro Gabriel"
Ara es mostren els items 16-20 de 20
-
Probability density estimation of the Q Function for reinforcement learning
Agostini, Alejandro Gabriel; Celaya Llover, Enric (2009)
Report de recerca
Accés obertPerforming Q-Learning in continuous state-action spaces is a problem still unsolved for many complex applications. The Q function may be rather complex and can not be expected to fit into a predefined parametric model. In ... -
Quick learning of cause-effects relevant for robot action
Agostini, Alejandro Gabriel; Wörgötter, Florentin; Torras, Carme (2010)
Report de recerca
Accés obertIn this work we propose a new paradigm for the rapid learning of cause-effect relations relevant for task execution. Learning occurs automatically from action experiences by means of a novel constructive learning approach ... -
Reinforcement learning for robot control using probability density estimations
Agostini, Alejandro Gabriel; Celaya Llover, Enric (INSTICC Press. Institute for Systems and Technologies of Information, Control and Communication, 2010)
Text en actes de congrés
Accés restringit per política de l'editorialThe successful application of Reinforcement Learning (RL) techniques to robot control is limited by the fact that, in most robotic tasks, the state and action spaces are continuous, multidimensional, and in essence, too ... -
Reinforcement learning with a Gaussian mixture model
Agostini, Alejandro Gabriel; Celaya Llover, Enric (2010)
Text en actes de congrés
Accés obertRecent approaches to Reinforcement Learning (RL) with function approximation include Neural Fitted Q Iteration and the use of Gaussian Processes. They belong to the class of fitted value iteration algorithms, which use a ... -
Stochastic approximations of average values using proportions of samples
Agostini, Alejandro Gabriel; Celaya Llover, Enric (2011)
Report de recerca
Accés obertIn this work we explain how the stochastic approximation of the average of a random variable is carried out when the observations used in the updates consist in proportion of samples rather than complete samples.