Ara es mostren els items 12-20 de 20

    • Object-action complexes: grounded abstractions of sensory-motor processes 

      Krüger, Norbert; Geib, Cristopher; Piater, Justus; Petrick, Ronald; Steedman, Mark; Wörgötter, Florentin; Ude, Ales; Asfour, Tamim; Kraft, Dirk; Omrcen, Damir; Agostini, Alejandro Gabriel; Dillmann, Rudiger (2011)
      Article
      Accés obert
      This paper formalises Object-Action Complexes (OACs) as a basis for symbolic representations of sensorimotor experience and behaviours. OACs are designed to capture the interaction between objects and associated actions in ...
    • On-line learning of macro planning operators using probabilistic estimations of cause-effects 

      Agostini, Alejandro Gabriel; Wörgötter, Florentin; Celaya Llover, Enric; Torras, Carme (2008)
      Report de recerca
      Accés obert
      In this work we propose an on-line learning method for learning action rules for planning. The system uses a probabilistic approach of a constructive induction method that combines a beam search with an example-based search ...
    • Online EM with weight-based forgetting 

      Celaya Llover, Enric; Agostini, Alejandro Gabriel (2015)
      Article
      Accés obert
      In the on-line version of the EM algorithm introduced by Sato and Ishii (2000), a time-dependent discount factor is introduced for forgetting the effect of the old posterior values obtained with an earlier, inaccurate ...
    • Online reinforcement learning using a probability density estimation 

      Agostini, Alejandro Gabriel; Celaya Llover, Enric (The MIT Press. Massachusetts Institute of Technology, 2017-01-01)
      Article
      Accés obert
      Function approximation in online, incremental, reinforcement learning needs to deal with two fundamental problems: biased sampling and nonstationarity. In this kind of task, biased sampling occurs because samples are ...
    • Probability density estimation of the Q Function for reinforcement learning 

      Agostini, Alejandro Gabriel; Celaya Llover, Enric (2009)
      Report de recerca
      Accés obert
      Performing Q-Learning in continuous state-action spaces is a problem still unsolved for many complex applications. The Q function may be rather complex and can not be expected to fit into a predefined parametric model. In ...
    • Quick learning of cause-effects relevant for robot action 

      Agostini, Alejandro Gabriel; Wörgötter, Florentin; Torras, Carme (2010)
      Report de recerca
      Accés obert
      In this work we propose a new paradigm for the rapid learning of cause-effect relations relevant for task execution. Learning occurs automatically from action experiences by means of a novel constructive learning approach ...
    • Reinforcement learning for robot control using probability density estimations 

      Agostini, Alejandro Gabriel; Celaya Llover, Enric (INSTICC Press. Institute for Systems and Technologies of Information, Control and Communication, 2010)
      Text en actes de congrés
      Accés restringit per política de l'editorial
      The successful application of Reinforcement Learning (RL) techniques to robot control is limited by the fact that, in most robotic tasks, the state and action spaces are continuous, multidimensional, and in essence, too ...
    • Reinforcement learning with a Gaussian mixture model 

      Agostini, Alejandro Gabriel; Celaya Llover, Enric (2010)
      Text en actes de congrés
      Accés obert
      Recent approaches to Reinforcement Learning (RL) with function approximation include Neural Fitted Q Iteration and the use of Gaussian Processes. They belong to the class of fitted value iteration algorithms, which use a ...
    • Stochastic approximations of average values using proportions of samples 

      Agostini, Alejandro Gabriel; Celaya Llover, Enric (2011)
      Report de recerca
      Accés obert
      In this work we explain how the stochastic approximation of the average of a random variable is carried out when the observations used in the updates consist in proportion of samples rather than complete samples.