Sample-efficient robot motion learning using Gaussian process latent variable models
10.1109/ICRA40945.2020.9196658
Inclou dades d'ús des de 2022
Cita com:
hdl:2117/339877
Tipus de documentText en actes de congrés
Data publicació2020
Condicions d'accésAccés obert
Llevat que s'hi indiqui el contrari, els
continguts d'aquesta obra estan subjectes a la llicència de Creative Commons
:
Reconeixement-NoComercial-SenseObraDerivada 3.0 Espanya
Abstract
Robotic manipulators are reaching a state where we could see them in household environments in the following decade. Nevertheless, such robots need to be easy to instruct by lay people. This is why kinesthetic teaching has become very popular in recent years, in which the robot is taught a motion that is encoded as a parametric function - usually a Movement Primitive (MP)-. This approach produces trajectories that are usually suboptimal, and the robot needs to be able to improve them through trial-and-error. Such optimization is often done with Policy Search (PS) reinforcement learning, using a given reward function. PS algorithms can be classified as model-free, where neither the environment nor the reward function are modelled, or model-based, which can use a surrogate model of the reward function and/or a model for the dynamics of the task. However, MPs can become very high-dimensional in terms of parameters, which constitute the search space, so their optimization often requires too many samples. In this paper, we assume we have a robot motion task characterized with an MP of which we cannot model the dynamics. We build a surrogate model for the reward function, that maps an MP parameter latent space (obtained through a Mutual-information-weighted Gaussian Process Latent Variable Model) into a reward. While we do not model the task dynamics, using mutual information to shrink the task space makes it more consistent with the reward and so the policy improvement is faster in terms of sample efficiency.
CitacióDelgado, J.; Colomé, A.; Torras, C. Sample-efficient robot motion learning using Gaussian process latent variable models. A: IEEE International Conference on Robotics and Automation. "ICRA 2020 - 2020 IEEE International Conference on Robotics and Automation: Paris, France (VIRTUAL), May 31- Aug 31, 2020: proceedings book". 2020, p. 314-320. DOI 10.1109/ICRA40945.2020.9196658.
Versió de l'editorhttps://ieeexplore.ieee.org/document/9196658
Col·leccions
- IRI - Institut de Robòtica i Informàtica Industrial, CSIC-UPC - Ponències/Comunicacions de congressos [575]
- ROBiri - Grup de Percepció i Manipulació Robotitzada de l'IRI - Ponències/Comunicacions de congressos [251]
- Doctorat en Automàtica, Robòtica i Visió - Ponències/Comunicacions de congressos [165]
Fitxers | Descripció | Mida | Format | Visualitza |
---|---|---|---|---|
2320-Sample-eff ... latent-variable-models.pdf | 640,1Kb | Visualitza/Obre |