Using symmetries in reinforcement learning of bimanual robotic tasks
Document typeMaster thesis
Rights accessOpen Access
The learning of bimanual robotic tasks, i.e., tasks executed by two manipulators together, can be particularly important in the new scenarios opened by the rise of humanoid robotics, one of the most interesting trend currently in the field. The work presented wants to build a method to simplify the dimensionality of parameter space in this particular context, exploiting the presence of symmetries between the movements executed by the two arms. The aim is to develop a reduced-order representation of the bimanual motion, with the purpose of increase the speed of learning process. In chapter 1, kinematics of the used robots is studied, in order to know how to correctly command the position of the robots while executing a task. Robotic movements are then modeled using Probabilistic Movement Primitives (ProMPs), a stochastic interpretation of robot movements (details in chapter 2). The first objective is to develop a symmetrization method for those kind of policies, and this part is treated in chapter 3. This will give the chance of representing the movement of two robotic arms, with only a single ProMP (instead of two, one for each arm), from which obtain the second policy applying symmetrization. In this way the amount of parameters representing motion can be halved. The most common kind of symmetry is the one defined by a plane, but also other cases can be explored, e.g., spherical or cylindrical symmetry. If the symmetry surface is not explicitly given in the bimanual task description, it is critical to have a reliable method to estimate it in order to exploit it in the learning process. In chapter 4 it is reported a way to obtain this estimation of the parameters describing the symmetry surface from the initially demonstrated trajectories. Finally, in chapter 5 it is defined a symmetric policy representation for bimanual task, that depends only on a single ProMP and a symmetry surface. The effectiveness of this parameter reduction has been tested applying it in reinforcement learning of some tasks, in comparison to the results obtained by the standard way of proceeding, that model the bimanual task with two separated ProMPs, one for each robotic arm.