Policy transfer via modularity

Clavera Gilaberte, Ignasi

Visualitza/Obre

memoria.pdf (4,564Mb)

Veure estadístiques d'ús d'UPCommons

Estadístiques de LA Referencia / Recolecta

Cita com:

Mostra el registre d'ítem complet

Clavera Gilaberte, Ignasi

Tutor / directorAbbeel, Pieter; Torras, Carme

Realitzat a/ambUniversity of California Berkeley. Electrical Engineering and Computer Science

Tipus de documentTreball Final de Grau

Data2017-05

Condicions d'accésAccés obert

Attribution-NonCommercial-NoDerivs 3.0 Spain

Llevat que s'hi indiqui el contrari, els continguts d'aquesta obra estan subjectes a la llicència de Creative Commons : Reconeixement-NoComercial-SenseObraDerivada 3.0 Espanya

Abstract

Non-prehensile manipulation, such as pushing, is an important function for robots to move objects and is sometimes preferred as an alternative to grasping. However, due to unknown frictional forces, pushing has been proven a difficult task for robots. We explore the use of reinforcement learning to train a robot to robustly push an object. In order to deal with the sample complexity of training such a method, we train the pushing policy in simulation and then transfer this policy to the real world. In order to ease the transfer from simulation, we propose to use modularity to separate the learned policy from the raw inputs and outputs; rather than training ``end-to-end," we decompose our system into modules and train only a subset of these modules in simulation. We further demonstrate that we can incorporate prior knowledge about the task into the state space and the reward function to speed up convergence. Finally, we introduce "reward guiding" to modify the reward function and further reduce the training time. We demonstrate, in both simulation and real-world experiments, that such an approach can be used to reliably push an object from many initial positions and orientations.

MatèriesArtificial intelligence, Intel·ligència artificial

TitulacióGRAU EN MATEMÀTIQUES (Pla 2009)

URIhttp://hdl.handle.net/2117/106257

Col·leccions

Facultat de Matemàtiques i Estadística - Grau en Matemàtiques (Pla 2009) [325]

Veure estadístiques d'ús d'UPCommons

Mostra el registre d'ítem complet

Fitxers	Descripció	Mida	Format	Visualitza
memoria.pdf		4,564Mb	PDF	Visualitza/Obre

UPCommons. Portal del coneixement obert de la UPC

Policy transfer via modularity

Visualitza/Obre

Explora