DSpace DSpace UPC
 English   Castellano   Català  

Treballs academics UPC >
Màsters Oficials >
Master in Artificial Intelligence - MAI (Pla 2006) >

Empreu aquest identificador per citar o enllaçar aquest ítem: http://hdl.handle.net/2099.1/11320

Arxiu Descripció MidaFormat
Master thesis_ Xavier Pererz Sala.pdf13,48 MBAdobe PDFVeure/Obrir

Títol: Vision-based Navigation and Reinforcement Learning Path Finding for Social Robots
Autor: Pérez Sala, Xavier
Tutor/director/avaluador: Angulo Bahón, Cecilio Veure Producció científica UPC
Universitat: Universitat Politècnica de Catalunya
Matèries: Àrees temàtiques de la UPC::Informàtica::Intel·ligència artificial::Aprenentatge automàtic
Àrees temàtiques de la UPC::Informàtica::Robòtica
Reinforcement learning
Artificial Vision module
Behavior control module
Reinforcement Learning module
Robot Navigation
Aprenentatge per reforç
Data: 3-set-2010
Tipus de document: Master thesis
Resum: We propose a robust system for automatic Robot Navigation in uncontrolled en- vironments. The system is composed by three main modules: the Arti cial Vision module, the Reinforcement Learning module, and the behavior control module. The aim of the system is to allow a robot to automatically nd a path that arrives to a pre xed goal. Turn and straight movements in uncontrolled environments are automatically estimated and controlled using the proposed modules. The Arti cial Vision module is responsible of obtaining a quanti ed representa- tion of the robot vision. This is done by the automatic detection and description of image interest points using state-of-the-art strategies. Once an image is described with a set of local feature vectors, the view is codi ed as a vector of visual words frequencies computed from a previous scene representation, which robustly discrim- inate among the di erent possible views of the robot in the environment. Local features changes in time are also used to estimate robot movement and consequently control robot behavior be means of the analysis of the computed vanishing points. The Reinforcement Learning (RL) module receives a vector quanti ed by the Arti cial Vision module plus robot sensor estimations. RL strategy computes the required state and reward. The state corresponds to the normalized received quan- ti ed vector together with the robot proximity sensor quanti cations. The reward value is computed using the distance between the robot and the goal. Given the high dimensionality of the problem we deal with, conventional RF strategies make the search problem unfeasible. Because of this reason, we propose the use of an al- gorithm from the articulation control eld, named Natural Actor-Critic, which can deal with high dimensionality problems. We tested the proposed methodology in uncontrolled environments using the Sony Aibo robot. The results shown that the robot looked for the goal, producing behavior changes based on experience, but without nding the optimal route. 3
URI: http://hdl.handle.net/2099.1/11320
Condicions d'accés: Open Access
Apareix a les col·leccions:Master in Artificial Intelligence - MAI (Pla 2006)

SFX Query

Aquest ítem (excepte textos i imatges no creats per l'autor) està subjecte a una llicència de Creative Commons Llicència Creative Commons
Creative Commons


Valid XHTML 1.0! Programari DSpace Copyright © 2002-2004 MIT and Hewlett-Packard Comentaris
Universitat Politècnica de Catalunya. Servei de Biblioteques, Publicacions i Arxius