Mostra el registre d'ítem simple
V-MIN: efficient reinforcement learning through demonstrations and relaxed reward demands
dc.contributor.author | Martínez Martínez, David |
dc.contributor.author | Alenyà Ribas, Guillem |
dc.contributor.author | Torras, Carme |
dc.contributor.other | Institut de Robòtica i Informàtica Industrial |
dc.date.accessioned | 2016-03-30T17:25:59Z |
dc.date.issued | 2015 |
dc.identifier.citation | Martínez, D., Alenyà, G., Torras, C. V-MIN: efficient reinforcement learning through demonstrations and relaxed reward demands. A: AAAI Conference on Artificial Intelligence. "Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence". Austin: 2015, p. 2857-2863. |
dc.identifier.uri | http://hdl.handle.net/2117/84917 |
dc.description.abstract | Reinforcement learning (RL) is a common paradigm for learning tasks in robotics. However, a lot of exploration is usually required, making RL too slow for high-level tasks. We present V-MIN, an algorithm that integrates teacher demonstrations with RL to learn complex tasks faster. The algorithm combines active demonstration requests and autonomous exploration to find policies yielding rewards higher than a given threshold Vmin. This threshold sets the degree of quality with which the robot is expected to complete the task, thus allowing the user to either opt for very good policies that require many learning experiences, or to be more permissive with sub-optimal policies that are easier to learn. The threshold can also be increased online to force the system to improve its policies until the desired behavior is obtained. Furthermore, the algorithm generalizes previously learned knowledge, adapting well to changes. The performance of V-MIN has been validated through experimentation, including domains from the international planning competition. Our approach achieves the desired behavior where previous algorithms failed. |
dc.format.extent | 7 p. |
dc.language.iso | eng |
dc.rights.uri | http://creativecommons.org/licenses/by-nc-nd/3.0/es/ |
dc.subject | Àrees temàtiques de la UPC::Informàtica::Robòtica |
dc.subject.other | learning (artificial intelligence) |
dc.subject.other | uncertainty handling |
dc.subject.other | reinforcement learning |
dc.subject.other | active learning |
dc.subject.other | model-based reinforcement learning |
dc.title | V-MIN: efficient reinforcement learning through demonstrations and relaxed reward demands |
dc.type | Conference report |
dc.contributor.group | Universitat Politècnica de Catalunya. ROBiri - Grup de Robòtica de l'IRI |
dc.description.peerreviewed | Peer Reviewed |
dc.subject.inspec | Classificació INSPEC::Cybernetics::Artificial intelligence |
dc.relation.publisherversion | http://www.aaai.org/ocs/index.php/AAAI/AAAI15/paper/view/9634/9952 |
dc.rights.access | Restricted access - publisher's policy |
local.identifier.drac | 17087237 |
dc.description.version | Postprint (author's final draft) |
dc.date.lift | 10000-01-01 |
local.citation.author | Martínez, D.; Alenyà, G.; Torras, C. |
local.citation.contributor | AAAI Conference on Artificial Intelligence |
local.citation.pubplace | Austin |
local.citation.publicationName | Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence |
local.citation.startingPage | 2857 |
local.citation.endingPage | 2863 |