Reports de recerca
http://hdl.handle.net/2117/3754
2016-04-30T17:28:17ZCompetitive function approximation for reinforcement learning
http://hdl.handle.net/2117/28454
Competitive function approximation for reinforcement learning
Agostini, Alejandro Gabriel; Celaya Llover, Enric
The application of reinforcement learning to problems with continuous domains requires representing the value function by means of function approximation. We identify two aspects of reinforcement learning that make the function approximation process hard: non-stationarity of the target function and biased sampling. Non-stationarity is the result of the bootstrapping nature of dynamic programming where the value function is estimated using its current approximation. Biased sampling occurs when some regions of the state space are visited too often, causing a reiterated updating with similar values which fade out the occasional updates of infrequently sampled regions.
We propose a competitive approach for function approximation where many different local approximators are available at a given input and the one with expectedly best approximation is selected by means of a relevance function. The local nature of the approximators allows their fast adaptation to non-stationary changes and mitigates the biased sampling problem. The coexistence of multiple approximators updated and tried in parallel permits obtaining a good estimation much faster than would be possible with a single approximator. Experiments in different benchmark problems show that the competitive strategy provides a faster and more stable learning than non-competitive approaches.
2015-06-29T18:57:10ZAgostini, Alejandro GabrielCelaya Llover, EnricThe application of reinforcement learning to problems with continuous domains requires representing the value function by means of function approximation. We identify two aspects of reinforcement learning that make the function approximation process hard: non-stationarity of the target function and biased sampling. Non-stationarity is the result of the bootstrapping nature of dynamic programming where the value function is estimated using its current approximation. Biased sampling occurs when some regions of the state space are visited too often, causing a reiterated updating with similar values which fade out the occasional updates of infrequently sampled regions.
We propose a competitive approach for function approximation where many different local approximators are available at a given input and the one with expectedly best approximation is selected by means of a relevance function. The local nature of the approximators allows their fast adaptation to non-stationary changes and mitigates the biased sampling problem. The coexistence of multiple approximators updated and tried in parallel permits obtaining a good estimation much faster than would be possible with a single approximator. Experiments in different benchmark problems show that the competitive strategy provides a faster and more stable learning than non-competitive approaches.Modelling of tidal power with EasyJava simulations
http://hdl.handle.net/2117/23235
Modelling of tidal power with EasyJava simulations
Baranger, Pierre; Grau Saldes, Antoni; Bolea Monte, Yolanda
This report is devoted to the study of a mathematical model to simulte the behaviour of a power generation plant from the ocean tidals. The simulation has been implemented with EasyJava Simulations (EJS) environment.
Recerca d'un model de generació elèctrica a partir de les marees oceàniques
2014-06-17T07:38:44ZBaranger, PierreGrau Saldes, AntoniBolea Monte, YolandaThis report is devoted to the study of a mathematical model to simulte the behaviour of a power generation plant from the ocean tidals. The simulation has been implemented with EasyJava Simulations (EJS) environment.Etude d'un canal d'irrigation MIMO
http://hdl.handle.net/2117/16938
Etude d'un canal d'irrigation MIMO
Chefdor, Nicolas; Bolea Monte, Yolanda; Grau Saldes, Antoni
Treball de recerca fet per l'alumne Nicolas Chefdor durant la seva estada al laboratori de recerca VIS-ESAII a la UPC. Dirigit pels Dr. Yolanda Bolea i Dr. Antoni Grau. Estada del 1/06/2011-30/09/2011
2012-11-16T11:44:48ZChefdor, NicolasBolea Monte, YolandaGrau Saldes, AntoniTreball de recerca fet per l'alumne Nicolas Chefdor durant la seva estada al laboratori de recerca VIS-ESAII a la UPC. Dirigit pels Dr. Yolanda Bolea i Dr. Antoni Grau. Estada del 1/06/2011-30/09/2011Stochastic approximations of average values using proportions of samples
http://hdl.handle.net/2117/14112
Stochastic approximations of average values using proportions of samples
Agostini, Alejandro Gabriel; Celaya Llover, Enric
In this work we explain how the stochastic approximation of the average of a random variable is carried out when the observations used in the updates consist in proportion of samples rather than complete
samples.
IRI Technical Report
2011-11-29T14:54:00ZAgostini, Alejandro GabrielCelaya Llover, EnricIn this work we explain how the stochastic approximation of the average of a random variable is carried out when the observations used in the updates consist in proportion of samples rather than complete
samples.Registration of 3d point clouds for urban robot mapping
http://hdl.handle.net/2117/13936
Registration of 3d point clouds for urban robot mapping
Teniente Avilés, Ernesto; Andrade-Cetto, Juan
We consider the task of mapping pedestrian urban areas for a robotic guidance and surveillance application. This mapping is performed by registering three-dimensional laser range scans acquired with two different robots.
To solve this task we will use the Iterative Closes Point (ICP) algorithm proposed in [8],
but for the minimization step we will use the metric proposed by Biota et al. [10] trying to get advantage of the compensation between translation and rotation they mention. To reduce computational cost in the original ICP during matching, the correspondences search is done with the library Approximate Nearest Neighbor (ANN). Finally we propose a hierarchical new
correspondence search strategy, using a point-to-plane strategy at the highest level and the point-to-point metric at finer levels. At the highest level the adjust error between a plane and it’s n adjacent points describing the plane is computed, if this error is bigger than a threshold then we change the level.
2011-11-16T13:18:24ZTeniente Avilés, ErnestoAndrade-Cetto, JuanWe consider the task of mapping pedestrian urban areas for a robotic guidance and surveillance application. This mapping is performed by registering three-dimensional laser range scans acquired with two different robots.
To solve this task we will use the Iterative Closes Point (ICP) algorithm proposed in [8],
but for the minimization step we will use the metric proposed by Biota et al. [10] trying to get advantage of the compensation between translation and rotation they mention. To reduce computational cost in the original ICP during matching, the correspondences search is done with the library Approximate Nearest Neighbor (ANN). Finally we propose a hierarchical new
correspondence search strategy, using a point-to-plane strategy at the highest level and the point-to-point metric at finer levels. At the highest level the adjust error between a plane and it’s n adjacent points describing the plane is computed, if this error is bigger than a threshold then we change the level.Path planning with pose SLAM
http://hdl.handle.net/2117/12449
Path planning with pose SLAM
Valencia Carreño, Rafael; Andrade-Cetto, Juan; Porta Pleite, Josep Maria
The probabilistic belief networks that result from standard feature-based simultaneous localization and map building (SLAM) approaches cannot be directly used to plan trajectories. The reason is that they
produce a sparse graph of landmark estimates and their probabilistic relations, which is of little value to find collision free paths for navigation. In contrast, we argue in this paper that Pose SLAM graphs can be directly used as belief roadmaps (BRMs). The original BRM algorithm assumes a known model of the environment from which probabilistic sampling generates a roadmap. In our work, the roadmap is built on-line by the Pose SLAM algorithm. The result is a hybrid BRM-Pose SLAM method that devises optimal navigation strategies on-line by searching for the path with lowest accumulated uncertainty for the robot pose. The method is validated over synthetic data and standard SLAM datasets.
2011-05-03T09:44:15ZValencia Carreño, RafaelAndrade-Cetto, JuanPorta Pleite, Josep MariaThe probabilistic belief networks that result from standard feature-based simultaneous localization and map building (SLAM) approaches cannot be directly used to plan trajectories. The reason is that they
produce a sparse graph of landmark estimates and their probabilistic relations, which is of little value to find collision free paths for navigation. In contrast, we argue in this paper that Pose SLAM graphs can be directly used as belief roadmaps (BRMs). The original BRM algorithm assumes a known model of the environment from which probabilistic sampling generates a roadmap. In our work, the roadmap is built on-line by the Pose SLAM algorithm. The result is a hybrid BRM-Pose SLAM method that devises optimal navigation strategies on-line by searching for the path with lowest accumulated uncertainty for the robot pose. The method is validated over synthetic data and standard SLAM datasets.Estudi de la transformació de l'espai de color RGB a l'espai de color HSV
http://hdl.handle.net/2117/12013
Estudi de la transformació de l'espai de color RGB a l'espai de color HSV
Grau Gotés, Mª Ángela; Grau Sánchez, Miguel; Montseny Masip, Eduard; Sobrevilla Frisón, Pilar
S’apliquen les tècniques clàssiques de propagació de l’error a la transformació de l’espai de color RGB en l’espai de color HSV a un conjunt de 1098 imatges test. El conjunt d’imatges test són 183 paletes de color i sis nivells d’il·luminació diferents. Els resultats que es presenten indiquen com varien la mitjana i la variància per la transformació.
2011-03-22T10:51:28ZGrau Gotés, Mª ÁngelaGrau Sánchez, MiguelMontseny Masip, EduardSobrevilla Frisón, PilarS’apliquen les tècniques clàssiques de propagació de l’error a la transformació de l’espai de color RGB en l’espai de color HSV a un conjunt de 1098 imatges test. El conjunt d’imatges test són 183 paletes de color i sis nivells d’il·luminació diferents. Els resultats que es presenten indiquen com varien la mitjana i la variància per la transformació.Audio localization for mobile robots
http://hdl.handle.net/2117/11543
Audio localization for mobile robots
de Guillebon, Thibaut; Grau Saldes, Antoni; Bolea Monte, Yolanda
The department of the University for which I worked is developing a project based on the interaction with robots in the environment. My work was to define an audio system for the robot. This audio system that I have to realize consists on a mobile head which is able to follow the sound in its environment. This subject was treated as a research problem, with the liberty to find and develop different solutions and make them evolve in the chosen way.
2011-02-25T10:59:41Zde Guillebon, ThibautGrau Saldes, AntoniBolea Monte, YolandaThe department of the University for which I worked is developing a project based on the interaction with robots in the environment. My work was to define an audio system for the robot. This audio system that I have to realize consists on a mobile head which is able to follow the sound in its environment. This subject was treated as a research problem, with the liberty to find and develop different solutions and make them evolve in the chosen way.