Learning complex manipulation skills with causal structure and multi-sensory fusion
Document typeBachelor thesis
Rights accessRestricted access - author's decision
This thesis explores an efficient learning, inference, control and planning for complex manip- ulation skills based on integrating tactile and visual information. Considering how humans are able to integrate tactile and visual stimuli to execute complex manipulation, we base this work in emulating the human causal reasoning in a robot that learns to play Jenga, a complex game that requires physical interaction to be played. Unlike most current robotic learning methodologies which exploit recent progress in computer vision and deep learning to acquire data-hungry pixel-to-action policies, we exploit force and basic intuitive structure such as causality to learn probabilistic models. The game mechanics are formulated as a generative process using a temporal hierarchical Bayesian model, with representations for both behavioral arches-types and noisy block states. This captures causal relationships in force and visual domain after a short exploration phase. Once the learning is done, the robot uses the learned representations to infer the block behavior patterns and states, and then adjust its behavior in both action and game strategy, emulating the way humans play the game.
The details of the work will be defined once the student arrives at the destination institution.