Simulation to reality: Study and demonstration of domain adaptation techniques applied to Reinforcement and Supervised learning algorithms
View/Open
Cita com:
hdl:2117/333264
Document typeMaster thesis
Date2020-07-09
Rights accessOpen Access
Except where otherwise noted, content on this work
is licensed under a Creative Commons license
:
Attribution-NonCommercial-ShareAlike 3.0 Spain
Abstract
This report describes my 5.5 months end of studies internship as an AI Research Intern, the focus of which was the implementation and demonstration of techniques from the Domain Adaptation literature on the problem of autonomous driving. Machine Learning algorithms are famously brittle to shifts in the data distribution, and Domain Adaptation is a sub-field of Machine Learning aiming to palliate this issue. We replicate and implement five Domain Adaptation techniques to the task of autonomous driving in simulated and real environments on a representative model which we assemble, install, and debug: the Duckietown simulator and robot created and distributed by the eponymous foundation. We create a simple containerised solution for robot control cleaner than the original one. We also extend the simulator with a plethora of additional methods to train and benchmark Supervised and Reinforcement Learning methods in default and altered environments The report is divided into the present summary and 6 chapters. Chapter 1 provides the project’s context. Chapter 2 introduces Duckietown: the robot, the circuit it drives in, the simulator where our AI agents are trained, and the community built around this ecosystem. Chapter 3 presents the theoretical foundations of the Machine Learning algorithms used in the project including Domain Adaptation: what it is, why it is needed, what techniques exist, and what algorithms we have used. Chapter 4 presents the conditions under which
we carried out our experiments: how results are evaluated, what architecture we have used, and what we tested on. Chapter 5 presents the results of our research and explores the performance of our different techniques under a variety of scenarios. Lastly, Chapter 6 draws conclusions from the project and reflects on potential future work. Regrettably, we did not manage to obtain good results in passage from simulation to reality: None of our methods managed to consistently complete a whole lap on the real Duckietown. We suspect a few key causes for this result: A gap between simulation and
reality that was too wide; an over-reliance of our methods on adaptation of observations which neglects gaps in dynamics; and a lack of generality within the Domain Adaptation literature. We also believe several avenues for attempting to improve performance remain: Bridging the dynamics gap with robust control techniques, using extreme augmentation techniques like others have done with similar problems, and simplifying the problem by renouncing to end-to-end control. In the face of a lack of results in the simulation to reality task, we switch to an intermediate task of moderate difficulty to benchmark our approaches: Using Domain Adaptation to obtain satisfactory performance across several different simulated circuits of mounting difficulty. Firstly, domain adaptation techniques are remarkably brittle. While they provide significant improvements in the reference benchmarks used for their publication, they
fail to provide any improvements in this specific problem, and in fact their underlying mechanisms are utterly broken in our problem. Secondly, data augmentation, a widely used technique to achieve somewhat increased generality renowned for its wide applicability, was only helpful in specific circumstances, and was a hindrance in some others. Nevertheless it was the only method providing any gains. Overall, it would seem the current status of the Domain Adaptation field does not allow for easy application of developed techniques across domains. One should exercise caution in assuming that good results in a given data-set will apply to another task, rendering the field’s findings less useful and of limited validity
SubjectsMachine learning, Artificial intelligence, Aprenentatge automàtic, Intel·ligència artificial
DegreeMÀSTER UNIVERSITARI EN ENGINYERIA INDUSTRIAL (Pla 2014)
Collections
Files | Description | Size | Format | View |
---|---|---|---|---|
rapport-sfe.pdf | 6,840Mb | View/Open |