Self-supervised video object segmentation using generative adversarial networks

View/Open
CovenanteeNortheastern University
Document typeMaster thesis
Date2020-06
Rights accessOpen Access
Except where otherwise noted, content on this work
is licensed under a Creative Commons license
:
Attribution-NonCommercial-NoDerivs 3.0 Spain
Abstract
Video Object Segmentation is arguably one of the most challenging tasks in computer vision. Training a model in a supervised manner in this task requires a high number of manually labelled data, which is extremely time-consuming and expensive to generate. In this thesis, we propose a self-supervised method that leverages the spatiotemporal nature of video to perform Video Object Segmentation using Generative Adversarial Networks. In this context, we design a novel framework composed of two generators and two discriminators that aim to reach an equilibrium to fulfill the task. Both at training and testing time, the model needs only the first mask of the video to be trained end to end, which is possible because it exploits the temporal consistency of videos to self-supervise its training. In addition, we refine the masks predicted by the model with the Sum of Squares polynomial, a tool adopted from the convex optimization community. Although our approach is considerably ambitious, our model achieves promising results on the DAVIS2016 dataset, which are reported both in a qualitative and quantitative manner.
DegreeMÀSTER UNIVERSITARI EN ENGINYERIA DE TELECOMUNICACIÓ (Pla 2013)
Files | Description | Size | Format | View |
---|---|---|---|---|
Master Thesis Ponç Palau.pdf | 6,834Mb | View/Open |