Spatio-Temporal networks for few-shot video segmentation with annotation guidance
Visualitza/Obre
Estadístiques de LA Referencia / Recolecta
Inclou dades d'ús des de 2022
Cita com:
hdl:2117/370658
Tutor / directorGiorgos Tolias
Realitzat a/ambČeské vysoké učení technické v Praze
Tipus de documentTreball Final de Grau
Data2022-06-26
Condicions d'accésAccés obert
Tots els drets reservats. Aquesta obra està protegida pels drets de propietat intel·lectual i
industrial corresponents. Sense perjudici de les exempcions legals existents, queda prohibida la seva
reproducció, distribució, comunicació pública o transformació sense l'autorització del titular dels drets
Abstract
The project is dealing with the task of video semantic segmentation with respect to labeled
data annotated by the user to indicate the underlying semantic classes. The current
paradigm for segmentation methods and benchmark datasets is to segment objects in video
provided a single annotation in the first frame. Instead we extend this setup to multiple
annotated data, specifically, two different scenarios are proposed: having two annotated
frames and having pixel-level annotations. For each of these settings, solutions have been
explored, inspired by active learning, to offer guidance for choosing which data, whether
frames or pixels, should be suitable for annotation. To achieve it, we will rely on previous
works based on spatio-temporal networks for video object segmentation, an actual stateof-the-art approach. For each approach, an adaptation of the inference is done in order
to be able to exploit the new data. Finally, different selection criteria will be explored
based on the confidence predictions and uncertainty. When applying a selection criteria
to choose which frame to annotate, the performance improves reasonably, in particular,
we get up to 89% in segmentation performance on the DAVIS benchmark. When dealing
with pixels, qualitative results do not increase as much, achieving over 87% on DAVIS17
when annotating around 100 and 200 pixels. Comparing both methods, we see that in
some cases, annotating pixels is better considering the trade-off between the annotation
cost and the percentage of improved segmentation.
TitulacióGRAU EN CIÈNCIA I ENGINYERIA DE DADES (Pla 2017)
Fitxers | Descripció | Mida | Format | Visualitza |
---|---|---|---|---|
170739.pdf | 6,960Mb | Visualitza/Obre |