A closer look at referring expressions for video object segmentation
Visualitza/Obre
Cita com:
hdl:2117/373642
Tipus de documentArticle
Data publicació2023-01
Condicions d'accésAccés obert
Llevat que s'hi indiqui el contrari, els
continguts d'aquesta obra estan subjectes a la llicència de Creative Commons
:
Reconeixement 4.0 Internacional
ProjecteUPC-COMPUTACION DE ALTAS PRESTACIONES VIII (AEI-PID2019-107255GB-C22)
APRENDIZAJE PROFUNDO EFICIENTE PARA SECUENCIAS DE VIDEO Y NUBES DE PUNTOS (AEI-PID2020-117142GB-I00)
MODELOS DE DEEP LEARNING CONSIDERANDO EXPLICABILIDAD Y INCERTIDUMBRE. APLICACIONES A LA PERCEPCION EMOCIONAL. (AEI-RTI2018-095232-B-C22)
APRENDIZAJE PROFUNDO EFICIENTE PARA SECUENCIAS DE VIDEO Y NUBES DE PUNTOS (AEI-PID2020-117142GB-I00)
MODELOS DE DEEP LEARNING CONSIDERANDO EXPLICABILIDAD Y INCERTIDUMBRE. APLICACIONES A LA PERCEPCION EMOCIONAL. (AEI-RTI2018-095232-B-C22)
Abstract
The task of Language-guided Video Object Segmentation (LVOS) aims at generating binary masks for an object referred by a linguistic expression. When this expression unambiguously describes an object in the scene, it is named referring expression (RE). Our work argues that existing benchmarks used for LVOS are mainly composed of trivial cases, in which referents can be identified with simple phrases. Our analysis relies on a new categorization of the referring expressions in the DAVIS-2017 and Actor-Action datasets into trivial and non-trivial REs, where the non-trivial REs are further annotated with seven RE semantic categories. We leverage these data to analyze the performance of RefVOS, a novel neural network that obtains competitive results for the task of language-guided image segmentation and state of the art results for LVOS. Our study indicates that the major challenges for the task are related to understanding motion and static actions.
CitacióBellver, M. [et al.]. A closer look at referring expressions for video object segmentation. "Multimedia tools and applications", Gener 2023, vol. 82, núm. 3, p. 4419-4438.
ISSN1380-7501
Versió de l'editorhttps://link.springer.com/article/10.1007/s11042-022-13413-x
Fitxers | Descripció | Mida | Format | Visualitza |
---|---|---|---|---|
s11042-022-13413-x.pdf | 5,777Mb | Visualitza/Obre |