Can our TV robustly understand human gestures? Real-Time Gesture Localization in Range Data
Document typeConference report
PublisherACM Press. Association for Computing Machinery
Rights accessRestricted access - publisher's policy
The 'old' remote falls short of requirements when confronted with digital convergence for living room displays. Enriched options to watch, manage and interact with content on large displays demand improved means of interaction. Concurrently, gesture recognition is increasingly present in human-computer interaction for gaming applications. In this paper we propose a gesture localization framework for interactive display of audio-visual content. The proposed framework works with range data captured from a single consumer depth camera. We focus on still gestures because they are generally user friendly (users do not have to make complex and tiring movements) and allow formulating the problem in terms of object localization. Our method is based on random forests, which have shown an excellent performance on classification and regression tasks. In this work, however, we aim at a specific class of localization problems involving highly unbalanced data: positive examples appear during a small fraction of space and time. We study the impact of this natural unbalance on the random forest learning and we propose a framework to robustly detect gestures on range images in real applications. Our experiments with offline data show the effectiveness of our approach. We also present a real-time application where users can control the TV display with a reduced set of still gestures.
Google Best Student Paper Award CVMP 2012
CitationLópez-Méndez, A.; Casas, J. Can our TV robustly understand human gestures? Real-Time Gesture Localization in Range Data. A: European Conference on Visual Media Production. "Proceedings of the 9th European Conference on Visual Media Production". London: ACM Press. Association for Computing Machinery, 2012, p. 18-25.