Accelerating boosting-based face detection on GPUs
View/Open
Paper ICPP 2012 (769,2Kb) (Restricted access)
Request copy
Què és aquest botó?
Aquest botó permet demanar una còpia d'un document restringit a l'autor. Es mostra quan:
- Disposem del correu electrònic de l'autor
- El document té una mida inferior a 20 Mb
- Es tracta d'un document d'accés restringit per decisió de l'autor o d'un document d'accés restringit per política de l'editorial
Cita com:
hdl:2117/18498
Document typeConference report
Defense date2012
Rights accessRestricted access - publisher's policy
All rights reserved. This work is protected by the corresponding intellectual and industrial
property rights. Without prejudice to any existing legal exemptions, reproduction, distribution, public
communication or transformation of this work are prohibited without permission of the copyright holder
Abstract
The goal of face detection is to determine the
presence of faces in arbitrary images, along with their locations
and dimensions. As it happens with any graphics workloads,
these algorithms benefit from data-level parallelism. Existing
parallelization efforts strictly focus on mapping different di-
vide and conquer strategies into multicore CPUs and GPUs.
However, even the most advanced single-chip many-core pro-
cessors to date are still struggling to effectively handle real-
time face detection under high-definition video workloads. To
address this challenge, face detection algorithms typically avoid
computations by dynamically evaluating a boosted cascade
of classifiers. Unfortunately, this technique yields a low ALU
occupancy in architectures such as GPUs, which heavily rely
on large SIMD widths for maximizing data-level parallelism.
In this paper we present several techniques to increase the
performance of the cascade evaluation kernel, which is the
most resource-intensive part of the face detection pipeline.
Particularly, the usage of concurrent kernel execution in
combination with cascades generated with the GentleBoost
algorithm solves the problem of GPU underutilization, and
achieves a 5X speedup in 1080p videos on average over
the fastest known implementations, while slightly improving
the accuracy. Finally, we also studied the parallelization of
the cascade training process and its scalability under SMP
platforms. The proposed parallelization strategy exploits both
task and data-level parallelism and achieves a 3.5X speedup
over single-threaded implementations
CitationOro, D. [et al.]. Accelerating boosting-based face detection on GPUs. A: International Conference on Parallel Processing. "ICPP 2012: the 41st International Conference on Parallel Processing: Pittsburgh, Pennsylvania, USA, 10-13 September 2012". Pittsburgh, Pennsylvania: 2012, p. 309-318.
Publisher versionhttp://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6337592
Collections
- CAP - Grup de Computació d'Altes Prestacions - Ponències/Comunicacions de congressos [784]
- VEU - Grup de Tractament de la Parla - Ponències/Comunicacions de congressos [437]
- Departament d'Arquitectura de Computadors - Ponències/Comunicacions de congressos [1.986]
- Departament de Teoria del Senyal i Comunicacions - Ponències/Comunicacions de congressos [3.395]
Files | Description | Size | Format | View |
---|---|---|---|---|
Paper ICPP 2012.pdf | Paper ICPP 2012 | 769,2Kb | Restricted access |