Mostra el registre d'ítem simple
Work-efficient parallel non-maximum suppression for embedded GPU architectures
dc.contributor.author | Oro Garcia, David |
dc.contributor.author | Fernandez Tena, Carles |
dc.contributor.author | Martorell Bofill, Xavier |
dc.contributor.author | Hernando Pericás, Francisco Javier |
dc.contributor.other | Universitat Politècnica de Catalunya. Departament d'Arquitectura de Computadors |
dc.contributor.other | Universitat Politècnica de Catalunya. Departament de Teoria del Senyal i Comunicacions |
dc.date.accessioned | 2016-11-02T14:01:14Z |
dc.date.issued | 2016 |
dc.identifier.citation | Oro, D., Fernandez, C., Martorell, X., Hernando, J. Work-efficient parallel non-maximum suppression for embedded GPU architectures. A: IEEE International Conference on Acoustics, Speech, and Signal Processing. "2016 IEEE International Conference on Acoustics, Speech, and Signal Processing: proceedings: March 20-25, 2016: Shanghai International Convention Center: Shanghai, China". Shanghai: Institute of Electrical and Electronics Engineers (IEEE), 2016, p. 1026-1030. |
dc.identifier.isbn | 978-1-4799-9988-0 |
dc.identifier.uri | http://hdl.handle.net/2117/91351 |
dc.description.abstract | With the emergence of GPU computing, deep neural networks have become a widely used technique for advancing research in the field of image and speech processing. In the context of object and event detection, slidingwindow classifiers require to choose the best among all positively discriminated candidate windows. In this paper, we introduce the first GPU-based non-maximum suppression (NMS) algorithm for embedded GPU architectures. The obtained results show that the proposed parallel algorithm reduces the NMS latency by a wide margin when compared to CPUs, even clocking the GPU at 50% of its maximum frequency on an NVIDIA Tegra K1. In this paper, we show results for object detection in images. The proposed technique is directly applicable to speech segmentation tasks such as speaker diarization. |
dc.format.extent | 5 p. |
dc.language.iso | eng |
dc.publisher | Institute of Electrical and Electronics Engineers (IEEE) |
dc.subject | Àrees temàtiques de la UPC::Informàtica::Sistemes d'informació |
dc.subject | Àrees temàtiques de la UPC::Informàtica |
dc.subject.lcsh | Embedded computer systems |
dc.subject.lcsh | Information display systems |
dc.subject.other | Embedded systems |
dc.subject.other | Graphics processing units |
dc.subject.other | Parallel algorithms |
dc.subject.other | Work-efficient parallel nonmaximum suppression |
dc.subject.other | Embedded GPU architectures |
dc.subject.other | Image processing |
dc.subject.other | Speech processing |
dc.subject.other | Deep neural networks |
dc.subject.other | NMS latency |
dc.subject.other | Positively discriminated candidate windows |
dc.subject.other | Parallel algorithm |
dc.subject.other | NVIDIA Tegra JC1 |
dc.subject.other | Speech segmentation tasks |
dc.subject.other | Speaker diarization |
dc.title | Work-efficient parallel non-maximum suppression for embedded GPU architectures |
dc.type | Conference report |
dc.subject.lemac | Sistemes incrustats (Informàtica) |
dc.subject.lemac | Visualització (Informàtica) |
dc.contributor.group | Universitat Politècnica de Catalunya. CAP - Grup de Computació d'Altes Prestacions |
dc.contributor.group | Universitat Politècnica de Catalunya. VEU - Grup de Tractament de la Parla |
dc.identifier.doi | 10.1109/ICASSP.2016.7471831 |
dc.description.peerreviewed | Peer Reviewed |
dc.relation.publisherversion | http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=7471831 |
dc.rights.access | Restricted access - publisher's policy |
local.identifier.drac | 18765151 |
dc.description.version | Postprint (published version) |
dc.relation.projectid | info:eu-repo/grantAgreement/EC/H2020/644312/EU/Heterogeneous Secure Multi-level Remote Acceleration Service for Low-Power Integrated Systems and Devices/RAPID |
dc.date.lift | 10000-01-01 |
local.citation.author | Oro, D.; Fernandez, C.; Martorell, X.; Hernando, J. |
local.citation.contributor | IEEE International Conference on Acoustics, Speech, and Signal Processing |
local.citation.pubplace | Shanghai |
local.citation.publicationName | 2016 IEEE International Conference on Acoustics, Speech, and Signal Processing: proceedings: March 20-25, 2016: Shanghai International Convention Center: Shanghai, China |
local.citation.startingPage | 1026 |
local.citation.endingPage | 1030 |