Mostra el registre d'ítem simple
GPU-accelerated sparse matrix-vector product for a hybridizable discontinuous Galerkin method
dc.contributor.author | Roca Navarro, Francisco Javier |
dc.contributor.author | Nguyeny, N.C. |
dc.contributor.author | Peraire Guitart, Jaume |
dc.contributor.other | Universitat Politècnica de Catalunya. Departament de Matemàtica Aplicada III |
dc.date.accessioned | 2011-12-01T09:15:52Z |
dc.date.available | 2011-12-01T09:15:52Z |
dc.date.created | 2011 |
dc.date.issued | 2011 |
dc.identifier.citation | Roca, X.; Nguyeny, N.; Peraire, J. GPU-accelerated sparse matrix-vector product for a hybridizable discontinuous Galerkin method. A: AIAA Aerospace Sciences Meeting including the New Horizons Forum and Aerospace Exposition. "49th AIAA Aerospace Sciences Meeting including the New Horizons Forum and Aerospace Exposition". Orlando, Florida: 2011, p. 1-12. |
dc.identifier.uri | http://hdl.handle.net/2117/14128 |
dc.description.abstract | The iterative solution of the large systems of equations that result from discontinuous Galerkin (DG) discretizations require the ability to carry out fast matrix-vector products. DG matrices have a sparse block structure with a constant number of non-zero equal-sized non-overlapping blocks per row. General-purpose sparse matrix-vector product algorithms are not designed to exploit the speci c structure of the DG matrices and, as a consequence, result in sub-optimal performance. To address this issue, we propose a sparse matrix-vector product for DG discretizations based on a dense tensor contraction. A GPU implementation of the proposed algorithm for a hybridizable discontinuous Galerkin (HDG) method is tested on the NVIDIA GEFORCE GTX 285. The results show that the tensor contraction performs at about 20 to 25 GFLOP/s in double precision with a sustained efficiency of more than 40% (60 GBytes/s) of the peak memory bandwidth (160 GBytes/s). Moreover, for HDG matrices in double precision, the proposed method is 2 times faster than the general sparse matrix-vector products provided by the GPU library CUSPARSE and about 30 times faster than MATLAB running on a CPU. |
dc.format.extent | 12 p. |
dc.language.iso | eng |
dc.subject | Àrees temàtiques de la UPC::Matemàtiques i estadística::Matemàtica aplicada a les ciències |
dc.subject.lcsh | Galerkin methods |
dc.title | GPU-accelerated sparse matrix-vector product for a hybridizable discontinuous Galerkin method |
dc.type | Conference report |
dc.subject.lemac | Mètodes de Garlekin |
dc.contributor.group | Universitat Politècnica de Catalunya. LACÀN - Mètodes Numèrics en Ciències Aplicades i Enginyeria |
dc.description.peerreviewed | Peer Reviewed |
dc.rights.access | Restricted access - publisher's policy |
local.identifier.drac | 5755158 |
dc.description.version | Postprint (published version) |
local.citation.author | Roca, X.; Nguyeny, N.; Peraire, J. |
local.citation.contributor | AIAA Aerospace Sciences Meeting including the New Horizons Forum and Aerospace Exposition |
local.citation.pubplace | Orlando, Florida |
local.citation.publicationName | 49th AIAA Aerospace Sciences Meeting including the New Horizons Forum and Aerospace Exposition |
local.citation.startingPage | 1 |
local.citation.endingPage | 12 |