Batched QR and SVD algorithms on GPUs with applications in hierarchical matrix compression

Wajih Halim Boukaram; George Turkiyyah; Hatem Ltaief; David E. Keyes

doi:10.1016/j.parco.2017.09.001

Back

Batched QR and SVD algorithms on GPUs with applications in hierarchical matrix compression

Journal article

Peer reviewed

Batched QR and SVD algorithms on GPUs with applications in hierarchical matrix compression

Wajih Halim Boukaram, George Turkiyyah, Hatem Ltaief and David E. Keyes

Parallel computing, Vol.74, pp.19-33

05/2018

DOI: https://doi.org/10.1016/j.parco.2017.09.001

Abstract

Batched operations

Compression

GPU

Hierarchical

SVD

•High performance GPU hosted batched QR decomposition kernels are developed and outperform current implementations for small and rectangular matrices.•Various GPU hosted batched singular value decomposition kernels are developed and used as building blocks of a batched randomized SVD kernel for numerically low rank matrix blocks.•Batched QR, SVD, and GEMM kernels are used to compress hierarchical matrices entirely on the GPU. We present high performance implementations of the QR and the singular value decomposition of a batch of small matrices hosted on the GPU with applications in the compression of hierarchical matrices. The one-sided Jacobi algorithm is used for its simplicity and inherent parallelism as a building block for the SVD of low rank blocks using randomized methods. We implement multiple kernels based on the level of the GPU memory hierarchy in which the matrices can reside and show substantial speedups against streamed cuSOLVER SVDs. The resulting batched routine is a key component of hierarchical matrix compression, opening up opportunities to perform H-matrix arithmetic efficiently on GPUs.

Metrics

1 Record Views

Details

Title: Batched QR and SVD algorithms on GPUs with applications in hierarchical matrix compression
Creators - without role: Wajih Halim Boukaram - King Abdullah University of Science and Technology
George Turkiyyah - American University of Beirut
Hatem Ltaief - King Abdullah University of Science and Technology
David E. Keyes - King Abdullah University of Science and Technology
Publication Details: Parallel computing, Vol.74, pp.19-33
Publisher: Elsevier B.V
Identifiers: 9941364408331
Academic Unit: King Abdullah University of Science & Technology
Language: English
Resource Type: Journal article