Use MAGMA asynchronously with CUDA streams for Cholesky app?

Open discussion for MAGMA library (Matrix Algebra on GPU and Multicore Architectures)

Use MAGMA asynchronously with CUDA streams for Cholesky app?

Postby judits » Tue May 27, 2014 10:45 am


I'm testing a CUDA framework that I'm developing and I'd like to use MAGMA to implement a blocked Cholesky factorization (so, I need the dpotrf, dtrsm, dgemm and dsyrk kernels). For my experiments, I need these functions to be asynchronous and I need to be able to configure the CUDA stream where they are launched (as I will later synchronize using CUDA events). Ideally, the kernels should run exclusively on the GPU.

However, I read in a previous post that dpotrf function is synchronous because it's partially run on the CPU and that setting the CUDA stream in MAGMA is not thread safe. Is this true for the latest MAGMA release?

I know I can use CUBLAS for dtrsm, dgemm and dsyrk (run asynchronously in the CUDA stream that I set), but I also need dpotrf... Is it possible to use MAGMA kernels in the way that I need it?

Posts: 1
Joined: Tue May 27, 2014 9:29 am

Re: Use MAGMA asynchronously with CUDA streams for Cholesky

Postby mgates3 » Tue May 27, 2014 4:59 pm

Yes, most MAGMA functions including dpotrf are hybrid -- they do some work on the CPU, namely the panel, dpotf2 -- so they won't be asynchronous. There is a Cholesky panel in MAGMA, magma_dpotf2_gpu, which runs completely on the GPU. You could use that to build a dpotrf that runs completely on the GPU. Panel operations tend to be slow on the GPU, though, so it may cause the entire factorization to be slower. You can set MAGMA's stream beforehand, but if you have other threads that are also setting MAGMA's stream, currently you would need to modify dpotf2 to pass in a stream to be thread-safe. Eventually, the MAGMA API will change to have a stream passed into each function.
Posts: 782
Joined: Fri Jan 06, 2012 2:13 pm

Return to User discussion

Who is online

Users browsing this forum: No registered users and 4 guests