MAGMA 1.1 Released

Open discussion for MAGMA

MAGMA 1.1 Released

Postby Stan Tomov » Fri Nov 18, 2011 4:52 pm

Stan Tomov
 
Posts: 247
Joined: Fri Aug 21, 2009 10:39 pm

Re: MAGMA 1.1 Released

Postby maranhao » Thu Nov 24, 2011 8:03 pm

On the release announcement http://icl.cs.utk.edu/magma/news/news.html?id=278 it seems to indicate that version 1.1 is capable of performing QR factorization on the GPU as well as multiple GPUs, but under the Magma 1.1 computational routines http://icl.cs.utk.edu/projectsfiles/magma/magma-routines-2.png QR factorization is listed as a CPU only function. Which is it?
maranhao
 
Posts: 6
Joined: Thu Nov 24, 2011 7:52 pm

Re: MAGMA 1.1 Released

Postby Stan Tomov » Fri Nov 25, 2011 12:05 am

"CPU" does not mean that only CPUs would be used - it means "CPU interface" (the input data and the output result is expected to be on the CPU memory). The "GPU" or "GPU interface" means that the input matrix as well as the output is on the GPU memory. In either case both the GPUs and the CPUs are used.
For the case of QR, if you have more than one GPU, you can set environment variable MAGMA_NUM_GPUS to the number of GPUs you would like to use. For example, setting
Code: Select all
setenv MAGMA_NUM_GPUS 4

will result in using 4 GPUs in subsequent calls to magma_{s,d,c,z}geqrf.
Stan Tomov
 
Posts: 247
Joined: Fri Aug 21, 2009 10:39 pm

Re: MAGMA 1.1 Released

Postby maranhao » Sat Nov 26, 2011 1:57 pm

Thanks Stan. If this is the case it would seem data input to magma_sgeqrf functions would have to all reside on CPU memory, but in testing_sgeqrf_gpu.cpp it appears to me that d_A is in device memory. So does SGEQRF also have a GPU interface in which case the "Computation Routines in Magma 1.1" table I linked earlier needs to be updated?

Also, what is the difference between sgeqrf_gpu, sgeqrf2_gpu, and sgeqrf3_gpu?
maranhao
 
Posts: 6
Joined: Thu Nov 24, 2011 7:52 pm

Re: MAGMA 1.1 Released

Postby Stan Tomov » Sun Nov 27, 2011 2:14 am

Yes, this is a typo - QR has both CPU and GPU interface. Thanks for pointing this out. We will fix it.

Regarding the different versions, sgeqrf2_gpu is LAPACK consistent in terms of input and output data layout. The sgeqrf_gpu version stores the triangular matrices used in the factorization. sgeqrf3_gpu stores the triangular matrices but also modifies the storage for the Householder vectors used in the factorization - 0s are put in the upper triangular parts of the panels, 1s on the diagonal, and the upper triangular parts are stored separately. See also this discussion topic.
Stan Tomov
 
Posts: 247
Joined: Fri Aug 21, 2009 10:39 pm


Return to User discussion

Who is online

Users browsing this forum: Bing [Bot] and 2 guests