Search found 276 matches

by Stan Tomov
Wed Dec 04, 2019 12:36 am
Forum: User discussion
Topic: magma_init returns MAGMA_SUCCESS with no GPU
Replies: 1
Views: 465

Re: magma_init returns MAGMA_SUCCESS with no GPU

One of the functions of magma_init() is to determine how many devices are out there. If there are none, the number of devices is initialized as 0. The code that checks this looks like this: err = cudaGetDeviceCount( &g_magma_devices_cnt ); if ( err != 0 && err != cudaErrorNoDevice ) { info = MAGMA_E...
by Stan Tomov
Thu Nov 21, 2019 8:58 pm
Forum: User discussion
Topic: GPU_TARGET selection affects performance?
Replies: 2
Views: 546

Re: GPU_TARGET selection affects performance?

MAGMA queries the GPU architecture through CUDA function calls, and tunes the code based on that. Thus, tuning is not based on the specified GPU_TARGET. GPU_TARGET is used for the compilation to generate code that is compatible with various GPUs. A disadvantage of specifying all is longer compilatio...
by Stan Tomov
Fri Dec 28, 2018 1:34 pm
Forum: User discussion
Topic: (d/s)potrf_batched has some kind of memory leak
Replies: 2
Views: 790

Re: (d/s)potrf_batched has some kind of memory leak

Thank you for reporting it.
The leak has been fixed you can update from bitbucket.
by Stan Tomov
Tue Oct 16, 2018 3:15 pm
Forum: User discussion
Topic: Create distributed matrix on gpus with no cpu to gpu copy
Replies: 9
Views: 2069

Re: Create distributed matrix on gpus with no cpu to gpu cop

The routine described is actually in MAGMA, called dgegqr_gpu version 4, along with a few other versions also described there. You can test them and see how they are called, e.g., with ./testing_dgegqr_gpu --version 4 -N 10000,64 -c The implementation itself is very simple using the magma building b...
by Stan Tomov
Sun Oct 14, 2018 11:53 pm
Forum: User discussion
Topic: Create distributed matrix on gpus with no cpu to gpu copy
Replies: 9
Views: 2069

Re: Create distributed matrix on gpus with no cpu to gpu cop

The magma_dpotrf_mgpu function requires that the matrix is distributed among the GPUs in 1D block cyclic way, where nb is obtained by magma_get_dpotrf_nb(n). The magma_dsetmatrix_1D_col_bcyclic function is just one example on how one can get to this distribution starting from CPU memory. If you alre...
by Stan Tomov
Fri Oct 12, 2018 5:46 pm
Forum: User discussion
Topic: Matrix formats for lobpcg
Replies: 4
Views: 1372

Re: Matrix formats for lobpcg

We can easily enable SpMM for any sparse matrix, e.g., just do SpMV for each vector. However, special optimizations for SpMM can do much better than SpMV. We have done this only for the Magma_SELLP format, and have disable the others - just as a reminder to add them (but we can also enable the slow ...
by Stan Tomov
Wed Sep 19, 2018 10:09 am
Forum: User discussion
Topic: Batched dpotri?
Replies: 2
Views: 2046

Re: Batched dpotri?

Hi Rene, This sounds very good! We are interested in any improvement in these kernels. I will contact you with the procedure how to contribute it - we will have some software engineering requirements and we have to see if the code can be generalized (and tuned easily for other sizes and precisions)....
by Stan Tomov
Thu Jul 12, 2018 7:09 pm
Forum: User discussion
Topic: Magma Example gives wrong results
Replies: 3
Views: 1060

Re: Magma Example gives wrong results

Dear Martin,

Very good! So the problem was not in MAGMA?

Yes, we recommend using MAGMA. We support multiple right hand sides for CG
and have very highly optimized SpMM product that is used in this case! Tell us how it goes.

Best regards,
Stan
by Stan Tomov
Mon Sep 18, 2017 1:11 pm
Forum: User discussion
Topic: magma_dgeqp3 parameter 2 incorrect
Replies: 4
Views: 1133

Re: magma_dgeqp3 parameter 2 incorrect

You can just recompile - remove testing_dgeqp3 and testing_dgeqp3.o, and do make testing_dgeqp3 to see how it is compiled and linked through the magma testers. The way you have it is a little different but still worked on the system that I tested. Compiling and linking the same way indeed sounds lik...
by Stan Tomov
Fri Sep 15, 2017 10:39 pm
Forum: User discussion
Topic: magma_dgeqp3 parameter 2 incorrect
Replies: 4
Views: 1133

Re: magma_dgeqp3 parameter 2 incorrect

This is interesting. I took your example and the way you compile and link it,
and managed to run the code without problem. I would have guessed there is
problem with MKL, but you say the testing_dgeqp3 in magma-2.2.0 runs fine.
Stan