Search found 8 matches

by wsawyer
Thu Jan 25, 2018 10:14 am
Forum: User discussion
Topic: MAGMA 2.3.0: Tests segfaulting
Replies: 0
Views: 1408

MAGMA 2.3.0: Tests segfaulting

I've installed MAGMA versions in the past w/o problems, but 2.3.0 is recalcitrant... Overview: CUDA8.0, Pascal cards,Cray wrappers (CC and ftn). gcc/5.3.0 (also tried 6.1.0) dom101/apps/daint/UES/6.0.UP04/sandboxes/wsawyer/magma-2.3.0> srun nvidia-smi Thu Jan 25 15:00:18 2018 +----------------------...
by wsawyer
Mon Dec 05, 2011 5:18 am
Forum: User discussion
Topic: Possible to runmagma kernel on only one thread-block?
Replies: 8
Views: 3839

Re: Possible to runmagma kernel on only one thread-block?

Stan, Thanks for your answer. The first part (the user would provide the computations streams as input) would be an entirely reasonable we to go. We only constrained the problem to one multiprocessor was for simplicity's sake. So if I understand correctly, since cublas_v2 allows streams already, it ...
by wsawyer
Mon Dec 05, 2011 5:02 am
Forum: User discussion
Topic: Possible to runmagma kernel on only one thread-block?
Replies: 8
Views: 3839

Re: Possible to runmagma kernel on only one thread-block?

Dear Nostra, I just finished integrating a blocking factor into the QR code, so now we can look at systems of very large size. The code is part of CUSP, and therefore it is publicly available. It however been written by a student and is not very clean. The performance reaches an asymptote of about 3...
by wsawyer
Thu Nov 17, 2011 9:16 am
Forum: User discussion
Topic: Possible to runmagma kernel on only one thread-block?
Replies: 8
Views: 3839

Re: Possible to runmagma kernel on only one thread-block?

Stan, I am only now getting back to this subject. A student wrote a CUDA implementation to perform QR on multiple (small) matrices with variable size, with each matrix mapped to a separate thread block. The performance increases dramatically from 1 to 10 matrices and reaches an asymptote at about 50...
by wsawyer
Thu Jul 28, 2011 5:24 am
Forum: User discussion
Topic: Possible to runmagma kernel on only one thread-block?
Replies: 8
Views: 3839

Possible to runmagma kernel on only one thread-block?

Our application requires a large number of independent least squares minimizations, which result in QR decompositions of varying size, generally not more than 200x100. This is the low end of performance if we try to utilize the whole GPU. We would like to invoke multiple magma_sgeqrf each occupying ...
by wsawyer
Thu Jul 28, 2011 5:17 am
Forum: User discussion
Topic: MAGMA 1.0rc5 with cuda/4.0
Replies: 1
Views: 1652

MAGMA 1.0rc5 with cuda/4.0

I am now trying to move all our software to cuda/4.0. I've noticed that at least one testing routine does not compile: gcc -O3 -DADD_ -DGPUSHMEM=130 -fPIC -nofor_main -Xlinker -zmuldefs -DGPUSHMEM=130 testing_zgemm.o -o testing_zgemm lin/liblapacktest.a -L../lib \ -lcuda -lmagma -lmagmablas -lmagma ...
by wsawyer
Tue Jun 22, 2010 10:06 am
Forum: User discussion
Topic: MAGMA 0.2 does not seem to work with CUDA 3.0
Replies: 2
Views: 8568

Re: MAGMA 0.2 does not seem to work with CUDA 3.0

Thanks for the reply. Here some more details: OS: Novell SUSE Linux Enterprise Server 11 Operating System GPUs: 11xGTX285, 4xTesla MKL: 11.1 Compilers: Intel icc/11.1 ifort/11.1 sm_20 seems to apply only to Fermi cards, therefore inappropriate for our GTX285 and Tesla cards. We tried compiling with ...
by wsawyer
Wed Jun 09, 2010 2:47 am
Forum: User discussion
Topic: MAGMA 0.2 does not seem to work with CUDA 3.0
Replies: 2
Views: 8568

MAGMA 0.2 does not seem to work with CUDA 3.0

We have successfully tested MAGMA 0.2 with CUDA 2.3, but have since upgraded to CUDA 3.0. MAGMA 0.2 does not seem to be compatible with 3.0. We can post error messages on request.

Will this problem be addressed in MAGMA 0.3. When will that be released?

Thanks much!

--Will and Vincenzo