Search found 271 matches

by Stan Tomov
Tue Sep 23, 2014 3:03 pm
Forum: User discussion
Topic: MAGMA on ARM
Replies: 5
Views: 5038

Re: MAGMA on ARM

Hi Rob, Thanks for the info. We are targeting a middle of November release, but can provide specific routines in advance if you want to test. There were no problems with the compilation - we just put lapack with reference blas for the ARM with a make.inc looking like this: #/////////////////////////...
by Stan Tomov
Sat Sep 20, 2014 8:22 pm
Forum: User discussion
Topic: MAGMA on ARM
Replies: 5
Views: 5038

Re: MAGMA on ARM

Hi, We have been able to compile on ARM, and in particular on the TK1 development board that you also mentioned. We also compile directly on the TK1 now and everything works out of the box but performance can be further optimized, and we are developing a MAGMA Embedded version of MAGMA to address th...
by Stan Tomov
Wed Jun 04, 2014 2:41 pm
Forum: User discussion
Topic: linking with intel mkl
Replies: 2
Views: 2678

Re: linking with intel mkl

MAGMA requires linking with these libraries:

Code: Select all

-lmkl_core -lmkl_intel_lp64 -lmkl_intel_thread -lcublas -lcudart -lm -liomp5
How is your linking different?
Functions like '_intel_fast_memcpy' are in libirc, so adding '-lirc' may help. Usually this is included automatically by the compiler.
Stan
by Stan Tomov
Tue Apr 22, 2014 6:26 pm
Forum: User discussion
Topic: zgemm for matrices that don't fit
Replies: 2
Views: 1725

Re: zgemm for matrices that don't fit

You can try the new NVIDIA cuBLAS-XT library. See:
https://developer.nvidia.com/cublasxt
Stan
by Stan Tomov
Fri Sep 13, 2013 1:14 am
Forum: User discussion
Topic: the error when I compile magma1.4.0 in vs2010
Replies: 3
Views: 2550

Re: the error when I compile magma1.4.0 in vs2010

You should be able to specify compiler options in vs2010, but I haven't used it recently and I am not sure where exactly. The other way is to revise the magma_types.h, e.g., by adding after the include statements a

Code: Select all

#define HAVE_CUBLAS
by Stan Tomov
Mon Sep 09, 2013 9:20 am
Forum: User discussion
Topic: the error when I compile magma1.4.0 in vs2010
Replies: 3
Views: 2550

Re: the error when I compile magma1.4.0 in vs2010

HAVE_CUBLAS has to be defined, e.g., by adding -DHAVE_CUBLAS to the compiler options. In Linux, the compiler options are set in Makefile.internal based on user input from make.inc.
by Stan Tomov
Thu Aug 22, 2013 5:55 pm
Forum: User discussion
Topic: Error: BLAS/LAPACK routine 'magma_' gave error code -7
Replies: 2
Views: 4042

Re: Error: BLAS/LAPACK routine 'magma_' gave error code -7

I see that the work space is indeed not large enough. You have

Code: Select all

lwork = max( lwork, max( nb, 2*nb*nb ));
but it should be

Code: Select all

lwork = max( lwork, max( n_col*nb, 2*nb*nb ));
Alternatively, you could have called directly magma_dgeqrf with work space size query (instead of lapackf77_dgeqrf).
by Stan Tomov
Sat Jun 22, 2013 10:33 am
Forum: User discussion
Topic: magma-1.4.0-beta1 does not compile...
Replies: 3
Views: 1963

Re: magma-1.4.0-beta1 does not compile...

Sorry about this - we use release generation scripts and we had a bug there. To fix it, please add

Code: Select all

.DEFAULT_GOAL :=
at the end of file Makefile.internal. Thanks.
by Stan Tomov
Fri Jun 14, 2013 1:46 pm
Forum: User discussion
Topic: GPU interface to dgetrf with streams
Replies: 4
Views: 3889

Re: GPU interface to dgetrf with streams

Austin, I see. This sounds good. We have been asked by users to provide this type of stream interface, so any experimental results on performance would be very useful for us to know. I can check with NVIDIA developers if routine arguments are always sent asynchronously or if there are cases that the...
by Stan Tomov
Thu Jun 13, 2013 11:18 pm
Forum: User discussion
Topic: GPU interface to dgetrf with streams
Replies: 4
Views: 3889

Re: GPU interface to dgetrf with streams

The current code uses stream 0 for the GPU BLAS. This would not allow concurrent BLAS execution on the GPU from the different threads. Related to the communications, I think magmablas_dpermute_long2s does not use synchronous communications. The routine does not have any explicit communication, only ...