Search found 266 matches

by Stan Tomov
Thu Feb 28, 2013 1:55 pm
Forum: User discussion
Topic: Variable stack allocations, Microsoft Visual Studio 2010
Replies: 4
Views: 3258

Re: Variable stack allocations, Microsoft Visual Studio 2010

Tom,
It will be in 1.4. We are incorporating some new eigensolvers for 1.4 and along that we will include bug fixes and other small changes (to be released within a month).
Stan
by Stan Tomov
Tue Feb 26, 2013 8:32 pm
Forum: User discussion
Topic: Functions implemented in magmablas interface (MAGMA 1.3.0)
Replies: 1
Views: 1489

Re: Functions implemented in magmablas interface (MAGMA 1.3.

This is correct. We had at some point implementation of all routines in the header but as CUBLAS improved they were not needed and removed. In general, we prefer not to maintain a complete BLAS implementation; only certain routines that are important for magma and we see how to accelerate (and they ...
by Stan Tomov
Tue Feb 26, 2013 8:21 pm
Forum: User discussion
Topic: About lack of documentatio in some interfaces in MAGMA 1.3.0
Replies: 1
Views: 1603

Re: About lack of documentatio in some interfaces in MAGMA 1

Thanks for these documentation bug reports! We fixed them in the SVN. Regarding the hvector question, you are right, it is on the CPU memory. An example on how to use the routine is in testing_dsgeqrsv_gpu.cpp. This routines tests mixed precision, as well as working precision solvers like the magma_...
by Stan Tomov
Tue Feb 26, 2013 8:01 pm
Forum: User discussion
Topic: testing_*blas core
Replies: 1
Views: 1083

Re: testing_*blas core

Hello, I can confirm there is an issue. There is a bug related to the error checking, and in particular, the size of a pivot array piv in err = magma_malloc_cpu( (void**) &piv, N*sizeof(magma_int_t) ); assert( err == 0 ); should have been Ak, as in err = magma_malloc_cpu( (void**) &piv, Ak*sizeof(ma...
by Stan Tomov
Tue Feb 26, 2013 7:31 pm
Forum: User discussion
Topic: Variable stack allocations, Microsoft Visual Studio 2010
Replies: 4
Views: 3258

Re: Variable stack allocations, Microsoft Visual Studio 2010

Thanks for pointing this out! We will fix it for the next release.
by Stan Tomov
Tue Feb 26, 2013 7:28 pm
Forum: User discussion
Topic: Testing bugs
Replies: 1
Views: 1250

Re: Testing bugs

Thanks for pointing this out. We had other user requests and feedback on the testers and we redesigned them. They are more uniform looking now and support -h option to print help. Different options give the user a way to specify specific sizes, ranges, error checking, etc. This will be available wit...
by Stan Tomov
Tue Feb 26, 2013 7:14 pm
Forum: User discussion
Topic: Deploying apps which use Magma
Replies: 3
Views: 1402

Re: Deploying apps which use Magma

The ultimate goal is that the user would specify what hardware to be used (while the default would be to use everything available). This is not possible right now but we keep adding functionality that would make it possible, e.g., we released clMAGMA for enabling MAGMA use on any accelerator through...
by Stan Tomov
Tue Oct 23, 2012 2:43 am
Forum: User discussion
Topic: Installation error with MAGMA1.2.1
Replies: 1
Views: 1930

Re: Installation error with MAGMA1.2.1

Try linking with these libraries

Code: Select all

LIB       = -lmkl_gf_lp64 -lmkl_intel_thread -lmkl_core -liomp5 -lpthread -lcublas -lm -fopenmp
or consult with the MKL's link adviser at http://software.intel.com/sites/products/mkl/
Stan
by Stan Tomov
Thu Sep 13, 2012 10:25 am
Forum: User discussion
Topic: Problem with testing_zgesv
Replies: 12
Views: 9067

Re: Problem with testing_zgesv

Upon further investigation we found that the problem is with Intel's compiler. This CUDA release note summarizes the issue: There is a known bug in ICC with respect to passing 16-byte aligned types by value to GCC-built code such as the CUDA Toolkit libraries (e.g., CUBLAS). At this time, passing a ...
by Stan Tomov
Wed Sep 05, 2012 11:02 am
Forum: User discussion
Topic: wishlist
Replies: 4
Views: 2602

Re: wishlist

This routine becomes more compute intensive when eigenvectors are needed. In that case most of the flops are in gemm and this is what is GPU accelerated.
Stan