Search found 911 matches

by mgates3
Wed May 02, 2012 7:35 pm
Forum: User discussion
Topic: magma_zgetri_gpu segfaults in OpenMP parallel for
Replies: 14
Views: 10919

Re: magma_zgetri_gpu segfaults in OpenMP parallel for

Yes, we're working on moving to the newer CUBLAS v2 API, which provides thread safety and better support for running multiple kernels in parallel.
-mark
by mgates3
Wed May 02, 2012 7:27 pm
Forum: User discussion
Topic: MAGMA 1.0
Replies: 16
Views: 19280

Re: MAGMA 1.0

See CUBLAS, provided by NVIDIA. It implements all BLAS level 1, 2, 3 functions, including syrk. The MAGMA BLAS library only supplements CUBLAS when we need functionality not covered by BLAS (e.g., LACPY) or we develop a faster implementation.
-mark
by mgates3
Wed May 02, 2012 7:18 pm
Forum: User discussion
Topic: potrs : does it do host-gpu memory transfers
Replies: 2
Views: 1751

Re: potrs : does it do host-gpu memory transfers

Yes, magma_potrs_gpu does data transfers between the CPU and the GPU. It factors the matrix by blocks. Each diagonal block is copied to the CPU, factored there, then copied back to the GPU. On the GPU, the rest of the panel below the diagonal block is updated. If your system is tridiagonal, you will...
by mgates3
Wed May 02, 2012 7:02 pm
Forum: User discussion
Topic: MAGMA on windows or linux
Replies: 6
Views: 3309

Re: MAGMA on windows or linux

I think you need to add LAPACK. The ATLAS library provides the BLAS routines, not the higher-level LAPACK routines. For what it's worth, here is how I link a generic program with ATLAS (but not using magma). Your libraries may differ. # -lifcore resolves undefined reference to `for_write_seq_fmt' # ...
by mgates3
Wed May 02, 2012 4:27 pm
Forum: User discussion
Topic: Magma compiling with MPF failed
Replies: 2
Views: 2337

Re: Magma compiling with MPF failed

MAGMA does not replace the existing BLAS and LAPACK libraries on the CPU. It supplements them with functions to run on the GPU. In fact, MAGMA routines use both the existing BLAS on the CPU (e.g., ATLAS, MKL) and BLAS on the GPU (e.g., CUBLAS, MAGMABLAS). Certainly you could write a wrapper to call ...
by mgates3
Wed May 02, 2012 4:07 pm
Forum: User discussion
Topic: Problem building Magma 1.1.0
Replies: 3
Views: 1946

Re: Problem building Magma 1.1.0

What version of CUDA are you using? Version 4.0 and 4.1 both define cublasStatus_t in the cublas.h. If you can't upgrade, a work around is to define it as John suggests.
-mark
by mgates3
Wed May 02, 2012 3:53 pm
Forum: User discussion
Topic: Dynamic library (.so) compilation on Linux
Replies: 3
Views: 3814

Re: Dynamic library (.so) compilation on Linux

You can safely use the cublas functions. The magmablas functions are in some cases faster but otherwise provide the same functionality. The magmablas_dtrsm should be defined in magmablas/dtrsm_tesla.cu.
-mark
by mgates3
Wed May 02, 2012 3:31 pm
Forum: User discussion
Topic: Problem with "magma_zhegvx" function
Replies: 3
Views: 2366

Re: Problem with "magma_zhegvx" function

No, that's the correct function to use. I didn't realize that function had only the complex version.
-mark
by mgates3
Wed May 02, 2012 3:14 pm
Forum: User discussion
Topic: clMAGMA 0.1 Beta Released
Replies: 8
Views: 19962

Re: clMAGMA 0.1 Beta Released

All four precisions (single, double, single complex, double complex) are available in clMAGMA for QR and Cholesky. However, LU factorization is currently only in single precision. We intend to add support for other precisions but don't have a definite time table yet.
-mark
by mgates3
Tue Apr 03, 2012 1:28 pm
Forum: User discussion
Topic: Tridiagonal Solver
Replies: 3
Views: 3150

Re: Tridiagonal Solver

Magma does not have tridiagonal (or banded) solvers for the GPU. My guess is the tridiagonal solver in LAPACK (dgtsv or dptsv) on the CPU is faster than transferring a tridiagonal matrix to the GPU, solving, and transferring the results back. This is because a tridiagonal solve has O(n) operations o...