MAGMA provides implementations for CUDA, Intel Xeon Phi, and OpenCL. The latest releases are MAGMA 2.0, MAGMA MIC 1.4.0, and clMAGMA 1.3, respectively. The libraries available for download are listed below in the order of their release dates.

MAGMA 2.0 release

MAGMA 2.0 is now available. This release includes a major interface change for all MAGMA BLAS functions; most higher level functions such as magma_zgetrf have not changed their interface. Significant changes:

  • Added queue argument to magmablas routines, and deprecated magmablas{Set,Get}KernelStream. This resolves a thread safety issue with using global magmablas{Set,Get}KernelStream.
  • Fixed bugs related to relying on CUDA NULL stream implicit synchronization.
  • Fixed memory leaks (zunmqr_m, zheevdx_2stage, etc.). Add -DDEBUG_MEMORY option to catch leaks.
  • Fixed geqrf*_gpu bugs for m == nb, n >> m (ex: -N 64,10000); and m >> n, n == nb+i (ex: -N 10000,129)
  • Fixed zunmql2_gpu for rectangular sizes.
  • Fixed zhegvdx_m itype 3.
  • Added zunglq, zungbr, zgeadd2 (which takes both alpha and beta).   

MAGMA sparse

  • Added QMR, TFQMR, preconditioned TFQMR
  • Added CGS, preconditioned CGS
  • Added kernel-fused versions for CGS/PCGS QMR, TFQMR/PTFQMR
  • Changed relative stopping criterion to be relative to RHS
  • Fixed bug in complex version of CG
  • Accelerated version of Jacobi-CG
  • Added very efficient IDR
  • Performance tuning for SELLP SpMV

2.0.1 (Feb 26, 2016)

  • Fixes a minor issue with "make install". No other source code changes from 2.0.0.

2.0.2 (May 2, 2016)

  • Adds MAGMA_NO_V1 option to disable MAGMA v1.x compatability.
  • Updates the testers to use MAGMA v2.0. 
  • Fixes use of NULL stream when using MAGMA v1.x interface.

Please take this survey to help improve MAGMA, LAPACK, and other dense linear algebra libraries. We estimate that it should take 10 minutes to fill it out.  Thank you very much.

magma-2.0.2.tar.gz   Download View License


MAGMA MIC 1.4.0 is now available. This release provides implementations for MAGMA's one-sided (LU, QR, and Cholesky) and two-sided (Hessenberg, bi- and tridiagonal reductions) dense matrix factorizations, as well as linear and eigenproblem solver for Intel Xeon Phi Coprocessors. More information on the approach is given in this presentation.

magmamic-1.4.0.tar.gz   Download View License

clMAGMA 1.3

clMAGMA is an OpenCL port of MAGMA. It supports AMD GPUs. The clMAGMA library dependancies, in particular optimized GPU OpenCL BLAS and CPU optimized BLAS and LAPACK for AMD hardware, can be found in the AMD clMath Libraries (formerly APPML).

Included in the clMAGMA 1.3 release are routines for the following algorithms:

  • LU, QR, and Cholesky factorizations in both real and complex  arithmetic (single and double);
  • Linear and least squares solvers based on correspondingly the LU/Cholesky and QR factorizations in both real and complex  arithmetic (single and double);
  • Reductions to Hessenberg, bidiagonal, and tridiagonal forms using orthgonal similarity transformationsin both real and complex arithmetic (single and double);
  • Eigen and singular value problem solvers in both real and complex arithmetic (single and double);
  • Orthogonal transformation routines.
clmagma-1.3.0.tar.gz   Download View License

MAGMA 2.1 release

MAGMA 2.1 for CUDA is now available. Now features and updates include:

  • Variable size batched routines (gemm, gemv, syrk, syr2k).
  • Improved SVD performance for tall (m >> n) or wide (m << n) matrices.
  • Preconditioned QMR.
  • Expanded doxygen documentation.
  • For MAGMA v1 compatability, initializes default queue for each GPU on first use, instead of in magma_init.

magma-2.1.0.tar.gz   Download View License


