Search found 906 matches

by mgates3
Fri Mar 20, 2020 1:16 pm
Forum: User discussion
Topic: MPI+MAGMA
Replies: 1
Views: 82

Re: MPI+MAGMA

Let me see if I understand you correctly. MAGMA doesn't have any MPI. So you are calling MAGMA for node-local computations on each node, and you are doing your own MPI communication. And in this context, MAGMA running in one MPI rank is giving the wrong result. Right? CUDA counts GPUs from 0. You ca...
by mgates3
Thu Mar 05, 2020 11:49 pm
Forum: User discussion
Topic: Building MAGMA on Windows
Replies: 1
Views: 84

Re: Building MAGMA on Windows

This is a known issue with CMake on Windows (see issue below). We have not yet found a feasible work around.
https://gitlab.kitware.com/cmake/cmake/ ... ote_642545

Mark
by mgates3
Thu Mar 05, 2020 11:04 pm
Forum: User discussion
Topic: magma_dgels_gpu error when trying to compile
Replies: 1
Views: 69

Re: magma_dgels_gpu error when trying to compile

Can you give more specifics such as: version of MAGMA (e.g., MAGMA 2.5.2) C/C++ compiler & version (e.g., g++ 9.2.0) CUDA version (e.g., CUDA 10.0) Is this an error compiling MAGMA, or an error compiling your application using MAGMA? If it's your application, please include a minimum code example th...
by mgates3
Mon Mar 02, 2020 11:08 am
Forum: User discussion
Topic: Limitations on precision
Replies: 5
Views: 168

Re: Limitations on precision

Are you using MAGMA's testers to test these, e.g., testing/testing_zheevd? Which specific routine are you using? If using MAGMA's tester, can you share the complete input & output that is concerning you? We generally check the relative backwards error, || A - U S U^H ||_1 / ( || A ||_1 N ) MAGMA's t...
by mgates3
Wed Jan 08, 2020 2:27 pm
Forum: User discussion
Topic: Request: dgges, dtrsyl
Replies: 1
Views: 255

Re: Request: dgges, dtrsyl

Thanks for your input. We'll keep it in mind for future developments.
-mark
by mgates3
Tue Jan 07, 2020 1:43 pm
Forum: User discussion
Topic: Best library for O(100k) linear system
Replies: 2
Views: 385

Re: Best library for O(100k) linear system

SLATE is a good choice for a distributed solver with CPUs or CPUs + GPUs. For using CPUs + GPUs, it would require that the distributed matrix fits in the cumulative memory of all the GPUs. For a single node, MAGMA will also do out-of-GPU-memory algorithms for such large matrices. Just call magma_dge...
by mgates3
Tue Jan 07, 2020 1:37 pm
Forum: User discussion
Topic: ILP64 name-mangling
Replies: 8
Views: 1787

Re: ILP64 name-mangling

You can name mangle the functions in magma/testing/lin/
In magma/testing/Makefile.sr, see variable liblapacktest_src for the 36 or so files that are used. The other files in that directory were copied from LAPACK but aren't used.

-mark
by mgates3
Mon Jan 06, 2020 2:43 pm
Forum: User discussion
Topic: ILP64 name-mangling
Replies: 8
Views: 1787

Re: ILP64 name-mangling

[sdcz]qpt01 are LAPACK testing functions. They would not be in the LAPACK library, per se, but in LAPACK's testing. MAGMA has its own copy of them: >> pfind -i qpt01 lapack lapack/TESTING/LIN/cqpt01.f lapack/TESTING/LIN/dqpt01.f lapack/TESTING/LIN/sqpt01.f lapack/TESTING/LIN/zqpt01.f >> pfind -i qpt...
by mgates3
Thu Dec 19, 2019 4:26 pm
Forum: User discussion
Topic: Multiple hybrid gpu linear solver
Replies: 1
Views: 229

Re: Multiple hybrid gpu linear solver

magma_cgesv works on multiple GPUs; it calls magma_cgetrf, which calls magma_cgetrf_m. However, as you observe, the forward and back solves (getrs) are on the CPU. It's unclear if multi-GPU pivoting (laswp) and triangular solves (trsm) in getrs would benefit from the GPU, since there would be signif...
by mgates3
Thu Nov 07, 2019 2:47 pm
Forum: User discussion
Topic: MAGAMA routines and CUDA kernels
Replies: 4
Views: 1777

Re: MAGAMA routines and CUDA kernels

Yes, magma_dmalloc is just a wrapper around cudaMalloc. It is type-safe (you don't need to use sizeof(double) as you do with cudaMalloc), but otherwise nothing special going on. If you call asynchronous MAGMA routines that take a magma_queue, use the stream from the magma_queue to call CUDA function...