Search found 883 matches

by mgates3
Fri Sep 13, 2019 2:03 pm
Forum: User discussion
Topic: Is there a step by step tutorial on installing magma?
Replies: 1
Views: 76

Re: Is there a step by step tutorial on installing magma?

Are you following the steps using CMake in README-Windows? Can you be more specific about what happened and when the error occurred?
-mark
by mgates3
Thu Sep 05, 2019 12:50 am
Forum: User discussion
Topic: MAGMA on MAC OS
Replies: 1
Views: 301

Re: MAGMA on MAC OS

Yes, though since Apple has stopped supporting NVIDIA cards on MacOS Mojave (10.14), anything using CUDA is limited to MacOS 10.13 and older (as I understand it). Compiling MAGMA on MacOS is similar to on Linux. I use the gcc & gfortran compilers from HPC MacOS X: http://hpc.sourceforge.net/ Follow ...
by mgates3
Mon Sep 02, 2019 1:17 pm
Forum: User discussion
Topic: nvcc "command line is too long" on windows
Replies: 3
Views: 427

Re: nvcc "command line is too long" on windows

From a magma/build directory, try:

Code: Select all

cmake -DCMAKE_WINDOWS_EXPORT_ALL_SYMBOLS=TRUE -DBUILD_SHARED_LIBS=TRUE ..
This assumes CMake >= 3.4.
See https://blog.kitware.com/create-dlls-on ... l-feature/

-mark
by mgates3
Wed Aug 28, 2019 10:19 am
Forum: User discussion
Topic: nvcc "command line is too long" on windows
Replies: 3
Views: 427

Re: nvcc "command line is too long" on windows

I haven't seen this, but usually use the Makefile, not CMake. When I use CMake on Windows, I use MS Visual Studio (the free version works fine). I gather you run CMake to configure MAGMA. What build system do you use after configuring with CMake: Makefile, MS Visual Studio, ...?
-mark
by mgates3
Tue Aug 20, 2019 2:31 pm
Forum: User discussion
Topic: Pinned memory for diagonalization dsygvd (Divide and conquer)
Replies: 1
Views: 427

Re: Pinned memory for diagonalization dsygvd (Divide and conquer)

Because of the complexity of managing an array distributed across multiple GPUs, we don't currently have a version of sygvdx_m where the matrix is given on the GPUs (i.e., sygvdx_mgpu).

I do recommend trying the 2-stage version, dsygvdx_2stage_m, which is often faster.

-mark
by mgates3
Fri Aug 16, 2019 7:56 am
Forum: User discussion
Topic: Using sgemm for rectangular (non-square) matrix multiply
Replies: 7
Views: 7290

Re: Using sgemm for rectangular (non-square) matrix multiply

First, in the future, please start a new topic, rather than replying to a 7 year old topic. Your variable names seem mixed up: there are M rows and N cols in C. You are adding beta*C multiple times. Normally I use i for rows, j for cols. Unfortunately k is used for dimension, so we need something el...
by mgates3
Mon Aug 12, 2019 12:26 pm
Forum: User discussion
Topic: ILP64 name-mangling
Replies: 4
Views: 493

Re: ILP64 name-mangling

On second glance, it doesn't appear that CFLAGS is propagated to CXXFLAGS by CMake, and it won't propagate to Fortran or NVCC for CUDA. Even worse, it doesn't seem that CMake picks up NVCCFLAGS from the environment, so there doesn't appear to be a way to override or append to it, without editing CMa...
by mgates3
Sun Aug 11, 2019 9:13 am
Forum: User discussion
Topic: ILP64 name-mangling
Replies: 4
Views: 493

Re: ILP64 name-mangling

That should work. Ideally, CMake would detect 64-bit BLAS, which we do in some other projects (BLAS++) but not yet in MAGMA.
-mark
by mgates3
Fri Aug 09, 2019 4:55 pm
Forum: User discussion
Topic: ILP64 name-mangling
Replies: 4
Views: 493

Re: ILP64 name-mangling

If you compiled without CMake, then editing magma_mangling.h as you describe should work. But as you noticed, CMake #defines its own mangling macro, called MAGMA_GLOBAL, in the file magma_mangling_cmake.h. The quickest hack to do would be comment out magma_mangling_cmake (or #undef MAGMA_GLOBAL), an...
by mgates3
Mon Aug 05, 2019 12:55 pm
Forum: User discussion
Topic: Fault injection
Replies: 0
Views: 418

Re: Fault injection

That seems fine for testing. MAGMA doesn't do anything to detect or prevent faults from bit flips. If the bit happens to be low-order (like 0.00000001), then the difference will be negligible, but if the bit happens to be high-order (like 1.) or an exponent bit, the error is likely to be significant...