Search found 893 matches

by mgates3
Fri Jan 20, 2012 6:50 pm
Forum: User discussion
Topic: Multicore and MultiGPU use of MAGMA
Replies: 13
Views: 8957

Re: Multicore and MultiGPU use of MAGMA

Hi John, Both of these goals -- running multiple problems simultaneously on a single GPU and running MAGMA distributed across several GPUs on several nodes -- are of interest to the MAGMA team. We are moving in that direction, with MAGMA 1.1 adding support for streams in the MAGMA BLAS and support f...
by mgates3
Fri Jan 20, 2012 6:42 pm
Forum: User discussion
Topic: Building problems
Replies: 2
Views: 3152

Re: Building problems

Those variables (blockIdx, etc.) are CUDA specific, as is the kernel launch <<< >>> notation. In order to compile CUDA specific code, you need to be use the CUDA nvcc compiler, instead of gcc. CUDA source code files should have an extension .cu instead of .c or .cpp, to differentiate them. The testi...
by mgates3
Sat Jan 14, 2012 11:47 am
Forum: User discussion
Topic: Magma 1.1.0 building problems w/ testing
Replies: 2
Views: 2593

Re: Magma 1.1.0 building problems w/ testing

The omp_get_num_threads is from missing openmp, I believe from a quick web search. Unfortunately, the Intel MKL libraries vary between different versions. Using gcc and MKL, I currently use: LIB = -lmkl_gf_lp64 -lmkl_gnu_thread -lmkl_core -lpthread -lcublas -lm -fopenmp and LIBDIR = -L${MKLROOT}/lib...
by mgates3
Sat Jan 14, 2012 11:28 am
Forum: User discussion
Topic: Does MAGMA1.0 & Above support Geforce 9series?
Replies: 4
Views: 3426

Re: Does MAGMA1.0 & Above support Geforce 9series?

Notice the Fortran compile command is gibberish: gfortran -O3 -DADD_ -x f95-cpp-input -Dmagma_devptr_t="integer(kind=../control/sizeptr.c: In function ‘main’: ../control/sizeptr.c:6:3: warning: format ‘%lu’ expects type ‘long unsigned int’, but argument 2 has type ‘unsigned int’ 4)" ... It should lo...
by mgates3
Fri Jan 13, 2012 3:00 pm
Forum: User discussion
Topic: new to magma
Replies: 1
Views: 1628

Re: new to magma

Most of the MAGMA routines are self-documented, which you can read in the files section of the doxygen documentation at http://icl.cs.utk.edu/magma/docs/ or in the source code itself. We don't yet have an updated manual. As for eigenvalues, there are syevd and heevd routines for the symmetric/Hermit...
by mgates3
Fri Jan 13, 2012 12:40 pm
Forum: User discussion
Topic: Does MAGMA1.0 & Above support Geforce 9series?
Replies: 4
Views: 3426

Re: Does MAGMA1.0 & Above support Geforce 9series?

See http://developer.nvidia.com/cuda-gpus for a list of CUDA-enabled GPUs. MAGMA supports both CUDA 1.x and CUDA 2.x (Fermi) GPUs. Set the appropriate CUDA version in the make.inc file (GPU_TARGET=0 for CUDA 1.x, GPU_TARGET=1 for CUDA 2.x). However, only GPUs with CUDA 1.3 and above support double p...
by mgates3
Fri Jan 06, 2012 4:52 pm
Forum: User discussion
Topic: where is the realization of magmablas_sgemm?
Replies: 4
Views: 3643

Re: where is the realization of magmablas_sgemm?

I think you mean you found the declaration in include/magmablas_s.h
The actual code is in several files matching magmablas/sgemm*.cu

-mark
by mgates3
Fri Jan 06, 2012 4:47 pm
Forum: User discussion
Topic: problem with magma_dgetrf_gpu
Replies: 1
Views: 941

Re: problem with magma_dgetrf_gpu

According to http://developer.nvidia.com/cuda-gpus, the GeForce GT 220 is CUDA 1.2, which doesn't support double precision. Double precision was added in CUDA 1.3. Does the single precision magma_sgetrf routine work?

-mark
by mgates3
Fri Jan 06, 2012 4:39 pm
Forum: User discussion
Topic: dgemv returns wrong results
Replies: 1
Views: 1215

Re: dgemv returns wrong results

I'm not sure that I understand your question. Doing gemv( A, x ) yields A*x. Doing potrf( A ) then potrs( A, x ) yields A^{-1}*x, or in Matlab notation, A\x. These two, A*x and A\x, should not be the same. Can you clarify what are your inputs to each function, and what is the expected output? -mark
by mgates3
Fri Jan 06, 2012 4:24 pm
Forum: User discussion
Topic: Workspace
Replies: 1
Views: 1009

Re: Workspace

To achieve any reasonable performance, MAGMA requires using a blocked algorithm, which requires using a workspace based on the block size. In some cases, MAGMA uses a larger block size or otherwise needs more workspace than LAPACK. Incidentally, the LAPACK performance will also greatly increase if t...