magma_dgetrf_gpu. CUDA runtime error

Open discussion for MAGMA

magma_dgetrf_gpu. CUDA runtime error

Postby dalal » Sat Mar 22, 2014 1:52 pm

Hello everyone,
I am trying to use magma_dgetrf_gpu, but I got a lot of CUDA runtime errors:
My code:
magma_int_t *ipiv; int info; min_nm = n;
magma_malloc_cpu((void**) &ipiv, n*sizeof(magma_int_t));
magmablas_dlacpy( MagmaUpperLower, n, n, d_U, n32, d_Utmp, n32 );
magma_dgetrf_gpu( n, n, d_Utmp, n32, ipiv, &info);
printf("\n info %d \n", info);

The following errors are repeated many times:
3.0
CUDA runtime error: unspecified launch failure (4) in magma_dgetrf_gpu at dgetrf_gpu.cpp:170
CUBLAS error: memory mapping error (11) in magma_dgetrf_gpu at dgetrf_gpu.cpp:172
CUDA runtime error: unspecified launch failure (4) in magma_dgetrf_gpu at dgetrf_gpu.cpp:188
CUBLAS error: memory mapping error (11) in magma_dgetrf_gpu at dgetrf_gpu.cpp:195
CUDA runtime error: unspecified launch failure (4) in magma_dgetrf_gpu at dgetrf_gpu.cpp:199
CUDA runtime error: unspecified launch failure (4) in magma_dgetrf_gpu at dgetrf_gpu.cpp:170
CUBLAS error: memory mapping error (11) in magma_dgetrf_gpu at dgetrf_gpu.cpp:172
CUDA runtime error: unspecified launch failure (4) in magma_dgetrf_gpu at dgetrf_gpu.cpp:188
CUBLAS error: memory mapping error (11) in magma_dgetrf_gpu at dgetrf_gpu.cpp:195
CUDA runtime error: unspecified launch failure (4) in magma_dgetrf_gpu at dgetrf_gpu.cpp:199
CUDA runtime error: unspecified launch failure (4) in magma_dgetrf_gpu at dgetrf_gpu.cpp:170
CUBLAS error: memory mapping error (11) in magma_dgetrf_gpu at dgetrf_gpu.cpp:172
CUDA runtime error: unspecified launch failure (4) in magma_dgetrf_gpu at dgetrf_gpu.cpp:188
CUBLAS error: memory mapping error (11) in magma_dgetrf_gpu at dgetrf_gpu.cpp:195
CUDA runtime error: unspecified launch failure (4) in magma_dgetrf_gpu at dgetrf_gpu.cpp:199
CUDA runtime error: unspecified launch failure (4) in magma_dgetrf_gpu at dgetrf_gpu.cpp:170
CUBLAS error: memory mapping error (11) in magma_dgetrf_gpu at dgetrf_gpu.cpp:172
CUDA runtime error: unspecified launch failure (4) in magma_dgetrf_gpu at dgetrf_gpu.cpp:188
CUBLAS error: memory mapping error (11) in magma_dgetrf_gpu at dgetrf_gpu.cpp:195
CUDA runtime error: unspecified launch failure (4) in magma_dgetrf_gpu at dgetrf_gpu.cpp:199
CUDA runtime error: unspecified launch failure (4) in magma_dgetrf_gpu at dgetrf_gpu.cpp:170
CUBLAS error: memory mapping error (11) in magma_dgetrf_gpu at dgetrf_gpu.cpp:172
CUDA runtime error: unspecified launch failure (4) in magma_dgetrf_gpu at dgetrf_gpu.cpp:188
CUBLAS error: memory mapping error (11) in magma_dgetrf_gpu at dgetrf_gpu.cpp:195
CUDA runtime error: unspecified launch failure (4) in magma_dgetrf_gpu at dgetrf_gpu.cpp:199

Do I miss anything in my code? Could you please help me to fix this.
Thanks in advance
dalal
 
Posts: 9
Joined: Thu Feb 20, 2014 4:30 am

Re: magma_dgetrf_gpu. CUDA runtime error

Postby mgates3 » Mon Mar 24, 2014 12:59 pm

It's impossible to say, since your code snippet is incomplete. That is, you don't include allocating the d_U and d_Utmp arrays, nor the definition of n32. Usually such CUDA errors occur when the matrix is accessed out-of-bounds, which happens if the matrix allocated doesn't match the size given to dgetrf. You should also verify that malloc succeeded by checking the return code of magma_malloc.

Can you replicate the error by running the provided MAGMA tester, in magma/testing/testing_dgetrtf_gpu ? The complete input & output of that tester is very useful for diagnosing problems.

BTW, there are type-safe versions of malloc available, e.g.,
magma_dmalloc( double** x, size_t n )
magma_imalloc( int** x, size_t n )
These are both easier and safer, avoiding the (void**) cast and sizeof(...).

-mark
mgates3
 
Posts: 442
Joined: Fri Jan 06, 2012 2:13 pm

Re: magma_dgetrf_gpu. CUDA runtime error

Postby dalal » Tue Mar 25, 2014 5:56 am

Thanks for reply. Here is the matrices and their sizes, I already have done exactly as in the testing_dgetrf_gpu code.
n = 1024;
double *d_U, *d_Utmp; int n32 = ((n+31)/32)*32;
magma_int_t *ipiv; int info;
TESTING_DEVALLOC( d_Utmp, double, n32 * n);
TESTING_MALLOC(ipiv, magma_int_t, n );
magma_dgetrf_gpu( n, n, d_Utmp, n32, ipiv, &info);
where TESTING_DEVALLOC and TESTING_MALLOC have the (void**) cast and sizeof(...) to allocate on device and on host respectively.

Also I tried using other version of malloc as follow,
magma_imalloc_cpu( &ipiv, n);
magma_dmalloc( &d_Utmp, n32*n);
magma_dgetrf_gpu( n, n, d_T, n32, ipiv, &info);

But I am still getting the same errors.
dalal
 
Posts: 9
Joined: Thu Feb 20, 2014 4:30 am

Re: magma_dgetrf_gpu. CUDA runtime error

Postby mgates3 » Tue Mar 25, 2014 3:15 pm

That code looks okay.
Did you allocate d_U? Not used in this code, but was in the previous code snippet.
What's your make.inc file?
What version of MAGMA?
Try running the testing_dgetrf_gpu and testing_dgetrf programs. E.g.

./testing_dgetrf_gpu -N 1024

Please post the output of that, as it has some additional information like what GPU your are using.
If that fails, then it's some issue with MAGMA or how it is installed.

Yes, the TESTING malloc macros should work, too.
-mark
mgates3
 
Posts: 442
Joined: Fri Jan 06, 2012 2:13 pm

Re: magma_dgetrf_gpu. CUDA runtime error

Postby dalal » Thu Mar 27, 2014 2:09 am

Thanks mark for your help.
Magma version is magma-1.4.1.
Running the testing_dgetrf_gpu and testing_dgetrf programs in magma, both work.
Some results of ./testing_dgetrf_gpu -c :

MAGMA 1.4.1 , compiled for CUDA capability >= 3.0
device 0: Tesla K20c, 705.5 MHz clock, 5119.8 MB memory, capability 3.5
M N CPU GFlop/s (sec) GPU GFlop/s (sec) |PA-LU|/(N*|A|)
=========================================================================
1088 1088 --- ( --- ) 27.52 ( 0.03) 3.19e-18
2112 2112 --- ( --- ) 160.74 ( 0.04) 3.03e-18
3136 3136 --- ( --- ) 272.43 ( 0.08) 2.44e-18

results of ./testing_dgetrf -c :
MAGMA 1.4.1 , compiled for CUDA capability >= 3.0
device 0: Tesla K20c, 705.5 MHz clock, 5119.8 MB memory, capability 3.5

M N CPU GFlop/s (sec) GPU GFlop/s (sec) |PA-LU|/(N*|A|)
=========================================================================
1088 1088 --- ( --- ) 7.02 ( 0.12) 3.19e-18
2112 2112 --- ( --- ) 135.64 ( 0.05) 3.03e-18
3136 3136 --- ( --- ) 217.32 ( 0.09) 2.44e-18

In my code I call some plasma routines with magma routines.
Compile and link of my test_dgetrf_gpu is as follows:
icc -O2 -c test_dgetrf_gpu.c -o test_dgetrf_gpu.o -DADD_ -Wall -openmp -DMAGMA_WITH_MKL -DMAGMA_SETAFFINITY -DMKL_ILP64 -openmp -mkl=parallel -DMAGMA_ILP64 -DHAVE_CUBLAS -DMIN_CUDA_ARCH=300 -I/usr/local/cuda/include -I/opt/intel/composer_xe_2013/mkl/include -I/.../magma-1.4.1/include -I/.../magma-1.4.1/control -I/plasma/control -I/plasma/include -I/plasma/quark -I/plasma/include -I/plasma-installer/install/include

ifort -O2 -nofor_main -openmp -diag-disable vec -fltconsistency -fp_port test_dgetrf_gpu.o -o exe -L/magma-1.4.1/lib -lmagma -L/plasma/lib -lplasma -lcoreblasqw -lcoreblas -lplasma -L/plasma/quark -lquark -L/plasma-installer/install/lib -llapacke -mkl=parallel -lpthread -lm -lpthread -lm -L/opt/intel/composer_xe_2013/mkl/lib/intel64 -L/usr/local/cuda/lib64 -lmkl_intel_ilp64 -lmkl_intel_thread -lmkl_core -lpthread -lcublas -lcudart -lstdc++ -lm -DPLASMA_WITH_MKL

Thanks
dalal
 
Posts: 9
Joined: Thu Feb 20, 2014 4:30 am

Re: magma_dgetrf_gpu. CUDA runtime error

Postby mgates3 » Thu Mar 27, 2014 11:19 am

My first suspicion would be the ILP64 stuff. I don't see anything obviously wrong here (you use both -DMKL_ILP64 and link with -lmkl_intel_ilp64), but if some piece of code was compiled without -DMKL_ILP64, then magma_int_t will be 32-bit there, and 64-bit in other places.

Using ILP64, you may also run into problems between PLASMA and MAGMA. PLASMA uses a regular 32-bit int everywhere, while MAGMA with ILP64 uses 64-bit magma_int_t. This affects integers passed by pointer. For instance, the same ipiv array cannot be used for both PLASMA and MAGMA. However, if -DMKL_ILP64 is used consistently, it seems the compiler should catch these problems.

Also, was lapacke compiled with 64-bit integers? MAGMA doesn't use lapacke, but PLASMA does.

So I would recommend trying to compile MAGMA and your code without ILP64, and linking with mkl_intel_lp64. ILP64 is only needed if your matrices are larger than 46000 x 46000 or so (i.e., 2**31 elements). Otherwise the standard 32-bit int works fine.

-mark
mgates3
 
Posts: 442
Joined: Fri Jan 06, 2012 2:13 pm

Re: magma_dgetrf_gpu. CUDA runtime error

Postby dalal » Thu Mar 27, 2014 12:38 pm

Thank you Mark so much.
After compiling MAGMA and my code without ILP64, dgetrf does work.
Thanks again for your very useful answers.
dalal
 
Posts: 9
Joined: Thu Feb 20, 2014 4:30 am

Re: magma_dgetrf_gpu. CUDA runtime error

Postby mgates3 » Fri Mar 28, 2014 1:06 pm

Good!

While it's easier to do LP64, it should be possible to get it working with ILP64, if that is really necessary. It just requires being very consistent so that everything uses the same 64-bit integer types.
-mark
mgates3
 
Posts: 442
Joined: Fri Jan 06, 2012 2:13 pm


Return to User discussion

Who is online

Users browsing this forum: Bing [Bot] and 3 guests

cron