magma_spotrf_gpu CL_INVALID_VALUE

Open discussion for MAGMA

magma_spotrf_gpu CL_INVALID_VALUE

Postby f41thful » Thu Jul 05, 2012 12:25 pm

Hi,
I want to do a cholesky factorization of a matrix.
To do that, I have chosen magma_spotrf_gpu function.
For matrix dimensions up to 128 it computes cholesky factorization with no problem (because blocks size is 128). So It takes a branch where it uses lapackf77_spotrf. I have checked that resulting matrix is correct.

But for dimensions over 128, for example 1000, it takes another branch wich uses left-looking algirthm. In that algorithm, magma_ssyrk function is used. This function is returning a code error with value -30 (INVALID_VALUE).

Matrix is a matrix of float with 1000x1000 dimension. Stored in row major order. Offset is 0. Stride (i think them called it leading dimension is the same as dimension, i.e. 1000). Uplo value is magmaLower.

Calls are like that, using gdb debugger
magma_spotrf_gpu ( uplo=122,
n=1000,
dA=0x88a8c0,
dA_offset=0,
ldda=1000,
info=0x7fffffffde00,
queue=0x870820)

Inside magma_spotrf_gpu below function is called and is given the error when dimension is over 128:

magma_ssyrk ( uplo=122,
trans=111,
n=128,
k=0,
alpha=-1,
dA=0x88a8c0,
dA_offset=0,
lda=1000,
beta=1,
dC=0x88a8c0,
dC_offset=0,
ldc=1000,
queue=0x870820) at magmablas_s.cpp:230

I thing k = 0 value is very suspicious because it is supposed to be a dimension... But I have not modified source file.
Called to that function in source file is
chk( magma_ssyrk( MagmaLower,
MagmaNoTrans,
jb,
j,
m_one,
dA(j, 0),
ldda,
one,
dA(j, j),
ldda,
queue ));

call is inside this loop
for( j = 0; j < n; j += nb ) {
// apply all previous updates to diagonal block
jb = min( nb, n-j );
call ssyrk(...);
nb is the block size (128).
dA(x,y) is a macro
#define dA(i,j) dA, ( (dA_offset) + (i) + (j)*ldda )
n is the dimension (1000).
I am using OpenCL 1.1. Hardware is NVIDIA Geforce gtx 285. I have compiled clmagma to use OpenCL + amd blas instead of using cuda + cublas.

Thanks in advance.
f41thful
 
Posts: 8
Joined: Thu Jul 05, 2012 12:04 pm

Re: magma_spotrf_gpu CL_INVALID_VALUE

Postby Stan Tomov » Sun Jul 08, 2012 2:47 pm

Hi,
What you are trying is very interesting but I think that for now AMD does not provide the OpenCL sources for their BLAS, only binaries, so if you try the binaries on NVIDIA hardware, you would get errors. The AMD BLAS indeed did not like k=0 in syrk so we modified the source (at line 176 of spotrf_gpu.cpp, add 'if (j>0)' ). Even without this change the code was finishing correctly, as in the k=0 case the ssyrk is not supposed to do any computations.
Stan
Stan Tomov
 
Posts: 253
Joined: Fri Aug 21, 2009 10:39 pm

Re: magma_spotrf_gpu CL_INVALID_VALUE

Postby f41thful » Mon Jul 09, 2012 4:24 am

Hi Stan,
Do you say that in AMD hardware that function works without any change for matrix dimension over 128?

It is supposed that it should be compatible. I have used directly clAmdBlasSgemmEx to multiply two matrixes (dim=1000) without errors and correctly results.
I will try adding if statement.
Javier.
f41thful
 
Posts: 8
Joined: Thu Jul 05, 2012 12:04 pm

Re: magma_spotrf_gpu CL_INVALID_VALUE

Postby f41thful » Mon Jul 09, 2012 10:13 am

Hi,
I have added if statement and ssyrk function error has dissapear (due to the first time (j=0) does not execute function).
But I got another error, this time in the variable info. Error code is 999.
In the return value of documentation it is said that: "if INFO = i, the leading minor of order i is not positive definite, and the factorization could not be completed."
So, I understand that matrix is not suitable for cholesky factorization. Nevertheless, that matrix is the result of a matrix product where operand was a matrix and its transpose (matrix = x * xt). So that matrix has to be positive definite. Moreover, I have done cholesky factorization with the same matrix in octave and gives no error.

Using gdb to watch variable info i get below information:
FIRST CHANGE
Old value=-158558016
new value = 0 (initialization)

SECOND CHANGE
New value=39
There is no source code. Value is change in function spotf2_
stack trace:
spotf2_
spotrf_
magma_spotrf_gpu
main

THIRD CHANGE
New value=103
No sourde code. Value is change in function spotrf_
stack trace:
#0 0x000000000040fae5 in spotrf_ ()
#1 0x000000000040ddcf in magma_spotrf_gpu (uplo=122, n=1000, dA=0x88a8c0,
dA_offset=0, ldda=1000, info=0x7fffffffde00, queue=0x870820)
at spotrf_gpu.cpp:196
#2 0x0000000000403094 in main (argc=2, argv=0x7fffffffdf28) at test.c:87

FOURTH CHANGE
New value=999
Value is changed in magma_spotrf_gpu (uplo=122, n=1000, dA=0x88a8c0, dA_offset=0, ldda=1000,
info=0x7fffffffde00, queue=0x870820) at spotrf_gpu.cpp:200

The code where it is changed is
if ( *info != 0 ) {
assert( *info > 0 );
*info += j;
break;
}
Value of j is 896 (*info + j = 999)

Javier.
f41thful
 
Posts: 8
Joined: Thu Jul 05, 2012 12:04 pm

Re: magma_spotrf_gpu CL_INVALID_VALUE

Postby Stan Tomov » Mon Jul 09, 2012 12:56 pm

Hello,

Do you say that in AMD hardware that function works without any change for matrix dimension over 128?

Yes, it prints error message for k=0 but execution continues and the final result is correct.

We were also thinking to do this experiment so it will be very interesting to have a note if you manage to make it work. Just to make sure I understand - you take the AMD sources for OpenCL GPU BLAS, and use it to run the clMAGMA on NVIDIA hardware using the NVIDIA OpenCL implementation, right? The routines in clMAGMA were ported from MAGMA (for CUDA), the results tested on AMD hardware, and also tested using CPU interface for BLAS. In particular, we have wrappers for GPU BLAS where for example magma_dsyrk is defined as copy the matrices needed to the CPU, use CPU BLAS there, and finally copy the results back to the GPU. We can make these available if you think it will help you to locate the problem.

Related to the question for Cholesky on x * xt, in exact arithmetic this is SPD, but in practice a random matrix of size 1000x1000 (in matlab for example) has a condition number of ~2e5 and x*xt ~7e10, which is singular in single precision and Cholesky in single fails. In double precision Cholesky still can be computed.

Stan

Stan
Stan Tomov
 
Posts: 253
Joined: Fri Aug 21, 2009 10:39 pm

Re: magma_spotrf_gpu CL_INVALID_VALUE

Postby f41thful » Tue Jul 10, 2012 4:25 am

yes, I have taken clAmdBlas and clmagma. I had to modify some things of Makefile and source files and compile it against AMD OpenCL implementation. In execution time I use NVIDIA platform for OpenCL.

we have wrappers for GPU BLAS where for example magma_dsyrk is defined as copy the matrices needed to the CPU, use CPU BLAS there, and finally copy the results back to the GPU. We can make these available if you think it will help you to locate the problem.

I don't understand it very well. The copy is done GPU to CPU, CPU computes and CPU transfers data to GPU?

I have forgotten single precission issue. You are right, double precission is ok. I will try using double precission kernels and post results.
Thanks.
Javier.
f41thful
 
Posts: 8
Joined: Thu Jul 05, 2012 12:04 pm

Re: magma_spotrf_gpu CL_INVALID_VALUE

Postby f41thful » Tue Jul 10, 2012 9:02 am

Cholesky function now works (adding double floating point support and adding if statement). I have compared resulting matrix with lapack and octave ones. Lapack and clmagma results are equals. Octave and clmagma are not. I really don't know why. (the major difference between matrixes (octave[x][x] - clmagma[x][x]) is 0.000238.
if I consider a bias of 0.000001, the number of elements whose difference among [0.000001, 0.000238] are 8780.

Thanks for your help Stan.
f41thful
 
Posts: 8
Joined: Thu Jul 05, 2012 12:04 pm

Re: magma_spotrf_gpu CL_INVALID_VALUE

Postby f41thful » Wed Jul 11, 2012 11:22 am

To close the post, errors are caused because the precision used with matrix. Using double precision matrixes are very close (max difference value 0.0000000001705809).
So, It is possible to use clmagma with clAmdBlas using OpenCL in NVIDIA hardware.
f41thful
 
Posts: 8
Joined: Thu Jul 05, 2012 12:04 pm

Re: magma_spotrf_gpu CL_INVALID_VALUE

Postby fletchjp » Wed Jul 11, 2012 11:56 am

Does this mean that this version of CLmagma can run on NVIDIA hardware?

That option is not supported in the setup, as I was told some time ago.

Will this support be in a future version?

I am interested as I have some other OpenCL software I run on NVIDIA and it would be nice to have a uniform environment for it to work with the GPU.

Thanks

John
fletchjp
 
Posts: 175
Joined: Mon Dec 27, 2010 7:29 pm


Return to User discussion

Who is online

Users browsing this forum: Bing [Bot], Google [Bot] and 1 guest

cron