Trouble in running test programs

Open discussion for MAGMA

Re: Trouble in running test programs

Postby athlonshi » Wed Mar 16, 2011 12:51 pm

Hi,
I had exactly the same problem for testing programs using GPU.
Regardless the size of the matrix, the testing_*_GPU did not work because "!!!! cublasAlloc failed for: d_A"
However, the Fortran code did work well.
I am using C2050 with CUDA 3.2 and MAGMA 1.0RC.
Thanks,
Yu
athlonshi
 
Posts: 5
Joined: Wed Mar 16, 2011 12:34 pm

Re: Trouble in running test programs

Postby Stan Tomov » Thu Mar 17, 2011 1:42 am

This is very interesting. The FORTRAN interface is calling the routine that fails. It looks like some CUDA constants may have been changed. Can you try to replace in file testings.h
Code: Select all
if( CUBLAS_STATUS_SUCCESS != cublasAlloc( size, sizeof(type), (void**)&ptr) )

by
Code: Select all
if( 0 != cublasAlloc( size, sizeof(type), (void**)&ptr) )

This is the only difference that I see with the FORTRAN versions. You may also try to just removed the exit(-1). I suspect that the memory is successfully allocated but the return value or the constant used to indicate success is changed.
Stan
Stan Tomov
 
Posts: 253
Joined: Fri Aug 21, 2009 10:39 pm

Re: Trouble in running test programs

Postby athlonshi » Fri Mar 18, 2011 2:38 pm

I have exactly the same problem and tried the following way, but it still did not work
Fortran wrap works fine.

Stan Tomov wrote:This is very interesting. The FORTRAN interface is calling the routine that fails. It looks like some CUDA constants may have been changed. Can you try to replace in file testings.h
Code: Select all
if( CUBLAS_STATUS_SUCCESS != cublasAlloc( size, sizeof(type), (void**)&ptr) )

by
Code: Select all
if( 0 != cublasAlloc( size, sizeof(type), (void**)&ptr) )

This is the only difference that I see with the FORTRAN versions. You may also try to just removed the exit(-1). I suspect that the memory is successfully allocated but the return value or the constant used to indicate success is changed.
Stan
athlonshi
 
Posts: 5
Joined: Wed Mar 16, 2011 12:34 pm

Re: Trouble in running test programs

Postby scho » Fri Mar 18, 2011 3:18 pm

Thanks again.

I changed "testing.h", as suggested.
I do not get the previous error messesge "!!!! cublasAlloc failed for: d_A", but "magma_zgetrf_gpu returned with error code -7".
The error seems to be due to CUBLAS_STATUS_SUCCESS again which is used inside magma_zgetrf_gpu.
I looked up "cuda/include/cublas.h", which has "#define CUBLAS_STATUS_SUCCESS 0x00000000".
Is there any other place where CUBLAS_STATUS_SUCCESS is defined?

Suwon
scho
 
Posts: 8
Joined: Wed Mar 09, 2011 12:07 am

Re: Trouble in running test programs

Postby scho » Sat Mar 19, 2011 6:28 pm

Finally, my problem was resolved.

I found that another vendor provides libcuda.so, etc in its own directory
which seem to be incompatible with the current version of cuda.
I did not think of this possibility because I have not yet used this program
except an initial test after installation. I changed the order of library search for magma.
My apologies.

Thanks to all who suggested hints and directions.
Suwon
scho
 
Posts: 8
Joined: Wed Mar 09, 2011 12:07 am

Re: Trouble in running test programs

Postby athlonshi » Mon Mar 28, 2011 1:05 pm

Hi,

Could you be more specific on how you solved the problem?
I had a similar error and tried to replace CUBLAS_STATUS_SUCESS with 0 as suggested
It skipped the memory allocation error but still got segmentation error

Again, the fortran wrapper works fine.

Thanks,

Yu
scho wrote:Finally, my problem was resolved.

I found that another vendor provides libcuda.so, etc in its own directory
which seem to be incompatible with the current version of cuda.
I did not think of this possibility because I have not yet used this program
except an initial test after installation. I changed the order of library search for magma.
My apologies.

Thanks to all who suggested hints and directions.
Suwon
athlonshi
 
Posts: 5
Joined: Wed Mar 16, 2011 12:34 pm

Re: Trouble in running test programs

Postby mateo70 » Tue Apr 05, 2011 6:44 pm

Hi Yu,

The problem was that he was linking with two different libcublas.so, and they were a mix between them.
Try to do a ldd on the binary to check that you are using the correct library.

Mathieu
mateo70
 
Posts: 41
Joined: Tue Mar 08, 2011 12:38 pm

Previous

Return to User discussion

Who is online

Users browsing this forum: No registered users and 2 guests

cron