magma_zgetri_gpu segfaults in OpenMP parallel for

Open discussion for MAGMA

magma_zgetri_gpu segfaults in OpenMP parallel for

Postby _kirpi_ » Wed Apr 04, 2012 11:56 am

Dear All,

I have installed magma-1.1.0 (compiled using ATLAS, ACML libraries, and CUDA 4.0):
Operating system: Fedora Release 13, linux kernel 2.6.34.9-69.fc13.x86_64
C/C++ Compiler: gcc-4.4.5
hardware: 8x AMD Opteron 8439SE (48 cores), 128 GB ram, NVIDIA TESLA C1060

I have a test function (below) that calculates inverse of a complex<double> matrix A (of size 90x90).When I call this code in a parallel for loop using OpenMP, it gives a segfault without displaying any error messages. If I call this code in a sequential loop, it works. Or, if I comment out the call to "magma_zgetri_gpu" doesn't segfault in parfor (no error messages either).

Parallel for loop initiates 48 OpenMP threads, so there are 48x (90x90) matrices to invert which are about 6 MB in size in total at any given time. So it is not huge at all. I have also tried using only 2 threads, still segfaults.

Do you have any ideas?

Thanks!

Code: Select all
 
...............
// n = 90
int nb = magma_get_zgetri_nb( n );
int ldda = ((n+31)/32) * 32;
int ldwork = n * nb;
   
cuDoubleComplex *dAinv, *dwork;
cudaMalloc((void**)&dAinv, sizeof(cuDoubleComplex)*ldda*n);
cudaMalloc((void**)&dwork, sizeof(cuDoubleComplex)*ldwork);


cublasSetMatrix( n, n, sizeof(cuDoubleComplex), (cuDoubleComplex*)A, n, dAinv, ldda );
magma_zgetrf_gpu( n,n, dAinv, ldda, P, &err );
if (err) {
      cout << "got err " << err << " from magma_zgetrf" << endl;
      return err;
}

magma_zgetri_gpu( n,    dAinv, ldda, P, dwork, ldwork, &err );
if (err) {
      cout << "got err " << err << " from magma_zgetri" << endl;
      return err;
}
cublasGetMatrix( n, n, sizeof(cuDoubleComplex), dAinv, ldda, Ainv, n );

cudaFree(dAinv);
cudaFree(dwork);

...................
_kirpi_
 
Posts: 2
Joined: Wed Apr 04, 2012 11:14 am

Re: magma_zgetri_gpu segfaults in OpenMP parallel for

Postby brom » Wed Apr 04, 2012 4:31 pm

A developer can correct me if I'm wrong, but I don't think MAGMA is thread safe.
brom
 
Posts: 18
Joined: Tue Jan 25, 2011 8:20 pm

Re: magma_zgetri_gpu segfaults in OpenMP parallel for

Postby jeremiahpalmer » Thu Apr 26, 2012 11:55 am

Does anyone know if a future version of Magma will be thread safe?
jeremiahpalmer
 
Posts: 58
Joined: Fri Jan 28, 2011 12:46 pm

Re: magma_zgetri_gpu segfaults in OpenMP parallel for

Postby _kirpi_ » Thu Apr 26, 2012 12:08 pm

I think MAGMA itself is thread safe, but not the linear algebra libraries that it depends for. For example, I am able to run Matrix multiplication functions in parallel in different threads at the same time without a problem (this uses ATLAS). It seems that this problem is related with CUBLAS (magma_zgetri_gpu fails when called by multiple threads at the same time).

A MAGMA developer would provide more reliable information on that though...

_kirpi_
_kirpi_
 
Posts: 2
Joined: Wed Apr 04, 2012 11:14 am

Re: magma_zgetri_gpu segfaults in OpenMP parallel for

Postby Stan Tomov » Thu Apr 26, 2012 2:25 pm

Yes, MAGMA itself should be thread safe. For now we do not create our own threads or have global variables that may compromise thread safety. It should be coming from the way the math libraries or CUDA are used with OpenMP. Also, for matrices of size 90x90 the current implementation would not be very efficient (for smaller than 64 we use CPU code). We are working on adding functionality like the one needed here (for small matrices) and similar to the batch gemms in CUBLAS.
Stan Tomov
 
Posts: 253
Joined: Fri Aug 21, 2009 10:39 pm

Re: magma_zgetri_gpu segfaults in OpenMP parallel for

Postby keitat » Tue May 01, 2012 5:45 pm

CUBLAS's old API does not guarantee thread safety, and the CUBLAS4.0 or higher document recommends using Version 2 API with stream created by each individual CPU thread.

Current version of Magma relies on CUBLAS's old API and thay may break the execution order of kernels in a few calls using multiple cudaStream. In particular, use of cublasSetKernelStream is dangerous because it accesses global variables maintained by CUBLAS. Also, Magma has a global variable "magma_stream" for magmablas execution, which is another potential flaw.

I am curious if anybody tried Magma with thread-safe BLAS/LAPACK.
keitat
 
Posts: 8
Joined: Tue Jan 24, 2012 2:19 pm

Re: magma_zgetri_gpu segfaults in OpenMP parallel for

Postby brom » Wed May 02, 2012 2:30 pm

Thanks Keitat for pointing out the reasons why MAGMA is not thread safe. It's worth noting that the MAGMA was probably "more thread safe" before the CUDA 4.0 changes to GPU binding.

In an unrelated project, I was required to completely overhaul my multi-threaded code to account for the multi-GPU changes in CUDA/CUBLAS 4.0. I image MAGMA will need to do the same.

Are the developers planning on fixing these issues?
brom
 
Posts: 18
Joined: Tue Jan 25, 2011 8:20 pm

Re: magma_zgetri_gpu segfaults in OpenMP parallel for

Postby mgates3 » Wed May 02, 2012 7:35 pm

Yes, we're working on moving to the newer CUBLAS v2 API, which provides thread safety and better support for running multiple kernels in parallel.
-mark
mgates3
 
Posts: 443
Joined: Fri Jan 06, 2012 2:13 pm

Re: magma_zgetri_gpu segfaults in OpenMP parallel for

Postby jeremiahpalmer » Wed May 23, 2012 4:09 pm

I'm finding that magma_zgetrf_gpu fails in an OpenMP region even when there's only 1 thread set. Is this everyone's experience as well? (Or just with the other magma routines?)

BTW, the routine fails in calling cudaFree in magma_free_host.
jeremiahpalmer
 
Posts: 58
Joined: Fri Jan 28, 2011 12:46 pm

Re: magma_zgetri_gpu segfaults in OpenMP parallel for

Postby jeremiahpalmer » Tue Jun 12, 2012 12:09 pm

Any news on a parallel safe MAGMA?
jeremiahpalmer
 
Posts: 58
Joined: Fri Jan 28, 2011 12:46 pm

Next

Return to User discussion

Who is online

Users browsing this forum: No registered users and 2 guests