Page 1 of 2

magma_zgetri_gpu segfaults in OpenMP parallel for

PostPosted: Wed Apr 04, 2012 11:56 am
by _kirpi_
Dear All,

I have installed magma-1.1.0 (compiled using ATLAS, ACML libraries, and CUDA 4.0):
Operating system: Fedora Release 13, linux kernel 2.6.34.9-69.fc13.x86_64
C/C++ Compiler: gcc-4.4.5
hardware: 8x AMD Opteron 8439SE (48 cores), 128 GB ram, NVIDIA TESLA C1060

I have a test function (below) that calculates inverse of a complex<double> matrix A (of size 90x90).When I call this code in a parallel for loop using OpenMP, it gives a segfault without displaying any error messages. If I call this code in a sequential loop, it works. Or, if I comment out the call to "magma_zgetri_gpu" doesn't segfault in parfor (no error messages either).

Parallel for loop initiates 48 OpenMP threads, so there are 48x (90x90) matrices to invert which are about 6 MB in size in total at any given time. So it is not huge at all. I have also tried using only 2 threads, still segfaults.

Do you have any ideas?

Thanks!

Code: Select all
 
...............
// n = 90
int nb = magma_get_zgetri_nb( n );
int ldda = ((n+31)/32) * 32;
int ldwork = n * nb;
   
cuDoubleComplex *dAinv, *dwork;
cudaMalloc((void**)&dAinv, sizeof(cuDoubleComplex)*ldda*n);
cudaMalloc((void**)&dwork, sizeof(cuDoubleComplex)*ldwork);


cublasSetMatrix( n, n, sizeof(cuDoubleComplex), (cuDoubleComplex*)A, n, dAinv, ldda );
magma_zgetrf_gpu( n,n, dAinv, ldda, P, &err );
if (err) {
      cout << "got err " << err << " from magma_zgetrf" << endl;
      return err;
}

magma_zgetri_gpu( n,    dAinv, ldda, P, dwork, ldwork, &err );
if (err) {
      cout << "got err " << err << " from magma_zgetri" << endl;
      return err;
}
cublasGetMatrix( n, n, sizeof(cuDoubleComplex), dAinv, ldda, Ainv, n );

cudaFree(dAinv);
cudaFree(dwork);

...................

Re: magma_zgetri_gpu segfaults in OpenMP parallel for

PostPosted: Wed Apr 04, 2012 4:31 pm
by brom
A developer can correct me if I'm wrong, but I don't think MAGMA is thread safe.

Re: magma_zgetri_gpu segfaults in OpenMP parallel for

PostPosted: Thu Apr 26, 2012 11:55 am
by jeremiahpalmer
Does anyone know if a future version of Magma will be thread safe?

Re: magma_zgetri_gpu segfaults in OpenMP parallel for

PostPosted: Thu Apr 26, 2012 12:08 pm
by _kirpi_
I think MAGMA itself is thread safe, but not the linear algebra libraries that it depends for. For example, I am able to run Matrix multiplication functions in parallel in different threads at the same time without a problem (this uses ATLAS). It seems that this problem is related with CUBLAS (magma_zgetri_gpu fails when called by multiple threads at the same time).

A MAGMA developer would provide more reliable information on that though...

_kirpi_

Re: magma_zgetri_gpu segfaults in OpenMP parallel for

PostPosted: Thu Apr 26, 2012 2:25 pm
by Stan Tomov
Yes, MAGMA itself should be thread safe. For now we do not create our own threads or have global variables that may compromise thread safety. It should be coming from the way the math libraries or CUDA are used with OpenMP. Also, for matrices of size 90x90 the current implementation would not be very efficient (for smaller than 64 we use CPU code). We are working on adding functionality like the one needed here (for small matrices) and similar to the batch gemms in CUBLAS.

Re: magma_zgetri_gpu segfaults in OpenMP parallel for

PostPosted: Tue May 01, 2012 5:45 pm
by keitat
CUBLAS's old API does not guarantee thread safety, and the CUBLAS4.0 or higher document recommends using Version 2 API with stream created by each individual CPU thread.

Current version of Magma relies on CUBLAS's old API and thay may break the execution order of kernels in a few calls using multiple cudaStream. In particular, use of cublasSetKernelStream is dangerous because it accesses global variables maintained by CUBLAS. Also, Magma has a global variable "magma_stream" for magmablas execution, which is another potential flaw.

I am curious if anybody tried Magma with thread-safe BLAS/LAPACK.

Re: magma_zgetri_gpu segfaults in OpenMP parallel for

PostPosted: Wed May 02, 2012 2:30 pm
by brom
Thanks Keitat for pointing out the reasons why MAGMA is not thread safe. It's worth noting that the MAGMA was probably "more thread safe" before the CUDA 4.0 changes to GPU binding.

In an unrelated project, I was required to completely overhaul my multi-threaded code to account for the multi-GPU changes in CUDA/CUBLAS 4.0. I image MAGMA will need to do the same.

Are the developers planning on fixing these issues?

Re: magma_zgetri_gpu segfaults in OpenMP parallel for

PostPosted: Wed May 02, 2012 7:35 pm
by mgates3
Yes, we're working on moving to the newer CUBLAS v2 API, which provides thread safety and better support for running multiple kernels in parallel.
-mark

Re: magma_zgetri_gpu segfaults in OpenMP parallel for

PostPosted: Wed May 23, 2012 4:09 pm
by jeremiahpalmer
I'm finding that magma_zgetrf_gpu fails in an OpenMP region even when there's only 1 thread set. Is this everyone's experience as well? (Or just with the other magma routines?)

BTW, the routine fails in calling cudaFree in magma_free_host.

Re: magma_zgetri_gpu segfaults in OpenMP parallel for

PostPosted: Tue Jun 12, 2012 12:09 pm
by jeremiahpalmer
Any news on a parallel safe MAGMA?