Question about linear solver sgetrf_gpu in Magma 0.2

Open discussion for MAGMA

Question about linear solver sgetrf_gpu in Magma 0.2

Postby jpeinado » Mon Dec 07, 2009 8:55 am

Hi:

I am triyng to understand the way of work of linear solver "magma_sgetrf_gpu"

I think (but I am not sure) that the solver uses any kind of padding technique

Am I right?

If I see the code in "testing_sgesv.gupu.cpp", I have some questions for

- allocating memory for matrix in GPU:


status = cublasAlloc((N+32)*(N+32) + 32*maxnb + lwork+2*maxnb*maxnb,
sizeof(float), (void**)&d_A ) ;
if (status != CUBLAS_STATUS_SUCCESS) {



- Sending the matrix to GPU

int dlda = (N/32)*32;
if (dlda<N) dlda+=32;

.....

cublasSetMatrix( N, N, sizeof( float ), A, N, d_A, dlda ) ;


This confuses to me a little because dlda is not N...What happens here?


- Solving the system:

magma_sgetrf_gpu(&N, &N, d_A, &dlda, IPIV, h_work_M_S, INFO);


Is this to make padding?



With many thanks in advance.

jpeinado
jpeinado
 
Posts: 10
Joined: Thu Dec 03, 2009 2:53 pm

Re: Question about linear solver sgetrf_gpu in Magma 0.2

Postby Stan Tomov » Tue Dec 08, 2009 5:08 pm

The solver first transposes the matrix in the GPU memory. The CUDA kernel that is doing it is of block size 32 and we request larger matrix so that we do not code the transpose operation for general matrix size. This will be most probably changed in future releases. When the next panel has to be processed, it is first transposed (to move it back to the standard data layout that LAPACK expects) and than sent to the CPU and factored there using LAPACK. The work space on the GPU needed for this and other operations is requested by the user - to be given as single pointer.
int dlda = (N/32)*32;
if (dlda<N) dlda+=32;
cublasSetMatrix( N, N, sizeof( float ), A, N, d_A, dlda ) ;

Here we just make the device lda of d_A divisible by 32 (and larger than N). This is where the matrix is copied and transposed in-place. The rest of the memory is used as workspace. So, to answer your question, we do "padding" just for the transpose operation, not for BLAS, and in future releases we will remove the need for the "padding" in the transpose operation.
Stan
Stan Tomov
 
Posts: 251
Joined: Fri Aug 21, 2009 10:39 pm

Re: Question about linear solver sgetrf_gpu in Magma 0.2

Postby jpeinado » Wed Dec 09, 2009 4:49 am

Thank you very much for your answer Stan.


The question is that I am studying if this behaviour change the way of working with my algorithms. I do several CUBLAS computations with the matrix A ( Matrix x Matrix, and Matrix x Vector) before applying it the LU.

I think that this behaviour only change in my algorithms that I need to change the space allocated in for A in GPU. I am not sure yet, if I must change anything else

Thanks again.

jpeinado
jpeinado
 
Posts: 10
Joined: Thu Dec 03, 2009 2:53 pm


Return to User discussion

Who is online

Users browsing this forum: No registered users and 2 guests

cron