Page 1 of 1

Basic code for SpMV on GPU

Posted: Tue Feb 14, 2017 2:35 am
by akamo
Hello,
I am a new user of Magma. I am looking for a basic code example for running the SpMV (Sparse Matrix Vector Multiplication), lets say using the CSR format, on the GPU side. The examples I found in the downloaded package were not very helpful because I just switched to Magma and the code is all new for me.
Thanks,

Re: Basic code for SpMV on GPU

Posted: Tue Feb 14, 2017 1:38 pm
by hartwig anzt
Dear Akrem Benatia,

assuming you have your matrix of size n in the CSR-array *row *col *val on the CPU, you can pass this matrix to MAGMA-sparse via

magma_d_matrix A={Magma_CSR},
magma_zcsrset( n, n, row, col, val &A, queue );

Then, you may want to initialize vectors on the GPU and copy the system to the device:

magma_d_matrix dA={Magma_CSR}, dx={Magma_DENSE}, dy={Magma_DENSE};
magma_dvinit( &dx, Magma_DEV, n, 1, 1.0, queue );
magma_dvinit( &dx, Magma_DEV, n, 1, 0.0, queue );
magma_dmtransfer( A, &dA, Magma_CPU, Magma_DEV, queue );

Now you can run your CSR-SpMV y=Ax:

magma_d_spmv( 1.0, A, x, 0.0, y, queue );


In case you want to use a different format, e.g. SELLP,, you have to convert the matrix first:

magma_d_matrix B={Magma_CSR}, dA={Magma_CSR}, dx={Magma_DENSE}, dy={Magma_DENSE};
magma_dvinit( &dx, Magma_DEV, n, 1, 1.0, queue );
magma_dvinit( &dx, Magma_DEV, n, 1, 0.0, queue );

// you can modify parameters in SELLP:
B.blocksize = 32; // as example: row-blocks of 32
B.alignment = 1; // as example: 1 thread per row, multiple are possible
magma_dmconvert( A, &B, Magma_CSR, Magma_SELLP, queue );
magma_dmtransfer( B, &dA, Magma_CPU, Magma_DEV, queue );

Now you can run your SELLP-SpMV y=Ax:

magma_d_spmv( 1.0, A, x, 0.0, y, queue );


Hope this helps! Please let me know if you have further questions.

Hartwig

Re: Basic code for SpMV on GPU

Posted: Thu Feb 16, 2017 2:21 am
by akamo
Dear Mr. Hartwig
Thank you very much, that was very helpful.

Akrem

Re: Basic code for SpMV on GPU

Posted: Wed Jun 17, 2020 10:30 am
by llueveYescampa
Hello,
I have the same question here.
While I am trying to compile the library I would like to be sure about some points in this previous answer.

1.- In the initialization of the vectors on the GPU, you wrote:

magma_dvinit( &dx, Magma_DEV, n, 1, 1.0, queue );
magma_dvinit( &dx, Magma_DEV, n, 1, 0.0, queue );

Did you actually mean dy instead of dx the second time? :
magma_dvinit( &dx, Magma_DEV, n, 1, 1.0, queue );
magma_dvinit( &dy, Magma_DEV, n, 1, 0.0, queue );

2.- for actually running the spmv you wrote:

Now you can run your CSR-SpMV y=Ax:
magma_d_spmv( 1.0, A, x, 0.0, y, queue );

but, the matrix in the GPU device is dA and the vectors dx and dy. is that correct?
shouldn't the call be:

magma_d_spmv( 1.0, dA, dx, 0.0, dy, queue );

Thank you very much in advance.

P.D: I am trying to compile the magma library using cmake but I am getting:
/usr/bin/ld: cannot find -lpthreads

However, pthreads is indeed installed in my system. I am trying to figure this out. Have you had this problem before?

Thanks again.



hartwig anzt wrote:
Tue Feb 14, 2017 1:38 pm
Dear Akrem Benatia,

assuming you have your matrix of size n in the CSR-array *row *col *val on the CPU, you can pass this matrix to MAGMA-sparse via

magma_d_matrix A={Magma_CSR},
magma_zcsrset( n, n, row, col, val &A, queue );

Then, you may want to initialize vectors on the GPU and copy the system to the device:

magma_d_matrix dA={Magma_CSR}, dx={Magma_DENSE}, dy={Magma_DENSE};
magma_dvinit( &dx, Magma_DEV, n, 1, 1.0, queue );
magma_dvinit( &dx, Magma_DEV, n, 1, 0.0, queue );
magma_dmtransfer( A, &dA, Magma_CPU, Magma_DEV, queue );

Now you can run your CSR-SpMV y=Ax:

magma_d_spmv( 1.0, A, x, 0.0, y, queue );


In case you want to use a different format, e.g. SELLP,, you have to convert the matrix first:

magma_d_matrix B={Magma_CSR}, dA={Magma_CSR}, dx={Magma_DENSE}, dy={Magma_DENSE};
magma_dvinit( &dx, Magma_DEV, n, 1, 1.0, queue );
magma_dvinit( &dx, Magma_DEV, n, 1, 0.0, queue );

// you can modify parameters in SELLP:
B.blocksize = 32; // as example: row-blocks of 32
B.alignment = 1; // as example: 1 thread per row, multiple are possible
magma_dmconvert( A, &B, Magma_CSR, Magma_SELLP, queue );
magma_dmtransfer( B, &dA, Magma_CPU, Magma_DEV, queue );

Now you can run your SELLP-SpMV y=Ax:

magma_d_spmv( 1.0, A, x, 0.0, y, queue );


Hope this helps! Please let me know if you have further questions.

Hartwig