magma_<t>gemm API

Open discussion for MAGMA library (Matrix Algebra on GPU and Multicore Architectures)

magma_<t>gemm API

Postby Volodimir » Tue Jul 03, 2018 12:22 am

Hello,
in the description of the matrix-matrix operations, the value for trans is provided as one of the following:
Code: Select all
    = MagmaNoTrans[: op( A ) = A.
    = MagmaTrans: op( A ) = A**T.
    = MagmaConjTrans: op( A ) = A**H.

When I specify any option but MagmaNoTrans, then the d_c is not updated, when using
Code: Select all
magma_cgemm(MagmaNoTrans, MagmaNoTrans, m, n, k, alpha, d_a, m, d_b, k,
         beta, d_c, m);

However, if i transpose (and conjugate) matrix d_b outside of the function, and use MagmaNoTrans, then I am getting correct values in d_c.

Also, in documentation there is a reference to the queue, but similarly to the question i am asking in the different thread, i am getting an error if i try to pass it.
Code: Select all
too many arguments to function ‘void magma_cgemm_v1(magma_trans_t, magma_trans_t, magma_int_t, magma_int_t, magma_int_t, magmaFloatComplex, magmaFloatComplex_const_ptr, magma_int_t, magmaFloatComplex_const_ptr, magma_int_t, magmaFloatComplex, magmaFloatComplex_ptr, magma_int_t)’
    beta, d_c, m, queue);


Will appreciate explanations on this subject.
Volodimir
 
Posts: 10
Joined: Fri Jun 29, 2018 2:52 pm

Re: magma_<t>gemm API

Postby mgates3 » Tue Jul 03, 2018 5:15 pm

Can you provide a minimum working example demonstrating the issue? It's hard to know what is going wrong without seeing your code.

As for the queue, include <magma_v2.h> instead of <magma.h>.

-mark
mgates3
 
Posts: 829
Joined: Fri Jan 06, 2012 2:13 pm

Re: magma_<t>gemm API

Postby mgates3 » Tue Jul 03, 2018 5:26 pm

Also, you can try the tester, magma/testing/testing_cgemm, to see that the various transpose options do work. Here C is 200x100, A is either 200x300 (transA=NoTrans) or 300x200 (transA=ConjTrans), and B is either 100x300 (transB=NoTrans) or 300x100 (transB=ConjTrans).

(Redundant headers in output omitted.)

Code: Select all
magma> ./testing/testing_cgemm -n 200,100,300 --check -NN
% MAGMA 2.3.0 svn compiled for CUDA capability >= 3.0, 32-bit magma_int_t, 64-bit pointer.
% CUDA runtime 8000, driver 9000. OpenMP threads 4.
% device 0: GeForce GT 750M, 925.5 MHz clock, 2047.6 MiB memory, capability 3.0
% Tue Jul  3 17:21:01 2018
% Usage: ./testing/testing_cgemm [options] [-h|--help]

% If running lapack (option --lapack), MAGMA and cuBLAS error are both computed
% relative to CPU BLAS result. Else, MAGMA error is computed relative to cuBLAS result.

% transA = No transpose, transB = No transpose
%   M     N     K   MAGMA Gflop/s (ms)  cuBLAS Gflop/s (ms)   CPU Gflop/s (ms)  MAGMA error  cuBLAS error
%========================================================================================================
  200   100   300     37.83 (   1.27)      43.17 (   1.11)     ---   (  ---  )    2.10e-09        ---    ok

magma> ./testing/testing_cgemm -n 200,100,300 --check -NC
...
% transA = No transpose, transB = Conjugate transpose
%   M     N     K   MAGMA Gflop/s (ms)  cuBLAS Gflop/s (ms)   CPU Gflop/s (ms)  MAGMA error  cuBLAS error
%========================================================================================================
  200   100   300     42.14 (   1.14)      46.02 (   1.04)     ---   (  ---  )    2.41e-09        ---    ok

magma> ./testing/testing_cgemm -n 200,100,300 --check -CN
...
% transA = Conjugate transpose, transB = No transpose
%   M     N     K   MAGMA Gflop/s (ms)  cuBLAS Gflop/s (ms)   CPU Gflop/s (ms)  MAGMA error  cuBLAS error
%========================================================================================================
  200   100   300     54.05 (   0.89)      61.95 (   0.77)     ---   (  ---  )    2.38e-09        ---    ok

magma> ./testing/testing_cgemm -n 200,100,300 --check -CC
...
% transA = Conjugate transpose, transB = Conjugate transpose
%   M     N     K   MAGMA Gflop/s (ms)  cuBLAS Gflop/s (ms)   CPU Gflop/s (ms)  MAGMA error  cuBLAS error
%========================================================================================================
  200   100   300     40.71 (   1.18)      45.07 (   1.07)     ---   (  ---  )    2.12e-09        ---    ok
mgates3
 
Posts: 829
Joined: Fri Jan 06, 2012 2:13 pm

Re: magma_<t>gemm API

Postby Volodimir » Fri Jul 06, 2018 1:13 am

hello
I have python code that i am trying to convert into c++ (and accelerate using MAGMA). So, in python domain the operation in question looks like :
Code: Select all
R2 = Rx.dot(np.matrix.getH(Rx));

so i was trying with MAGMA various permutations of
Code: Select all
//  allocate  matrices  and   on the  device
   err = magma_cmalloc(&d_a, mk);      //  device  memory  for a
   err = magma_cmalloc(&d_b, km);      //  device  memory  for b
   err = magma_cmalloc(&d_c, mm);      //  device  memory  for c

   // copy  data  from  host to  device
   magma_csetmatrix( m, k, Rx, m, d_a, m, queue);    // copy a -> d_a
   magma_csetmatrix( k, m, Rx_TC, k, d_b, k, queue);    // copy b -> d_b
   


followed by

Code: Select all
magma_cgemm(MagmaNoTrans, MagmaConjTrans, m, m, k, alpha, d_a, m, d_b, k, beta, d_c, m, queue);


What appears to me is that the only valid option is MagmaNoTrans, or i miss something, as usual.
Volodimir
 
Posts: 10
Joined: Fri Jun 29, 2018 2:52 pm

Re: magma_<t>gemm API

Postby mgates3 » Fri Jul 06, 2018 2:02 am

It appears you have:

Matrix A is m-by-k.
Matrix B is k-by-m.
Matrix C is m-by-m.

Then, just based on dimensions, you would need the NoTrans, NoTrans options:

C = alpha * A * B + beta * C

Mathematically, if you try transposing B:

C = alpha * A * B^T + beta * C

the dimensions for B^T are incompatible with A and C. Of course, if B had dimensions m-by-k, then B^T (or B^H) would be correct, as far as dimensions go.

You don't show setting the matrix C. If it is unset, then beta = 0 would be required.

-mark
mgates3
 
Posts: 829
Joined: Fri Jan 06, 2012 2:13 pm


Return to User discussion

Who is online

Users browsing this forum: No registered users and 3 guests