Open discussion for MAGMA library (Matrix Algebra on GPU and Multicore Architectures)


Postby agruber » Wed Mar 03, 2010 11:45 am

Is there more documentation than the header file on how to use magmablas_stranspose?
Posts: 1
Joined: Wed Mar 03, 2010 11:14 am

Re: magmablas_stranspose

Postby Stan Tomov » Thu Mar 04, 2010 1:45 pm

There isn't because for now the function is used internally. The function definition is
Code: Select all
extern "C"
void magmablas_stranspose(float *odata, int ldo,
                          float *idata, int ldi,
                          int m, int n )

It takes an input m x n matrix in idata with leading dimension ldi (>=m) and transposes it, writing the output in odata with leading dimension ldo (>=n). The implementation requires m and n to be divisible by 32. Coalescent memory accesses (and hence high performance) is achieved if ldo and ldi are divisible by 16 and odata and idata addresses divisible by 16*sizeof(float). The function does not give feedback on wrong input, e.g., if m%32 != 0 and (m/32 + 32) <= ldi the matrix will be correctly inverted but so would be the strip/padding of rows from m to m/32 + 32 (as a side effect; those does not need to be initialized).

There is also another function that would do in-place transpose for square matrices.
Code: Select all
extern "C" void
magmablas_sinplace_transpose( float *A, int lda, int n )

The requirements for lda and n are the same as in the stranspose function.
Stan Tomov
Posts: 256
Joined: Fri Aug 21, 2009 10:39 pm

Return to User discussion

Who is online

Users browsing this forum: No registered users and 1 guest