2 posts
• Page **1** of **1**

Is there more documentation than the header file on how to use magmablas_stranspose?

- agruber
**Posts:**1**Joined:**Wed Mar 03, 2010 11:14 am

There isn't because for now the function is used internally. The function definition is

It takes an input m x n matrix in idata with leading dimension ldi (>=m) and transposes it, writing the output in odata with leading dimension ldo (>=n). The implementation requires m and n to be divisible by 32. Coalescent memory accesses (and hence high performance) is achieved if ldo and ldi are divisible by 16 and odata and idata addresses divisible by 16*sizeof(float). The function does not give feedback on wrong input, e.g., if m%32 != 0 and (m/32 + 32) <= ldi the matrix will be correctly inverted but so would be the strip/padding of rows from m to m/32 + 32 (as a side effect; those does not need to be initialized).

There is also another function that would do in-place transpose for square matrices.

The requirements for lda and n are the same as in the stranspose function.

- Code: Select all
`extern "C"`

void magmablas_stranspose(float *odata, int ldo,

float *idata, int ldi,

int m, int n )

It takes an input m x n matrix in idata with leading dimension ldi (>=m) and transposes it, writing the output in odata with leading dimension ldo (>=n). The implementation requires m and n to be divisible by 32. Coalescent memory accesses (and hence high performance) is achieved if ldo and ldi are divisible by 16 and odata and idata addresses divisible by 16*sizeof(float). The function does not give feedback on wrong input, e.g., if m%32 != 0 and (m/32 + 32) <= ldi the matrix will be correctly inverted but so would be the strip/padding of rows from m to m/32 + 32 (as a side effect; those does not need to be initialized).

There is also another function that would do in-place transpose for square matrices.

- Code: Select all
`extern "C" void`

magmablas_sinplace_transpose( float *A, int lda, int n )

The requirements for lda and n are the same as in the stranspose function.

- Stan Tomov
**Posts:**258**Joined:**Fri Aug 21, 2009 10:39 pm

2 posts
• Page **1** of **1**

Users browsing this forum: Bing [Bot] and 1 guest