LAPACK Archives

[Lapack] copy transpose in LAPACK


What is the best method to create the transpose of a matrix (or switch 
from colmajor to rowmajor)? Is it worth to do it with BLAS routines or 
better to do it with a C double for-cycle (I have no fortran here)?

? there is no BLAS routine to do such the inplace transpose of a matrix ?
? what do you mean ?

If you mean, shall I do:

        convert A from col major to row major
        convert B from col major to row major
        convert C from col major to row major
        call dgemm ( 'N', 'N', alpha, A, lda, B, ldb, beta, C, ldc )
        convert C from row major to col major
        convert B from row major to col major
        convert A from row major to col major

or

        call dgemm ( 'T', 'T', alpha, B, ldb, A, lda, beta, C, ldc )
?

there is no question, the second approach is better. This is one used in 
the CBLAS wrapper available at:
        http://www.netlib.org/blas/blast-forum/cblas.tgz

But finally, you know the best is maybe to have colmajor matrices in your 
C code....

Generally speaking is it worth to use BLAS instead of C where a single 
for-cyle would do in C (I am thinking of xSCAL,xSWAP,xCOPY and so on)?

yes good question, any reasonnable compiler should be able to optimize a 
single do loop pretty well. The main interest of the BLAS1 routine is for 
readability of the codes.

Do you recommend MKL compared to ATLAS/CLAPACK on an INTEL dual XEON
machine?

Sure you can give it a try. I do not have data on me, but MKL should 
perform slightly better than ATLAS/CLAPACK. (Just a guess)

On this machine link with
        -lptatlas -lpthread
if you want to use both CPUs during the BLAS part.

Julien



On Mon, 27 Mar 2006, Laszlo Sragner wrote:

Hi!

While I got kind answers from you previously I dare to ask a few more.

What is the best method to create the transpose of a matrix (or switch 
from colmajor to rowmajor)? Is it worth to do it with BLAS routines or 
better to do it with a C double for-cycle (I have no fortran here)?

Generally speaking is it worth to use BLAS instead of C where a single 
for-cyle would do in C (I am thinking of xSCAL,xSWAP,xCOPY and so on)?

Do you recommend MKL compared to ATLAS/CLAPACK on an INTEL dual XEON
machine?

Cheers,

Laszlo

<Prev in Thread] Current Thread [Next in Thread>


For additional information you may use the LAPACK/ScaLAPACK Forum.
Or one of the mailing lists, or