While I got kind answers from you previously I dare to ask a few more.

What is the best method to create the transpose of a matrix (or switch from
colmajor to rowmajor)?
Is it worth to do it with BLAS routines or better to do it with a C double
for-cycle (I have no fortran here)?

Generally speaking is it worth to use BLAS instead of C where a single
for-cyle would do in C (I am thinking of xSCAL,xSWAP,xCOPY and so on)?

Do you recommend MKL compared to ATLAS/CLAPACK on an INTEL dual XEON


