## dgetrf from C

Open discussion regarding features, bugs, issues, vendors, etc.

### dgetrf from C

I'm using LAPACK from C and I'm faced with the typical problem of accessing a MxN matrix A, which in FORTRAN is stored in column-major order as dgetrf expects.

In calling dgetrf from C, because my matrix A is stored in row-major order, it computes the LU factorization of A^t (the transpose matrix). Is there another function callable from C which computes the LU factorization of A stored in row-major order? I would be very grateful for any clue in this direction. Many thanks for your consideration
PabloSCasas

Posts: 3
Joined: Wed Jul 13, 2005 11:59 am

Dear Pablo,

> Is there another function callable from C which computes the LU factorization of A stored in row-major order?

No, not in LAPACK. The LU factorization performed by DGETRF only supports column-major matrices.

Note that however in most cases DGETRF can be used in the context of row-major format matrices. For example, if A is square nonsingular stored with row-major format and you want to solve a linear system of equations associated with A, then you can use DGETRF on A, this computes the LU factor of A^T, and finally you obtain the solution by using DGETRS with the first argument TRANS set to 'T'.

Julien Langou
Julien Langou

Posts: 834
Joined: Thu Dec 09, 2004 12:32 pm
Location: Denver, CO, USA

The solution I can think of is to create your (m x n) matrix as a single vector:
Code: Select all
`double* A = (double *)calloc(m*n, sizeof(double));`

and use a macro to choose either row- or column-major ordering:
Code: Select all
`#define ROWM(x, y, n) (x*n + y) //row-major#define COLM(x, y, m) (y*m + x) //column-major`

Then accessing element A[3][1] when A has height=5 becomes:
Code: Select all
`A[COLM(3, 1, 5)] = ...`

This should essentially translate between C++ array storage and Fortran storage.

Hope this helps.
SEspyder

Posts: 2
Joined: Sun Jul 17, 2005 3:51 pm

Thank you both Julien Langou and SEspyder for your replies.

In my calculations I have to LU factorize NxN matrices for 1200 <= N <= 5000. Up to now
I obtained the LU decomposition with my own written function. The program works but I'd want to improve performance using the ATLAS
functions (from LAPACK) without changing the main structure of my algorithms based
on row-major matrices. Does someone know how to solve this problem?
PabloSCasas

Posts: 3
Joined: Wed Jul 13, 2005 11:59 am

Hello,

item #4 where you will find what I guess is the answer you are looking for.
1. ATLAS is not in LAPACK, ATLAS is an optimized BLAS library independent from
LAPACK which LAPACK can use as any other BLAS library.

LAPACK calls BLAS routines, thus LAPACK needs a BLAS library behind it.

BLAS libraries provide the most basic linear algebra operations (matrix-vector
products, matrix-matrix products, etc., but not many more). LAPACK calls those
routines. You are supposed to use a BLAS optimized for your architecture, in
this case LAPACK should obtain excellent performance.

That's where ATLAS comes in the picture. ATLAS is an optimized BLAS library
that produces excellent performance for almost all the architecture. So you can
link your code with LAPACK and ATLAS, and you will have an efficient
combination.

This strategy has been used in LAPACK to sucessfully obtain high performance on
a lots of platform. The tuning for a given architecture is done at the BLAS
level (in particular at the BLAS-3 level) while the LAPACK code remains
unchanged. This enables us to have a portable (and efficient on a lots platform) code

Note that there is a default BLAS implementation (also called reference BLAS)
in the directory LAPACK that you can use with LAPACK. This enables the tar-ball
lapack.tgz to be self-consistent; but do not expect any good performance from it.

2. If you use LAPACK+(optimized BLAS), you should observe a dramatic speedup in
your application code. We highly encourage you to do so.

3. So now, the problem is that you have a row-major matrix, you want to stick with it,
and you want to do an LU factorization with LAPACK:
1. why do you want an LU factorization? If you indeed just want to solve one (or several) linear system(s)
associated with the matrix then do what I previously wrote:
call dgetrf on A (this does the LU factorization of A^T)
then call dgetrs with the option 'T'
2. if you really want the factor L and U of A (not the one of A^T as in a-) then you might need to
convert your matrix from row-major to column-major format.

4. There is an LU factorization in ATLAS itself. This routine is in general
faster that what you can obtain with LAPACK-3.0+ATLAS (it is recursive not
blocked). So, if you are 'only' concerned in solving a linear system of
equations, then I would simply recommend you (at the date of now at least) that
you directly use the dgetrf routine from ATLAS. Moreover it handles both
like:

Code: Select all
`int ATL_dgetrf(const enum CBLAS_ORDER Order, const int M, const int N,               double *A, const int lda, int *ipiv);`

Julien Langou
Julien Langou

Posts: 834
Joined: Thu Dec 09, 2004 12:32 pm
Location: Denver, CO, USA

Thank you Julien for your advises. This is in fact what I have done: I've taken ATL_dgetrf in the row-major case. The little trick is that it returns a factor like: A P = L U, ie. it does column pivoting among other things.

My matrices are K x (K-1) but the decomposition I used up to now (P A = L U) can be modified to work with the one ATL_dgetrf returns.
PabloSCasas

Posts: 3
Joined: Wed Jul 13, 2005 11:59 am