Page 1 of 1

C call Fortan, column major

PostPosted: Mon Mar 26, 2007 5:29 am
by hongbo
Hello,

I thought I understood the issue of "column major" in Fortran, and "row major" in C. But after a test, I got confused. I appreciate if you can help.

Following is just a simple example illustrate the issue.

I have a C program, I want to call PDSYEV to get eigenvectors of a matrix M.

Assuming M is a 3X3 matrix.
In C program, I created a 1D array: A[9] = {1, 4, 5, 0, 2, 6, 0, 0, 3}, then I called PDSYEV_ using A as input matrix, and using upper triangle by the letter 'U'.

Because Fortran is "column major", which means in memory adjacent element belong to same column (ignore those on boundary). So if I write in 2D format, A should be the matrix like:

1 0 0
4 2 0
5 6 3

For better reference, let's call the above matrix M1.

By using upper triangle for the eigenvalue problem, all those 4, 5, 6 elements should be ignored, and I should get eigenvalue 1, 2, and 3.

However, I got eigenvalues: -3.66868, -2.50729, 12.176
which are the eigenvalues of the matrix (if only consider upper triangle)

1 4 5
0 2 6
0 0 3

Let's call this matrix M2.

===
To recap,
By definition of "column major", my 1D array should represent M1,
but the ScaLapack routine gave eigenvalue of M2.

What is wrong here?

PostPosted: Mon Mar 26, 2007 10:17 am
by Julien Langou
Hello,
There is nothing wrong with what you said except that ScaLAPACK works fine.
Below a (quick and dirty) demo. On one process (well the matrix is 3-by-3). The code has to be compiled with
Code: Select all
mpicc forum-357.c -L/home/faculty/langou/optproject/lib -lscalapack -lblacsCinit -lblacs -lreflapack -lrefblas -lg2c

executed with
Code: Select all
mpirun -np 1 ./a.out

it looks like:
Code: Select all
extern int dsyev_(char *jobz, char *uplo, int *n, double *a,
   int *lda, double *w, double *work, int *lwork,
   int *info);
extern void pdsyev_( char *jobz, char *uplo, int *n,
          double *a, int *ia, int *ja, int *desca, double *w, double *z, int *iz, int *jz, int *descz,
          double *work, int *lwork, int *info );
int main(int argc, char **argv){

   double a[9] = {1, 4, 5, 0, 2, 6, 0, 0, 3}, w[3], work[100], *z;
   int n=3, lda=3, nb=3, lwork=100,info,ione=1,desca[9],izero=0,ictxt,nprocs=1,iam;

   dsyev_("N","U",&n,a,&lda,w,work,&lwork,&info);
   printf("DSYEV:  %e %e %e\n",w[0],w[1],w[2]);

   MPI_Init( &argc, &argv);
   Cblacs_pinfo( &iam, &nprocs ) ;
   Cblacs_get( -1, 0, &ictxt );
   Cblacs_gridinit( &ictxt, "Row", 1, 1 );
   descinit_( desca,  &n, &n, &nb, &nb, &izero, &izero, &ictxt, &n, &info );
   pdsyev_( "N", "U", &n, a, &ione, &ione, desca, w, z, &ione, &ione, desca, work, &lwork, &info);
   printf("PDSYEV: %e %e %e\n",w[0],w[1],w[2]);
   Cblacs_gridexit( 0 );
   MPI_Finalize();
   return 0;

}

and its output is:
Code: Select all
 
DSYEV:  1.000000e+00 2.000000e+00 3.000000e+00
PDSYEV: 1.000000e+00 2.000000e+00 3.000000e+00

Julien

PostPosted: Mon Mar 26, 2007 10:33 am
by hongbo
Hello,

I solved the problem, it has to do with ScaLapack tool subroutine PDELSET, which distributes global matrix into local ones. Apparently, this routine changes a Lower triangular global matrix into Upper triangular local matrix.

In the example of last post, global matrix A = 1, 4, 5, 0, 2, 6, 0, 0, 3
which corresponds to Lower triangular matrix M1:
1 0 0
4 2 0
5 6 3

I then distribute this matrix into 4 processes (npcol=nprow=2), with nB=2. I found out that the local matrix is, however, Upper triangular. E.g. the local matrix for myrow=mycol=0 is
1 0 4 2

i.e., a perfect column-major 2X2 upper triangular matrix:
1 4
0 2

=============
Following are my C++ code and output.

Code: Select all
#include <iostream>
#include <fstream>

#include "blacs.h"      // for  Cblacs_ ...
#include "scalapack.h"  // for  numroc_, descinit_, pdelset_, pdsyev_

#include "mpi.h"              // for  mpi calls

using namespace std;


void get_eigenvalue_para(int nA, double **A, double *v);


main(int ac, char **av)
{
    int    mpirank, mpisize;
    double **A, *v;
    int    dimA;

    MPI_Init(&ac, &av);
    MPI_Comm_rank(MPI_COMM_WORLD, &mpirank);
    MPI_Comm_size(MPI_COMM_WORLD, &mpisize);


    dimA=3;

    A = new double* [dimA];
    A[0] = new double [dimA*dimA];
    A[1] = A[0] + dimA;
    A[2] = A[1] + dimA;

    v = new double[dimA];

   
    A[0][0] = 1.; A[0][1] = 4.; A[0][2] = 5.;
    A[1][0] = 0.; A[1][1] = 2.; A[1][2] = 6.;
    A[2][0] = 0.; A[2][1] = 0.; A[2][2] = 3.;

    if (mpirank==0) {
   cout << "Global matrix elements are:\n";
   for (int i=0; i<dimA*dimA; ++i)
       cout << '\t' << A[0][i];
   cout << endl;
    }
   
    get_eigenvalue_para(dimA, A, v);

    if (mpirank==0) {
   cout << "EIGENVALUES:\n";
   for (int i=0; i<dimA; ++i)
       cout << '\t' << v[i];
   cout << endl;
    }

    //delete2D(A);
    delete[] A[0];
    delete[] A;
    delete[] v;


    MPI_Finalize();
    return 0;
}




void get_eigenvalue_para(int nA, double **A, double *v) {

    char letterR = 'R';
    char letterN = 'N';
    char letterU = 'U';

    int  izero = 0;
    int  ione  = 1;

    // mpi variables
    int  mpirank, mpisize;

    // grid variables
    int  icntx, nprow=2, npcol=2, myrow, mycol;

   
    Cblacs_pinfo (mpirank, mpisize);

    char flnm[10];
    sprintf(flnm, "out%d", mpirank);
    ofstream fl(flnm);


    Cblacs_get   (-1, 0,   icntx);

    Cblacs_gridinit(icntx, letterR, nprow, npcol);
    Cblacs_gridinfo(icntx,          nprow, npcol, myrow, mycol);

    //  if not in grid, do nothing
    if (myrow == -1) return;


    // =========================================================
    //     real calculation begins
    // =========================================================

    int     iwork;
    double *lA, *work;

    //  local eigenvector matrix lZ is not referenced
    //    when only calculate eigenvalue.
    //  set it to zero to avoid compiler complaining
    //
    double *lZ = 0;

    int     nB = 2;

    int     mlA, nlA;
    int     desclA[9], desclZ[9];
    int     info;


    //    calc size of local matrix
    mlA = numroc_(nA, nB, myrow, izero, nprow);
    nlA = numroc_(nA, nB, mycol, izero, npcol);

    lA = new double[mlA*nlA];

    iwork = (mlA>1 ? mlA : 1);

    descinit_(desclA, nA, nA, nB, nB, izero, izero, icntx, iwork, info);
    descinit_(desclZ, nA, nA, nB, nB, izero, izero, icntx, iwork, info);


    //  set local matrix
    for (int i=1; i<=nA; ++i)
   for (int j=1; j<=nA; ++j)
       pdelset_(lA, i, j, desclA, A[i-1][j-1]);


    // write debug info into different files
    fl << "On process: " << mpirank
       << ", local matrix size = " << mlA << 'X' << nlA
       << "\nLocal matrix elements are:";

    for (int i=0; i<mlA*nlA; ++i)
   fl << '\t' << lA[i];
    fl << endl;

    //  get correct size of work array
    work = new double[2];
    iwork = -1;
    pdsyev_(letterN, letterU, nA,
       lA, ione, ione, desclA,   v,
       lZ, ione, ione, desclZ,   work, iwork, info);

    iwork = (int)work[0];
    delete[] work;


    //  real calculation
    work = new double[iwork];
    pdsyev_(letterN, letterU, nA,
       lA, ione, ione, desclA,   v,
       lZ, ione, ione, desclZ,   work, iwork, info);


    delete[] lA;
    delete[] work;
    fl.close();

    Cblacs_gridexit(icntx);

    return;
}


Screen output:
    Global matrix elements are:
    1 4 5 0 2 6 0 0 3
    EIGENVALUES:
    -3.66868 -2.50729 12.176


Content of output file "out0"
    On process: 0, local matrix size = 2X2
    Local matrix elements are: 1 0 4 2


Content of output file "out1"
    On process: 1, local matrix size = 2X1
    Local matrix elements are: 5 6


Content of output file "out2"
    On process: 2, local matrix size = 1X2
    Local matrix elements are: 0 0


Content of output file "out3"
    On process: 3, local matrix size = 1X1
    Local matrix elements are: 3


PostPosted: Mon Mar 26, 2007 10:50 am
by hongbo
Julien,

Thank you for your reply.

It is after I finished my second post that I noticed your reply. As I wrote in my second post, I suspect it is the PDELSET that changed the Lower triangle global matrix into Upper triangle local one.

Your "quick and dirty" code did not use the PDELSET call, so you got correct result.

My code was adapted from your old postings, hence the PDELSET calling sequence is unchanged from you old code too. I am now suspect that there is the most questionable place responsible for the change from lower to upper.

I am interested in this because I want to get the eigenvector, and I found that apparently the vector is not column-major.


Let me also thank you for your dedication and contribution in the forum, without your code, I should still be lost in this ScaLapack cloud.

Julien Langou wrote:Hello,
There is nothing wrong with what you said except that ScaLAPACK works fine.
Below a (quick and dirty) demo. On one process (well the matrix is 3-by-3). The code has to be compiled with
Code: Select all
mpicc forum-357.c -L/home/faculty/langou/optproject/lib -lscalapack -lblacsCinit -lblacs -lreflapack -lrefblas -lg2c

executed with
Code: Select all
mpirun -np 1 ./a.out

...
Julien

PostPosted: Mon Mar 26, 2007 11:40 pm
by hongbo
Hi Guys,

Finally I understand what really happened.

It does have to do with the PDELSET_ calling.

Code: Select all
    //  set local matrix
    for (int i=1; i<=nA; ++i)
        for (int j=1; j<=nA; ++j)
            pdelset_(lA, i, j, desclA, A[i-1][j-1]);


The way I allocated memory for matrix A is that A[0] is the 1D array that stores the matrix (M1) using column-major. And I can use A[i] to access i-th column.

Code: Select all
    dimA=3;
                                                                                                 
    A = new double* [dimA];
    A[0] = new double [dimA*dimA];
    A[1] = A[0] + dimA;
    A[2] = A[1] + dimA;


Hence the A[i-1][j-1] calling in the pdelset_() actually means i-th column and j-th row of a column-major matrix (M1). So the calling sequence of setting local matrix should be
Code: Select all
            pdelset_(lA, j, i, desclA, A[i-1][j-1]);

and this solves the problem.

Cheers.