Why is Dgemm giving me the answer transposed?

Open discussion regarding features, bugs, issues, vendors, etc.

Why is Dgemm giving me the answer transposed?

Postby ctcsback » Thu Mar 31, 2011 7:50 pm

I thought the dgemm would take care of the fortran column major problem but calling dgemm to solve for C = A*B only gives the C correctly in a transposed format. I can't get it to produce C non-transposed. I implemented a naive transpose function so I'd have the correct answer for my project, but profiling shows that transpose is taking 2x more time than dgemm itself. Which leads me to another question. Is there an efficient transpose blas routine?

Aside from that, here is my test code:

Code: Select all
//~returns random matrix
double* matrix_random(double *A, int row, int col)
{
  int i,j;
  srand(time(NULL));
  for(i=0;i<row;i++)
  {
    for(j=0;j<col;j++)
    {
      A[i*col + j] = rand() % 5;
    }
  }
  return A;
}

dgemm_test(int rowsA, int colsB, int common)
{
  int i,j,k;
  char trans = 'T';
  char norm = 'N';
  double one = 1.0;
  double zero = 0.0;
  double neg = -1.0;
  double *A, *B, *C, *D;
  A = malloc(sizeof(double)*rowsA*common);
  B = malloc(sizeof(double)*colsB*common);
  C = calloc(rowsA*colsB,sizeof(double));
  D = calloc(rowsA*colsB,sizeof(double));

  A = matrix_random(A,rowsA,common);
  B = matrix_random(B,common,colsB);

  //~ Do a matrix multiply: A*B = C
  for(i=0;i<rowsA;i++)
  {
    for(j=0;j<colsB;j++)
    {
      C[i*colsB+j]=0;
      for(k=0;k<common;k++)
      {
        C[i*colsB+j]+=A[i*common+k]*B[k*colsB+j];
      }
    }
  }
  dgemm_(&trans, &trans, &rowsA, &colsB, &common, &one, A, &common, B, &colsB, &zero, C, &rowsA);


}

int main(void)
{
  dgemm_test(4, 3, 2);
  return 0;
}


Example output:

A, resulting 4 by 2 matrix
0, 0,
0, 4,
3, 1,
3, 2,

B, resulting 2 by 3 matrix
0, 0, 0,
4, 3, 1,

C, resulting 4 by 3 matrix
0, 0, 0,
16, 12, 4,
4, 3, 1,
8, 6, 2,

D, resulting 4 by 3 matrix
0, 16, 4,
8, 0, 12,
3, 6, 0,
4, 1, 2,


If I print C and D here, they are just the same except transposed. Using the 'n' function does not help. It only specifies whether or not the input (A and B) is transposed. Because the output of A*B is transposed, I have to perform a transpose function before I can add another matrix to it. Is there a way to get the C result to not be transposed?
ctcsback
 
Posts: 10
Joined: Tue Oct 26, 2010 3:26 am

Re: Why is Dgemm giving me the answer transposed?

Postby admin » Fri Apr 01, 2011 4:33 am

Hi,
first your random routine is not working correctly.
Looks like the elements of A are the same than B.

Then , you should use cblas, it has a RowMajor / Col Major argument.
You can find it here: http://www.netlib.org/blas/blast-forum/cblas.tgz

Julie
admin
Site Admin
 
Posts: 504
Joined: Wed Dec 08, 2004 7:07 pm

Re: Why is Dgemm giving me the answer transposed?

Postby ctcsback » Sat Apr 02, 2011 6:39 pm

admin wrote:Hi,
first your random routine is not working correctly.
Looks like the elements of A are the same than B.

Then , you should use cblas, it has a RowMajor / Col Major argument.
You can find it here: http://www.netlib.org/blas/blast-forum/cblas.tgz

Julie


Thanks for the help. I finally got cblas working(ish) and got cblas_dgemm function to spit out what I wanted. However, I just wanted to ask about/point out its behavior, because following the documentation I found for the functions gave me illegal value errors and seg faults.

I ended up modifying my dgemm test function (and fixing my random function, thanks for pointing it out, even though it wasn't crucial). Code is here:

Code: Select all
dgemm_test(int rowsA, int colsB, int common)
{
  int i,j,k,w;
  double *A, *B, *C, *D;
  char trans = 'T';
  char norm = 'N';
  double one = 1.0;
  double zero = 0.0;
  double neg = -1.0;
  enum CBLAS_ORDER order;
  enum CBLAS_TRANSPOSE transa;
  enum CBLAS_TRANSPOSE transb;

  order = CblasRowMajor;
  transa = CblasNoTrans;
  transb = CblasNoTrans;

  A = (double *)malloc(sizeof(double)*rowsA*common);
  B = (double *)malloc(sizeof(double)*colsB*common);
  C = (double *)calloc(rowsA*colsB,sizeof(double));
  D = (double *)calloc(rowsA*colsB,sizeof(double));

  A = matrix_random(A,rowsA,common);
  B = matrix_random(B,common,colsB);

  //~ Do a matrix multiply: A*B = C
  for(i=0;i<rowsA;i++)
  {
    for(j=0;j<colsB;j++)
    {
      C[i*colsB+j]=0;
      for(k=0;k<common;k++)
      {
        C[i*colsB+j]+=A[i*common+k]*B[k*colsB+j];
      }
    }
  }
  //Print A, B,C here

  //~ This was my previous call
  //~ dgemm_(&trans, &trans, &rowsA, &colsB, &common, &one, A, &common, B, &colsB, &zero, D, &rowsA);

  cblas_dgemm(order,transa,transb,rowsA,colsB,common,1.0,A,common,B,colsB,0.0,D,colsB);
  //Print D here
}


This is what ended up working for me, but it wasn't close to what links like http://www.prism.gatech.edu/~ndantam3/cblas-doc/doc/html/cblas_8h.html#cbc643095eb54edbffb0c5e9080a2450 showed - my lda, ldb, ldc values are all different. Also, I was still wondering if there are any optimized versions of matrix transpose out there. Just curious if you know of any.

Thanks again.
ctcsback
 
Posts: 10
Joined: Tue Oct 26, 2010 3:26 am

Re: Why is Dgemm giving me the answer transposed?

Postby admin » Tue Apr 05, 2011 5:53 am

DGEMM routine are implemented in the BLAS libraries.
You can take a look here: http://en.wikipedia.org/wiki/Basic_Line ... ubprograms
Most of them (MKL, ATLAS, ACML, Accelerate ...) are also including the CBLAS.
Julie
admin
Site Admin
 
Posts: 504
Joined: Wed Dec 08, 2004 7:07 pm


Return to User Discussion

Who is online

Users browsing this forum: Google [Bot] and 2 guests