Hello,
You are right in saying that the CBLAS and the reference BLAS (that you
can get for example from CLAPACK or LAPACK) have different interfaces.
The main difference is in the first argument of the CBLAS Level 2 and
Level 3 subroutines (for example cblas_dgemm and dgemm). You should note
that this argument is not present in the reference BLAS. It is
CBLAS_ORDER. The user provides CBLAS with CLAS_ORDER = CblasRowMajor if
the matrices are stored in row major format or CBLAS_ORDER =
Cblas_ColMajor if the matrices are stored in column major format.
The reference BLAS does not have such an argument, it assumes CBLAS_ORDER
= Cblas_ColMajor (this is a Fortran inheritance).
All this to say that if you want to reproduce the same result with CBLAS
and BLAS with the same calling sequence you must set the first argument of
CBLAS to Cblas_ColMajor (in your first code).
Or if you absolutley want to to have Cblas_RowMajor for the CBLAS, then
you can trick the reference BLAS by asking it to do C = B^T * A^T (for C =
A * B ) as you just said.
Julien.
On Tue, 21 Aug 2007, scott@Domain.Removed wrote:
Dear Sirs,
I have the following quandery. I have downloaded both CBLAS and CLAPACK
which uses the f2c version of BLAS. I cannot help but get the
impression that these two different versions of BLAS follow a different
convention.
For matrix multiplication, I looked at sgemm.
Using CBLAS, I could get C = A*B
However, to get an equivalent result, I had in effect to compute C^T =
A^T*B^T i.e. C = B*A The code is shown below the =====
All the other settings are the same, as far as I can tell.
Are not the versions of BLAS, the same?
best wishes and thanks in advance for any feedback,
Tony Scott
RWTHAachen
Germany
================================
E.g. Using CBLAS with ATLAS and the following routine:
#include <stdio.h>
#include </users/tonys/BLAS/cblas.h>
int
main (void)
{
/*int lda = 3; float A[] = { 0.11, 0.12, 0.13, 0.21, 0.22, 0.23 };*/
int lda = 2;
float A[] = { 0.11, 0.12, 0.13, 0.21 };
printf ("[ %g, %g\n", A[0], A[1]);
printf (" %g, %g ]\n", A[2], A[3]);
int ldb = 2;
float B[] = { 1011, 1012, 1021, 1022, 1031, 1032 };
printf ("[ %g, %g\n", B[0], B[1]);
printf (" %g, %g ]\n", B[2], B[3]);
/*int ldc = 2; float C[] = { 0.00, 0.00, 0.00, 0.00 }; */
int ldc = 2;
float C[] = { 0.00, 0.00, 0.00, 0.00 };
/* Compute C = A B */
/* cblas_sgemm */
/*sgemm(CblasRowMajor, CblasNoTrans, CblasNoTrans, 2, 2, 3, 1.0, A,
lda, B, ldb, 0.0, C, ldc); */
sgemm(CblasRowMajor, CblasNoTrans, CblasNoTrans, 2, 2, 2, 1.0, A, lda,
B, ldb, 0.0, C, ldc);
printf ("[ %g, %g\n", C[0], C[1]);
printf (" %g, %g ]\n", C[2], C[3]);
return 0;
}
~
I get:
[ 0.11, 0.12
0.13, 0.21 ]
[ 1011, 1012
1021, 1022 ]
[ 233.73, 233.96
345.84, 346.18 ]
But when using CBLAS inside Lapack using:
#include <stdio.h>
#include </users/tonys/BLAS/clapack/CLAPACK/BLAS/WRAP/blaswrap.h>
#include </users/tonys/BLAS/clapack/CLAPACK/BLAS/WRAP/cblas.h>
#include </users/tonys/BLAS/clapack/CLAPACK/F2CLIBS/f2c.h>
/*  translated by f2c (version 19990503).
You must link the resulting object file with the libraries:
lf2c lm (in that order)
*/
/* Main program */ MAIN__(void)
{
extern /* Subroutine */ sgemm_( char *, char *, integer *, integer *,
integer *, real *, real *, integer *, real *, integer *,
real *, real *, integer *);
static char transa[1], transb[1];
static real alpha;
static real beta;
static integer m,n,k;
static integer lda,ldb,ldc;
/*lda = 3; static real A[] = { 0.11, 0.12, 0.13, 0.21, 0.22, 0.23 }; */
lda = 2;
static real A[] = { 0.11, 0.12, 0.13, 0.21};
printf ("[ %g, %g\n", A[0], A[1]);
printf (" %g, %g ]\n", A[2], A[3]);
alpha = 1.0;
beta = 0.0;
/*m=2; n=2; k=3; */
m=2; n=2; k=2;
/*ldb = 2; static real B[] = { 1011.0, 1012.0, 1021.0, 1022.0, 1031.0,
1032.0 }; */
ldb = 2;
static real B[] = { 1011.0, 1012.0, 1021.0, 1022.0};
printf ("[ %g, %g\n", B[0], B[1]);
printf (" %g, %g ]\n", B[2], B[3]);
ldc = 2;
static real C[] = { 0.00, 0.00, 0.00, 0.00 };
*(unsigned char *)transa = 'T';
*(unsigned char *)transb = 'T';
/* Compute C = A B */
/* cblas_sgemm */
/*sgemm_(CblasNoTrans, CblasNoTrans, 2, 2, 3, */
sgemm_(transa, transb, &m, &n, &k, &alpha, &A[0], &lda, &B[0], &ldb,
&beta, &C[0], &ldc);
printf ("[ %g, %g\n", C[0], C[1]);
printf (" %g, %g ]\n", C[2], C[3]);
return 0;
} /* MAIN__
Main program alias */ int testblas_ () { MAIN__ (); return 0; }
which produces:
[ 0.11, 0.12
0.13, 0.21 ]
[ 1011, 1012
1021, 1022 ]
[ 233.73, 345.84
233.96, 346.18 ]
The resulting C matrix is the transpose of the earlier result
