(C)LAPACK(E) does not work with Goto-BLAS

Open discussion regarding features, bugs, issues, vendors, etc.

(C)LAPACK(E) does not work with Goto-BLAS

Postby error5772 » Tue Sep 13, 2011 7:28 am

Dear people!

I have no more ideas how to make the
included LAPACK in Goto-BLAS (Ver. 2)
working directly, with LAPACKE or kompiling
CLAPACK with Goto-BLAS! I have a dual
XEON E5645 system with 25 threads using
opensuse 11.4. The Goto-BLAS itself seems to
work, but no LAPACK-Version with it.
These libraries compile, but any code fails with
'segmentation fault' as lapack_testing.

Does anyone know how to use LAPACK/Goto2
with C?

Thanks for answering.
error5772
 
Posts: 19
Joined: Thu Sep 01, 2011 5:02 am

Re: (C)LAPACK(E) does not work with Goto-BLAS

Postby admin » Tue Sep 20, 2011 4:27 pm

Posting a simple piece of code may help.
admin
Site Admin
 
Posts: 502
Joined: Wed Dec 08, 2004 7:07 pm

Re: (C)LAPACK(E) does not work with Goto-BLAS

Postby error5772 » Tue Oct 11, 2011 12:44 pm

Sorry for answering so late!

No, you can delete the thread: Goto-BLAS does not work it's no LAPACK-problem.
I have some unknown problem that causes Goto-BLAS and LAPACK 3.3.1
to sometimes work (diagonalize) and sometimes not. It's probably some "thread"
problem I can not solve. Maybe somebody else has the same problem (dual-Westmere 64bit).

Thanks for "Just calling" the F77-dlamch_

Michael

P.S.: If you still want to look I called dsyevx_ as:
dsyev_(&job,&inp,&ndim,amat.begin(),&ndim,dvec.begin(),dtmpvec.begin(),&ndtmp,&nstatus);
with char job='V', inp='U'; int ndim = 511, ndtmp, nstatus
dvec.begin() dtmpvec.begin() are double * and amat.begin() is double * to a matrix in Column-Major order as a double[].
error5772
 
Posts: 19
Joined: Thu Sep 01, 2011 5:02 am

Re: (C)LAPACK(E) does not work with Goto-BLAS

Postby error5772 » Wed Oct 26, 2011 5:17 am

The problem could be solved!

LAPACK3.3.1 on Goto-BLAS, called from C++:

The following instructions assume a hexacore (Nehalem/Westmere) system with 64bit adresses.

1. Install Goto-BLAS by editing "Makefile.rule":
TARGET=NEHALEM, CC=gcc, FC=gfortran, BINARY=64, USE_THREAD=1, NUM_THREADS=6
call the lib "libgoto6t.a" and compile some others "libgotoXt.a" as above with NUM_THREADS=X.
Copy them to "/usr/local/lib64". If you use too many threads (e.g. 12) Goto-BLAS will jump out
with "sementation fault"! You can use NUM_THREADS=11 and call that BLAS "libgoto11t.a".
This will be your fastest BLAS, but sometimes jumps out (try again).

2. Install LAPACK-3.3.1 by editing "make.inc":
FORTRAN=gfortran -m64 -m128bit-long-double -fimplicit-none,
OPTS=-O3 -funroll-all-loops, DRVOPTS=$OPTS, NOOPT=-O0,
LOADER=gfortran
LOADOPTS=-L/usr/local/lib64/ -lgoto6t -lpthread
TIMER=INT_ETIME
BLASLIB=/usr/local/lib64/libgoto6t.a -lpthread

Compile it with:
make lapack_install
(make variants)
make lapacklib
make tmglib

The testers should work, but they call Goto-BLAS very fast! It's better to use
a Goto-BLAS lib with less or equal to half of the number of threads (e.g. libgoto6t.a)
make lapack_testing
make blas_testing
Call it "liblapackgoto.a" and copy to "/usr/local/lib64".

3. You can link the 6 thread-Goto-BLAS with:
"-llapackgoto -lgoto6t -lpthread -lgfortran"
or your fastest:
"-llapackgoto -lgoto11t -lpthread -lgfortran"
that sometimes "jumps out of the threads", but is unbeatable fast!

No need to compile CLAPACK or LAPACKE - just call the the Fortran-function
directly! Remember to call them two times:
- work space query for arrays with unknown length: run with worklength=-1
- resize workspace array to length found as first entry in your work array in test-length (e.g. 1)
- run the function.

4. Use your own LAPACK-header (learn from "lapacke.h") to declare the functions you need:

#include<complex>
extern "C" {
double dlamch_ ( char* );
void dsyev_( char*,char*,int*,double*,int*,double*,double*,int*,int*);
void zheev_( char*,char*,int*,std::complex<double>*,int*,double*,std::complex<double>*,int*,double*,int* );
...
}

These 3 functions can be called in main now. For example:
int main(){
...
char sw = 'S';
double tol = 2.0*dlamch_(&sw);
...
}
(sorry no example for dsyev_ and zheev_)

Even complex functions work fine now - just give the pointer to
the first element of your complex<double> array stored as a 1D
C-Array in Column-Major order. It is even possible to compile
all for long array indices.
(See: Algorithm/Data -> How large matrix can LAPACK support)

Michael
error5772
 
Posts: 19
Joined: Thu Sep 01, 2011 5:02 am


Return to User Discussion

Who is online

Users browsing this forum: Google [Bot], Yahoo [Bot] and 2 guests