by error5772 » Wed Oct 26, 2011 5:17 am
The problem could be solved!
LAPACK3.3.1 on Goto-BLAS, called from C++:
The following instructions assume a hexacore (Nehalem/Westmere) system with 64bit adresses.
1. Install Goto-BLAS by editing "Makefile.rule":
TARGET=NEHALEM, CC=gcc, FC=gfortran, BINARY=64, USE_THREAD=1, NUM_THREADS=6
call the lib "libgoto6t.a" and compile some others "libgotoXt.a" as above with NUM_THREADS=X.
Copy them to "/usr/local/lib64". If you use too many threads (e.g. 12) Goto-BLAS will jump out
with "sementation fault"! You can use NUM_THREADS=11 and call that BLAS "libgoto11t.a".
This will be your fastest BLAS, but sometimes jumps out (try again).
2. Install LAPACK-3.3.1 by editing "make.inc":
FORTRAN=gfortran -m64 -m128bit-long-double -fimplicit-none,
OPTS=-O3 -funroll-all-loops, DRVOPTS=$OPTS, NOOPT=-O0,
LOADER=gfortran
LOADOPTS=-L/usr/local/lib64/ -lgoto6t -lpthread
TIMER=INT_ETIME
BLASLIB=/usr/local/lib64/libgoto6t.a -lpthread
Compile it with:
make lapack_install
(make variants)
make lapacklib
make tmglib
The testers should work, but they call Goto-BLAS very fast! It's better to use
a Goto-BLAS lib with less or equal to half of the number of threads (e.g. libgoto6t.a)
make lapack_testing
make blas_testing
Call it "liblapackgoto.a" and copy to "/usr/local/lib64".
3. You can link the 6 thread-Goto-BLAS with:
"-llapackgoto -lgoto6t -lpthread -lgfortran"
or your fastest:
"-llapackgoto -lgoto11t -lpthread -lgfortran"
that sometimes "jumps out of the threads", but is unbeatable fast!
No need to compile CLAPACK or LAPACKE - just call the the Fortran-function
directly! Remember to call them two times:
- work space query for arrays with unknown length: run with worklength=-1
- resize workspace array to length found as first entry in your work array in test-length (e.g. 1)
- run the function.
4. Use your own LAPACK-header (learn from "lapacke.h") to declare the functions you need:
#include<complex>
extern "C" {
double dlamch_ ( char* );
void dsyev_( char*,char*,int*,double*,int*,double*,double*,int*,int*);
void zheev_( char*,char*,int*,std::complex<double>*,int*,double*,std::complex<double>*,int*,double*,int* );
...
}
These 3 functions can be called in main now. For example:
int main(){
...
char sw = 'S';
double tol = 2.0*dlamch_(&sw);
...
}
(sorry no example for dsyev_ and zheev_)
Even complex functions work fine now - just give the pointer to
the first element of your complex<double> array stored as a 1D
C-Array in Column-Major order. It is even possible to compile
all for long array indices.
(See: Algorithm/Data -> How large matrix can LAPACK support)
Michael