quadruple-precision lapack/blas working

Open discussion regarding features, bugs, issues, vendors, etc.

quadruple-precision lapack/blas working

Postby airwin » Sun Oct 02, 2011 9:35 pm

I am planning a fit that might require quadruple precision to find the least-squares solution so I took some simple steps to implement a quadruple-precision build and test of lapack/blas. I am posting those steps and test results here for further comment.

The basic idea is to use the gcc-4.6.1 option, -fdefault-real-8, to interpret real, complex, double precision, and double complex variable types and constants as 64-bit real, 128-bit complex, 128-bit real, and 256-bit complex. But that option interprets definite types just as they are with no doubling of the precision. The only definite type in the code that I could find was COMPLEX*16. I used the following commands to convert that type to the indefinite DOUBLE COMPLEX type, that can be interpreted in "doubled" form by the -fdefault-real-8 option:

cp -a lapack-3.3.1 lapack-3.3.1_double_complex
for FILE in $(find lapack-3.3.1_double_complex -name "*\.f*"); \
do echo $FILE; NAME=$(echo $FILE|sed "s?^.*/??"); echo $NAME; \
sed 's?COMPLEX\*16?DOUBLE COMPLEX?' <$FILE >| /tmp/$NAME; \
mv -f /tmp/$NAME $FILE; done

I then built and tested the code as follows:

export FC="gfortran-4.6"
# N.B. -fdefault-real-8 doubles all precision interpretation of non-definite types.
# -fPIC allows shared libraries to link to static library versions of lapack/blas
# and -fixed-line-length-132 allows the longer "DOUBLE COMPLEX" strings above
# not to overflow the allowed line length.
export FFLAGS="-O3 -fdefault-real-8 -fPIC -ffixed-line-length-132"
mkdir build_quadruple_dir
cd build_quadruple_dir
cmake ../lapack-3.3.1_double_complex >& cmake.out

make VERBOSE=1 -j4 >& make.out
ctest --verbose --timeout 36000 >& ctest_quadruple_verbose.txt

I have attached ctest_quadruple_verbose.txt.gz (and also the equivalent ctest_double_verbose.txt.gz as a comparison for the case when the -fdefault-real-8 option is not used).

All tests passed for both the quadruple and double precision cases. However, lapack/blas tests can pass even though individual components of the tests fail. Here are the failing individual test messages.

From ctest_quadruple_verbose.txt
15: DGB drivers: 6 out of 30969 tests failed to pass the threshold
16: ZGB drivers: 6 out of 30969 tests failed to pass the threshold
24: SST: 1 out of 4662 tests failed to pass the threshold
29: SXV drivers: 200 out of 5000 tests failed to pass the threshold
43: CST: 1 out of 4662 tests failed to pass the threshold
48: CXV drivers: 24 out of 5000 tests failed to pass the threshold
67: DXV drivers: 200 out of 5000 tests failed to pass the threshold
81: ZST: 1 out of 4662 tests failed to pass the threshold
86: ZXV drivers: 24 out of 5000 tests failed to pass the threshold
100% tests passed, 0 tests failed out of 98

From ctest_double_verbose.txt:
24: SST: 1 out of 4662 tests failed to pass the threshold
25: SBD: 1 out of 5510 tests failed to pass the threshold
29: SXV drivers: 37 out of 5000 tests failed to pass the threshold
43: CST drivers: 1 out of 11664 tests failed to pass the threshold
43: CST: 1 out of 4662 tests failed to pass the threshold
62: DST: 1 out of 4662 tests failed to pass the threshold
62: DST: 1 out of 4662 tests failed to pass the threshold
62: DST drivers: 1 out of 14256 tests failed to pass the threshold
67: DXV drivers: 200 out of 5000 tests failed to pass the threshold
81: ZST: 1 out of 4662 tests failed to pass the threshold
81: ZST: 1 out of 4662 tests failed to pass the threshold
81: ZST: 1 out of 4662 tests failed to pass the threshold
81: ZST: 1 out of 4662 tests failed to pass the threshold
86: ZXV drivers: 24 out of 5000 tests failed to pass the threshold
100% tests passed, 0 tests failed out of 98

Could somebody knowledgeable comment on those individual failing tests?

If it turns out those individual failing tests are reasonable/expected, then it appears that thanks to the gfortran-4.6.1 option, -fdefault-real-8, that all Linux developers here will have access to a working quadruple-precision version of lapack/blas. However, there is one major caveat; all the 128-bit real and 256-bit complex tests took something like a factor of 100 (!) longer to complete on my ordinary Intel 64-bit box than the corresponding 64-bit real and 128-bit complex tests. So unless your computer has hardware support for 128-bit real and 256-bit complex arithmetic, results from quadruple-precision lapack/blas for those types will be extremely slow and therefore only useful as a last resort if ill-conditioning is killing you for the default-precision lapack/blas.
Attachments
ctest_double_verbose.txt.gz
ctest verbose output for double precision
(26.37 KiB) Downloaded 51 times
ctest_quadruple_verbose.txt.gz
ctest verbose output for quadruple precision
(27.15 KiB) Downloaded 66 times
airwin
 
Posts: 1
Joined: Sun Oct 02, 2011 6:26 pm

Re: quadruple-precision lapack/blas working

Postby admin » Thu Oct 06, 2011 9:08 am

Your quadruple version of LAPACK looks great, you should not have concerns.
You have different numerical failures than double precision LAPACK, but nothing worrisome.

Julien.
admin
Site Admin
 
Posts: 498
Joined: Wed Dec 08, 2004 7:07 pm

Re: quadruple-precision lapack/blas working

Postby int128 » Tue Nov 13, 2012 2:01 am

I am trying to build LAPACK 3.4.2 in quadruple precision using mingw32 and mingw64 by following ideas above.

Under mingw32, gcc 4.7.2 compilation went fine. Testing took a lot of time but it ended up successfully with 100% tests passed.
Also I successfully compiled quadruple precision LAPACK in 64bits on Linux with the same compilers and from the same sources.

However on mingw64 (compilation for 64bits using the same gcc 4.7.2) I've got only 77% tests passed.
All failed tests reported the same error:
Code: Select all
Program received signal SIGSEGV: Segmentation fault - invalid memory reference.

Backtrace for this error:
#0  ffffffffffffffff

I am far from being expert in Fortran and LAPACK tests design, and I really would appreciate any help in localizing particular code which causes the segfault.
Most likely, there is something wrong with mingw64, I just need to find simple example to re-produce the problem (and contact mingw64 developers for fixes).

Is it possible to run tests with the help of addr2line to get complete backtrace? And how to do that if yes?
Having the log files (attached) could you advise what particular test to run to re-produce the segfault (and where is corresponding source files)?

I would appreciate any other help in this direction.

Here are logs of compilation and testing:
Attachments
mingw64-testing-log.zip
(22.44 KiB) Downloaded 43 times
mingw64-build-log.zip
(60.35 KiB) Downloaded 43 times
int128
 
Posts: 1
Joined: Mon Nov 12, 2012 10:04 pm


Return to User Discussion

Who is online

Users browsing this forum: Yahoo [Bot] and 2 guests

cron