bug(?) with variable USEDQD in (s/d)larre from version 3.1.1

Open discussion regarding features, bugs, issues, vendors, etc.

bug(?) with variable USEDQD in (s/d)larre from version 3.1.1

Postby devel » Thu Mar 15, 2007 8:37 am

Hi everybody,

I was attempting to install lapack 3.1.1 on a cluster of bi-opteron using the Intel 9.1 compiler under CentOS 4
The making of the librairies went well but when I tried to test them I received messages telling that ssep.out then dsep.out, etc... had problems. Looking inside these files I saw Intel compiler Runtime messages telling me that variable slarre_$USEDQD had been used before being defined.
I looked at the the code of slarre.f and found that effectively, if MB=0 then this variable is used in a test before being attributed a value
devel
 
Posts: 3
Joined: Thu Mar 15, 2007 5:45 am

Re: bug(?) with variable USEDQD in (s/d)larre from version 3

Postby buttari » Thu Mar 15, 2007 10:19 am

devel wrote:Hi everybody,

I was attempting to install lapack 3.1.1 on a cluster of bi-opteron using the Intel 9.1 compiler under CentOS 4
The making of the librairies went well but when I tried to test them I received messages telling that ssep.out then dsep.out, etc... had problems. Looking inside these files I saw Intel compiler Runtime messages telling me that variable slarre_$USEDQD had been used before being defined.
I looked at the the code of slarre.f and found that effectively, if MB=0 then this variable is used in a test before being attributed a value


Hi,
I'm sorry but I can't reproduce your problem. I compiled lapack-3.1.1 with the intel compiler 9.1 and everything went smooth. No warnings or error messages compiling slarre.f. Moreover I also took a look at the slarre code and everything seems correct to me. if MB=0 there is a GOTO 170 instruction so the test on USEDQD is not performed.

Alfredo
buttari
 
Posts: 51
Joined: Tue Jul 11, 2006 2:11 pm

error only when executing testing

Postby devel » Thu Mar 15, 2007 11:33 am

Sorry for not being clear enough : the error is only when I do "make lapack_testing" not when doing "make lib".
(Remark: the same procedure caught the error with s/dormrz.f variable LWKOPT when I tried to compile lapack 3.1.0)
Maybe it has to do with the flags I use with ifort : -C -O3 -mcmodel=large -i-dynamic

Please find below the 2 relevant files

the end of the ssep.out file is : (end of dsep.out is similar)

SST routines passed the tests of the error exits (147 tests done)


SEP: NB = 1, NBMIN = 2, NX = 1

All tests for SST passed the threshold ( 3276 tests run)
forrtl: severe (193): Run-Time Check Failure. The variable 'slarre_$USEDQD' is being used without being defined
Image PC Routine Line Source
libirc.so 0000002A95D4B2EF Unknown Unknown Unknown
libirc.so 0000002A95D45C06 Unknown Unknown Unknown
libifcore.so.5 0000002A958813BC Unknown Unknown Unknown
forrtl: error (76): IOT trap signal



My make.inc file is :
####################################################################
# LAPACK make include file. #
# LAPACK, Version 3.0 #
# June 30, 1999 #
####################################################################
#
SHELL = /bin/sh
#
# The machine (platform) identifier to append to the library names
#
PLAT = _LINUX_ifort
#
# Modify the FORTRAN and OPTS definitions to refer to the
# compiler and desired compiler options for your machine. NOOPT
# refers to the compiler options desired when NO OPTIMIZATION is
# selected. Define LOADER and LOADOPTS to refer to the loader and
# desired load options for your machine.
#
FORTRAN = ifort
OPTS = -C -O3 -mcmodel=large -i-dynamic
DRVOPTS = $(OPTS)
NOOPT = -C
LOADER = ifort
LOADOPTS = $(OPTS)
LD_LIBRARY_PATH=/opt/intel/fce/9.1.036/lib

#
# Timer for the SECOND and DSECND routines
#
# Default : SECOND and DSECND will use a call to the EXTERNAL FUNCTION ETIME
# TIMER = EXT_ETIME
# For RS6K : SECOND and DSECND will use a call to the EXTERNAL FUNCTION ETIME_
# TIMER = EXT_ETIME_
# For gfortran compiler: SECOND and DSECND will use a call to the INTERNAL FUNCTION ETIME
# TIMER = INT_ETIME
# If your Fortran compiler does not provide etime (like Nag Fortran Compiler, etc...)
# SECOND and DSECND will use a call to the INTERNAL FUNCTION CPU_TIME
TIMER = INT_CPU_TIME
# If neither of this works...you can use the NONE value... In that case, SECOND and DSECND will always return 0
# TIMER = NONE
#
#
# The archiver and the flag(s) to use when building archive (library)
# If you system has no ranlib, set RANLIB = echo.
#
ARCH = ar
ARCHFLAGS= cr
RANLIB = ranlib
#
# The location of the libraries to which you will link. (The
# machine-specific, optimized BLAS library should be used whenever
# possible.)
#
BLASLIB = ../../blas$(PLAT).a
LAPACKLIB = lapack$(PLAT).a
TMGLIB = tmglib$(PLAT).a
EIGSRCLIB = eigsrc$(PLAT).a
LINSRCLIB = linsrc$(PLAT).a
devel
 
Posts: 3
Joined: Thu Mar 15, 2007 5:45 am

Re: error only when executing testing

Postby buttari » Thu Mar 15, 2007 3:04 pm

devel wrote:Sorry for not being clear enough : the error is only when I do "make lapack_testing" not when doing "make lib".
(Remark: the same procedure caught the error with s/dormrz.f variable LWKOPT when I tried to compile lapack 3.1.0)
Maybe it has to do with the flags I use with ifort : -C -O3 -mcmodel=large -i-dynamic

Please find below the 2 relevant files

the end of the ssep.out file is : (end of dsep.out is similar)

SST routines passed the tests of the error exits (147 tests done)


SEP: NB = 1, NBMIN = 2, NX = 1

All tests for SST passed the threshold ( 3276 tests run)
forrtl: severe (193): Run-Time Check Failure. The variable 'slarre_$USEDQD' is being used without being defined
Image PC Routine Line Source
libirc.so 0000002A95D4B2EF Unknown Unknown Unknown
libirc.so 0000002A95D45C06 Unknown Unknown Unknown
libifcore.so.5 0000002A958813BC Unknown Unknown Unknown
forrtl: error (76): IOT trap signal



My make.inc file is :
####################################################################
# LAPACK make include file. #
# LAPACK, Version 3.0 #
# June 30, 1999 #
####################################################################
#
SHELL = /bin/sh
#
# The machine (platform) identifier to append to the library names
#
PLAT = _LINUX_ifort
#
# Modify the FORTRAN and OPTS definitions to refer to the
# compiler and desired compiler options for your machine. NOOPT
# refers to the compiler options desired when NO OPTIMIZATION is
# selected. Define LOADER and LOADOPTS to refer to the loader and
# desired load options for your machine.
#
FORTRAN = ifort
OPTS = -C -O3 -mcmodel=large -i-dynamic
DRVOPTS = $(OPTS)
NOOPT = -C
LOADER = ifort
LOADOPTS = $(OPTS)
LD_LIBRARY_PATH=/opt/intel/fce/9.1.036/lib

#
# Timer for the SECOND and DSECND routines
#
# Default : SECOND and DSECND will use a call to the EXTERNAL FUNCTION ETIME
# TIMER = EXT_ETIME
# For RS6K : SECOND and DSECND will use a call to the EXTERNAL FUNCTION ETIME_
# TIMER = EXT_ETIME_
# For gfortran compiler: SECOND and DSECND will use a call to the INTERNAL FUNCTION ETIME
# TIMER = INT_ETIME
# If your Fortran compiler does not provide etime (like Nag Fortran Compiler, etc...)
# SECOND and DSECND will use a call to the INTERNAL FUNCTION CPU_TIME
TIMER = INT_CPU_TIME
# If neither of this works...you can use the NONE value... In that case, SECOND and DSECND will always return 0
# TIMER = NONE
#
#
# The archiver and the flag(s) to use when building archive (library)
# If you system has no ranlib, set RANLIB = echo.
#
ARCH = ar
ARCHFLAGS= cr
RANLIB = ranlib
#
# The location of the libraries to which you will link. (The
# machine-specific, optimized BLAS library should be used whenever
# possible.)
#
BLASLIB = ../../blas$(PLAT).a
LAPACKLIB = lapack$(PLAT).a
TMGLIB = tmglib$(PLAT).a
EIGSRCLIB = eigsrc$(PLAT).a
LINSRCLIB = linsrc$(PLAT).a


no way. I can't reproduce the problem. I recompiled lapack with your make.inc file and still my ssep.out file has not runtime error messages.
Could you please do me a favour? Can you initialize USEDQD to .TRUE. ath the beginning of slarre.f, recompile and rerun the tests? Does it work in this case?
Thanks

Alfredo
buttari
 
Posts: 51
Joined: Tue Jul 11, 2006 2:11 pm

Postby Julien Langou » Thu Mar 15, 2007 6:11 pm

Hello guys,

I have double checked with Christof Voemel. He agrees that there might have a problem
since the variable USEDQD is not initialized in some cases. This kind of errors are hard to
spot because indepently of the value of USEDQD ( .TRUE. or .FALSE. ), the algorithm
will return a correct value. So thanks for finding it.

'devel' since you might want to edit the DLARRE code, can you rather try to add:
Code: Select all
      USEDQD = (( IRANGE.EQ.ALLRNG ) .AND. (.NOT.FORCEB))

right after the initialization of FORCEB = .FALSE. on line 303, and report to us if your run is
now ok?

Best wishes,
Julien.
Julien Langou
 
Posts: 727
Joined: Thu Dec 09, 2004 12:32 pm
Location: Denver, CO, USA

xlarre "bug" apparently cured but there may be oth

Postby devel » Tue Mar 20, 2007 1:55 pm

Hi guy,

Sorry for the delay of my reply.
I did with buttari suggested (USEDQD=.TRUE. at the beginning of slarre.f) and what Julien suggested:
USEDQD = (( IRANGE.EQ.ALLRNG ) .AND. (.NOT.FORCEB))
after FORCEB = .FALSE. on line 303.
This time no crash of the tests. both ssep.out and dsep.out have no more forrtl errors.

However...

I did a "grep forrtl *.out" in TESTING and got plenty of identical lines from cvsd.out :
forrtl: error (63): output conversion error, unit 6, file stdout

then I tried a "grep failed *.out" and got an impressive list:
csep.out: CST drivers: 1 out of 11664 tests failed to pass the threshold
csvd.out: CBD drivers: 144 out of 4804 tests failed to pass the threshold
csvd.out: CBD drivers: 176 out of 4796 tests failed to pass the threshold
csvd.out: CBD: 1 out of 4085 tests failed to pass the threshold
csvd.out: CBD drivers: 160 out of 4800 tests failed to pass the threshold
csvd.out: CBD drivers: 176 out of 4796 tests failed to pass the threshold
csvd.out: CBD drivers: 192 out of 4792 tests failed to pass the threshold
ctest.out: CPB: 11 out of 3458 tests failed to pass the threshold
ctest.out: CPB drivers: 4 out of 4750 tests failed to pass the threshold
ctest.out: CLS drivers: 81 out of 65268 tests failed to pass the threshold
dgd.out: DXV drivers: 200 out of 5000 tests failed to pass the threshold
dsep.out: DST drivers: 1 out of 14256 tests failed to pass the threshold
dsep.out: DST drivers: 1 out of 14256 tests failed to pass the threshold
dsvd.out: DBD: 6 out of 5510 tests failed to pass the threshold
dsvd.out: DBD: 6 out of 5510 tests failed to pass the threshold
dsvd.out: DBD: 4 out of 5510 tests failed to pass the threshold
dsvd.out: DBD: 8 out of 5510 tests failed to pass the threshold
dsvd.out: DBD: 10 out of 5510 tests failed to pass the threshold
dtest.out: DLS drivers: 7 out of 65268 tests failed to pass the threshold
sgd.out: SXV drivers: 37 out of 5000 tests failed to pass the threshold
ssep.out: SST: 1 out of 3276 tests failed to pass the threshold
ssvd.out: SBD: 36 out of 5510 tests failed to pass the threshold
ssvd.out: SBD drivers: 160 out of 5320 tests failed to pass the threshold
ssvd.out: SBD: 40 out of 5510 tests failed to pass the threshold
ssvd.out: SBD drivers: 160 out of 5320 tests failed to pass the threshold
ssvd.out: SBD: 40 out of 5510 tests failed to pass the threshold
ssvd.out: SBD drivers: 176 out of 5320 tests failed to pass the threshold
ssvd.out: SBD: 40 out of 5510 tests failed to pass the threshold
ssvd.out: SBD drivers: 160 out of 5320 tests failed to pass the threshold
ssvd.out: SBD: 40 out of 5510 tests failed to pass the threshold
ssvd.out: SBD drivers: 128 out of 5320 tests failed to pass the threshold
stest.out: SLS drivers: 106 out of 65268 tests failed to pass the threshold
zgd.out: ZXV drivers: 24 out of 5000 tests failed to pass the threshold
zgg.out: ZGG: 1 out of 2177 tests failed to pass the threshold
ztest.out: ZLS drivers: 15 out of 65268 tests failed to pass the threshold

I can send all these files somewhere if you want, but as an example, please find hereafter the problematic section of ztest.out:

ZLS: Least squares driver routines
Matrix types (1-3: full rank, 4-6: rank deficient):
1 and 4. Normal scaling
2 and 5. Scaled near overflow
3 and 6. Scaled near underflow
Test ratios:
(1-2: ZGELS, 3-6: ZGELSX, 7-10: ZGELSY, 11-14: ZGELSS, 15-18: ZGELSD)
1: norm( B - A * X ) / ( max(M,N) * norm(A) * norm(X) * EPS )
2: norm( (A*X-B)' *A ) / ( max(M,N,NRHS) * norm(A) * norm(B) * EPS )
if TRANS='N' and M.GE.N or TRANS='T' and M.LT.N, otherwise
check if X is in the row space of A or A' (overdetermined case)
3: norm(svd(A)-svd(R)) / ( min(M,N) * norm(svd(R)) * EPS )
4: norm( B - A * X ) / ( max(M,N) * norm(A) * norm(X) * EPS )
5: norm( (A*X-B)' *A ) / ( max(M,N,NRHS) * norm(A) * norm(B) * EPS )
6: Check if X is in the row space of A or A'
7-10: same as 3-6 11-14: same as 3-6 15-18: same as 3-6
Messages:
M= 50, N= 50, NRHS= 1, NB= 1, type 6, test(16)= 0.24849E+13
M= 50, N= 50, NRHS= 1, NB= 1, type 6, test(17)= 0.86421E+12
M= 50, N= 50, NRHS= 1, NB= 3, type 6, test(16)= 0.24849E+13
M= 50, N= 50, NRHS= 1, NB= 3, type 6, test(17)= 0.86421E+12
M= 50, N= 50, NRHS= 1, NB= 3, type 6, test(16)= 0.24849E+13
M= 50, N= 50, NRHS= 1, NB= 3, type 6, test(17)= 0.86421E+12
M= 50, N= 50, NRHS= 1, NB= 3, type 6, test(16)= 0.24849E+13
M= 50, N= 50, NRHS= 1, NB= 3, type 6, test(17)= 0.86421E+12
M= 50, N= 50, NRHS= 1, NB= 20, type 6, test(16)= 0.24849E+13
M= 50, N= 50, NRHS= 1, NB= 20, type 6, test(17)= 0.86421E+12
M= 50, N= 50, NRHS= 2, NB= 1, type 1, test(16)= 0.19271E+13
M= 50, N= 50, NRHS= 2, NB= 3, type 1, test(16)= 0.19271E+13
M= 50, N= 50, NRHS= 2, NB= 3, type 1, test(16)= 0.19271E+13
M= 50, N= 50, NRHS= 2, NB= 3, type 1, test(16)= 0.19271E+13
M= 50, N= 50, NRHS= 2, NB= 20, type 1, test(16)= 0.19271E+13
ZLS drivers: 15 out of 65268 tests failed to pass the threshold


PS: buttari, 'devel' is not the abbreviation of development but my family name ! ;-)
devel
 
Posts: 3
Joined: Thu Mar 15, 2007 5:45 am

Postby Julien Langou » Tue Mar 20, 2007 2:05 pm

Great. So we'll have a patch soon, I guess, thanks a lot.

Regarding the numerical failures, say that your install is OK.
Some tests can not be realized on some of the ill-conditionned systems which the tester
proposes. Some numerical failures are just a little bit above the threshold, etc. When we'll
have time, we'll defintely need to improve the testing ...

Best wishes, Julien.
Julien Langou
 
Posts: 727
Joined: Thu Dec 09, 2004 12:32 pm
Location: Denver, CO, USA


Return to User Discussion

Who is online

Users browsing this forum: Bing [Bot] and 1 guest