LAPACK Archives

[Lapack] Problem with LAPACK timing routines

Dear LAPACK team

First of all, thank you for keeping this must-have great piece of code in
working/operational state.

Not having found a 'bug submission' entry on the LAPACK web-site, I infered
that the mail option was the way to go. If wrong, just ignore this email and
accept my excuses.

I fully appreciate that LAPACK is a stable & well tested package, and I am
really afraid of making a mistake in claiming that I believe I have found a
glitch in the timing ancilliary routines. I am not an expert in FORTRAN
(even if I designed the GSM physical layer in 1986 using FORTRAN on a VAX
785, that's long ago...), yet I have made every effort to ensure the
root-cause of the problem is actually related to the suggested solution.

Problem description :
-------------------

(a) Environment : gfortran on mingw32 within Windows (XP 32 and Vista 64),
problem occurs with either the 32 bit or 64 bit versions and does not seem
to be sensitive to compiler options

(b) Symptom : both complex (C and Z) SVD timing programs immediately crash
(access violation). The respective offending commands being

    xeigtimc < csvdtim.in > csvdtim.out
    xeigtimz < zsvdtim.in > zsvdtim.out

Some -g compile followed by a simple GDB exercise shows the following stack
call, where the #NNN suffix designates the line number NNN and the *.f
source files are the instrumented versions in TIMING/EIG/[EIGSRC/] except
for slasdt.f which is the non-instrumented version in LAPACK/SRC since the
distributed makefile does not put the instrumented version in the override
library:

Runtime...
 ctimee.f#618
  ctim26.f#1123
   cgesdd.f#1211
    sbdsdc.f#361
     slasd0.f#144
      slasdt.f#77     <= the write access 'NDIMR(1) = ...' triggers the
failure

It appears that the slasdt()::NDIMR() array is pointing to some read-only
memory area.

(c) Investigation: Reading through the call chain, it is apprent that the
culprit array is a 'far part' of the working area passed by the top caller
ctimee()::IWORK() to its children, which is dimensioned at ctimee.f#283 as
INTEGER IWORK( MAXT ) with actual dimension MAXT = 10. Clearly, this is
wrong since the various callee's require room like 8*MIN(M,N) and the test
file takes these N,M's as 50,100 etc...

The integer work array is clearly under-dimensioned, digging around
computing hex offsets shows that everything in the generated binary code
seems correct, the only anomaly I found was that the supplied argument is
too small an array for the actual and correctly documented write usage.
Looking at the surrouding code, it seems that the second array IWORK2 has
adequate dimension and should be used instead, and this assumption is
compatible with what is found looking at the similar routine stimee.f#583,
where the 'fat' array IWORK2 is passed rather than the 'tiny' IWORK.

(d) Tentative solution: With IWORK replaced by IWORK2 as argument to the
call to CTIM26() at ctimee.f#616 and ditto for the twin call to ZTIM26()
found in ztimee.f#616, I could have the fixed code run smoothly to
completion, reporting failures of the legacy LINPACK/EISPACK code whereas
LAPACK succeeds etc...

Hoping that this information is both accurate and somewhat useful, I wish
you a happy day.

Best regards

Jean-Louis Dornstetter
6 route de la Grange aux Moines, 78460 Choisel
France (48?41'03"N, 2?00'57"E)
Mob : +33 6 8574 3488
Tel : +33 1 3052 8669 / 8679



<Prev in Thread] Current Thread [Next in Thread>
  • [Lapack] Problem with LAPACK timing routines, Jean-Louis Dornstetter <=


For additional information you may use the LAPACK/ScaLAPACK Forum.
Or one of the mailing lists, or