ScaLAPACK Archives

[Scalapack] Bugs in integer workspace queries for P{S,D}LAQR1

On a related (and much more easily fixed note), I would humbly, but
strongly, suggest that p{s,d,c,z}lahqr be modified so that integer
workspace queries return a recommended integer workspace size of zero
rather than leaving the return value unmodified.

This idiosyncrasy consumed a very nontrivial amount of both my time and
compute resources to debug at scale, as my local tests had the workspace
size coincidentally set to zero upon initialization.

Jack

On 03/24/2014 03:34 PM, Jack Poulson wrote:
From what I understand, the calls to PXERBLA in P{S,D}HSEQR are faulty
and missing arguments. For example, on line 357 of SRC/pdhseqr.f,

    CALL PXERBLA( 'PDHSEQR', -info )

the routine will segfault if called since three arguments are expected:
the first (missing) argument should be ICTXT.

Please see the following commit to my fork:

https://github.com/poulson/scalapack/commit/3dfdf1bea2fa666791f4c3ed692847b4048547c5

After making this change, I am still seeing strange behavior.

Jack

On 03/24/2014 02:46 PM, Jack Poulson wrote:
Thanks Julien.

I actually just ran into another bug in PDHSEQR when I move on to also
requesting the Schur vectors (multiplied to the right by a passed-in
unitary matrix). I get the following output from valgrind:

==17070== Conditional jump or move depends on uninitialised value(s)
==17072== Conditional jump or move depends on uninitialised value(s)
==17072==    at 0x133783C: pdtrord_ (pdtrord.f:484)
==17070==    at 0x133783C: pdtrord_ (pdtrord.f:484)
==17070==    by 0x1290827: pdlaqr3_ (pdlaqr3.f:373)
==17070==    by 0x127D17D: pdlaqr0_ (pdlaqr0.f:408)
==17070==    by 0x127A377: pdhseqr_ (pdhseqr.f:336)
==17072==    by 0x1290827: pdlaqr3_ (pdlaqr3.f:373)
==17072==    by 0x127D17D: pdlaqr0_ (pdlaqr0.f:408)
==17072==    by 0x127A377: pdhseqr_ (pdhseqr.f:336)

This is pointing to another workspace query within the eigenvalue
reordering within the AED (line 484 of pdtrord). It appears that the
passed-in IWORK array (renamed to SELECT) is not properly initialized
and so the invariant subspace size will be calculated as larger than it
should be.

This bug is causing my driver to crash.

Jack

On 03/24/2014 02:02 PM, Langou, Julien wrote:

Hi Jack, thanks for the bug report (and the fix!), see commit below. Best 
wishes, Cheers, Julien.

Begin forwarded message:

From: "Langou, Julien" 
<julien.langou@Domain.Removed<mailto:julien.langou@Domain.Removed>>
Subject: [ScaLAPACK SVN] r198 - in /scalapack/trunk/SRC: pdlaqr1.f pslaqr1.f
Date: March 24, 2014 at 2:00:05 PM EDT

Author: langou
Date: Mon Mar 24 14:00:04 2014
New Revision: 198

Log:

Bug fix from Jack Poulson, email sent on 03-24-2014.
Thanks Jack!

=============================================================================
Dear ScaLAPACK developers,

I have been experimenting with ScaLAPACK's AED implementations and my 
drivers
were crashing for small problem sizes. I subsequently found and fixed the
problem: P{S,D}LAQR1 is not setting IWORK(1)=3 when the integer workspace is
queries, but it is accessing the first three entries of IWORK during normal
operation.

The reason that this causes crashes for tiny matrices is that large matrices
will result in P{S,D}HSEQR also querying workspaces from P{S,D}LAQR0, and 
the
integer workspace query will almost certainly be larger than 3.

You can find my two-line changesets here:
  https://github.com/poulson/scalapack/commits/master

Please let me know if it is better to submit this to the forum instead.

Jack
=============================================================================



Modified:
   scalapack/trunk/SRC/pdlaqr1.f
   scalapack/trunk/SRC/pslaqr1.f

Modified: scalapack/trunk/SRC/pdlaqr1.f
==============================================================================
--- scalapack/trunk/SRC/pdlaqr1.f (original)
+++ scalapack/trunk/SRC/pdlaqr1.f Mon Mar 24 14:00:04 2014
@@ -340,6 +340,7 @@
     $             +6*LDS*LDS )
      IF( LWORK.EQ.-1 .OR. ILWORK.EQ.-1 ) THEN
         WORK( 1 ) = DBLE( LWKOPT )
+         IWORK( 1 ) = 3
         RETURN
      ELSEIF( LWORK.LT.LWKOPT ) THEN
         INFO = -15

Modified: scalapack/trunk/SRC/pslaqr1.f
==============================================================================
--- scalapack/trunk/SRC/pslaqr1.f (original)
+++ scalapack/trunk/SRC/pslaqr1.f Mon Mar 24 14:00:04 2014
@@ -340,6 +340,7 @@
     $             +6*LDS*LDS )
      IF( LWORK.EQ.-1 .OR. ILWORK.EQ.-1 ) THEN
         WORK( 1 ) = FLOAT( LWKOPT )
+         IWORK( 1 ) = 3
         RETURN
      ELSEIF( LWORK.LT.LWKOPT ) THEN
         INFO = -15



On Mar 24, 2014, at 9:16 AM, Jack Poulson 
<jpoulson@Domain.Removed<mailto:jpoulson@Domain.Removed>> wrote:

Dear ScaLAPACK developers,

I have been experimenting with ScaLAPACK's AED implementations and my
drivers were crashing for small problem sizes. I subsequently found and
fixed the problem: P{S,D}LAQR1 is not setting IWORK(1)=3 when the
integer workspace is queries, but it is accessing the first three
entries of IWORK during normal operation.

The reason that this causes crashes for tiny matrices is that large
matrices will result in P{S,D}HSEQR also querying workspaces from
P{S,D}LAQR0, and the integer workspace query will almost certainly be
larger than 3.

You can find my two-line changesets here:
   https://github.com/poulson/scalapack/commits/master

Please let me know if it is better to submit this to the forum instead.

Jack



-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 901 bytes
Desc: OpenPGP digital signature
URL: 
<http://lists.eecs.utk.edu/mailman/private/scalapack/attachments/20140403/4ff5f494/signature.sig>

<Prev in Thread] Current Thread [Next in Thread>


For additional information you may use the LAPACK/ScaLAPACK Forum.
Or one of the mailing lists, or