bug in dgamx2d (triggered in parpack)

Post here if you want to report a bug to the LAPACK team

bug in dgamx2d (triggered in parpack)

Postby davydden » Sat Feb 28, 2015 9:00 am

Dear all,

It seems that there is a bug in dgamx2d.
I hit a segmentation fault when using Parpack 3.1.4 compiled with Scalapack 2.0.2 via clang.

The issue is gone when I compiled the trunk/head version of Scalapack.
Supposedly the bug has been fixed, but I have not seen a topic dedicated to it here.

Code: Select all
==84000== Invalid read of size 8
==84000==    at 0x113B4906F: dgamx2d_ (in /usr/local/Cellar/scalapack/2.0.2_1/lib/libscalapack.dylib)
==84000==    by 0x113D121B3: pdlamch_ (in /usr/local/Cellar/scalapack/2.0.2_1/lib/libscalapack.dylib)
==84000==    by 0x1149100C3: pdsaup2_ (in /usr/local/Cellar/arpack/3.1.4_1/libexec/lib/libparpack.2.dylib)
==84000==    by 0x1288012CF: ???
==84000==    by 0x128801A4F: ???
==84000==    by 0x128801A4F: ???
==84000==    by 0x1287FF06F: ???
==84000==    by 0x12880128F: ???
==84000==    by 0x12880128F: ???
==84000==    by 0xC0000016B: ???
==84000==    by 0x7FFF0000016B: ???
==84000==    by 0x1287F5C0F: ???
==84000==  Address 0x0 is not stack'd, malloc'd or (recently) free’d


Code: Select all
(lldb) r
Process 87381 stopped
* thread #1: tid = 0x20a531, 0x000000011321d06f libscalapack.dylib`dgamx2d_ + 91, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x0)
  frame #0: 0x000000011321d06f libscalapack.dylib`dgamx2d_ + 91
libscalapack.dylib`dgamx2d_ + 91:
-> 0x11321d06f:  movq   (%rax,%rdi,8), %r14
 0x11321d073:  jae    0x11321d079               ; dgamx2d_ + 101
 0x11321d075:  orb    $0x20, %r12b
 0x11321d079:  movb   (%rsi), %dl


p.s.
A segmentation fault (likely the same one) also occurs for GCC 4.8.
Yet, I do not know exactly which version of scalapack and parpack were used in that case


Regards,
Denis
davydden
 
Posts: 6
Joined: Fri Apr 17, 2009 11:32 am

Re: bug in dgamx2d (triggered in parpack)

Postby davydden » Wed Jul 15, 2015 9:45 am

here is a small example which shows the problem.
It is not written by me, but it's even better as it fails exactly at the same spot as what I experience in more complicated settings:
http://forge.scilab.org/index.php/p/arp ... sues/1480/

For that small program `lldb` gives:
Code: Select all
Process 12397 stopped
* thread #1: tid = 0xb446d, 0x0000000115765ccc libscalapack.dylib`dgamx2d_ + 64, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x0)
    frame #0: 0x0000000115765ccc libscalapack.dylib`dgamx2d_ + 64
libscalapack.dylib`dgamx2d_:
->  0x115765ccc <+64>: movq   (%rax,%rdi,8), %r14
    0x115765cd0 <+68>: movb   (%rdx), %bl
    0x115765cd2 <+70>: movb   %bl, %al
    0x115765cd4 <+72>: addb   $-0x41, %al
(lldb) bt
* thread #1: tid = 0xb446d, 0x0000000115765ccc libscalapack.dylib`dgamx2d_ + 64, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x0)
  * frame #0: 0x0000000115765ccc libscalapack.dylib`dgamx2d_ + 64
    frame #1: 0x00000001158f9157 libscalapack.dylib`pdlamch_ + 167
    frame #2: 0x000000011601d75b libparpack.2.dylib`pdsaupd_ + 2267
    frame #3: 0x0000000100027aac dvr_parpack`pdsaupd(n=256, nev=10, Evals=0x000000011f311fa0, Evecs=0x0000000120803200) + 1100 at dvr_parpack.cc:308
    frame #4: 0x0000000100026621 dvr_parpack`main(nr_arguments=2, arguments=0x00007fff5fbffa60) + 449 at dvr_parpack.cc:142
    frame #5: 0x00007fff92e545c9 libdyld.dylib`start + 1




i checked `pdlamch`, it's unchanged between 2.0.2. and the trunk. The same applies to `dgamx2d`, provided by `scalapack`.

It would be nice if developers could give it a brief look.

p.s. In Parpack that's the call of `pdlamch`.
Code: Select all
eps23 = pdlamch(comm, 'Epsilon-Machine')
davydden
 
Posts: 6
Joined: Fri Apr 17, 2009 11:32 am

Re: bug in dgamx2d (triggered in parpack)

Postby davydden » Wed Jul 22, 2015 2:59 am

Here is a super minimal example which leads to a segmentation fault.
I hope developers will investigate on that

Code: Select all
! compile as :
! mpif90 -o hello -L/usr/local/opt/scalapack/lib/ -lscalapack hello.f90
program hello

double precision pdlamch
external   pdlamch

include   'mpif.h'

integer    comm
integer ierr

double precision eps23

call MPI_INIT ( ierr )
call MPI_Comm_size(MPI_COMM_WORLD, nPEs, ierr)

comm = MPI_COMM_WORLD

print *, 'Here comes a segmentation fault!'
! eps23 = pdlamch(MPI_COMM_WORLD, 'Epsilon-Machine')
eps23 = pdlamch(comm, 'E')
print *, 'eps23=',eps23

call MPI_FINALIZE ( ierr )
stop

end program hello
davydden
 
Posts: 6
Joined: Fri Apr 17, 2009 11:32 am

Re: bug in dgamx2d (triggered in parpack)

Postby davydden » Tue Sep 01, 2015 4:11 am

turned out to be a (p)arpack issue (sybmol collision): https://github.com/opencollab/arpack-ng ... -135168310
davydden
 
Posts: 6
Joined: Fri Apr 17, 2009 11:32 am

Re: bug in dgamx2d (triggered in parpack)

Postby admin » Tue Sep 01, 2015 4:34 am

Thank you for the update
admin
Site Admin
 
Posts: 607
Joined: Wed Dec 08, 2004 7:07 pm


Return to Bug report

Who is online

Users browsing this forum: No registered users and 1 guest