by gregoryjorris » Fri Feb 29, 2008 4:36 pm
Folks,
Let me be more specific....Here's some sample output from the pblas test routine. This same error has occurred with openmpi, lam, and mpich. It has also happened with standard BLAS, CBLAS, ATLAS, and vecLib, and independent on 32 or 64 bit mode. So, it seems there are only two things left, gfortran (from the hpc.sourceforge.net site) or the pblas routines. I have seen the previous notice dated 2005 about vecLib having a problem with gfortran and am wondering if the same problems exist here?.....
I should also point out that it's not just a problem with the test routine. All of the ScaLapack routines that call these complex dot products fail in some way....
Regards
mpiexec -np 4 ../../TESTING//xzpblas1tst
Level 1 PBLAS testing program.
'Intel iPSC/860 hypercube, gamma model.'
Tests of the complex double precision Level 1 PBLAS
The following parameter values will be used:
Number of Tests : 4
Number of process grids : 4
P : 2 1 2 1
Q : 2 2 1 4
Stop on failure flag : F
Test for error exits flag : T
Leading dimension gap : 10
Verbosity level : 0
Alpha : ( 2.00000 , -3.00000 )
Routines to be tested : PZSWAP ... Yes
PZSCAL ... Yes
PZDSCAL ... Yes
PZCOPY ... Yes
PZAXPY ... Yes
PZDOTU ... Yes
PZDOTC ... Yes
PDZNRM2 ... Yes
PDZASUM ... Yes
PZAMAX ... Yes
Relative machine precision (eps) is taken to be 0.111022E-15
Tests started.
Error-exit tests completed.
Test number 1 started on a 2 x 2 process grid.
-----------------------------------------------------------------------------
N IX JX MX NX IMBX INBX MBX NBX RSRCX CSRCX INCX
-----------------------------------------------------------------------------
14 5 2 36 24 2 2 2 2 0 0 1
-----------------------------------------------------------------------------
N IY JY MY NY IMBY INBY MBY NBY RSRCY CSRCY INCY
-----------------------------------------------------------------------------
14 1 7 2 27 2 2 2 2 0 0 2
-----------------------------------------------------------------------------
Tested Subroutine: PZSWAP
***** Input-only parameter check: PZSWAP PASSED *****
***** Computational check: PZSWAP PASSED *****
Tested Subroutine: PZSCAL
***** Input-only parameter check: PZSCAL PASSED *****
***** Computational check: PZSCAL PASSED *****
Tested Subroutine: PZDSCAL
***** Input-only parameter check: PZDSCAL PASSED *****
***** Computational check: PZDSCAL PASSED *****
Tested Subroutine: PZCOPY
***** Input-only parameter check: PZCOPY PASSED *****
***** Computational check: PZCOPY PASSED *****
Tested Subroutine: PZAXPY
***** Input-only parameter check: PZAXPY PASSED *****
***** Computational check: PZAXPY PASSED *****
Tested Subroutine: PZDOTU
[host:91379] *** Process received signal ***
[host:91379] Signal: Segmentation fault (11)
[host:91379] Signal code: Address not mapped (1)
[host:91379] Failing at address: 0xffffff94
[ 1] [0x00000000, 0xffffff94] (FP-)
[host:91377] *** Process received signal ***
[host:91377] Signal: Bus error (10)
[host:91377] Signal code: (2)
[host:91377] Failing at address: 0x2
[host:91379] *** End of error message ***
[ 1] [0x00000000, 0x00000002] (FP-)
[marlin:91377] *** End of error message ***
mpiexec noticed that job rank 0 with PID 91377 on node host.somewhere.com exited on signal 10 (Bus error).
3 additional processes aborted (not shown)