ScaLapack and Mac OS X 10.5

Open discussion regarding features, bugs, issues, vendors, etc.

ScaLapack and Mac OS X 10.5

Postby gregoryjorris » Fri Feb 29, 2008 12:26 pm

Does anyone know the cause of the seg-fault in the routines pcdotc, pcdotu, pzdotc and pzdotu under OS X?
gregoryjorris
 
Posts: 4
Joined: Thu Feb 28, 2008 9:34 pm

Postby gregoryjorris » Fri Feb 29, 2008 4:36 pm

Folks,
Let me be more specific....Here's some sample output from the pblas test routine. This same error has occurred with openmpi, lam, and mpich. It has also happened with standard BLAS, CBLAS, ATLAS, and vecLib, and independent on 32 or 64 bit mode. So, it seems there are only two things left, gfortran (from the hpc.sourceforge.net site) or the pblas routines. I have seen the previous notice dated 2005 about vecLib having a problem with gfortran and am wondering if the same problems exist here?.....

I should also point out that it's not just a problem with the test routine. All of the ScaLapack routines that call these complex dot products fail in some way....

Regards


mpiexec -np 4 ../../TESTING//xzpblas1tst
Level 1 PBLAS testing program.
'Intel iPSC/860 hypercube, gamma model.'

Tests of the complex double precision Level 1 PBLAS

The following parameter values will be used:

Number of Tests : 4
Number of process grids : 4
P : 2 1 2 1
Q : 2 2 1 4
Stop on failure flag : F
Test for error exits flag : T
Leading dimension gap : 10
Verbosity level : 0
Alpha : ( 2.00000 , -3.00000 )
Routines to be tested : PZSWAP ... Yes
PZSCAL ... Yes
PZDSCAL ... Yes
PZCOPY ... Yes
PZAXPY ... Yes
PZDOTU ... Yes
PZDOTC ... Yes
PDZNRM2 ... Yes
PDZASUM ... Yes
PZAMAX ... Yes
Relative machine precision (eps) is taken to be 0.111022E-15

Tests started.

Error-exit tests completed.

Test number 1 started on a 2 x 2 process grid.

-----------------------------------------------------------------------------
N IX JX MX NX IMBX INBX MBX NBX RSRCX CSRCX INCX
-----------------------------------------------------------------------------
14 5 2 36 24 2 2 2 2 0 0 1
-----------------------------------------------------------------------------
N IY JY MY NY IMBY INBY MBY NBY RSRCY CSRCY INCY
-----------------------------------------------------------------------------
14 1 7 2 27 2 2 2 2 0 0 2
-----------------------------------------------------------------------------

Tested Subroutine: PZSWAP
***** Input-only parameter check: PZSWAP PASSED *****
***** Computational check: PZSWAP PASSED *****

Tested Subroutine: PZSCAL
***** Input-only parameter check: PZSCAL PASSED *****
***** Computational check: PZSCAL PASSED *****

Tested Subroutine: PZDSCAL
***** Input-only parameter check: PZDSCAL PASSED *****
***** Computational check: PZDSCAL PASSED *****

Tested Subroutine: PZCOPY
***** Input-only parameter check: PZCOPY PASSED *****
***** Computational check: PZCOPY PASSED *****

Tested Subroutine: PZAXPY
***** Input-only parameter check: PZAXPY PASSED *****
***** Computational check: PZAXPY PASSED *****

Tested Subroutine: PZDOTU
[host:91379] *** Process received signal ***
[host:91379] Signal: Segmentation fault (11)
[host:91379] Signal code: Address not mapped (1)
[host:91379] Failing at address: 0xffffff94
[ 1] [0x00000000, 0xffffff94] (FP-)
[host:91377] *** Process received signal ***
[host:91377] Signal: Bus error (10)
[host:91377] Signal code: (2)
[host:91377] Failing at address: 0x2
[host:91379] *** End of error message ***
[ 1] [0x00000000, 0x00000002] (FP-)
[marlin:91377] *** End of error message ***
mpiexec noticed that job rank 0 with PID 91377 on node host.somewhere.com exited on signal 10 (Bus error).
3 additional processes aborted (not shown)
gregoryjorris
 
Posts: 4
Joined: Thu Feb 28, 2008 9:34 pm

Postby Julie » Fri Feb 29, 2008 7:52 pm

gregory,
I would advice you to focus on using openmpi and the reference blas to check if your installation is correct.
I am running ScaLAPACK on Mac OS X 10.5.2 with gcc 4.3.1, openmpi 1.2.5 and the reference BLAS and everything is fine for xzpblas1tst.
I installed ScaLAPACK with the installer provided on http://www.netlib.org/scalapack.
For OpenMPI installation, I just did a ./configure.
let me know if you need some more help
Julie
Julie
 
Posts: 299
Joined: Wed Feb 23, 2005 12:32 am
Location: ICL, Denver. Colorado

Postby gregoryjorris » Mon Mar 03, 2008 5:21 pm

Julie,

I'm assuming by "reference BLAS" you mean the BLAS from the LAPACK distribution? If so, then is not this significantly slower than the tuned BLAS distribution in vecLib? Also, are you also saying that your entire gcc distribution was 4.3.1? If so, where did you pick it up (HPC is only 4.3.0)? And then I'm assuming that you're not using the system version and one compiled with the gcc 4.3.1?

I originally tried the installer and it produced the problems I delineated originally. Of course, I was using the Developer tools version of gcc which is still listed as 4.0.1 if I'm not mistaken.

Regards,
Greg
gregoryjorris
 
Posts: 4
Joined: Thu Feb 28, 2008 9:34 pm

Postby Julie » Mon Mar 03, 2008 5:34 pm

Gregory
I'm assuming by "reference BLAS" you mean the BLAS from the LAPACK distribution?

Yes
If so, then is not this significantly slower than the tuned BLAS distribution in vecLib?

Yes, of course, but it is just to check if the testing are successful. Once they are successful with the BLAS, you should have no problem with Veclib.

Also, are you also saying that your entire gcc distribution was 4.3.1?

Actually I am using 4.3.0 from http://hpc.sourceforge.net/

Julie
Julie
 
Posts: 299
Joined: Wed Feb 23, 2005 12:32 am
Location: ICL, Denver. Colorado

Postby gregoryjorris » Mon Mar 03, 2008 8:50 pm

Julie,

After downloading the complete 4.3.0 gcc from hpc.sourceforge.net and using a compiled version of openmpi and ATLAS, the pblas test routines now work. As do the ScaLapack test routines. While I haven't tested it yet, I would assume that it also works with the reference BLAS. It seems to not work for vecLib, independent of the combination of compilers and settings. I would assume that this is still a problem with the vecLib distribution's versions of zdotu, zdotc, cdotu and cdotc. I saw something posted here in 2005 discussing this problem. If memory serves me correctly it is related to the way complex numbers are returned from fortran functions and how it differs from the standard c functions.

Additionally, this did not work with gcc and g++ and mpicc and mpic++ from the standard Leopard distribution's Developer tools. When used with the gfortran version 4.3.0, as the former are 4.0.1.

Having gone through this, it should probably be in the release notes.

Regards,
Greg
gregoryjorris
 
Posts: 4
Joined: Thu Feb 28, 2008 9:34 pm

Re: ScaLapack and Mac OS X 10.5

Postby bschmidt » Wed Aug 26, 2009 9:27 am

Although it's more than a year ago, I'd like to comment on your post. I presume you are using a PowerPC based Mac? Perhaps http://developer.apple.com/hardwaredrivers/ve/errata.html#fortran_conventions is of interest to you?

Best regards, Burkhard.
bschmidt
 
Posts: 1
Joined: Wed Aug 26, 2009 4:46 am


Return to User Discussion

Who is online

Users browsing this forum: No registered users and 2 guests