mysterious MPI bug ....

Open discussion regarding features, bugs, issues, vendors, etc.

mysterious MPI bug ....

Postby timothee » Fri Sep 17, 2010 4:23 am

hello all,

I am a hpc developer (C++), I installed the ScaLAPACK on my cluster everything is working but I would like develop on my Macbook pro, and the problems are starting ....

-1) I install gfortran : ok
-2) I reinstall openmpi to get mpif90 : ok
-3) I install scalapack using the automatic tools (all tests are validated) : ok
-4) I develop first a lapack version of my code (C++), link to the library and run: ok
-5) Then I start to develop my scalapack version, and I get a running bug when I used, the CBlacs_gridinit ....

compile by

/usr/local/bin/mpic++ main_mpi.cpp -o ex -L$PATH_LD -lscalapack -llapack $PATH_LD/blacsC.a $PATH_LD/blacs.a $PATH_LD/blacsC.a -lgfortran -lmpi_f77


the program :
//init MPI
MPI_Init(&argc,&argv);
MPI_Comm_size(MPI_COMM_WORLD,&_nnumprocsMPI);
MPI_Comm_rank(MPI_COMM_WORLD,&_nmyidMPI);
//Init BLACS
_order[0] = 'R';
_order[1] = 'o';
_order[2] = 'w';
_nContxt = -1;
_nContinue = 0;
Cblacs_pinfo(&_nmyidBLACS,&_nnumprocsBLACS);
Cblacs_get(_nContxt,_nContinue,&_nVal);

Cblacs_gridinit(&_nVal,_order,2,2);

The program crash when I run I get :

[MacBook-Pro-de-Tim-Ewart:06838] [ 0] 2 libSystem.B.dylib 0x00007fff83e5535a _sigtramp + 26
[MacBook-Pro-de-Tim-Ewart:06838] [ 1] 3 ??? 0x0000000000000080 0x0 + 128 <---- mystic line
[MacBook-Pro-de-Tim-Ewart:06838] [ 2] 4 ex 0x0000000100005773 BI_TransUserComm + 35 <---- system layer mpi ?
[MacBook-Pro-de-Tim-Ewart:06838] [ 3] 5 ex 0x0000000100005d8b Cblacs_gridmap + 251
[MacBook-Pro-de-Tim-Ewart:06838] [ 4] 6 ex 0x0000000100005c1b Cblacs_gridinit + 123
[MacBook-Pro-de-Tim-Ewart:06838] [ 5] 7 ex 0x00000001000051bd _ZN9CParBLACSC1EiPPc + 193
[MacBook-Pro-de-Tim-Ewart:06838] [ 6] 8 ex 0x000000010000145d main + 66
[MacBook-Pro-de-Tim-Ewart:06838] [ 7] 9 ex 0x00000001000012fc start + 52

First I think I miss a library, so I type otool -L ex, I get

/System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libLAPACK.dylib (compatibility version 1.0.0, current version 219.0.0)
/usr/local/lib/libgfortran.3.dylib (compatibility version 4.0.0, current version 4.0.0)
/usr/local/lib/libmpi_f77.0.dylib (compatibility version 1.0.0, current version 1.0.0)
/usr/local/lib/libmpi_cxx.0.dylib (compatibility version 1.0.0, current version 1.1.0)
/usr/local/lib/libmpi.0.dylib (compatibility version 1.0.0, current version 1.2.0)
/usr/local/lib/libopen-rte.0.dylib (compatibility version 1.0.0, current version 1.0.0)
/usr/local/lib/libopen-pal.0.dylib (compatibility version 1.0.0, current version 1.0.0)
/usr/lib/libutil.dylib (compatibility version 1.0.0, current version 1.0.0)
/usr/lib/libstdc++.6.dylib (compatibility version 7.0.0, current version 7.9.0)
/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 125.2.0)

The second possibility it is a mix of 32/64 bits library, but my compiler disagree when I start this kind of linking. I also thing the pb of INTEGER fortran defined 8 byte and integer in C defined of 4 byte, but I used long integer. The last solution is a bug
of the openmp version on Mac (freebsd). If anybody have an idea, I am not a big fan to develop under VI.
timothee
 
Posts: 2
Joined: Fri Sep 17, 2010 3:35 am

Return to User Discussion

Who is online

Users browsing this forum: Bing [Bot] and 1 guest

cron