scalapack on ubuntu 9.10

Open discussion regarding features, bugs, issues, vendors, etc.

scalapack on ubuntu 9.10

Postby righton » Sat Apr 24, 2010 12:07 am

I was wondering if anyone got scalapack to work under ubuntu 9.10 (64bit version)

i tried both the dist version (where got everything via apt-get), as well as built everything from scratch.. - both at no avail. For kicks played with turning off optimization and manually set -m64 everywhere, but that did not seem to help.

A few of the test pass fine... but some don't - for an example of a failure see below.

any ideas??
thanks.

$$ mpirun -np 4 ./xdsep-openmpi
[dakine:13794] *** Process received signal ***
[dakine:13794] Signal: Segmentation fault (11)
[dakine:13794] Signal code: Address not mapped (1)
[dakine:13794] Failing at address: 0x1a
[dakine:13794] [ 0] /lib/libc.so.6 [0x7f617720a530]
[dakine:13794] [ 1] /usr/lib/openmpi/lib/openmpi/mca_pml_ob1.so [0x7f6173c3ad53]
[dakine:13794] [ 2] /usr/lib/openmpi/lib/openmpi/mca_pml_ob1.so [0x7f6173c3b56d]
[dakine:13794] [ 3] /usr/lib/openmpi/lib/openmpi/mca_pml_ob1.so [0x7f6173c3ba86]
[dakine:13794] [ 4] /usr/lib/openmpi/lib/openmpi/mca_btl_sm.so [0x7f6172ddd53f]
[dakine:13794] [ 5] /usr/lib/libopen-pal.so.0(opal_progress+0x5a) [0x7f6176d2f05a]
[dakine:13794] [ 6] /usr/lib/libmpi.so.0 [0x7f6177f3c5f5]
[dakine:13794] [ 7] /usr/lib/openmpi/lib/openmpi/mca_coll_tuned.so [0x7f6171d1b40a]
[dakine:13794] [ 8] /usr/lib/libmpi.so.0(ompi_comm_nextcid+0x10a) [0x7f6177f2a6ca]
[dakine:13794] [ 9] /usr/lib/libmpi.so.0 [0x7f6177f297c0]
[dakine:13794] [10] /usr/lib/libmpi.so.0(MPI_Comm_create+0xc1) [0x7f6177f54581]
[dakine:13794] [11] /usr/lib/libmpi_f77.so.0(pmpi_comm_create__+0x44) [0x7f6177cf6834]
[dakine:13794] [12] /usr/lib/libblacs-openmpi.so.1(BI_TransUserComm+0xf0) [0x7f6179862f90]
[dakine:13794] [13] /usr/lib/libblacs-openmpi.so.1(Cblacs_gridmap+0x118) [0x7f617987cd48]
[dakine:13794] [14] /usr/lib/libscalapack-openmpi.so.1(SL_Cgridreshape+0x1fc) [0x7f6179d2758c]
[dakine:13794] [15] ./xdsep-openmpi [0x41788a]
[dakine:13794] [16] ./xdsep-openmpi [0x4183aa]
[dakine:13794] [17] ./xdsep-openmpi [0x407c44]
[dakine:13794] [18] ./xdsep-openmpi [0x4143a8]
[dakine:13794] [19] ./xdsep-openmpi [0x413794]
[dakine:13794] [20] ./xdsep-openmpi [0x4201fa]
[dakine:13794] [21] /lib/libc.so.6(__libc_start_main+0xfd) [0x7f61771f5abd]
[dakine:13794] [22] ./xdsep-openmpi [0x401de9]
[dakine:13794] *** End of error message ***
SCALAPACK symmetric Eigendecomposition routines.
' '

Running tests of the parallel symmetric eigenvalue routine: PDSYEVX & PDSYEV & PDSYEVD.
The following scaled residual checks will be computed:
||AQ - QL|| / ((abstol + ||A|| * eps) * N)
||Q^T*Q - I|| / (N * eps)

An explanation of the input/output parameters follows:
RESULT : passed; or an indication of which eigen request test failed
N : The number of rows and columns of the matrix A.
P : The number of process rows.
Q : The number of process columns.
NB : The size of the square blocks the matrix A is split into.
THRESH : If a residual value is less than THRESH, RESULT is flagged as PASSED.
: the QTQ norm is allowed to exceed THRESH for those eigenvectors
: which could not be reorthogonalized for lack of workspace.
TYP : matrix type (see PDSEPtst.f).
SUB : Subtests (see PDSEPtst).f
CHK : ||AQ - QL|| / ((abstol + ||A|| * eps) * N)
QTQ : ||Q^T*Q - I||/ (N * eps)
: when the adjusted QTQ exceeds THRESH
the adjusted QTQ norm is printed
: otherwise the true QTQ norm is printed
If NT>1, CHK and QTQ are the max over all eigen request tests
TEST : EVX - testing PDSYEVX, EV - testing PDSYEV, EVD - testing PDSYEVD

N NB P Q TYP SUB WALL CPU CHK QTQ CHECK TEST
----- --- --- --- --- --- -------- -------- --------- --------- ----- ----
'TEST 1 - test tiny matrices - different process configurations'
0 1 1 2 8 N 0.00 -1.00 0.0 0.0 PASSED EVX
0 1 1 2 8 N 0.00 -1.00 0.0 0.0 PASSED EV
0 1 1 2 8 N 0.00 -1.00 0.0 0.0 PASSED EVD
0 1 1 1 8 N 0.00 -1.00 0.0 0.0 PASSED EVX
--------------------------------------------------------------------------
mpirun noticed that process rank 0 with PID 13794 on node dakine exited on signal 11 (Segmentation fault).
--------------------------------------------------------------------------
righton
 
Posts: 4
Joined: Fri Apr 23, 2010 11:45 pm

Re: scalapack on ubuntu 9.10

Postby admin » Wed Apr 28, 2010 4:08 pm

Hey,
Could you give it a last try with the ScaLAPACK installer? (http://www.netlib.org/scalapack)
If it is still does not work, I will change the MPI library. I usually use open mpi and it works fine.
Julie
admin
Site Admin
 
Posts: 616
Joined: Wed Dec 08, 2004 7:07 pm

Re: scalapack on ubuntu 9.10

Postby righton » Mon May 10, 2010 7:02 pm

thanks for getting back to me.

I have tried using the scalapack installer already... today I have also tried using atlas and lapack from the dist sources (via apt-get), and used the scalapack installer for scalapack (and blacs)

Built with:
./setup.py --blaslib="-L/usr/lib/atlas -llapack -lblas" --lapacklib="-L/usr/lib/atlas -llapack -lblas" --mpiincdir=/usr/include/mpi --mpibindir=/usr/bin --prefix=$BUILD_DIR/scalapack

.. still no luck (see result below):

any other ideas??
thanks!


--
cd ~/mpi_stuff/linux_x64/scalapack/build/scalapack-1.8.0/TESTING
mpirun -np 4 ./xdsep
SCALAPACK symmetric Eigendecomposition routines.
' '

Running tests of the parallel symmetric eigenvalue routine: PDSYEVX & PDSYEV & PDSYEVD.
The following scaled residual checks will be computed:
||AQ - QL|| / ((abstol + ||A|| * eps) * N)
||Q^T*Q - I|| / (N * eps)

An explanation of the input/output parameters follows:
RESULT : passed; or an indication of which eigen request test failed
N : The number of rows and columns of the matrix A.
P : The number of process rows.
Q : The number of process columns.
NB : The size of the square blocks the matrix A is split into.
THRESH : If a residual value is less than THRESH, RESULT is flagged as PASSED.
: the QTQ norm is allowed to exceed THRESH for those eigenvectors
: which could not be reorthogonalized for lack of workspace.
TYP : matrix type (see PDSEPtst.f).
SUB : Subtests (see PDSEPtst).f
CHK : ||AQ - QL|| / ((abstol + ||A|| * eps) * N)
QTQ : ||Q^T*Q - I||/ (N * eps)
: when the adjusted QTQ exceeds THRESH
the adjusted QTQ norm is printed
: otherwise the true QTQ norm is printed
If NT>1, CHK and QTQ are the max over all eigen request tests
TEST : EVX - testing PDSYEVX, EV - testing PDSYEV, EVD - testing PDSYEVD

N NB P Q TYP SUB WALL CPU CHK QTQ CHECK TEST
----- --- --- --- --- --- -------- -------- --------- --------- ----- ----
'TEST 1 - test tiny matrices - different process configurations'
[dakine:12388] *** Process received signal ***
[dakine:12388] Signal: Segmentation fault (11)
[dakine:12388] Signal code: Address not mapped (1)
[dakine:12388] Failing at address: 0x1a
[dakine:12388] [ 0] /lib/libpthread.so.0 [0x7fb2937ab190]
[dakine:12388] [ 1] /usr/lib/openmpi/lib/openmpi/mca_pml_ob1.so [0x7fb290b88d53]
[dakine:12388] [ 2] /usr/lib/openmpi/lib/openmpi/mca_pml_ob1.so [0x7fb290b8956d]
[dakine:12388] [ 3] /usr/lib/openmpi/lib/openmpi/mca_pml_ob1.so [0x7fb290b89a86]
[dakine:12388] [ 4] /usr/lib/openmpi/lib/openmpi/mca_btl_sm.so [0x7fb28fd2b53f]
[dakine:12388] [ 5] /usr/lib/libopen-pal.so.0(opal_progress+0x5a) [0x7fb29477305a]
[dakine:12388] [ 6] /usr/lib/libmpi.so.0 [0x7fb294c525f5]
[dakine:12388] [ 7] /usr/lib/openmpi/lib/openmpi/mca_coll_tuned.so [0x7fb28ec6940a]
[dakine:12388] [ 8] /usr/lib/libmpi.so.0(ompi_comm_nextcid+0x10a) [0x7fb294c406ca]
[dakine:12388] [ 9] /usr/lib/libmpi.so.0 [0x7fb294c3f7c0]
[dakine:12388] [10] /usr/lib/libmpi.so.0(MPI_Comm_create+0xc1) [0x7fb294c6a581]
[dakine:12388] [11] /usr/lib/libmpi_f77.so.0(pmpi_comm_create__+0x44) [0x7fb294eed834]
[dakine:12388] [12] ./xdsep(BI_TransUserComm+0xed) [0x4be62d]
[dakine:12388] [13] ./xdsep(Cblacs_gridmap+0xf8) [0x4ba5c8]
[dakine:12388] [14] ./xdsep(SL_Cgridreshape+0x114) [0x420a14]
[dakine:12388] [15] ./xdsep(pdlasizesyev_+0x262) [0x419512]
[dakine:12388] [16] ./xdsep(pdsqpsubtst_+0x5e5) [0x419c15]
[dakine:12388] [17] ./xdsep(pdseptst_+0x346b) [0x40af6b]
[dakine:12388] [18] ./xdsep(pdsepreq_+0x7d6) [0x417776]
[dakine:12388] [19] ./xdsep(MAIN__+0x1774) [0x416b80]
[dakine:12388] [20] ./xdsep(main+0x2a) [0x4be75a]
[dakine:12388] [21] /lib/libc.so.6(__libc_start_main+0xfd) [0x7fb29344babd]
[dakine:12388] [22] ./xdsep [0x407a39]
[dakine:12388] *** End of error message ***
0 1 1 2 8 N 0.00 -1.00 0.0 0.0 PASSED EVX
0 1 1 2 8 N 0.00 -1.00 0.0 0.0 PASSED EV
0 1 1 2 8 N 0.00 -1.00 0.0 0.0 PASSED EVD
0 1 1 1 8 N 0.00 -1.00 0.0 0.0 PASSED EVX
--------------------------------------------------------------------------
mpirun noticed that process rank 0 with PID 12388 on node dakine exited on signal 11 (Segmentation fault).
--------------------------------------------------------------------------
righton
 
Posts: 4
Joined: Fri Apr 23, 2010 11:45 pm

Re: scalapack on ubuntu 9.10

Postby righton » Tue Jun 08, 2010 1:48 pm

is there anyone on this forum who has gotten scalapack to work under ubuntu (9.10 or other)??

thanks.
righton
 
Posts: 4
Joined: Fri Apr 23, 2010 11:45 pm

Re: scalapack on ubuntu 9.10

Postby admin » Tue Jun 08, 2010 2:06 pm

Hey
could you try
./setup.py --downblas --downlapack
This would get you the NETLIB LAPACK and BLAS, those are the reference implementation.
ATLAS is a optimized BLAS but just has a subset of LAPACK
openmi is working usually fine with ScaLAPACK and ubuntu.
admin
Site Admin
 
Posts: 616
Joined: Wed Dec 08, 2004 7:07 pm

Re: scalapack on ubuntu 9.10

Postby admin » Tue Jun 08, 2010 3:29 pm

hi again,
Actually I managed to reproduce your problem.
It looks like the problem also appears in the testing in the BLACS (RUNNING REPEATABLE SUM TEST), those are usually due by a problem of the mpi implementation.
Julie
admin
Site Admin
 
Posts: 616
Joined: Wed Dec 08, 2004 7:07 pm

Re: scalapack on ubuntu 9.10

Postby righton » Fri Jun 11, 2010 12:38 pm

hi:

i have head similar issue when running with the mpich2 implementation of mpi on ubuntu (i think this was with version 8.10).

you mention that "openmi is working usually fine with ScaLAPACK and ubuntu".. could you let what versions you mean??

anyone else has scalapack running on ubuntu or debian?
thanks.
righton
 
Posts: 4
Joined: Fri Apr 23, 2010 11:45 pm

Re: scalapack on ubuntu 9.10

Postby admin » Tue Jun 15, 2010 4:46 pm

Hi everybody,
I managed to have ScaLAPACK runs fine with OPEN-MPI on ubuntu 64 bits.
first you may want to check that the Bmake.inc from the BLACS library has the following set:
Code: Select all
TRANSCOMM =  -DUseMpi2

The installer for example is not setting the flag correctly (at least for my machine). I will need to correct that.
So you have to delete all the BLACS libraries and recompile BLACS once you have changed the value of TRANSCOMM.
To make sure everything is fine you can do
Code: Select all
make testing

and then go to TESTING/EXE and run
Code: Select all
mpirun -np 4 ./xCbtest_MPI--0

It should finished with the ugly test on MPI_Abort and not get stucked on RUNNING REPEATABLE SUM TEST.

Then clean the testing of ScaLAPACK and recompile.
It solves the problem in xnsep testing in ScaLAPACK.
Let me know what happens on your side.
Julie
admin
Site Admin
 
Posts: 616
Joined: Wed Dec 08, 2004 7:07 pm

Re: scalapack on ubuntu 9.10

Postby mpo101 » Sun Jul 18, 2010 12:56 am

I am a new user to LAPACk and I want to install LAPACk on my ubuntu 8.04 LTS. What is the procedure? Do i have to add something on my repo or I can just use apt-get command to install this. Any help is appreciated.
mpo101
 
Posts: 1
Joined: Sun Jul 18, 2010 12:47 am


Return to User Discussion

Who is online

Users browsing this forum: No registered users and 5 guests