ScaLAPACK Archives

[Scalapack] MPI errors with BLACS on Ubuntu 9.10

Dear Julie Langou

Thank you for the response, and sorry for the unusually long delay.

I confirm that ScaLAPACK works correctly when installed manually on 
Ubuntu.  This is with scalapack 2.0.1 and the standard openmpi package on 
Ubuntu 10.04.  As of this version the packaged scalapack will still not 
run correctly (using openmpi it now hangs halfway through the test).

Best regards
Ask Hjorth Larsen

On Mon, 30 Nov 2009, julie langou wrote:

Ask,I would tend to agree with your conclusion.
We provide a python installer to install BLACS and ScaLAPACK.
(http://www.netlib.org/scalapack/)
Maybe you could give it a try and see if it solves your problem.
Regards
Julie
On Nov 26, 2009, at 4:39 PM, Ask Hjorth Larsen wrote:

      Dear BLACS developers

      I have installed the default Ubuntu 9.10 packages with
      Blacs/Scalapack: libblacs-mpi1, libblacs-mpi-dev and so on,
      using OpenMPI. ?I get an error when attempting to call
      Cblacs_gridinit with certain MPI communicators.

      I have attached a simple example program (blacs.c and a
      corresponding makefile) which exhibits this behaviour. ?The same
      program runs fine on several different non-Ubuntu computers with
      BLACS/Scalapack, which leads me to believe that it could be
      related to the debian package.

      The program is run with 8 cpus (mpirun -np 8 testblacs). ?If
      Cblacs_gridinit is called on a subcommunicator for ranks 0, 1,
      2, 3, then everything works. ?If instead it is called on ranks
      0, 2, 4, 6, then it gives the following error:

      [askm:3868] *** An error occurred in MPI_Group_incl
      [askm:3868] *** on communicator MPI_COMM_WORLD
      [askm:3868] *** MPI_ERR_RANK: invalid rank
      [askm:3868] *** MPI_ERRORS_ARE_FATAL (your MPI job will now
      abort)
      
--------------------------------------------------------------------------
      mpirun has exited due to process rank 0 with PID 3868 on
      node askm exiting without calling "finalize". This may
      have caused other processes in the application to be
      terminated by signals sent by mpirun (as reported here).
      
--------------------------------------------------------------------------

      (The above error is caused specifically by Cblacs_gridinit, not
      the explicit creation of the MPI group in the program)

      In case this is relevant, here are the output and error files
      from running the cblacs tests as per the command 'mpirun -np 4
      cblacs_test_shared-openmpi' (-np 8 gives identical output):

      http://www.student.dtu.dk/~ashj/opendir/cblacstest.out
      http://www.student.dtu.dk/~ashj/opendir/cblacstest.err

      None of the tests fail, but some of them are skipped.

      Any help to understand or fix this, or other places to direct
      this question or report it if it is a bug, would be greatly
      appreciated.

      Best regards
      Ask Hjorth Larsen

      Center for Atomic-scale Materials Design
      Technical University 
ofDenmark<blacs.c><makefile.txt>_____________________________________________
      __
      Scalapack mailing list
      Scalapack@Domain.Removed
      http://lists.eecs.utk.edu/mailman/listinfo/scalapack


**********************************************
Julie Langou; Research Associate in Computer Science
Innovative Computing Laboratory;
University of Tennessee from Denver, Colorado ;-)
julie@Domain.Removed;?http://www.cs.utk.edu/~julie/










<Prev in Thread] Current Thread [Next in Thread>


For additional information you may use the LAPACK/ScaLAPACK Forum.
Or one of the mailing lists, or