MVAPICH: BLACS tester infinite loop - nonblock send issue?

Open discussion regarding features, bugs, issues, vendors, etc.

MVAPICH: BLACS tester infinite loop - nonblock send issue?

Postby MarkDixon » Wed Aug 24, 2011 10:12 am

Hi,

I've been building MPI BLACS with both OpenMPI and MVAPICH2 and have seen an interesting issue with running the BLACS tester and its default input files with MVAPICH2 only.

Occasionally, the tester gets "stuck" until it is killed. After putting in a few debugging statements, I see that the exact point where this happens can vary. Has anyone else encountered this? I checked the BLACS Errata page on the website and it doesn't seem to cover this specific issue (although it did help me with a separate "Problems compiling dwalltime00" issue I saw - thanks).

After reading the "Non-blocking communication" section of the "Some Plebian Extensions to MPI" paper, I'm wondering if there's an issue where some asynchronous buffer is filling up, causing the MPI to start blocking. If this is the case, I can stop worrying.

Do you think I'm close?

Thanks,

Mark
MarkDixon
 
Posts: 8
Joined: Thu Jul 31, 2008 11:46 am

Re: MVAPICH: BLACS tester infinite loop - nonblock send issu

Postby rodney » Wed Aug 24, 2011 1:08 pm

I have also run into this problem. I'm not sure what is causing the test code to hang, but after trying several different combinations of parameters in Bmake.inc, I got it to work correctly with OpenMPI. Below is the Bmake.inc that I am using:


SHELL = /bin/sh
BTOPdir = $(HOME)/Projects/BLACS
COMMLIB = MPI

BLACSdir = $(BTOPdir)/LIB
BLACSDBGLVL = 0
BLACSFINIT = $(BLACSdir)/blacsF77init_$(COMMLIB)-$(PLAT)-$(BLACSDBGLVL).a
BLACSCINIT = $(BLACSdir)/blacsCinit_$(COMMLIB)-$(PLAT)-$(BLACSDBGLVL).a
BLACSLIB = $(BLACSdir)/blacs_$(COMMLIB)-$(PLAT)-$(BLACSDBGLVL).a

MPIdir = /usr/local
MPILIBdir = $(MPIdir)/lib
MPIINCdir = $(MPIdir)/include
MPILIB =

BTLIBS = $(BLACSFINIT) $(BLACSLIB) $(BLACSFINIT) $(MPILIB)

INSTdir = $(BTOPdir)/INSTALL/EXE

TESTdir = $(BTOPdir)/TESTING/EXE
FTESTexe = $(TESTdir)/xFbtest_$(COMMLIB)-$(PLAT)-$(BLACSDBGLVL)
CTESTexe = $(TESTdir)/xCbtest_$(COMMLIB)-$(PLAT)-$(BLACSDBGLVL)

INTFACE = -DAdd_

WHATMPI = -DUseCMpi

DEFS1 = -DSYSINC $(SYSINC) $(INTFACE) $(DEFBSTOP) $(DEFCOMBTOP) $(DEBUGLVL)
BLACSDEFS = $(DEFS1) $(SENDIS) $(BUFF) $(TRANSCOMM) $(WHATMPI) $(SYSERRORS)

F77 = mpif77
F77NO_OPTFLAGS = -g
F77FLAGS = $(F77NO_OPTFLAGS) -Wall
F77LOADER = $(F77)
F77LOADFLAGS = -g
CC = mpicc
CCFLAGS = -g -Wall
CCLOADER = $(CC)
CCLOADFLAGS = -g

ARCH = ar
ARCHFLAGS = r
RANLIB = ranlib
rodney
 
Posts: 49
Joined: Thu Feb 10, 2011 8:20 pm
Location: Colorado College

Re: MVAPICH: BLACS tester infinite loop - nonblock send issu

Postby MarkDixon » Fri Aug 26, 2011 9:38 am

Thanks for the suggestion rodney - I've given it a go, but no joy. I've tried both with your Bmake.inc verbatim, and my Bmake.inc with just the WHATMPI change you suggest.

I'm still getting random hangs on some runs. In addition to seeing it with MVAPICH2, I've now seen it with OpenMPI too.

A bit of additional information:

* I'm doing this on 64-bit RHEL5 on an Intel Nehalem box.
* I'm building BLACS against several different combinations of compilers and MPIs.
* Oddly enough, I'm only seeing it on some builds with 64-bit GNU and 64-bit Intel (mostly when teamed-up with MVAPICH2 1.4), and not at all with PGI or any 32-bit combinations

Not sure where to go from here :(

Mark
MarkDixon
 
Posts: 8
Joined: Thu Jul 31, 2008 11:46 am

Re: MVAPICH: BLACS tester infinite loop - nonblock send issu

Postby MarkDixon » Fri Aug 26, 2011 11:40 am

Not really understanding BLACS, or MPI for that matter, I got my trusty debugger out.

I built BLACS (+ mpiblacs-patch03 patch) on RHEL5, 64-bit GNU 4.4.5, OpenMPI 1.4, Bmake.inc at bottom of this post.

Running xFbtest_MPI-LINUX-1 and 4 processes, it keeps getting stuck. Exactly where varies a little. I attached my debugger to the process for one run, to see what was what.

2 processes were stuck at: igsum2d_.c:170 -> PMPI_Allreduce (OpenMPI)
2 processes were stuck at: igsum2d_.c:163 -> PMPI_Reduce (OpenMPI)

The relevant section of igsum2d_.c is (inc line numbers):

158 switch(ttop)
159 {
160 case ' ': /* use MPI's reduction by default */
161 if (dest != -1)
162 {
163 BI_MPI_Reduce(bp->Buff, bp2->Buff, bp->N, bp->dtype, BI_MPI_SUM,
164 dest, ctxt->scp->comm, ierr);
165 if (ctxt->scp->Iam == dest)
166 BI_ivmcopy(Mpval(m), Mpval(n), A, tlda, bp2->Buff);
167 }
168 else
169 {
170 BI_MPI_Allreduce(bp->Buff, bp2->Buff, bp->N, bp->dtype, BI_MPI_SUM,
171 ctxt->scp->comm, ierr);
172 BI_ivmcopy(Mpval(m), Mpval(n), A, tlda, bp2->Buff);
173 }
174 if (BI_ActiveQ) BI_UpdateBuffs(NULL);
175 return;
176 break;

The processes have clearly branched due to an if/else test. This seems to be because they were called with *cdest = { 0, 1, -1, -1 } respectively. Now, it strikes me as rather odd that two different collective MPI routines are being called at the same time... or is this part of a test showing that two different communicators can work at the same time?

I hope this is interesting for someone...



Bmake.inc

SHELL = /bin/sh
BTOPdir = /nobackup/issmcd/blacs/BLACS
COMMLIB = MPI
PLAT = LINUX
BLACSdir = $(BTOPdir)/LIB
BLACSDBGLVL = 1
BLACSFINIT = $(BLACSdir)/blacsF77init_$(COMMLIB)-$(PLAT)-$(BLACSDBGLVL).a
BLACSCINIT = $(BLACSdir)/blacsCinit_$(COMMLIB)-$(PLAT)-$(BLACSDBGLVL).a
BLACSLIB = $(BLACSdir)/blacs_$(COMMLIB)-$(PLAT)-$(BLACSDBGLVL).a
MPIdir = $(MPI_HOME)
MPILIBdir = $(MPIdir)/lib/
MPIINCdir = $(MPIdir)/include
MPILIB =
BTLIBS = $(BLACSFINIT) $(BLACSLIB) $(BLACSFINIT) $(MPILIB)
INSTdir = $(BTOPdir)/INSTALL/EXE
TESTdir = $(BTOPdir)/TESTING/EXE
FTESTexe = $(TESTdir)/xFbtest_$(COMMLIB)-$(PLAT)-$(BLACSDBGLVL)
CTESTexe = $(TESTdir)/xCbtest_$(COMMLIB)-$(PLAT)-$(BLACSDBGLVL)
SYSINC = -I$(MPIINCdir)
INTFACE = -DAdd_
SENDIS =
BUFF =
TRANSCOMM = -DUseMpi2
WHATMPI =
SYSERRORS =
DEBUGLVL = -DBlacsDebugLvl=$(BLACSDBGLVL)
DEFS1 = -DSYSINC $(SYSINC) $(INTFACE) $(DEFBSTOP) $(DEFCOMBTOP) $(DEBUGLVL)
BLACSDEFS = $(DEFS1) $(SENDIS) $(BUFF) $(TRANSCOMM) $(WHATMPI) $(SYSERRORS)
F77 = mpif77 -O0 -g
F77NO_OPTFLAGS = -O0
F77FLAGS = $(F77NO_OPTFLAGS)
F77LOADER = $(F77)
F77LOADFLAGS =
CC = mpicc -O0 -g
CCFLAGS = -O0 -fPIC
CCLOADER = $(CC)
CCLOADFLAGS =
ARCH = ar
ARCHFLAGS = r
RANLIB = ranlib
MarkDixon
 
Posts: 8
Joined: Thu Jul 31, 2008 11:46 am

Re: MVAPICH: BLACS tester infinite loop - nonblock send issu

Postby MarkDixon » Fri Aug 26, 2011 11:52 am

Whoops, forgot to say that the last line printed was "INTEGER SUM TESTS: BEGIN". I had already killed the processes, but I had a debugger still open that confirmed that, for at least one process, the line number of blacstest.f was 11930.
MarkDixon
 
Posts: 8
Joined: Thu Jul 31, 2008 11:46 am

Re: MVAPICH: BLACS tester infinite loop - nonblock send issu

Postby rodney » Fri Aug 26, 2011 12:41 pm

Mark,

I will figure out what is going on and get back to you.

By the way, which version of OpenMPI 1.4 are you using? I got it to work with OpenMPI 1.4.3.

Rodney
rodney
 
Posts: 49
Joined: Thu Feb 10, 2011 8:20 pm
Location: Colorado College

Re: MVAPICH: BLACS tester infinite loop - nonblock send issu

Postby MarkDixon » Wed Aug 31, 2011 11:22 am

Hi Rodney,

I've got a bit further with this. The specific OpenMPI version I was having trouble with was 1.4. Looking at the changelog, it seems this was a known issue that has been fixed in 1.4.1 and higher:

- Fix a shared memory "hang" problem that occurred on x86/x86_64
platforms when used with the GNU >=4.4.x compiler series.


This resolves all of my OpenMPI issues :)

This left my MVAPICH2 1.4 problems to look at. I've built BLACS against a number of compilers and checked to see if the testers hung, with the following results:

64-bit Intel 12.0.2 - yes
64-bit Intel 11.1.059 - no
64-bit GNU 4.1.2, 4.2.3, 4.4.5 - yes
64-bit PGI 10.0, 11.3 - no
32-bit all - no

Getting the debugger out again, I saw that the tester programs were blocking in SMP MPI routines. Working on the assumption that an asynchronous buffer is filling up, I upped the buffer holding "eager" messages from 128Kb to 1Mb (setting environment variable SMPI_LENGTH_QUEUE=1024). This got the testers to complete with all compilers.

On the assumption I've either avoided an MVAPICH2 non-blocking buffer exhaustion or hidden a deeper MVAPICH2 problem (rather than an issue with BLACS), I think this resolves my MVAPICH2 issues.

My sincere thanks for helping me look at this - you've been a great help.

Best wishes,

Mark
MarkDixon
 
Posts: 8
Joined: Thu Jul 31, 2008 11:46 am


Return to User Discussion

Who is online

Users browsing this forum: Google [Bot], Yahoo [Bot] and 3 guests