scalapack outofcore

Open discussion regarding features, bugs, issues, vendors, etc.

scalapack outofcore

Postby elyip » Wed Aug 09, 2006 9:10 pm

I have an application that I need to divide my processors into groups, each group solving a different matrix equation. The groups don't need to communicate with each other.

In the exmaple I attach (when I compile with -DNOTWORK and using 4 processors),I am dividing my processors into 2 groups, group 1 just has processors 0 with a grid of 1x1, group 2 has processors 1 to 3 with a 1x3 grid. I have processor 0 generate a 324 x 324 matrix and processors 1 to 3 each generates a 324x108 matrix and I try to use zlawrite to write these into their corresponding out-of-core matrices. But processors 1 to 3 does not return from zlawrite.

When I compile the attach code without the -DNOTWORK flag, I am using all the processors for one matrix problem, the code works correctly. I suppose this is the common usage. But I do need to have more than one problems factored/solved simultaneously.

I have tried this example both on a Dell 670 (hyperthreaded, SUSE 9.3, MPICH 1.2.7) and an SGI O2000. The results are the same.

I appreciate your help.
Example :

implicit none

integer, parameter :: M = 324, N = 162, mb = 64, nb = mb
integer, parameter :: rsrc = 0, csrc = 0, dlen_ = 9
integer, parameter :: flen_= 11
integer :: ictxt, rank, tproc
integer :: nprow, npcol, myrow, mycol

integer, external :: numroc
integer :: i, info
integer, dimension(0:3):: umap
integer :: DESCA( FLEN_ )
integer :: mxlocr_Me, mxlocc_Me, mxlld_Me, desc_Me(dlen_)
complex*16, allocatable ::Ae(:,:)
real*8, allocatable :: Be(:,:)
character (len=1) :: filetype
character (len=80) :: filename
integer :: myrowe,mycole,Me,mine,npcole
integer :: iodev,mmb,nnb,asize,color,ncol,start_col,end_col

call blacs_pinfo(rank, tproc)
call blacs_get(-1,0,ictxt)
call blacs_get(-1,0,mine)
nprow = int(sqrt(real(tproc+0.0001)))
npcol = tproc/nprow
#ifdef NOTWORK
if (rank == 0) then
nprow = 1
npcol = 1
color = 0
nprow = 1
npcol = tproc - 1
color = 1
npcole = npcol
call blacs_gridmap(ictxt,umap(color),1,nprow, npcol)
call blacs_gridmap(mine,rank,1,1,1)
color = 0
call blacs_gridinit(ictxt,'r',nprow, npcol)
call blacs_gridinit(mine ,'r',1, tproc)
npcole = tproc

mmb = mb*nprow
nnb = nb*npcol
asize = -1
iodev = 98
filename = "LU.DAT" // CHAR( 0 )

call blacs_gridinfo(ictxt,nprow,npcol,myrow,mycol)
print *, 'rank',rank, 'myrow', myrow, 'mycol',mycol

if (info .NE. 0 ) then
print *, 'descinit error, info =',info

Me = M

mxlocr_Me = numroc(Me, nb,myrowe,rsrc,1)
mxlocc_Me = numroc(Me/npcol,nb,mycole,rsrc,1)
mxlld_Me = max(1,mxlocr_Me)
print*, 'rank',rank, 'mxlocr_Me', mxlocr_Me, 'mxlocc_Me',mxlocc_Me
ncol = Me/npcole
if (ncol*npcole < M) ncol = ncol + 1
start_col = (rank-color)*ncol + 1
end_col = min(start_col+ncol - 1, Me)
print*, 'rank', rank, 'start_col', start_col, 'end_col', end_col
call descinit(desc_Me,Me,ncol,nb,nb,rsrc,rsrc,mine,mxlld_Me,info)
if (info .NE. 0 ) then
print *, 'descinit error, info =',info

call pfdescinit( desca, m, m, mb, nb, rsrc, csrc, &
ictxt, iodev, filetype, mmb, nnb, &
asize, filename, info )
if (info .NE. 0 ) then
print *, 'pfdescinit error, info =',info

call random_number(Be)
do i = start_col,end_col
Ae(i,i) = Ae(i,i) + 100.00
end do
call zlawrite(iodev,Me,end_col-start_col+1,1,1,Ae(1,start_col),1,1,desc_Me,info)
if (info .NE. 0 ) then
print *, 'zlawrite error, info =',info
print*, ' zlawrite returned from rank ', rank
call blacs_gridexit(ictxt)
call blacs_gridexit(mine)
call blacs_exit(0)

Posts: 2
Joined: Wed Aug 09, 2006 8:57 pm

out of core scalapack

Postby efdazedo » Thu Aug 17, 2006 1:46 pm

you are correct that the out of core scalapack was originally designed to solve very large problems that cannot fit entirely in memory and it was expected to have all processors participate in the computation.

I have not fully tested the cases where subgroups of processors are working on separate out-of-core problems. Perhaps your problems are small enough to fit entirely in memory for in-core scalapack solvers?

One thing to consider is each subgroup of processors will need to associate different out-of-core files, otherwise there will be a name clash and may give incorrect results.

You might try giving different file names to the subgroups of processors. Currently it seems like all subgroups will use the same name and this may not be what you want.

filename = "LU.DAT" // CHAR( 0 )

Just a suggestion.
Posts: 2
Joined: Mon Jun 20, 2005 9:52 am

Postby elyip » Thu Aug 17, 2006 9:27 pm

Thank you very much for your reply. I do give different names to the different matrix problem in my application program.

By the way I traced the problem to pzgemr2d, which is called by pzgemr2do, which is called by fpzgemr2d, which is called by zlawrite. The last argument in pzgemr2d is a context containing both matrices A and B. This argument is missing from pzgemr2do. pzgemr2do used a global context when it calls pzgemr2d. However the following statement appears at the documentaion of pzgemr2d:

"Be aware that all processors included in this
context must call the redistribution routine."

Julien provided some example code to solve more than one incore problem simultaneously, in response to the thread "Can scalapack be used in explicit MPI?" I turned the main program testzdriver.f which came with outofcore.tgz into a subroutine that solves one problem and modified Julien's code to call this subroutine.
Posts: 2
Joined: Wed Aug 09, 2006 8:57 pm

Return to User Discussion

Who is online

Users browsing this forum: No registered users and 5 guests