I'm writing a program to solve AX=B for a block septadiagonal matrix. To do this, I'm treating the matrix as sparsely banded and am trying to use the ScaLAPACK routine PDGBSV to solve for X. Unfortunately, I noticed some a weird memory issue where the routine seemed be overwriting arrays outside of those given to the routine (though likely adjacent in memory). Nevertheless, it appeared to give the correct answer for X. To investigate this, I tried to write a reduced version of the code that would recreate the problem:
- Code: Select all
program SLP_test
use mpi
implicit none
integer :: Ntot, Nbw, INFO
integer, external :: NUMROC
real*8, dimension(:,:), allocatable :: A
real*8, dimension(:), allocatable :: X
integer :: ictxt
integer :: myrow, mycol
integer :: nprow, npcol
integer :: myid, numprocs
integer, dimension(7) :: desc_A, desc_X
integer, dimension(:), allocatable :: IPIV
real*8, dimension(:), allocatable :: WORK
integer :: LWORK
real*8, dimension(:,:), allocatable :: IdKk
integer :: i, j, q, k, p
integer, parameter :: Ni = 14, Ll = 1, Mm = 1
integer :: Kk, Nb, Npb
character, parameter :: tab = char(9)
! MPI initialization
CALL BLACS_PINFO(myid, numprocs)
!****************************************************************************
! Initialize Size Variables
!****************************************************************************
! k stores all (l,m) combos and runs from 1 to Kk
Kk = (Ll+1)*(2*Mm+1) ! l runs from 0 to Ll, m runs from 0 to 2*Mm
! each block is 3*Kk by 3*Kk (one for each of h, Phi, Psi)
Nb = 3*Kk
! There are Ni+3 blocks in the full matrix, as i runs from -1 to Ni+1
Ntot = (Ni+3)*Nb
! Bandwidth: 1 diagonal block plus 3 off-diagonal blocks
! in each direction for each row, less the diagonal element
! N.B.: bandwidth the same for upper and lower
Nbw = 4*Nb - 1
! Number of
Npb = Ntot/numprocs
if (mod(Ntot,numprocs)/=0) Npb = Npb+1
!****************************************************************************
! Allocate Arrays
!****************************************************************************
allocate(A(-3*Nbw:Nbw,Npb), STAT = INFO)
A = 0.d0
allocate(X(Npb), STAT = INFO)
X = 0.d0
allocate(IPIV(Npb), STAT = INFO)
IPIV = 0
! LWORK from ScaLAPACK doc.
LWORK = (Npb+Nbw)*(Nbw+Nbw) + 6*(Nbw+Nbw)*(Nbw+2*Nbw) + Npb + 2*Nbw + 4*Nbw
allocate(WORK(LWORK), STAT = INFO)
WORK = 0.d0
allocate(IdKk(Kk,Kk), STAT = INFO)
IdKk = 0.d0
!****************************************************************************
! Setup for ScaLAPACK routine
!****************************************************************************
! Make block-column distribution for A
nprow = 1
npcol = numprocs
CALL SL_INIT(ictxt, nprow, npcol)
! Get this processor's processor grid row and column
CALL BLACS_GRIDINFO(ictxt,nprow,npcol,myrow,mycol)
! If processor isn't used in this grid, then it's finished
if (myrow==-1) then
CALL BLACS_EXIT(0)
stop
end if
! Setup array descriptor for A
desc_A(1) = 501
desc_A(2) = ictxt
desc_A(3) = Ntot
desc_A(4) = Npb
desc_A(5) = 0
desc_A(6) = 4*Nbw+1
desc_A(7) = 0
! Setup array descriptor for X
desc_X(1) = 502
desc_X(2) = ictxt
desc_X(3) = Ntot
desc_X(4) = Npb
desc_X(5) = 0
desc_X(6) = NUMROC(Ntot,Npb,mycol,0,npcol)
desc_X(7) = 0
!****************************************************************************
! Initialize Arrays
!****************************************************************************
! Identity Matrix
do k=1,Kk
IdKk(k,k) = 1.d0
end do
! Source
X(1:desc_X(6)) = 1.d0
! Matrix
A(0,1:desc_X(6)) = 2.d0
CALL PDGBSV(Ntot,Nbw,Nbw,1,A,1,desc_A,IPIV,X,1,desc_X,WORK,LWORK,INFO)
do p = 0,numprocs-1
if (myid==p) then
do i = 1, desc_X(6)
write(UNIT=0,'(es13.6)') X(i)
end do
end if
CALL BLACS_BARRIER(ictxt,'A')
end do
CALL BLACS_EXIT(0)
end program SLP_test
When this program is run, instead of the memory overwrite issue, I get some MPI errors followed by a seg fault.
*** An error occurred in MPI_Reduce
[...] *** on communicator MPI_COMM_WORLD
[...] *** MPI_ERR_COMM: invalid communicator
[...] *** MPI_ERRORS_ARE_FATAL (your MPI job will now abort)
--------------------------------------------------------------------------
mpirun noticed that process rank 1 with PID 9524 on node ... exited on signal 11 (Segmentation fault).
If I declare IPIV() as an array of real*8 instead of integers, the code runs properly. I only found that by accident and I don't know why it would work like that. Maybe the size of IPIV() is too small and declaring it as real*8 gives it the additional memory in needs? I also don't know if the seg fault issue I'm having here is related to the memory overwrite issue I have in my larger code, but I assume it is.
Any help is greatly appreciated. Thanks in advance!