Hi Ashton,
I am glad the reply is helpful.
Yes, the original matrix A has to be square (NxN). The
local are in packed format of size DESCA( 6 ) x DESCA( 4 ).
You can not have varying DESCA(4) for the different processors.
If for example N = 10 and you distribute among 3 processors (see the
example below), NB = DESCA(4) should be 4. The "first" 2 processors
will have 4 columns of A each, and the last one will have 2.
I compile and run the example without the problems that you have.
I link with scalapack, blas, atlas, mpi, and several blacs libraries.
The easiest way to figure out how to compile is to go to
your main scalapack directory and do
make exe
This will create your testing routines. They should work and
if they do, link your program as done in creating the testing
routines. I hope this will work.
Below is one more example.
Stan
P.S. I compile and run with something like
ifort -o testing PDGBTRF_test2.f libscalapack.a blacsF77init_MPI-LINUX-0.a blacs_MPI-LINUX-0.a blacsF77init_MPI-LINUX-0.a libf77blas.a libmpich.a
mpirun -np 3 testing
- Code: Select all
PROGRAM PDGBTRF_EXAMPLE
*
* This is an example of using PDGBTRF and PDGBTRS.
* A matrix of size 10x10 is distributed on a 1x3 process
* grid, factored, and solved in parallel.
*
*
INTEGER ICTXT, INFO, MYCOL, MYROW, NPCOL, NPROW
INTEGER BWL, BWU, N, NB, LAF, LWORK
PARAMETER (BWL = 1, BWU = 1, NB = 4, N = 10)
PARAMETER (LAF=(NB+BWU)*(BWL+BWU)+6*(BWL+BWU)*(BWL+2*BWU))
PARAMETER (LWORK = LAF)
INTEGER DESCA(7), DESCB(7), IPIV(NB), FILL_IN
DOUBLE PRECISION A(1+2*BWL+2*BWU,NB),B(NB),AF(LAF),WORK(LWORK)
*
* INITIALIZE THE PROCESS GRID
*
NPROW = 1
NPCOL = 3
CALL SL_INIT( ICTXT, NPROW, NPCOL )
CALL BLACS_GRIDINFO( ICTXT, NPROW, NPCOL, MYROW, MYCOL )
*
* DISTRIBUTE THE MATRIX ON THE PROCESS GRID
* Initialize the array descriptors for the matrices A and B
*
DESCA( 1 ) = 501 ! descriptor type
DESCA( 2 ) = ICTXT ! BLACS process grid handle
DESCA( 3 ) = N ! number of rows in A
DESCA( 4 ) = NB ! Blocking factor of the distribution
DESCA( 5 ) = 0 ! size of block rows
DESCA( 6 ) = 1+2*BWL+2*BWU ! leading dimension of A
DESCA( 7 ) = 0 ! process row for 1st row of A
DESCB( 1 ) = 502 ! descriptor type
DESCB( 2 ) = ICTXT ! BLACS process grid handle
DESCB( 3 ) = N ! number of rows in B
DESCB( 4 ) = NB ! Blocking factor of the distribution
DESCB( 5 ) = 0 ! size of block rows
DESCB( 6 ) = NB ! leading dimension of B
DESCB( 7 ) = 0 ! process row for 1st row of B
*
* Generate matrices A and B and distribute them to the process grid
*
* P1 P2 P3 X B
*
* / 1 1 . . \ / 1 \ / 2 \
* | 1 1 1 . . | | 1 | | 3 |
* | 1 1 2 . . | | 1 | | 4 |
* | 1 2 . 2 . | | 1 | | 5 |
* | 2 . 2 2 . | | 1 | = | 6 |
* | . 2 2 2 . | | 1 | | 6 |
* | . 2 2 3 . | | 1 | | 7 |
* | . 2 3 . 3 | | 1 | | 8 |
* | . 3 . 3 3 | | 1 | | 9 |
* \ . . 3 3 / \ 1 / \ 6 /
*
*
FILL_IN = BWL+BWU
IF (MYCOL == 0) THEN
A(2+FILL_IN,1) = 1.d0
A(3+FILL_IN,1) = 1.d0
A(1+FILL_IN,2) = 1.d0
A(2+FILL_IN,2) = 1.d0
A(3+FILL_IN,2) = 1.d0
A(1+FILL_IN,3) = 1.d0
A(2+FILL_IN,3) = 1.d0
A(3+FILL_IN,3) = 1.d0
A(1+FILL_IN,4) = 2.d0
A(2+FILL_IN,4) = 2.d0
A(3+FILL_IN,4) = 2.d0
B(1) = 2.d0
B(2) = 3.d0
B(3) = 4.d0
B(4) = 5.d0
ELSE IF (MYCOL == 1) THEN
A(1+FILL_IN,1) = 2.d0
A(2+FILL_IN,1) = 2.d0
A(3+FILL_IN,1) = 2.d0
A(1+FILL_IN,2) = 2.d0
A(2+FILL_IN,2) = 2.d0
A(3+FILL_IN,2) = 2.d0
A(1+FILL_IN,3) = 2.d0
A(2+FILL_IN,3) = 2.d0
A(3+FILL_IN,3) = 2.d0
A(1+FILL_IN,4) = 3.d0
A(2+FILL_IN,4) = 3.d0
A(3+FILL_IN,4) = 3.d0
B(1) = 6.d0
B(2) = 6.d0
B(3) = 7.d0
B(4) = 8.d0
ELSE IF (MYCOL == 2) THEN
A(1+FILL_IN,1) = 3.d0
A(2+FILL_IN,1) = 3.d0
A(3+FILL_IN,1) = 3.d0
A(1+FILL_IN,2) = 3.d0
A(2+FILL_IN,2) = 3.d0
B(1) = 9.d0
B(2) = 6.d0
END IF
*
* Perform LU factorization
*
CALL PDGBTRF( N, BWL, BWU, A(1,1), 1, DESCA, IPIV,
$ AF, LAF, WORK, LWORK, INFO )
IF (INFO/=0) THEN
write(*,*) 'Info flag from PDGBTRF = ',INFO, ', Col = ',MYCOL
GOTO 100
END IF
*
* Solve using the LU factorization from PDGBTRF
*
CALL PDGBTRS('N', N, BWL, BWU, 1, A(1,1), 1, DESCA, IPIV,
$ B(1), 1, DESCB, AF, LAF, WORK, LWORK, INFO)
IF (INFO/=0) THEN
write(*,*) 'Info flag from PDGBTRS = ',INFO, ', Col = ',MYCOL
GOTO 100
END IF
write(*,200) 'X(processor ',MYCOL,') = ',B(1:DESCB(4))
200 format((a12,I3,a4,4(f10.2,2x)))
*
* Release the process grid and exit BLACS
*
100 CALL BLACS_GRIDEXIT( ICTXT )
CALL BLACS_EXIT( 0 )
STOP
END