SCALAPCK pdpbtrf error

Open discussion regarding features, bugs, issues, vendors, etc.

SCALAPCK pdpbtrf error

Postby jamabr » Thu Jun 09, 2016 8:54 am

Dear all,

for a current problem I tried to make use of pdpbtrf, to solve with a positive definite band matrix. For testing purposes, I started with the Cholesky reduction. I tried to integrate the algorithm into my environment, but I failed, producing the error "On entry to PDPBTRF, D&C alg.: NB too small parameter number 604 had an illegal value". I tried to check everything, but it seemed fine.
So I tried the simple example from http://www.ibm.com/support/knowledgecen ... lpbtrf.htm , Example 1. For the given input matrix, my implementation derived exectly the same paramerters for the function call and the descriptors. Nevertheless pdpbtrf fails again with "On entry to PDPBTRF, D&C alg.: NB too small parameter number 604 had an illegal value". Has anbody used the routine sucessully? Or is there an error in the example, misinterpreting some of the parameters the same way I did?

To summerize I call PDPBTRF in this case on all processes with

0: L _ 9 _ 7 _ 0x14ea2f0 _ 1 _ 0x14ea2a0 _ 0x14e6a10 _ 119 _ 0x14e95c0 _ 0 _ 123
1: L _ 9 _ 7 _ 0x19e92d0 _ 1 _ 0x19e50c0 _ 0x19e4be0 _ 119 _ 0x19e85f0 _ 0 _ 123
2: L _ 9 _ 7 _ 0x24c32d0 _ 1 _ 0x24bf0c0 _ 0x24bebe0 _ 119 _ 0x24c25f0 _ 0 _ 123

and desc is

0: 501 _ 0 _ 9 _ 3 _ 0 _ 8 _ 0
1: 501 _ 0 _ 9 _ 3 _ 0 _ 8 _ 0
2: 501 _ 0 _ 9 _ 3 _ 0 _ 8 _ 0
it results in
{ 0, 0}: On entry to PDPBTRF, D&C alg.: NB too small parameter number 604 had an illegal value
{ 0, 1}: On entry to PDPBTRF, D&C alg.: NB too small parameter number 604 had an illegal value
{ 0, 2}: On entry to PDPBTRF, D&C alg.: NB too small parameter number 604 had an illegal value

Has anbody an idea? So i am not new to scalapck and using it since several year, but I use band matrices for the first time...
Thanks a lot in advance,
Jan Martin
jamabr
 
Posts: 7
Joined: Thu May 27, 2010 6:53 am
Location: Bonn, Germny

Re: SCALAPCK pdpbtrf error

Postby jamabr » Thu Jun 09, 2016 11:28 am

Being able to read clearly gives you an advantage. After looking into the code and seeing the line 477, I thought about what it can mean. Afterwards I found the restrictions section where it is stated:

00142 * Blocksize cannot be too small:
00143 * If the matrix spans more than one processor, the following
00144 * restriction on NB, the size of each block on each processor,
00145 * must hold:
00146 * NB >= 2*BW
00147 * The bulk of parallel computation is done on the matrix of size
00148 * O(NB) on each processor. If this is too small, divide and conquer
00149 * is a poor choice of algorithm.
00150 *

So taking this into account solved the problem. Sorry for that.
jamabr
 
Posts: 7
Joined: Thu May 27, 2010 6:53 am
Location: Bonn, Germny


Return to User Discussion

Who is online

Users browsing this forum: Bing [Bot] and 7 guests