ScaLAPACK Archives

[Scalapack] Getting CPU performance for pdgemm on a cluster

Dear ScaLAPACK group,

I would like to compare the performance of parallel matrix multiplication 
(pdgemm) in ScaLAPACK's PBLAS library with other approaches on our local 
Opteron-Infiniband cluster. Is there any example program that could give 
me decent results on that issue? 

For a first try, I've checked the PDBLAS3TIM program in the TIMINGS 
subdirectory of PBLAS, but got stuck with the syntax of the input file 
(PDBLAS3TIM.dat) - is there a more detailed description of the parameters 
than that given  in the source file pdblas3tim.f ?
I tried to simply change the values of M,N,K (the matrix sizes), but that 
leads immediately to incompatible parameter errors.

Is the pdblas3tim code suitable for a solid performance check for matrix 
multiplication? I'd like to compare the speed for a single matrix 
multiplication (matrix sizes ~15000*15000) of up to 128 processes (32 
nodes) with the implementaion in the Global Arrays package, and with 
my own code.

Any help would be appreciated :-)

With best regards

Michael Bader

Dr. Michael Bader                         Email: bader@Domain.Removed
Scientific Computing in Computer Science  Tel.:  089/289-18634
Technische Universit?t M?nchen            WWW:

<Prev in Thread] Current Thread [Next in Thread>

For additional information you may use the LAPACK/ScaLAPACK Forum.
Or one of the mailing lists, or