[Scalapack] Getting CPU performance for pdgemm on a cluster
From: Michael Bader
Date: Thu, 8 May 2008 19:33:32 +0200
Dear ScaLAPACK group,
I would like to compare the performance of parallel matrix multiplication
(pdgemm) in ScaLAPACK's PBLAS library with other approaches on our local
Opteron-Infiniband cluster. Is there any example program that could give
me decent results on that issue?
For a first try, I've checked the PDBLAS3TIM program in the TIMINGS
subdirectory of PBLAS, but got stuck with the syntax of the input file
(PDBLAS3TIM.dat) - is there a more detailed description of the parameters
than that given in the source file pdblas3tim.f ?
I tried to simply change the values of M,N,K (the matrix sizes), but that
leads immediately to incompatible parameter errors.
Is the pdblas3tim code suitable for a solid performance check for matrix
multiplication? I'd like to compare the speed for a single matrix
multiplication (matrix sizes ~15000*15000) of up to 128 processes (32
nodes) with the implementaion in the Global Arrays package, and with
my own code.
Any help would be appreciated :-)
With best regards
Dr. Michael Bader Email: bader@Domain.Removed
Scientific Computing in Computer Science Tel.: 089/289-18634
Technische Universit?t M?nchen WWW: http://www5.in.tum.de/~bader