Hi Christof,
my experience is that a fast way to implement
all-to-all broadcast in a process row of column
is to use an ordinary ring algorithm combined
with the standard BLACS send/revc, like in:
result = my_part_of_data
buff = my_part_of_data
DO I = 1, NPCOL-1
Send( buff, East )
Revc( buff, West )
result = concat( result, buff )
END DO
If you run such an algorithm first in row
direction of the grid and then in the row
direction of the grid it will take O(P_r+P_c)
steps and it should be (almost) the fastest
way of doing the all-gather operation since
it keeps all processes and all links in each ring
busy all the time.
Hope this helps!
Robert
Christof Voemel wrote:
I have O(n) DOUBLE PRECISION data distributed across p processors, and now I
want
a big 'conversation' so that everyone knows everything.
What is the fastest way in ScaLAPACK (BLACS) to do that? Does anyone have
experience with this?
Thank you.
Christof
_______________________________________________
Lapack mailing list
Lapack@Domain.Removed
http://lists.cs.utk.edu/listinfo/lapack
--
Robert Granat
Department of Computing Science
Ume? University
SE-90187 Sweden
Phone: +46(0)907866129
Cellular phone: +46(0)733429554
Email: granat@Domain.Removed
WWW: http://www.cs.umu.se/~granat/index.html
Proverb of today: "Don't worry, it is only
a matrix!" (Author unknown)
--
|