Hello there ,
I do have a question concerning the Block Cyclic Distribution scheme used by ScaLAPACK.
I've build an C++ Wrapper around ScaLAPACK and store dense matrices distributed in an 1-D C-Array on each node. (The right values are already distributed on the right machines)
That Means that i've already accomplished to distribute the values onto the right processor like in the right image.
But I don't really understand in which order the values are then stored into the processor's memory.
Es an Example:
Right now they are stored in blocks In this order in the local (1-D) Array on proc 00.
[ (1,1)|(1,2)|(2,1)|(2,2)|(5,1)|(5,2)|(1,5)|(2,5)|(5,5) ]
Is this the right order the decomposed matrice should be stored in memory?