Hi.
I'm running a ScaLAPACK code to solve for the eigenvectors and eigenvalues of symmetric matrices of order 2000 x 2000, using the series of subroutines pdsytrd, pdstebz, pdstein, and pdormtr. I have the program set up correctly, but the performance seems to only improve up to about 8 processors; using more than 9 processors produces the same or even longer runtimes. I am using the grid geometries recommended in the user's guide (1 x np for np < 9, and square grids for np >= 9) and have tried blocking factors of 20, 64 (the value recommended in the user's guide), and 100, but the performance improvement seems to top out at 8 processors. Are there any changes I can make or tricks I can use to improve the scalability?
Thanks!
Brian Lane
blane@phys.ufl.edu