Hi there - I've been using LAPACK for a number of years and have some code that I'd like to parallelise over multiple machines - I'm running an optimisation, each iteration of which requires a number (currently 24, but ideally ~100) of complex-valued SVDs, each of which is taking of the order of 10mins. Yes I know, there probably should be a simpler way of doing this, but life is too short...
I'm currently splitting the optimisation over 8 threads at the application level, but ideally I'd like to farm out to a grid of machines and I'm wondering if converting the SVDs to ScaLAPACK would be an answer, but I'm missing something really basic - if I replace my LAPACK calls with ScaLAPACK calls, parts of the problem can be sent by MPI/PVM/whatever to another machine - but what actually runs on that other machine? Is it a copy of my own program? I already do a bunch of things within main(), not least setting up the higher level optimisation - does the program end up with an extra entry point once linked? Or do I need to write a separate program to solve the SVD only and shell out to that with a filename or suchlike?
BTW, the 1997 ScaLAPACK docs say somewhere that complex SVD isn't supported yet. Is this still the case? I'm generally using MKL on Windows (which does seem to include pcgesvd() etc) and ATLAS on Linux.
Clues appreciated,
--Richard

