Hello everybody after a long period of time.
Recently I have returned to tryingo to run some code implementation
of scalapck subrutines. And I find a same problem that I never understand
what is wrong with my calling to my code.
The problem is that if I launch my code calling to N processors to run from
a manager node (the master node in term of NFS system) all N process run to 100%
an all is OK. But, if start the same N calling but launching them from a diferent node,
then N-1 process will run to 100% and 1 of them to 12% or 30% depending if I use
ethernet or infiniband network.
The process that is runing slowly report by ps x command a STAT field of Dl
which is an "uninterruptible sleep" and all other N-1 reports a Rl status.
I appreciate very much if somebody can give me some advice/idea about it.