I am trying to benchmark MKL and PLASMA. With MKL there is no doubt that increasing the number of cores, the performance will increase. Now, what about PLASMA?
I've managed to set an SGEMM benchmark suite. Following the examples and testing cases provided by PLASMA, I can set the number of cores. My question is, does the application run on the number of cores provided or creates threads within the same core? I am running on an interective node with access to up to 6CPUs x 4 Cores/CPU.
My other question is, how do you meassure the flops? I am currently using the one provided on the flops.h file (FLOPS_SGEMM(M,N,K)). Would this function give me the right number of FLOPS?
Thanks a lot!