single-threaded / multi-threaded mkl performance difference

Open discussion for MAGMA library (Matrix Algebra on GPU and Multicore Architectures)

single-threaded / multi-threaded mkl performance difference

Postby dni » Mon Sep 12, 2011 3:27 am

I build magma twice, one with single-threaded mkl, then with multi-threaded mkl. It is interesting to find that multi-threaded testing program is almost twice as fast as the single threaded version. I only tested sgeqrf. I wonder why magma performance is so much dependent on mkl. Is it true most calculation is done on GPU?
Posts: 5
Joined: Tue Aug 09, 2011 3:42 am

Re: single-threaded / multi-threaded mkl performance differe

Postby Stan Tomov » Wed Sep 14, 2011 10:36 pm

Yes, most of the computation is done on the GPU but the critical path of many of the algorithms is done on the CPU. Therefore, the CPU code has to be as fast as possible. Currently, the code is tuned to get best performance if you use all the cores of a socket on your host. If you would like to use only one core, you must re-tuned the algorithms to get better performance (e.g., reduce the blocking sizes in file control/get_nb.cpp).
Stan Tomov
Posts: 256
Joined: Fri Aug 21, 2009 10:39 pm

Return to User discussion

Who is online

Users browsing this forum: No registered users and 1 guest