Analyzing PLASMA

Open forum for general discussions relating to PLASMA.

Analyzing PLASMA

Postby luiceur » Fri Apr 19, 2013 9:56 am


I would like to understand this behavior here. I've run time_dgemm and time_dgemm_tile on my machine (Xeon X5650 4CPUx6Cores 48GB ). See the result below

Why is that happening? Any ideas would be much appreciated.

dgemm.jpg (41.94 KiB) Viewed 2824 times
Posts: 3
Joined: Fri Apr 19, 2013 4:52 am

Re: Analyzing PLASMA

Postby admin » Mon Apr 22, 2013 11:22 am

So you probably know that dgemm_tile is faster than dgemm, because it skips the layout translation.
So, your question is about the drop-off when exceeding 12 cores.
The first thing on my mind is a NUMA effect.
Try using numaclt --interleave=all
Site Admin
Posts: 84
Joined: Wed May 13, 2009 1:27 pm

Return to User discussion

Who is online

Users browsing this forum: ilknurella and 1 guest