Analyzing PLASMA

Open forum for general discussions relating to PLASMA.

Analyzing PLASMA

Postby luiceur » Fri Apr 19, 2013 9:56 am

Hi,

I would like to understand this behavior here. I've run time_dgemm and time_dgemm_tile on my machine (Xeon X5650 4CPUx6Cores 48GB ). See the result below

Why is that happening? Any ideas would be much appreciated.

Best
Attachments
dgemm.jpg
dgemm.jpg (41.94 KiB) Viewed 2626 times
luiceur
 
Posts: 3
Joined: Fri Apr 19, 2013 4:52 am

Re: Analyzing PLASMA

Postby admin » Mon Apr 22, 2013 11:22 am

So you probably know that dgemm_tile is faster than dgemm, because it skips the layout translation.
So, your question is about the drop-off when exceeding 12 cores.
The first thing on my mind is a NUMA effect.
Try using numaclt --interleave=all
Jakub
admin
Site Admin
 
Posts: 79
Joined: Wed May 13, 2009 1:27 pm


Return to User discussion

Who is online

Users browsing this forum: Yahoo [Bot] and 1 guest

cron