expected performance and fortran interface

Open forum for general discussions relating to PLASMA.

Re: expected performance and fortran interface

Postby faircdl » Fri Jun 18, 2010 8:12 am

Here are some run times:

5000x5000 matrix solve via sgesv on corei7:
ATLAS 1 thread - 5.5s
ATLAS auto thread - 2.12
PLASMA+ATLAS with optimal block size, 1 thread - 15.27
PLASMA+ATLAS with optimal block size, 4 thread - 3.93

10000x10000:
ATLAS auto thread - 12.92
PLASMA+ATLAS with optimal block size, 4 thread - 30.78
MKL auto thread - 8.62
PLASMA+MKL (MKL_NUM_THREAD=1), 4 thread - 12.07

Perhaps I'm just not going to see big gains from PLASMA unless I run on a many-core machine. This is always the situation shown in the various graphs I've seen...machines with 16-64 cores rather than 4-8.
faircdl
 
Posts: 7
Joined: Tue Jun 15, 2010 1:33 pm

Re: expected performance and fortran interface

Postby mateo70 » Sat Jun 19, 2010 3:16 pm

Hi,

sorry to answer so late to your problem, but do you still have your segfault when you are using fortran interface ?
Because I guess your system is a 64bits one, so normally pointer are 64bits and you should use INTEGER*8 instead of INTEGER*4 in your call to Plasma.

Mathieu
mateo70
 
Posts: 92
Joined: Fri May 07, 2010 3:48 pm

Re: expected performance and fortran interface

Postby faircdl » Sat Jun 19, 2010 7:01 pm

Yes, I am still having a segfault for the fortran interface. I used INTEGER*8 as the type. I put some print statements in the PLASMA routine, and the segfault is occurring in PLASMA somewhere beyond the C-wrapper function that fortran is calling. But I may still be doing something wrong.
faircdl
 
Posts: 7
Joined: Tue Jun 15, 2010 1:33 pm

Re: expected performance and fortran interface

Postby mateo70 » Sun Jun 20, 2010 3:02 am

Ok, I will try to reproduce the problem to see what is it.
mateo70
 
Posts: 92
Joined: Fri May 07, 2010 3:48 pm

Re: expected performance and fortran interface

Postby slemons » Thu Jul 22, 2010 3:05 pm

I would be very interested to get your tests when you finish them(or even before). I am hoping to test lapack calling threaded blas and plasma. This fortran interface problem is something i would almost certainly run into if i tried this on my own, so thanks for that too!

how many cores do you expect to need before you see a big improvement?
slemons
 
Posts: 1
Joined: Tue Jul 07, 2009 11:43 am

Previous

Return to User discussion

Who is online

Users browsing this forum: jianqayn and 1 guest

cron