Effectively tuning PLASMA

Open forum for general discussions relating to PLASMA.

Effectively tuning PLASMA

Postby dobson156 » Fri Jul 27, 2012 12:33 pm

Hi all

The PLASMA user guide for 2.4.5 directs (6.3 Tuning Howto) you to LAWN #217 (Comparative Study of One-Sided Factorizations with Multiple Software Packages on Multi-Core Hardware) for information on tuning PLASMA.

Having read the paper section 3.1 (Tuning.PLASMA) details the tuning process, my understanding is as follows:
  • Iterate though values of NB (40...500) and IB (factors or NB)
  • run sequential core_blas routines used in factorisations (with N=NB) with the select IB and NB values
  • Select the best performing combinations (from across the NB range)
  • Run full parallel factorisations using IB and NB values from the selection
  • Select best performing combination as "best performing" for architecture @ problem size.

So my first question, is my understanding of the pruned tuning process correct?

My second is the graphs in that paper show that the core_blas routines used by the factorisation (DPORTF - dgemm-seq, DGEQRF - dssrfb-seq, DETRF - dsssm-seq) seem to have changed since the paper was written (dssrfb isn't even included with my version of plasma) which routines should I tune against instead.

Many Thanks
dobson156
 
Posts: 11
Joined: Thu Jun 21, 2012 12:50 pm

Re: Effectively tuning PLASMA

Postby admin » Fri Jul 27, 2012 4:10 pm

I think that you understand it correctly.
Yes, naming changed - young project evolved, sorry for that.
For QR, LQ or anything that uses Householder reflectors, you need to tune xTSMQR.
Jakub
admin
Site Admin
 
Posts: 79
Joined: Wed May 13, 2009 1:27 pm

Re: Effectively tuning PLASMA

Postby dobson156 » Mon Jul 30, 2012 10:05 am

Thanks for the reply.

The provided timing programs appear to act on top level `PLASMA_` functions. For the step which involves, running pairs of IB and NB against `core_blas` routines (so N=nb) should I either write my own timing programs calling the core_blas or call the closest top level routine setting the --nb and --n to be the same

I guess my question boils down too will PLASMA_dgemm, ans similar, with N=NB only make one Core_dgemm call? (and there fore in essence have the same performance).

Thanks again
dobson156
 
Posts: 11
Joined: Thu Jun 21, 2012 12:50 pm

Re: Effectively tuning PLASMA

Postby jgpallero » Sun Dec 22, 2013 6:37 am

dobson156 wrote:Thanks for the reply.

The provided timing programs appear to act on top level `PLASMA_` functions. For the step which involves, running pairs of IB and NB against `core_blas` routines (so N=nb) should I either write my own timing programs calling the core_blas or call the closest top level routine setting the --nb and --n to be the same

I guess my question boils down too will PLASMA_dgemm, ans similar, with N=NB only make one Core_dgemm call? (and there fore in essence have the same performance).

Thanks again


Hello dobson156,

I know this topic is a bit old but, did you finally write any code for PLASMA tuning? I'm trying to write a program based on lawn217, but I'm confused...
jgpallero
 
Posts: 29
Joined: Sat Jul 28, 2012 12:12 pm


Return to User discussion

Who is online

Users browsing this forum: No registered users and 3 guests

cron