## Operation count for DSGESV and DSPOSV

Post here if you have a question about LAPACK performance

### Operation count for DSGESV and DSPOSV

Hello,

I'm doing some benchmarking using mixed precision DSGESV and DSPOSV and I need to know the number of operations performed for these routines. I know for high dimensions the term O(N^3) from internal SGESV is dominant, but I wold like to know also the exact number of operations (adds and products), including O(N^2) and O(N), which should be dependent on the final number of iterations (and on NRHS, of course). I've inspected the lawn41, but both DSGESV and DSPOSV were introduced in Lapack after the document. Exist any strict operation counting for the functions?

Thanks

PS: Ob course, I'm considering a case where DSGESV and DSPOSV finalize without errors: iter<=30, not overflow nor sgetrf failure.
jgpallero

Posts: 24
Joined: Thu Jul 29, 2010 2:29 pm
Location: Madrid, Spain

### Re: Operation count for DSGESV and DSPOSV

Hi Jose,

There are some details on the LAWN 175: http://www.netlib.org/lapack/lawnspdf/lawn175.pdf
But nothing answering your question per se.

Roughly without the O(n) terms, using #RHS for the number of right hand sides (NRHS in the code):
• ( 2/3 N^3 - 1/2 N^2 ) in Single Precision arithmetic for the initial factorization (SGETRF),
• ( 2 N^2 ) * ( #RHS ) in Single Precision arithmetic for the initial solve (SGETRS),
• ( 2 N^2 ) * ( #RHS ) in Double Precision arithmetic for computing the residual,
for each iteration of iterative refinement
• ( 2 N^2 ) * ( #RHS ) in Single Precision arithmetic for the initial solve (SGETRS),
• ( 2 N^2 ) * ( #RHS ) in Double Precision arithmetic for computing the residual,
So a formula would be: ( 2/3 N^3 - 1/2 N^2 ) + ( #ITER+1 ) * ( 2 N^2 ) * ( #RHS ) in Single Precision arithmetic and ( #ITER+1 ) * ( 2 N^2 ) * ( #RHS ) in Double Precision arithmetic.

Getting the O(n) Flops is not very relevant but not very hard, just look at the code and check LAWN #41 appendix D for some values of the O(N). I do not think this is relevant.

Note that the N^2 FLOPS are much much slower than the N^3 if NRHS is small. So the FLOPS count is not really a good indicator of the time to solution.

Cheers,
Julien
Julien Langou

Posts: 828
Joined: Thu Dec 09, 2004 12:32 pm
Location: Denver, CO, USA

### Re: Operation count for DSGESV and DSPOSV

Thank you for your answer,

inspecting the http://www.netlib.no/netlib/lapack/double/dsgesv.f I can compute

Prior to the iterative refinement:

FLOPS for SGESV (i. e. SGETRF+SGETRS)
FLOPS for DGEMM (with dimensions N,NRHS,N)

Inside the iterative refinement:

FLOPS for SGETRS
FLOPS for DAXPY
FLOPS for DGEMM (with dimensions N,NRHS,N)

This is esentially the same computation as you have done, but I don't understand in your numbers the (ITER+1) factor. Why '+1'? I think it should be only ITER

Thanks
jgpallero

Posts: 24
Joined: Thu Jul 29, 2010 2:29 pm
Location: Madrid, Spain

### Re: Operation count for DSGESV and DSPOSV

I think we agree. For the way you are doing it, that would ITER as you write. The way I did it is that I broke down the first SGESV in SGETRF+SGETRS, counting SGETRF by itself and factoring the SGETRS and the first GEMM outside the iterative refinement loop as a +1. So ITER+1 for me, ITER for you. I think we agree.
Julien Langou

Posts: 828
Joined: Thu Dec 09, 2004 12:32 pm
Location: Denver, CO, USA

### Re: Operation count for DSGESV and DSPOSV

Thanks!
jgpallero

Posts: 24
Joined: Thu Jul 29, 2010 2:29 pm
Location: Madrid, Spain

Return to Performance

### Who is online

Users browsing this forum: No registered users and 1 guest