Hi
If you have NRHS right-hand sides, spttrs "blocks" the call to sptts2 by block
of right hand sides of size NB
where NB is a block size tuned for the machine. It is a function of N and of
your machine
characteristics. Ideally you would want NB right sides and the matrix (so D and
E ) to fit in cache, I guess. The
optimal value for NB is returned by ILAENV. Hopefully this has been tuned for
your
machine. I think this is the intent of spttrs. (There is also some checks on
the input parameter
in spttrs which shows that spttrs is a driver for users while sptts2 has no
check so it is
an internal routine.)
Honnestly I am confused as well by the routines. Since sptts2 does a simple
J-loop on the right-hand sides from
1 to NRHS (which is NB when it is called by spttrs), I do not see how the
blocking can be useful.
I think the intent could be to have sptts2 written differently. (Swapping the I
and J loop for example.)
So in any case,
- it is pretty clear that the intent is to have users use spttrs (as opposed to
sptts2)
- it might be the case that sptts2 is faster than spttrs but this should be
marginal,
Cheers,
Julien.
On Aug 20, 2012, at 11:05 PM, julie langou
<julie.langou@Domain.Removed<mailto:julie.langou@Domain.Removed>> wrote:
From: Sa-Lin Cheng Bernstein <salin@Domain.Removed<mailto:salin@Domain.Removed>>
Subject: Question about spttrs and sptts2.
Date: August 20, 2012 11:44:05 AM PDT
Hello Julie,
I hope this email finds you well!
I have some questions about two LAPACK routines: spttrs and sptts2. I would
greatly appreciate it if you could help me with the answers.
The "Purpose" sections for these two routines in the docs are identical, and
their "Arguments" are almost the same. (Please see
http://www.netlib.org/lapack/lapack_routine/spttrs.f and
http://www.netlib.org/lapack/lapack_routine/sptts2.f.) It looks like that
spttrs sets up the problem, and sptts2 actually does the solve. In
particular, spttrs sorts out the number of right hand sides and calls sptts2
for each of them.
My questions are: can a user call sptts2 directly to solve the same problem
that spttrs solves? If so, is it faster than spttrs because spttrs has to call
sptts2? And if the answer is 'yes, sptts2 is faster', then dose it mean that
spttrs can be replaced by sptts2 for better efficiency?
Thank you in advance!
Best Regards,
Sa-Lin
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
http://lists.eecs.utk.edu/mailman/private/lapack/attachments/20120820/0c6d8161/attachment.html
|