The fastest alternative that we have explored is based on Aasen's method,
which factors A into P*L*T*L^T*P^T, where P is a permutation, L is lower
triangular, and T is tridiagonal. There is a paper (which won a prize at
IPDPS'13) on this, see
These more recent versions modify Aasen's original factorization by
steps from A to narrower band and then solving the narrower band problem.
So the output factorization differs from Bunch-Kaufman or rook, but this
seems necessary to attain high performance. How would such a change
affect the Matlab interface and its users?
On 11/12/14, 11:38 AM, Bobby Cheng wrote:
Hi LAPACK team,
It is great to see that LDL^T with rook pivoting made it into LAPACK 3.5.
I observed that xSYTRI_ROOK and xSYTRS_ROOK use only level 2 BLAS
similar to xSYTRI and xSYTRS. However, I have not found the level
three BLAS counterpart of xSYTRI2 and xSYTRS2. This is an issue for
MATLAB, as customers, not surprisingly, have found the performance of
xSYTRI not acceptable for large matrices. This will be the same for
xSYTRI_ROOK. I am wondering what your plan is for applying what is
done in xSYTRI2 to rook pivoting.
Lapack mailing list
-------------- next part --------------
An HTML attachment was scrubbed...