## Pivoted Cholesky solver

Post here if you have a question about LAPACK or ScaLAPACK algorithm or data format

### Pivoted Cholesky solver

I am solving linear systems of the form:

(A^T A) x = b

using the Cholesky decomposition. However sometimes when I apply the Cholesky decomposition (DPOTRF) to the matrix C = A^T A, it fails due to rounding errors in forming the matrix C, or because A could be rank deficient.

I see there is a pivoted Cholesky routine (DPSTRF) which looks promising to avoid these problems, however I don't see a corresponding routine to actually solve the equation once I have the pivoted Cholesky decomposition. Is there a routine similar to DPOTRS which uses the pivoted Cholesky decomposition to solve the system C x = b ?

I don't think I can simply use DPOTRS, because if P^T C P = L L^T, the lower right block of L could be singular so won't DPOTRS fail?
pa33

Posts: 11
Joined: Fri Apr 27, 2007 5:17 pm

### Re: Pivoted Cholesky solver

Is there a routine similar to DPOTRS which uses the pivoted Cholesky decomposition to solve the system C x = b ?

No this is missing. There is no "DPSTRS" or "DPSSV" in LAPACK (yet). Good point.

I don't think I can simply use DPOTRS,
No you cannot because of the permutations. So you "simply" need to permute in input and permute in output. Then you can use DPOTRS in the middle.
I think it could be worthwhile to write a demo code. (and naming it DPSTRS would be even better .)

because if P^T C P = L L^T, the lower right block of L could be singular so won't DPOTRS fail?

Oh OK. So in addition, if the lower right block of L is singular then this would be a problem indeed. Sure. I think, in that case, you simply need to "cut" and solve for the nonsingular piece. You need to think and see what makes sense for you. Moreover what singular means is for you to define. Because you mean "numerically singular", so you need to define when you want to "cut". If you want to "cut" as late as when Pivoted Cholesky stops. This could be a strategy. But you could decide to stop earlier. (You cannot do it after.)

My recommendation would be to first try to factorize your matrices using DPSTRF and see how that works for you. Then you can see what to do with the factorization.

Feel free to follow up with us. Great to see that you are interested in this functionality.
Cheers,
Julien.
Julien Langou

Posts: 835
Joined: Thu Dec 09, 2004 12:32 pm
Location: Denver, CO, USA

### Re: Pivoted Cholesky solver

Oh OK. So in addition, if the lower right block of L is singular then this would be a problem indeed. Sure. I think, in that case, you simply need to "cut" and solve for the nonsingular piece. You need to think and see what makes sense for you. Moreover what singular means is for you to define. Because you mean "numerically singular", so you need to define when you want to "cut". If you want to "cut" as late as when Pivoted Cholesky stops. This could be a strategy. But you could decide to stop earlier. (You cannot do it after.)

Well, as I understand things, any matrix of the form A^T A (with A non-singular) should be strictly positive definite (in exact arithmetic) and so Cholesky should work fine. But for ill-conditioned matrices A, there can be rounding errors leading to negative diagonal entries during the Cholesky process (leading to a square root of a small negative number). Basically I'm just looking for a way to get a stable solution vector even when A is ill conditioned.

I could of course apply a rank revealing QR decomposition to A^T A and get a nice solution (which would just ignore the singular part of the R matrix). I'm looking for something similar in the Cholesky case. So as you say the algorithm should "cut" the solution when the diagonal entries of the Cholesky factor drop below some tolerance.

I'm surprised there is no solution for this since matrices of the form A^T A arise quite often in practice. I do see there is a pivoted L*D*L^T decomposition with solver (DSYSV) - do you think that would work for my problem?
pa33

Posts: 11
Joined: Fri Apr 27, 2007 5:17 pm

### Re: Pivoted Cholesky solver

Many (A^T A) problems come from least squares problems with A as the observation matrix. Turning it into a linear equation problem squares the condition number so, especially for ill-conditioned problems, it is preferable to solve the least squares problem directly.

LAPACK gives a choice of four drivers for least squares problems. So, IF your problem is really a least squares problem I would strongly recommend using one of those. Three of the drivers allow for the case of a rank deficient matrix A.

Sven Hammarling.
sven

Posts: 146
Joined: Wed Dec 22, 2004 4:28 am

### Re: Pivoted Cholesky solver

My system does come from a least squares problem, however the matrix A is prohibitively large to use a direct decomposition (millions of rows and thousands of columns). Therefore I need to use the normal equations approach with A^T A.
pa33

Posts: 11
Joined: Fri Apr 27, 2007 5:17 pm

### Re: Pivoted Cholesky solver

Is A sparse? Or dense? I would assume A is sparse. Correct?
You have your own way to form A^TA and you take into account the sparsity of A?

If A is sparse: (1) For sparse QR solvers, see QR MUMPS ( http://buttari.perso.enseeiht.fr/qr_mumps/ ) and/or Suitesparse ( http://faculty.cse.tamu.edu/davis/suitesparse.html ). (2) You could also try an iterative method like LSQR.

Cheers,
Julien
Julien Langou

Posts: 835
Joined: Thu Dec 09, 2004 12:32 pm
Location: Denver, CO, USA

### Re: Pivoted Cholesky solver

Julien Langou wrote:Is A sparse? Or dense? I would assume A is sparse. Correct?
You have your own way to form A^TA and you take into account the sparsity of A?

If A is sparse: (1) For sparse QR solvers, see QR MUMPS ( http://buttari.perso.enseeiht.fr/qr_mumps/ ) and/or Suitesparse ( http://faculty.cse.tamu.edu/davis/suitesparse.html ). (2) You could also try an iterative method like LSQR.

Cheers,
Julien

No actually A is dense, just very large. Its too large to store in memory at one time so I construct a few thousand rows at a time and fold them into the normal equations matrix A^T A.

I will think some more on how to solve the pivoted Cholesky system, which should be similar to the pivoted QR method. In fact I could simply do a pivoted QR decomposition on A^T A and probably end up with a reasonable solution vector.
pa33

Posts: 11
Joined: Fri Apr 27, 2007 5:17 pm

### Re: Pivoted Cholesky solver

Hi Patrick,

I see. Thanks for the explanation.

Question: how do you form the normal equations? You say: "construct a few thousand rows at a time and fold them into the normal equations matrix A^T A." How do you fold? Do you use SYRK or GEMM?

Here is another way to go. You could use the TS (Triangle on top of Square QR) kernel in LAPACK to do an out-of-core tall-and-skinny QR factorization.

The name of the subroutine is DTPQRT (introduced in LAPACK 3.4).

Note: The advantage of QR factorization is that QR factorization is more stable than normal equations since it does not square the condition number. You seem to struggle with positive definiteness of the normal equations, as you know, this indicates that the normal equations are numerically ill-conditioned.

So here what you could do. It requires to get your hands dirty but since you are already forming the normal equations by hand and constructing the matrix along the way.

Well, I think you have done the hardest

So: Split your matrix A in rows so that a block of rows fits in memory. So A is M-by-N. And you partition with A1 of size N-by-N, the first N rows of A, then A2 of size M2-by-N, the next M2 rows of A, etc. (Same as what you do for forming the normal equations.)

Make sure that N-by-N (for R) and Mi-by-N (for Ai) fit in main memory. (And there are some other smaller array.) (This is same as what you do for forming the normal equations.)

Catch: we want M1 >= N. This will not work if M1 < N. For good use of space, it is better to take M1 = N. So I suggest M1 = N. Then you can have M2, M3, M4 anything you want.

Then
Code: Select all
`// DO: Bring A1 from disk to main memory or construct A1, A1 is first N rows of ADGEQRF( N, N, A1, LDA1 ... )// DO: Keep A1. It contains the matrix "R". (Not the final one, but we will update in there.)// DO: Bring A2 from disk to main memory or construct A2, A2 is next M2 rows of ADTPQRF( M2, N, 0, NB, A1, LDA1, A2, LDA2, … )// DO: You can now discard A2.// DO: Bring A3 from disk to main memory or construct A3, A3 is next M3 rows of ADTPQRF( M3, N, 0, NB, A1, LDA1, A3, LDA3, … )// DO: You can now discard A3.// Etc.`

Note: NB is the algorithmic block size of the algorithm. We decided to have it part of the interface so you (the user) need to decide what you want NB to be. (Maybe take NB=100 to start and you can tune later.) In LAPACK, in general, NB is set by the ILAENV( ) function. So this kind of interface slightly differs from what is standard in LAPACK. We also decided to export the "T matrix" to the user, see the T, LDT arguments in the interface. This is useful for performing efficiently the updates. (This avoids to have to recompute T.) So the interface is a little different. You will not really need T but you need to give it. It will be a workspace that DTPQRT needs.

At the end of this algorithm, A1 should contain the R factor of the QR factorization. This is called "tall-skinny flat-tree QR factorization".

The number of FLOPS will be ( 2 * M * N^2 - 2/3 N^3 ) so about 2 * M * N^2 if M >> N. The cost for normal equations is ( M * N^2 + 1/3 N^3 ) so about M * N^2 if M >> N.
So this method will be double the number of FLOPS if if M >> N. (Note that if M=N, this is the same number of FLOPS.)

To deal with your right-hand side b, (because at the end of the day, you do not want a QR factorization but you want to solve a least squares problem,) there are various ways. I think the easiest is to append b to A and so you would always work on an M-by-(N+1) matrix all the time. I hope this makes sense. Pseudo-code below.

Code: Select all
`// DO: construct A1 and b1 so that [A1 b1] is N-by(N+1) matrix. Call [A1 b1] "A1b1".DGEQRF( N, N+1, A1b1, LDA1 ... )// DO: Keep A1b1. It contains the matrix "Rb". (Not the final one, but we will update in there.)// DO: construct A2 and b2 so that [A2 b2] is M2-by(N+1) matrix. Call [A2 b2] "A2b2".DTPQRF( M2, N+1, 0, NB, A1b1, LDA1, A2b2, LDA2, … )// DO: You can now discard A2b2.// DO: construct A3 and b3 so that [A3 b3] is M3-by(N+1) matrix. Call [A3 b3] "A3b3".DTPQRF( M3, N+1, 0, NB, A1b1, LDA1, A3b3, LDA3, … )// DO: You can now discard A3b3.// Etc.`

// SOLVE A1 x = b1 (where A1 is upper triangular)
DTRSV( 'U', 'N', 'N', N, A1b1(1,1), LDA1, A1b1(1,N+1), 1)
[/code]

The solution x is in A1b1(1:N,N+1).

I think this is worth a shot.

If I understand what you are doing, this is minimal change and you have already done the hardest.

Let us know if that turns out to work. (Or not.)

Cheers,
Julien.
Julien Langou

Posts: 835
Joined: Thu Dec 09, 2004 12:32 pm
Location: Denver, CO, USA

### Re: Pivoted Cholesky solver

Thanks for the tips.

I use SYRK to assemble the normal equations matrix, it works fine.

I have heard of tall-skinny QR, I think its about a factor 2 slower than the normal equations method, but I could give it a try. I really thought pivoted Cholesky would help to get a solution vector, but maybe I'm wrong. Dennis and Schnabel discuss a "perturbed" Cholesky algorithm, which adds a small diagonal matrix to your matrix which is just enough to prevent Cholesky from failing. I might investigate that a bit more, since it would be faster (though not as accurate) as TSQR.

I've also read that iterative refinement can help improve normal equations solutions for ill-conditioned matrices, though I haven't tried that yet either.
pa33

Posts: 11
Joined: Fri Apr 27, 2007 5:17 pm

### Re: Pivoted Cholesky solver

Hi Patrick,

Dennis and Schnabel discuss a "perturbed" Cholesky algorithm, which adds a small diagonal matrix to your matrix which is just enough to prevent Cholesky from failing.
Yes, if you want Cholesky to complete, you can simply add ( eps * || A ||^2 ) to the diagonal before starting Cholesky and that should do the trick. (Maybe multiply by SQRT(N) to be on the safe side. So SQRT(N) * eps * || A ||^2). This is a "small" perturbation of A so you will solve a slightly perturbed problem so you will conserve a small backward error. (The smaller the perturbation the better.) Yes, this is a quick fix. And this is minimal change and you rely on a fast Cholesky. So, if the solution you get is satisfying, then I guess there is no need to look at anything else. I did not know this trick was in "Dennis and Schnabel". I knew the trick from general folklore in numerical linear algebra. I did not know a reference for it. Good to know!

I have heard of tall-skinny QR, I think its about a factor 2 slower than the normal equations method
Yes, this is correct. Twice as many FLOPS. So about a factor of 2. Though all the FLOPS in normal equations are in SYRK, and SYRK is really efficient and parallel and etc.; while the FLOPS in Tall-Skinny QR are a little less efficient. So I would expect a little more than a factor of 2 and some difficulty to scale on multicore. (But it is worth a try, I'd say.)

I've also read that iterative refinement can help improve normal equations solutions for ill-conditioned matrices
This is a good point. FYI, my reference for iterative refinement and least squares is the book of Åke Björk, "Numerical Methods for Least Squares Problems", 1996. But essentially, yes, you do iterative refinement and it does help improve normal equations solutions for ill-conditioned matrices as you said .

Cheers,
Julien.
Julien Langou

Posts: 835
Joined: Thu Dec 09, 2004 12:32 pm
Location: Denver, CO, USA

### Re: Pivoted Cholesky solver

Julien Langou wrote:
I've also read that iterative refinement can help improve normal equations solutions for ill-conditioned matrices
This is a good point. FYI, my reference for iterative refinement and least squares is the book of Åke Björk, "Numerical Methods for Least Squares Problems", 1996. But essentially, yes, you do iterative refinement and it does help improve normal equations solutions for ill-conditioned matrices as you said .

Thanks for your help Julien. Just curious, is the iterative refinement method implemented in LAPACK?
pa33

Posts: 11
Joined: Fri Apr 27, 2007 5:17 pm

### Re: Pivoted Cholesky solver

is the iterative refinement method implemented in LAPACK?

Hi Patrick,

No: we do not have iterative refinement for linear least squares problem in LAPACK. I know that Jim Demmel, Yozo Hida, Sherry Li, and Jason Riedy worked on this but their work did not make in LAPACK (yet). (See: J. Demmel, Y. Hida, X. S. Li, and E. J. Riedy, “Extra-precise Iterative Refinement for Overdetermined Least Squares Problems”, ACM Trans. Mathematical Software, Vol. 35, No. 4, Aticle 28, February 2009.) I do not think this would be much help anyway. Their work (1) was based out of QR factorization, (2) was more geared toward stability than speed, (3) relied on extra-precise BLAS, (4) need to have the whole matrix A stored in memory. So it would have been a few ways out of what you want.

You have a nice problem. It is really not clear to me that you want to do iterative refinement at all. The main problem with iterative refinement is that, for each iteration of iterative refinement, you will need to reconstruct the matrix A. I am not sure how much this costs but this seems a lot. So, on the one hand you have QR factorization, it will likely require on pass on the data matrix A. On the other hand, you can use normal equations (with a small perturbation), it will likely require iterative refinement and so a few passes on the data matrix A. It is not clear which method will be better.

Cheers,
Julien.
Julien Langou

Posts: 835
Joined: Thu Dec 09, 2004 12:32 pm
Location: Denver, CO, USA