dgelsy question

Open discussion regarding features, bugs, issues, vendors, etc.

dgelsy question

Hello,

I am just learning to use LAPACK for linear regression problems.

I am trying to solve the following problem--

Find x so that the L2 norm ||Ax-b|| is minimized; and the problem data is,

A is a matrix with 3 columns and 4 rows:
A = [1 1 17; 1 2 18; 1 3 19; 1 4 20]

b is a vector with 1 column and 4 rows:
b = [18.75715 18.83939 18.91538 18.98599]

Since this is problem where some columns in A are correlated, I am using the DGELSY subroutine to compute x, with a value of RCOND = 9e-14.

This routine DGELSY returns the matrix rank as 2, but the x array returned is [0.1401 -1.0827 1.1589]. I was wondering if this is correct. I thought that if this routine returns a rank of 2, then only 2 of the 3 columns in A will have non-zero values; but here all three columns have non-zero x-values. Also, the standard errors that I get from this reported solution are [4.68e+13 1.07e-3 5.6e-5]. I am a bit suspicious on the huge standard error value for the first variable.

Solving this same problem in Matlab (using lscov) gives much different results. I get,

x = [0 -1.09149 1.16774], and standard errors as
[0 2.13125e-3 3.14922e-4].

Is there something that I am doing incorrectly? Any feedback would be greatly appreciated.

Thanks.
dilberito1

Posts: 3
Joined: Fri Aug 31, 2007 3:36 am

Re: dgelsy question

Routine DGELSY returns the least squares solution of minimum norm, which will not in general have any zero elements. See Section 2.3.2 of the LAPACK Users' Guide:

http://www.netlib.org/lapack/lug/node53.html

Information on computing the, so called, basic solution is given in Section 2.4.2.3:

http://www.netlib.org/lapack/lug/node42.html

I am afraid that there is not a driver routine to compute the basic solution, so you will need to put together some computational routines as described in Section 2.4.2.3.

Best wishes,

Sven Hammarling.
sven

Posts: 146
Joined: Wed Dec 22, 2004 4:28 am

Re: dgelsy question

Thank you for your explanation. I read the links you pointed out, but I still have a couple of questions.

In the example that I have, the solution returned by DGELSY is not the minimum norm solution (it has norm = 3.41) whereas the minimum norm solution, as reported by Matlab, has norm = 3.38. Is this kind of behavior common, or is there something I am overlooking?

Also, assuming that I use the solution returned by DGELSY, does this solution have "degrees of freedom" equal to 1 (since there are 4 samples, 3 variables), or is it equal to 2 (since there are 4 samples, 3 variables, and the A matrix is reported by DGELSY to have a rank of 2 out of 3, so we add back one degree of freedom, even though the actual solution has 3 non-zeros). I am confused as to what the correct degrees of freedom should be here.

Thanks once again!
dilberito1

Posts: 3
Joined: Fri Aug 31, 2007 3:36 am

If I understand correctly (which I well may not), the MATLAB routine lscov solves a weighted least squares problem. You can use routine DGGGLM to solve weighted least squares problems, but it does assume that A has full column rank, so may not be appropriate for your data. Although as far as I can see from looking at lscov.m it does not compute the rank of A, so I assume that it also assumes A is of full rank.

Anyway, I think that is why you get different solutions.

Best wishes,

Sven.
sven

Posts: 146
Joined: Wed Dec 22, 2004 4:28 am

Hello,

I am putting together a short note to answer your problem because it's not the
quickly before the note, here are some facts.

In the overdetermined nonsingular case (or numerically nonsingular if you prefer),
* the solution returns by LAPACK routines DGELSY, DGELSS, DGELSD,
* the solution returned by A\b, lsocv(A , b) (Matlab)
* the solution retruned by pinv(A)*b
are all the same.

In the overdetermined singular case (or numerically rank deficient), this is your case,
the solution returned by LAPACK routine DGELSY, DGELSS, and DGELSD is the same
as the solution return by Matlab pinv(A)*b but is not the same as the solution returned
by Matlab A\b or lscov(A,b). This is normal. In the rank deficient case, there is an inifinite
number of solutions and so Matlab and LAPACK can both give you different answers
while being both correct.

The strategy/philosophy of Matlab is to represent the solution with as few variables as
possible. This is why you have an exact 0 in your x. Only two variables are needed in

The strategy/philosophy or LAPACK (or x =pinv(A)*b in Matlab) is to minimize the
two-norm of the solution x. Then x is fully populated but has minimum norm.

In any case the residual r = b-Ax is the same (even though the x's are different) and is
guaranteed to have minimal norm. So both LAPACK and Matlab solve the Least Squares problem.

Note that Matlab's approach is slightly faster than LAPACK.

OK so that's the first point. The second point is that I think you are not computing the
estimated standard error of the solution (stdx) in a stable manner for the solution of
LAPACK while Matlab function lscov computes stdx stably.

My suspicion is that you are doing, something like
Code: Select all
`   mse = norm(b-A*x)^2/(m-n);   S = inv(A'*A)*mse;   stdx = sqrt(diag(S));`

This previous code works fine for well-conditionned matrix.

If A is ill-conditionned (a fortiori singular), the inv(A'*A) is going to blow you in the face
and I think this is what you are observing.

Try the following code, you should feel better with what LAPACK is doing after that.

Cheers,
Julien

Code: Select all
`   m = 4;   n = 3;   A = [1 1 17; 1 2 18; 1 3 19; 1 4 20] ;   b = [18.75715 18.83939 18.91538 18.98599]' ;   x = [  +0.140107 -1.082733 +1.158984 ]';     % This is the LAPACK solution   rankA = 2;   mse = norm(b-A*x)^2/(m-rankA);   stdx = zeros(n,1);   [u,s,v] = svd(A);   s = s(1:rankA,1:rankA);   v = v(:,1:rankA);   for i=1:n,      stdx(i) = norm( s \ (v(i,:)') ) * sqrt(mse);   end   fprintf('rankA = %d\n',rankA);   fprintf('x    = [ %+f %+f %+f ]''\n',x(1),x(2),x(3));   fprintf('mse  = %f\n',mse);   fprintf('stdx = [ %+f %+f %+f ]''\n',stdx(1),stdx(2),stdx(3));   fprintf('\n');`
Julien Langou

Posts: 830
Joined: Thu Dec 09, 2004 12:32 pm
Location: Denver, CO, USA

Julien pointed out that my comment about lscov was rather out of date. I was looking at MATLAB 6.5, but lscov has been significantly updated since then (see Julien's own post). Apologies if I misled.

Sven.
sven

Posts: 146
Joined: Wed Dec 22, 2004 4:28 am

Thank you Sven and Julien for your very informative replies.

I now understand why the LAPACK solution differs from the lscov solution returned from Matlab.

I have one more question regarding computing the estimated standard error of the solution (stdx). I understand that the code posted by Julien to compute stdx correctly; however, this involves the svd decomposition of the matrix A, which is going to be very time consuming in general (which is why I am using the QR method, DGELSY, instead of a svd based method). So my question is, Is there a better (more efficient) way to compute the correct stdx values other than using the svd method?

My current implementation (in C++) does the following to compute stdx values. After a call to DGELSY, I form a triangular matrix R from the A matrix (this is the matrix A that is passed on to the DGELSY routine with the coefficient values, and gets modified by DGELSY). I form the triangular matrix R from the upper triangular part of A after DGELSY has modified it. I then compute the inverse of R and R' (Transpose of R), by calling the routine DTRTRI twice; and finally I multiply R(-1), the inverse of R, and R'(-1), the inverse of R' by calling the DTRMM. Finally, the diagonal values of this final matrix (after multiplication and a call to DTRMM) are the values of stdx I use. All this basically uses the fact that,
Code: Select all
`stdx = sqrt(mse) * sqrt(diag(inv(A'*A))) = sqrt(mse) * sqrt(diag(inv(R'*R)))`

where R is the triangular matrix in the QR decomposition of A.
This does result in a blow up of values for stdx; but I am not sure how to resolve this. Is there something that I am doing wrong?

Again, thank you for your help!
dilberito1

Posts: 3
Joined: Fri Aug 31, 2007 3:36 am

Hello,

Sure, here how to do it using a QR factorization with column pivoting.

Julien.

Code: Select all
`   m = 4;   n = 3;   A = [1 1 17; 1 2 18; 1 3 19; 1 4 20] ;   b = [18.75715 18.83939 18.91538 18.98599]' ;   [Q,R,perm] = qr(A,0);   i = 1; while (abs(R(i,i)) > abs(R(1,1))*m*eps ) i=i+1; if (i==n+1), break; end; end; rankA =i-1;   R = R(1:rankA,1:n);   Q = Q(:,1:rankA);   if (rankA < n),      [Z,R] = qr( R(1:rankA,1:n)', 0 ); R = R'; Z=Z';   else      Z = eye(n);   end   z = Q'*b;   x(perm,1) = Z' *(R \ z);   mse = (norm(b)^2 - norm(z)^2) / (m-rankA);   stdx = zeros(n,1);   for i=1:n,      stdx(perm(i)) = norm( R\Z(:,i) ) * sqrt( mse );   end   fprintf('rankA = %d\n',rankA);   fprintf('x    = [ %+f %+f %+f ]''\n',x(1),x(2),x(3));   fprintf('mse  = %f\n',mse);   fprintf('stdx = [ %+f %+f %+f ]''\n',stdx(1),stdx(2),stdx(3));   fprintf('\n');`
Julien Langou

Posts: 830
Joined: Thu Dec 09, 2004 12:32 pm
Location: Denver, CO, USA 