pseudo inverse

Post here if you have a question about LAPACK or ScaLAPACK algorithm or data format

pseudo inverse

Postby Gordon » Mon Mar 20, 2006 5:01 am

Dear LAPACK users,

I try to compute the pseudo inverse of a general MxN matrix using CLAPACK, however the result for a test 3x3 matrix differs from the matrix's inverse and also from what MATLAB's pinv() function returns. I know about the problems of explicitly computing the pseudo inverse, however, I need it.

I attach the code for my function and hope that someone is able to see an error, since I do not...

bool pseudoInverse(float **matrix, int m, int n) {

int info;
int LDWORK = -1;
float *WORK = new float[1];
int NRHS = 0;
int LDA = m;
float *B = new float[m];
for(int i=0; i<m; i++)
B[i] = 0.0;
int LDB = m;

float *SING = new float[(int)(min(m,n))];
float RCOND = -1.0f;
int IRANK = -1;

// 1. compute optimal size of work array
int res = sgelss_( &m, &n, &NRHS, &matrix[0][0], &LDA, B, &LDB, SING, &RCOND, &IRANK, WORK, &LDWORK, &info);

if(info)
return false;

// 2. compute pseudo inverse
LDWORK = WORK[0];
delete [] WORK;
WORK = new float[LDWORK];
res = sgelss_( &m, &n, &NRHS, &matrix[0][0], &LDA, B, &LDB, SING, &RCOND, &IRANK, WORK, &LDWORK, &info);

if(info)
return false;

delete [] B;
delete [] WORK;
delete [] SING;
return true;
}

I appreciate any comment.

Kind regards,

Gordon Wetzstein
Faculty of Media
Bauhaus-University Weimar
Gordon
 
Posts: 4
Joined: Mon Mar 20, 2006 4:53 am

Postby Julien Langou » Mon Mar 20, 2006 12:27 pm

Hello,

The 'matrix' array returned by your code is not related in anyway to the pseudo-inverse.

If you want to compute the pseudoinverse of an m-by-n matrix A using _gelss,
first initialize the set of vectors B to the m-by-m identity matrix, then
call _gelss mostly as your code does (just set NRHS=m),
finally the pseudoinverse of 'matrix' is the n-by-m matrix made of the first n rows of B.

The content of 'matrix' has been changed but this is no way the pseudoinverse of A.

_gelss with input A an m-by-n matrix, B an m-by-nrhs block vector solves nrhs least square problems to obtain Y,
an n-by-nrhs block vector, such that:

for i=1,...,nrhs
min _{y_i\in R^{n}} || A y_i - b_i ||_2,

or in other words:

min _{Y\in R^{n\times nrhs}} || A Y - B ||_F,

where || . ||_F is the Frobenius norm.

If you set B to the identity matrix of order m in _gelss, then you are solving

min _{Y\in R^{n\times m}} || AY - I_m ||_F.

The unique solution of this equation (Y, an n-by-m matrix) is the pseudo-inverse of the matrix.

Julien.
Julien Langou
 
Posts: 734
Joined: Thu Dec 09, 2004 12:32 pm
Location: Denver, CO, USA

Postby Gordon » Mon Mar 20, 2006 4:32 pm

Hello Julien,

thanks for your quick answer. That makes perfect sense to me. However, it would only work if m>=n, right? This represents a linear least squares solution to an overdetermined equation system, but what happens if the matrix is an underdetermined set of linear equations?

I am no mathematician, so my knowledge is quite limited. However, I need to compute the pseudo inverse of a general MxN matrix, no matter if M>N or vice versa. Can I use gelss_ for this case as well?

Kind regards,
Gordon
Gordon
 
Posts: 4
Joined: Mon Mar 20, 2006 4:53 am

Postby Julien Langou » Mon Mar 20, 2006 5:30 pm

Yes, this is a good question. It's not that simple to see all this clearly.

So the simple answer is yes it will work :)

For any M and N, (you can have M>N or M<N or M=N), if you apply _gelss to A of size M-by-N and B is the identity matrix of size N, the output B of size N-by-M will be the pseudo-inverse of A.

One way to see this is just to say that _gelss with input A of size M-by-N and B of size M-by-NRHS first performs the SVD decomposition of A:
(*)A = U . S . V^T
where U is of size M-by-min(M,N)
S is of size min(M,N)-by-min(M,N)
and V^T is of size min(M,N)-by-N.
It then performs
(**) X <- V. inv(S) . U^T . B
where X is of size N-by-NRHS.

This is true whatever the value of M and N are.
If M >= N, then all the x_i for i=1,..., NRHS are the minimum least squares solution of:
min _{x_i \in R^N } || b_i - Ax_i ||_2
If N <= M, then all the x_i for i=1,..., NRHS are the minimum-norm solution of:
min _{x_i \in R^N such that Ax_i=b_i } || x_i ||_2

The pseudo-inverse of A of size M-by-N with SVD decomposition such as (*) can be defined as
(***) pinv(A) = V. inv(S) . U^T.
This is also independent of whatever M and N are.

Using (**) and (***) you see that that pinv(A) can be obtained by using _gelss with B the identity matrix of size N indepently of which one is the bigger among M and N.
Julien Langou
 
Posts: 734
Joined: Thu Dec 09, 2004 12:32 pm
Location: Denver, CO, USA

Postby Gordon » Tue Mar 21, 2006 10:51 am

Thanks Julien,

I get the same results as pinv() in MATLAB by setting B to the max(M,N) identity matrix, no matter if M>=N or M<N.

Another more theoretical question:

In several experiments I found that the solution pinv(A)*b for computing x equals solving the linear set of equations only in the case of an overdetermined equation system. Otherwise A\b returns different results than pinv(A)*b. This makes sense, because there is an infinite number of solutions to an underdetermined equation system. Does it in this case actually make sense to try to compute a solution using the pseudo inverse? (I hope yes, otherwise it would destroy my plans :])

Cheers Gordon
Gordon
 
Posts: 4
Joined: Mon Mar 20, 2006 4:53 am

Postby Julien Langou » Tue Mar 21, 2006 5:14 pm

In several experiments I found that the solution pinv(A)*b for computing x
equals solving the linear set of equations only in the case of an
overdetermined equation system. Otherwise A\b returns different results than
pinv(A)*b. This makes sense, because there is an infinite number of solutions
to an underdetermined equation system. Does it in this case actually make sense
to try to compute a solution using the pseudo inverse? (I hope yes, otherwise
it would destroy my plans :])


If you are using Matlab of Scilab (e.g.), yes you are correct.

The Matlab A\b and pinv(A)*b are different in the undertermined case (A M-by-N with M<N). It is
important to know and understand the difference between pinv(A)*b and A\b. (ok
all is relative, there might have other things more important to know in life
:).

In the underdetermined case (A is M-by-N with M < N), Matlab is not using the
LAPACK routines dgels (nor dgelss nor dgelsd nor dgelsy). Matlab has its own hack
and a different philosophy than LAPACK for the underdetermined problem. As you
said in the undetermined problem there is an infinite number of x such Ax=b
so now the question which one do you pick?

LAPACK and pinv(A)*b picks x such that:

(+) min ||x||_2 s.t. Ax=b

(
which is in general the standard way to do, (+) implies for example that
x \in Im(A')
or in other words:
x \perp Null(A)
i.e. x has just what it needs to make b, no extra unnecessary stuff from null space.
)

now what Matlab \ does .... (You can have a hint by reading 'help mldivide'
but here is a more detailed explanation.)

First a QR factorization with column pivoting of the initial matrix A is performed,

A*E = Q*R

then the problem is truncated according to the first M columns of E. There is a change of variable
from x (of size N) to y of size M with the relation:

(*) x=E(:,1:M) * y

and the undetermined linear problem Ax=b becomes the square problem:

(**) (A*E(:,1:M)) * y =b

that is solved using the QR factorization:

(***) y = Q^T * ( R(:,1:M) \ b )

and so combining (*) and (***)

x = E(:,1:M) * (Q^T * ( R(:,1:M) \ b ))

In Matlab or Scilab, just try:
m=10; n=25; A = randn(m,n); b=rand(m,1); [Q,R,E]=qr(A); [A\b E(:,1:m)*(R(:,1:m)\(Q'*b))]
(this is called reverse engeneering :)

To solve the minimum 2-norm problem (+), LAPACK either performs an SVD
decomposition (see _gelss and _gelsd) or add a bunch of orthogonal
transformations from the right after the QR factorization step (see _gels and
_gelsy), which are more costly approaches than the one implememted in Matlab or Scilab.

So the bottomline is that Matlab and Scilab \ is not LAPACK _gels_ nor
pinv(.)*. They do solve the underdetermined problem Ax=b. Their approach is
faster than LAPACK but they do not provide the minimum 2-norm solution.

Just a question of taste. And yes better be aware of it.

Julien
Julien Langou
 
Posts: 734
Joined: Thu Dec 09, 2004 12:32 pm
Location: Denver, CO, USA

Postby Gordon » Mon Apr 03, 2006 3:43 am

Thanks for the explanation.

If you don't mind, I have another question:

Let's say I work with real world measurements that build the matrix of my equation system. Now solving the system (by computing the pseudo inverse or not) is a numerical solution containing real values. Since I am not able to physically produce negative quanitities I would like to know if it is possible to compute an optimal solution containg only non-negative values. I could simply set all negative values to zero, but I do not believe that is the optimal solution.

To make the question more general: is it possible to solve a constrained linear equation system, so that the solution does not contain negative values? Can this approach also be implemented using (C)LAPACK to compute a non-negative pseudo inverse?

Regards Gordon
Gordon
 
Posts: 4
Joined: Mon Mar 20, 2006 4:53 am

Postby Julien Langou » Mon Apr 03, 2006 4:12 pm

hello, last time I needed to solve this kind of problems I used
http://hesperia.gsfc.nasa.gov/~schmahl/nnls
it has worked fine for me, that's all I can say,
Julien
Julien Langou
 
Posts: 734
Joined: Thu Dec 09, 2004 12:32 pm
Location: Denver, CO, USA

Postby thombos » Tue Nov 27, 2007 2:20 am

I have a related question. This approach works well for small matrices, but when it comes to large matrices that are *very* rectangular, the amount of memory required for the intermediate matrix B can be extremely prohibitive. Is there a lapack routine that can do this without requiring a MAX(M,N)*M(M,N) matrix ?
thombos
 
Posts: 30
Joined: Mon Nov 26, 2007 8:41 pm

Postby thombos » Thu Nov 29, 2007 7:18 pm

I now decided to just calculate the reduced svd using cgesdd (or cgesvd) and explicitely calculate the inverse V*(S^-1)*U^H, which is in my case 100 times faster than cgelss_ . Its stunning how big the difference in timing is. Is this normal ? The matrices to pseudo-invert are 15000x640.

Thomas

Edit 1: Just noticed the comparison is between two machines, and one of them (the one that ran the cgelss) has only a lapack reference implementation. I'll update the accurate numbers....
thombos
 
Posts: 30
Joined: Mon Nov 26, 2007 8:41 pm

Postby Julien Langou » Thu Nov 29, 2007 9:39 pm

So let's repeat:

You start form an m-by-n matrix A.
I assume A is full rank ( rank(A) = n ) if this is not the case please correct me.
You want to compute pinv(A).

Now you have two methods:

method 1: you allocate an m-by-m matrix B (ouch in memory), set B to
be the m-by-m identity matrix, and then call _GELSS to solve A.X=B (ouch
in time). The solution X is in the first n rows of B. And you say
pinv(A) = B(1:n,1:m).
(In Matlab: pinv_A = A \ eye(m); )

method 2: you perform the reduced SVD of A using _GESVD or _GESDD
and then say pinv(A) = V*(inv(S))*U^H.
(In Matlab: [U,S,V] = svd(A); pinv_A = ( V / S ) * U'; )

Method 2 seems much more resonnable than method 1.

In term of memory,
method 2 is ~ mn + n^2,
method 1 is ~ m^2 +mn.
When m >> n that makes a big difference.

In term of FLOPS,
method 1 is ~ flops( SVD ) + flops ( apply U^H to B ) + flops( apply V )
~ O(mn^2) + O( m^2 n ) + O( mn^2 )
~ O( m^2n) + O(mn^2).
method 2 is ~ flops( SVD ) + flops ( compute (V/S)*U' )
~ O(mn^2).
When m >> n that makes a big difference.

If A is nonsingular (and not ill-conditionned), a standard method would be:
- allocate a n-by-n buffer for R
- call DGEQRF on A: output is in A
- copy the upper n-by-n triangular part of A in R
(put whatever you want in the lower part)
- call DORGQR on A: output (in A) is Q.
At this point you have Q and R such that A = QR and Q^H Q= I.
- call DTRSM on Q (in A) and R to compute pinv_A<- inv(R)*Q^H.

In Matlab this gives:
[Q,R] = qr(A); pinv_A = R\Q';
The algorithm is still O(mn^2) Flops but the constant should be slightly
better.

For m>>n, the cost of this method ~ 5 mn^2.
DGEQRF ~ 2mn^2, DORGQR ~ 2mn^2 and DTRSM ~ mn^2. With the SVD
approach the cost should be ~ 6mn^2 (this is when m >> n and I am
neglecting lots of n^3 terms.)

Need to check all this at some point ....

Julien.
Julien Langou
 
Posts: 734
Joined: Thu Dec 09, 2004 12:32 pm
Location: Denver, CO, USA

Postby thombos » Thu Nov 29, 2007 9:47 pm

Thanks for the nice summary. I didn't bother looking at the flops, but this makes the speed difference pretty obvious.

Thomas
thombos
 
Posts: 30
Joined: Mon Nov 26, 2007 8:41 pm

Re: pseudo inverse

Postby uliana » Wed May 12, 2010 7:07 pm

Hi Gordon,

I'm new to lapack, and I need to do the same task (calculate pseudo inverse)
what includes do you have in your program?


Thank you,
Ula
uliana
 
Posts: 5
Joined: Wed May 12, 2010 6:06 pm

Re: pseudo inverse

Postby rvishwa83 » Thu Jan 13, 2011 10:59 pm

Hello,

I have slightly modified the above code. For some reason, it does not give me the pseudo inverse. Could you please check and let me know where I am going wrong?

Regards,
Vish
Code: Select all
#include <iostream>
#include "clapack.h"
#include "f2c.h"

using namespace std;

extern "C"{
int sgelss_(integer *m, integer *n, integer *nrhs, real *a,
   integer *lda, real *b, integer *ldb, real *s, real *rcond, integer *
   rank, real *work, integer *lwork, integer *info);
}

int main()
{
   integer n, M, N;

    // Read the values of the matrix
   cout << "Enter n - no of linear equations\n";
   cin >> n;
   cout << "\n\nOn each line type a row of the matrix A\n";

   M=n;
   N=n;

   integer info, LDWORK = -1, NRHS = 0, LDA = M, LDB = M, IRANK = -1;
   float *WORK; //*SING;
   WORK = new float[1];
   float *SINGV;
   SINGV = new float[min(M,N)];
   float RCOND = -1.0f; 

   real** matrix=new real*[LDA];
   for (int row=0; row<LDA; row++)
      matrix[row]=new real[n];   

   for(int i = 0; i < n; i++)
   {
     cout << "Row " << i << " ";
     for(int j = 0; j < n; j++)
        cin >> matrix[i][j];
   }   

   /* Local arrays */
   float *a = new float[M * N];
   int tmp = 0;
   for(int i = 0; i<M; i++)
      for(int j = 0; j<N; j++)
     {
         a[tmp]=matrix[i][j];
         tmp++;
      }

   /*Create an identity matrix of size M x M*/
   float *b = new float[M*M];
   tmp = 0;
   for(int i = 0; i<M; i++)
      for(int j = 0; j<M; j++)
     {
         if(i == j)
            b[tmp] = 1.0;
       else
         b[tmp] = 0.0;
         tmp++;
      }
 
   ///* Query and allocate the optimal workspace */
   int res = sgelss_( &M, &N, &NRHS, &a[0], &LDA, &b[0], &LDB, SINGV, &RCOND, &IRANK, WORK, &LDWORK, &info);
   if(info)
      return 0;

   /* compute pseudo inverse*/
   LDWORK = WORK[0];
   delete [] WORK;
   WORK = new float[LDWORK];

   res = sgelss_( &M, &N, &NRHS, &a[0], &LDA, &b[0], &LDB, SINGV, &RCOND, &IRANK, WORK, &LDWORK, &info);

   if(info!=0)
      cout << "sgelss returned error " << info << "\n";
   else
   {
      // Write the answer
      cout << "The answer is\n";
      for(int i=0;i<n;i++)
      {   
         cout<<"\nRow"<<i<<" ";
      for(int j=0;j<n;j++)
         {
            cout<<b[(i*n)+j]<<"\t";
         }
      }
   }

   delete [] b;
   delete [] WORK;
   delete [] SINGV;

   system("PAUSE");
   return info;
}

rvishwa83
 
Posts: 1
Joined: Thu Jan 13, 2011 10:49 pm

Pseudo inverse of complex matrix

Postby nawras » Thu Sep 22, 2011 10:47 am

Dear LAPACK users,

I am wondering if anyone could help me. I need to find a Pseudo-inverse of a rectangular complex matrix (n*m), n>=m. I would appreciate it very much if anyone could guide me to a fortran 77 code capable of doing that.

Thanks very much for your time and help.

Nawras
nawras
 
Posts: 4
Joined: Thu Sep 22, 2011 10:39 am


Return to Algorithm / Data

Who is online

Users browsing this forum: No registered users and 1 guest