DGEEVX and left eigenvectors

Open discussion regarding features, bugs, issues, vendors, etc.

DGEEVX and left eigenvectors

Postby dsigg » Tue Dec 19, 2006 3:28 pm

Dear forum users,

I've come across an inexplicable problem with the DGEEVX eigenvalue routine. My interest is in tracer kinetics, where I perform an eigenvalue decomposition for nonsymmetric rate matrices (A) ranging in size from 3x3 to 200x200. I require left (vL) and right (vR) eigenvectors, so that, after normalization, the decomposition matrices can be found (E = vR*vL).

DGEEVX performs well with right eigenvectors, but I have found left eigenvectors to be in error if complex eigenvalues are involved. If all eigenvalues are real, there is no problem.

Here is a simple example:

A =
[-1,1,0]
[0,-1,1]
[1,0,-1]

DGEEVX output is (rounded to three significant figures):

Eigenvalues: (-1.5 + i0.866, -1.5 - i0.866, 0)

The following uses these substitutions: a = 0.289; b = 0.5; c = 0.577.

The (correct) right eigenvector is (in COLUMN-MAJOR form):

vR =
[-a + bi, -a - bi, -c]
[-a - bi, -a + bi, -c]
[ c , c , -c]

The inverse of vR results in the the correct left eigenvectors (in ROW-MAJOR form):

inv(vR) =
[-a - bi, -a + bi, c]
[-a + bi, -a - bi, c]
[-c , -c , -c]

However, this is the (incorrect) output from DGEEVX (in ROW-MAJOR form):

vL =
[-a - bi, c, -a + bi]
[-a + bi, c, -a - bi]
[-c , -c, -c ]

I call DGEEVX from a C program, but I use the correct FORTRAN calling sequence for matrices, and in any case that doesn't explain the results, as I don't think this is a case of a simple transposition (rather, two columns are exchanged). Also, the program works beatifully if eigenvalues are real.

I can hardly believe what I am seeing, as I would think that DGEEVX is among the more widely used routines, but I cannot find an explanation despite close scrutiny of my code and tinkering with options for DGEEVX.

I would greatly appreciate it if someone could replicate the computation and tell me either that I am wrong or that there is a bug in obtaining complex left eigenvectors.

Thanks,

- Dan
dsigg
 
Posts: 2
Joined: Tue Dec 19, 2006 2:19 pm
Location: Spokane, WA

Postby Julien Langou » Tue Dec 19, 2006 8:23 pm

Hello,

Everything's fine with your output from LAPACK, just a matter of
intrepretation. Explanation in the three points below.

First, and as a general statement, it's not a good idea to give the matrices
in various format: some in row major, some in column major. In particular when
those matrices are complex ... Then the row major version is the transpose of
the matrix but what makes more sense in general is to speak about the
transpose conjugate. All in all, this messes things. This is a personal point
of view. Well you seems fairly fluent with all this so that's fine with me
but I'd rather have seen VL in COLMAJOR format personnally.

Second, the definition of the left eigenvectors is that x is a left
eigenvector of A if and only if x is nonzero and there exits a lambda such that
x^H * A = (lambda) * x^H or if you prefer
A^T * x = x * conj(lambda). This is for your case where
A is real and some eigenvalues are complex. So you have to be careful. When
the eigenvectors (or equivalently the eigenvalues) are complex conjugate and A
is diagonalizable, then this means for example that the relation between left
eigenvector basis and right eigenvector basis is
VL = inv(VR)^H.

(You do not have an ^H in your formula which makes me wonder if we have the
same definition for the left eigenvectors of a matrix, this kind of explain
that with you were happy with the ROWMAJOR format of version of VL when the
eigenvalue were real and not happy when they were complex).

Third and final point, keep in mind that each vector of VR (or VL) is defined
up to a multiplication by a scalar of modulus 1 (since they are normalized by
LAPACK). When the eigenvector is real then this simply means multiplication by
+1 or -1, so checking them is easy and can be done at a glance. Well when the
eigenvectors are complex, it's a bit less trivial to check this.

So all these points in your case gives us that if you take your vL in Row
major and perform the following steps:

1- transpose vL (simply transpose, and do not conjugate) to get the COLMAJOR
format, you get:

Code: Select all
vL = [
-sqrt(3)/6 - i*1/2 , -sqrt(3)/6 + i*1/2 , -sqrt(3)/3 ;
+sqrt(3)/3         , +sqrt(3)/3         , -sqrt(3)/3 ;
-sqrt(3)/6 + i*1/2 , -sqrt(3)/6 - i*1/2 , -sqrt(3)/3
];


2- check that vL is ok: || VL^H * A - D * VL^H || / || A ||

3- then you can compute inv(vR)^H, you get:

Code: Select all
inv(vR)^H = [
-sqrt(3)/6 + i*1/2 , -sqrt(3)/6 - i*1/2 , -sqrt(3)/3 ;
-sqrt(3)/6 - i*1/2 , -sqrt(3)/6 + i*1/2 , -sqrt(3)/3 ;
+sqrt(3)/3         , -sqrt(3)/3         , -sqrt(3)/3
];


you can check that inv(vR)^H is a correct basis for the left eigenvectors
since || inv(VR) * A - D * inv(VR) || / || A ||.

inv(vR)^H and vL are obviously not the same but
- if you scale the first column of inv(vR)^H by exp(2*i*pi/3) you get the first column of vL,
- if you scale the second column of inv(vR)^H by exp(-2*i*pi/3) you get the second column of vL,
- if you scale the third column of inv(vR)^H by 1 you get the thrid column of vL.
So everything's fine.

The following code might make more sense than what I wrote:
Code: Select all
  clear
%
   A = [-1,1,0;0,-1,1;1,0,-1] ;
%
   D = diag([-3/2 + i*sqrt(3)/2, -3/2 - i*sqrt(3)/2, 0 ]);
   a = sqrt(3)/6; b =1/2; c=sqrt(3)/3;
   VL_ROWMAJOR = [ -a-b*i , c , -a+b*i ; -a+b*i , c , -a-b*i ; -c , -c , -c ] ;
   VL = (VL_ROWMAJOR).';
%
%  check that VL is a basis for the left eigenvectors associated with D
%
   norm(VL'*A-D*VL')/norm(A)
%
%  check that inv(VR)^H is a basis for the left eigenvectors associated with D
%
   VR = [ -a+b*i , -a-b*i , -c ; -a-b*i , -a+b*i , -c ; c , c , -c ];
   VL2 = inv(VR)';
   norm(VL2'*A-D*VL2')/norm(A)
%
%  check that VL and VL2 are the same up to a scaling with a scalar of modulus
%  1
   norm(VL - VL2*diag([ exp(2*pi*i/3) exp(-2*pi*i/3) 1] ) )



Julien
Julien Langou
 
Posts: 734
Joined: Thu Dec 09, 2004 12:32 pm
Location: Denver, CO, USA

Postby Julien Langou » Tue Dec 19, 2006 8:48 pm

By the way, I did not know if the matrix you give as example is a typical
matrix you need to deal with. But for information, your matrix is normal. (The
definition is A^TA = AA^T.) This means in particular that A is diagonalizable
in an orthonormal basis. This means that the left eigenvectors and right
eigenvectors are the same (yes, up to scaling of modulus 1 ...). This simply
comes from: inv(VR)=VR' (VR unitary) and VL=inv(VR)' (relation between left
and right eigenvector basis) gives VL=VR.

So for this particular matrix you might have wanted to just compute VR and therefore
save the computation of VL. Then compute E=VR^2.

Well, there is some hicks in general, first this works only for normal
matrices, second I have implicitly assumed that the eigenvalues were clearly
distinct. If there is a multiple eigenvalue then DGEEV or DGEEVX will not
return an orthonormal basis for the associated subspace (a priori), it will
just return a basis. So this means that you need to reorthogonalize all this
by yourself.

So if your matrices are all normal, well, maybe you want to let us know. This
is an interesting problem.

Julien.
Julien Langou
 
Posts: 734
Joined: Thu Dec 09, 2004 12:32 pm
Location: Denver, CO, USA

Postby dsigg » Wed Dec 20, 2006 9:06 pm

Thanks very much, Julien, for your prompt response. I need to study it a bit more when I have some free time, but I'd like to respond with my first impressions.

It seems I don't really know what a left eigenvector is. You surprised me with the equation: x^H * A = (lambda) * x^H. Does this mean that in order to apply the output VL to A one must take the Hermitian adjoint? I am missing something here.

Maybe it would help to lay out my understanding of the basic problem. I need to expand exp(At). A is a rate matrix describing unimolecular reactions and has positive off-diagonal elements (rate constants) and negative diagonal elements that are less than or equal to the negative sum of the respective column rate constants (equality in a closed system, nonequality in an open system). I don't believe the matrices must be normal in general, though in the particular example I gave you (unidirectional circulation among three compartments) they were.

To proceed, I diagonalize A: inv(M)*A*M = LAMDA, which allows me expand A as a function of the decomposition matrices E = vR*vL, where vR are the columns of M, and vL are the rows of inv(M). This explains the use of my COLMAJOR and ROWMAJOR conventions. The crucial question for me is: is inv(M) = VL or VL^T, the output of DGEEVX, as it does not appear so for complex eigenvalues in my hands? Stated more directly, are the rows vL of inv(M) the left eigenvectors? I've always called them such, which is why I was surprised by my results that VL^T != inv(VR), where VL and VR are the outputs from DGEEVX in ROWMAJOR format.

By equating M with the right eigenvectors and computing inv(M), using complex arithmetic if necessary, I've had success in solving problems. However, I'm looking for a faster and more accurate method of obtaining inv(M).

I hope this is clarifying from my end. I will look at your posts more closely and probably find out where I went wrong. Thanks again.

- Dan
dsigg
 
Posts: 2
Joined: Tue Dec 19, 2006 2:19 pm
Location: Spokane, WA

Postby Julien Langou » Fri Dec 22, 2006 5:15 pm

It seems I don't really know what a left eigenvector is. You surprised me with
the equation: x^H * A = (lambda) * x^H.


This is the standard definition. You can find it in any textbook.

Does this mean that in order to apply the output VL to A one must take the
Hermitian adjoint?


Yes if you want to apply the output VL from the left to A, you need to use the
Hermitian adjoint (in order for it to make some sense ...). Or you can apply
the output VL from the right to A^H. (Since your matrix A is real, this means
A^T.) But then it's not lambda but conj(lambda). Either you view it as:
x^H * A = (lambda) * x^H.
or
A^H * x = x * conj(lambda),
where x is a coumn of VL.

The crucial question for me is: is inv(M) = VL or VL^T.


Let's rename your M matrix by VR to make things clearer. The relation is:
(1) inv(VR) = VL^H. But the '=' sign
is not 'exactly exact' .... It's slightly overstated. Say that inv(VR) is a
valid left eigenvector basis. The problem is that there is not unicity for a
basis of eigenvectors if a matrix is a diagonalizable.

Just a few lines to convince you about Eq. (1).

From the right eigenvector decomposition we have:
(2) A*VR = VR*D ,
where A is a real diagonalizable n-by-n matrix, VR is a basis of
right eigenvectors (they might be complex), D is a diagonal complex n-by-n
matrix. VR and D are outputs from DGEEV (or DGEEVX). Fine.
Multiplying on the right and on the left by inv(VR), you get:
(2) inv(VR) * A = D * inv(VR)
Then you recall the definition of the left eigenvectors:
(3) VL^H * A = D * VL^H
So this implies that you can take inv(VR) as your VL^H which is Eq.
(1).

I think all this is pretty clear. Now there is an important subtility. You
need to keep in mind that Eq. (2) or Eq.
(3) do not define VL resp. VR uniquely.

Keep in mind the following reasons:

(1) there is the order of the eigenvalues in the matrix D. Well in LAPACK we
ensure that the order is the same for VL and for VR. (There is just one D.) So
this is not an issue.

(2) in any case you can always multiply a column of VR, or a row of VL by a
constant. Since we impose our vector to be of norm 1, the constant can only be
+/- 1 in the case your eigenvector is real or any complex of the form
e^(i\theta) (a complex of modulus 1) in the case your eigenvector is complex.
Consequently, LAPACK do not guarantee that inv(VL) and VR^H are the same and in
general those two matrices are not the same.

(3) Now you also have the case of multiple eigenvalues. If an eigenvalue is
multiple then you have an invariant subspace and any basis of this invariant
subspace (infinte number of possibility) is a valid solution and satisfies Eq.
(2) (or Eq. (3)). Well so if your
eigenvalues are distinct, this case can not happen. (And keep in mind that
working in finite precision arithmetic, distinct means that you would like a
relative gap between your eigenvalues fairly large.)

What's the conclusion of all this. Looks like for your application, you really
want inv(VR) and that VL is not enough. Even if your eigenvalue are distinct,
LAPACK will not guarantee you
VL^H * VR = I.
But (if dist. eigs.) VR^H * VL is a diagonal matrix with modulus 1 diagonal element,
therefore you can compute for each i=1 to n:
VL(:,i)^H * VR(:,i) = alpha.
alpha is necessarilly a scalar of modulus 1, if it's not, then the modulus of
alpha nedds to be between 0 and 1 and this means that you have multiple eigenvalues
and well, it's a little harder to handle, so I'll skip.
Then you just need to divide the row i of VL by conj(alpha) or the col i of VR
by alpha. (Either one or the other, but not both!). And that should do it.
After this scaling, you will find inv(VR) from VL^H.

As a side note: why are you using DGEEVX and not DGEEV?
Julien Langou
 
Posts: 734
Joined: Thu Dec 09, 2004 12:32 pm
Location: Denver, CO, USA

Postby Julien Langou » Mon Jan 08, 2007 2:02 pm

Hello, here is more information on the design of the DGEEV/DGEEVX interface from Jim Demmel. Julien.

[From Jim Demmel]

The question was this: In the nonsymmetric eigenvalue problem,
why isn't the left eigenvector matrix the "inverse" of the right
eigenvector matrix, at least when both are well-conditioned?

For notation, the right eigenvector matrix U would satisfy A*U = U *
Lambda,
in the absence of (roundoff) error
where Lambda is the diagonal matrix of eigenvalues.
The (exact) left eigenvector matrix V similarly
satisfies V^H * A = Lambda *V^H, where V^H means
conjugate transpose. So in the absense of error,
we get V^H * U = D, where D is an arbitrary
nonsingular diagonal matrix, and so might as well be the
identity, so the V^H is the inverse of U.

Here is why we don't try to do this. All that the algorithm
guarantees is small "right residual", i.e.
norm(A*U - U*Lambda) = O(macheps) * norm(A)
and similarly a small left residual with V^H, which is computed
independently.
If you add a third constraint, that V^H*U = I (or even
any diagonal) then the problem is "overconstrained"
from a roundoff point of view. We know how to
get the right residual and left residual small, but not
norm(V^H*U - I) tiny too. Accurate inversion all by
itself is a challenge as the condition number grows.

But suppose you only cared that diag(V^H*U - I) is
small, which could be accomplished just by scaling
each eigenvector pair independently of the others.
If you decided to do this by scaling the columns of U,
then just to compute U you'd have to compute V too
just to get the scaling factor, even if you didn't want V.
There is an analogous situation if you only scale V.
If you decided to not bother unless the user asked
for both U and V, then your code would return different
answers depending on what you asked it to compute.
This would make some users happy and others not.
So we think this scaling is better left to the users.

This reminds me of the occasional request to compute
"continuous eigenvectors" from a sequence of slightly
different input matrices. This is again an overconstrained
problem that the code does not have enough
information to solve, and that would raise the expense for the
uses who don't care.

Jim
Julien Langou
 
Posts: 734
Joined: Thu Dec 09, 2004 12:32 pm
Location: Denver, CO, USA


Return to User Discussion

Who is online

Users browsing this forum: Bing [Bot], Google [Bot] and 1 guest