## magma_dsygvdx return wrong eigen vectors for large matrix

### magma_dsygvdx return wrong eigen vectors for large matrix

Hello,

I was playing with MAGMA_DSYGVDX to solve both eigenvalue and eigenvectors. It was great with extreme fast speed. however, i notice that when the size of matrix becomes bigger, there is no error returned but the eigenvector is wrong compared to the CPU MKL code.

here is some datas, for AX=(lamda)BX, where A is N by N

N = 9629 eigenvalue correct, eigenvector is correct too;

N = 12604 eigenvalue correct, but eigenvector is wrong;

any advice?

Thanks,

boreas

I was playing with MAGMA_DSYGVDX to solve both eigenvalue and eigenvectors. It was great with extreme fast speed. however, i notice that when the size of matrix becomes bigger, there is no error returned but the eigenvector is wrong compared to the CPU MKL code.

here is some datas, for AX=(lamda)BX, where A is N by N

N = 9629 eigenvalue correct, eigenvector is correct too;

N = 12604 eigenvalue correct, but eigenvector is wrong;

any advice?

Thanks,

boreas

### Re: magma_dsygvdx return wrong eigen vectors for large matri

When you say "compared to the CPU MKL code", how are you comparing them? Are you simply checking that:

|| X_magma - X_mkl ||

is small? That won't work, because eigenvectors are unique only up to sign, so you can multiply an eigenvector by -1 and it is still valid. A better check is to verify that the backward error is small, e.g.,

|| A*X - B*X*Lambda || / (||A|| * ||X|| * n)

or similar, depending on itype, as is done in the MAGMA testers (cf. magma/testing/testing_zhegvdx.cpp).

You may also be interested in the magma_dsygvdx_2stage, which should be faster.

-mark

|| X_magma - X_mkl ||

is small? That won't work, because eigenvectors are unique only up to sign, so you can multiply an eigenvector by -1 and it is still valid. A better check is to verify that the backward error is small, e.g.,

|| A*X - B*X*Lambda || / (||A|| * ||X|| * n)

or similar, depending on itype, as is done in the MAGMA testers (cf. magma/testing/testing_zhegvdx.cpp).

You may also be interested in the magma_dsygvdx_2stage, which should be faster.

-mark

### Re: magma_dsygvdx return wrong eigen vectors for large matri

Thanks for reply. I check the relative norm for each eigen vector ||AX-(lamda) BX||/||AX||. this is pretty small when i solve smaller size of model, typically less than E-10, while it is 1.175830E+02 for the last eigenvector on a matrix with size 12604. what is funny is that eigenvalue is correct.

### Re: magma_dsygvdx return wrong eigen vectors for large matri

For each eigenvector, it should be

|| A x - lambda B x || / (||A||*||x||*n),

i.e., not ||A*x|| in denominator, but that's a minor thing that won't clear up the issue. Are you computing all eigenvectors? Then you can use magma_dsygvd (no x), which might work for you, but also doesn't resolve the issue with dsygvdx. Does the error show with the MAGMA tester? The input & output from the MAGMA tester would be helpful, as it contains useful information about your environment. I could not replicate the issue. E.g.,

As I understand what you're saying, only the last eigenvector is wrong? All the others are correct? That's interesting.

What is your system? I'm assuming MAGMA 2.3.0. What are your input arguments to dsygvdx (itype, range, jobz, uplo, n, lda, ldb, vl, vu, il, iu, lwork, liwork)? What are output arguments on return (mout, info)? What version of MKL (or other BLAS/LAPACK)? What compiler? What CUDA version? What processor & GPU? What is your make.inc file and any relevant environment variables (e.g., GPU_TARGET)?

-mark

|| A x - lambda B x || / (||A||*||x||*n),

i.e., not ||A*x|| in denominator, but that's a minor thing that won't clear up the issue. Are you computing all eigenvectors? Then you can use magma_dsygvd (no x), which might work for you, but also doesn't resolve the issue with dsygvdx. Does the error show with the MAGMA tester? The input & output from the MAGMA tester would be helpful, as it contains useful information about your environment. I could not replicate the issue. E.g.,

Code: Select all

```
bunsen magma/testing> ./testing_dsygvdx -n 123 -n 1000 -n 9629 -n 12604 -JV --niter 4 --check
% MAGMA 2.2.0 svn compiled for CUDA capability >= 3.0, 32-bit magma_int_t, 64-bit pointer.
% CUDA runtime 7050, driver 9010. OpenMP threads 16. MKL 11.3.0, MKL threads 16.
% device 0: Tesla K40c, 745.0 MHz clock, 11441.2 MiB memory, capability 3.5
% device 1: Tesla K40c, 745.0 MHz clock, 11441.2 MiB memory, capability 3.5
% Sat Apr 14 09:13:24 2018
% Usage: ./testing_dsygvdx [options] [-h|--help]
% itype = 1, jobz = Vectors needed, uplo = Lower, ngpu = 1
% N M GPU Time (sec) |AZ-BZD| |D - D_magma|
%======================================================
123 123 0.0107 6.97e-18 1.20e-17 ok
123 123 0.0041 6.66e-18 1.32e-17 ok
123 123 0.0041 6.09e-18 1.02e-17 ok
123 123 0.0041 6.03e-18 1.50e-17 ok
1000 1000 0.1529 1.65e-18 4.92e-18 ok
1000 1000 0.1516 1.58e-18 4.93e-18 ok
1000 1000 0.1507 1.60e-18 3.69e-18 ok
1000 1000 0.1302 1.56e-18 7.94e-18 ok
9629 9629 22.7737 3.35e-19 1.08e-18 ok
9629 9629 22.2445 3.40e-19 2.59e-18 ok
9629 9629 22.0150 3.36e-19 1.25e-18 ok
9629 9629 22.8482 3.39e-19 1.07e-18 ok
12604 12604 45.1833 2.84e-19 2.34e-18 ok
12604 12604 45.1970 2.92e-19 9.96e-19 ok
12604 12604 45.2176 3.01e-19 2.42e-18 ok
12604 12604 45.1455 2.82e-19 3.26e-18 ok
```

What is your system? I'm assuming MAGMA 2.3.0. What are your input arguments to dsygvdx (itype, range, jobz, uplo, n, lda, ldb, vl, vu, il, iu, lwork, liwork)? What are output arguments on return (mout, info)? What version of MKL (or other BLAS/LAPACK)? What compiler? What CUDA version? What processor & GPU? What is your make.inc file and any relevant environment variables (e.g., GPU_TARGET)?

-mark

Last edited by mgates3 on Sun Apr 15, 2018 9:11 am, edited 1 time in total.

**Reason:***Add "n" in denominator.*### Re: magma_dsygvdx return wrong eigen vectors for large matri

I might have figured out that it is due to ill-conditioning. somehow, my matrix becomes ill-conditioned when its size becomes larger.

but I can reproduce in a small 3X3 matrix.here it is

A = 7.20165E+05 3.03112E+03 -3.65387E+03

3.03112E+03 2.58619E+04 -1.53047E+03

-3.65387E+03 -1.53047E+03 1.82411E+04

B = 2.18796E+00 9.38368E-03 -1.12527E-02

9.38368E-03 4.32683E-02 -2.64063E-03

-1.12527E-02 -2.64063-04 3.02629E-02

the first eigenvector corresponding to lowest eigenvalue from MKL is

3.29149E+05 2.60432E+02 4.05145E-01

from MAGMA is

6.76019E-01 3.12768E-03 -3.82633E-03

both get eigenvalue 7.0572D+05

let me what your findings are. I really appreciate. Thank you,

boreas

but I can reproduce in a small 3X3 matrix.here it is

A = 7.20165E+05 3.03112E+03 -3.65387E+03

3.03112E+03 2.58619E+04 -1.53047E+03

-3.65387E+03 -1.53047E+03 1.82411E+04

B = 2.18796E+00 9.38368E-03 -1.12527E-02

9.38368E-03 4.32683E-02 -2.64063E-03

-1.12527E-02 -2.64063-04 3.02629E-02

the first eigenvector corresponding to lowest eigenvalue from MKL is

3.29149E+05 2.60432E+02 4.05145E-01

from MAGMA is

6.76019E-01 3.12768E-03 -3.82633E-03

both get eigenvalue 7.0572D+05

let me what your findings are. I really appreciate. Thank you,

boreas

Last edited by boreas on Sun Apr 15, 2018 10:08 pm, edited 2 times in total.

### Re: magma_dsygvdx return wrong eigen vectors for large matri

mgates3 wrote:

As I understand what you're saying, only the last eigenvector is wrong? All the others are correct? That's interesting.

No, all the eigenvectors are wrong. eigenvalue is correct.

What is your system? I'm assuming MAGMA 2.3.0. What are your input arguments to dsygvdx (itype, range, jobz, uplo, n, lda, ldb, vl, vu, il, iu, lwork, liwork)? What are output arguments on return (mout, info)? What version of MKL (or other BLAS/LAPACK)? What compiler? What CUDA version? What processor & GPU? What is your make.inc file and any relevant environment variables (e.g., GPU_TARGET)?

yes, I'm using MAGMA 2.3.0 with a TianXp card under CentOS 7.

-mark

### Re: magma_dsygvdx return wrong eigen vectors for large matri

Your B is not symmetric. dsygvdx will ignore the strictly upper or lower portion, depending on whether uplo is lower or upper, and will assume values based on symmetry.

When I run it in Matlab, assuming uplo = upper, I get an eigenvector that matches your MAGMA run. The eigenvalues are different, though. What's the complete decomposition (all eigenvalues & all eigenvectors) that you get from MKL and MAGMA?

Symmetric matrices are inherently well-conditioned with respect to eigenvalues, since the eigenvectors are orthogonal. Cf. http://heath.cs.illinois.edu/scicomp/notes/chap04.pdf slides 18-20.

For the generalized problem, I believe they are B-orthogonal, i.e.,

X^T B X = Identity

Your B is also well-conditioned.

When properly normalized as above, the error should be near machine epsilon (2.22e-16 for double). Typically we check that error < tol*eps with tol = 50 or 100.

-mark

When I run it in Matlab, assuming uplo = upper, I get an eigenvector that matches your MAGMA run. The eigenvalues are different, though. What's the complete decomposition (all eigenvalues & all eigenvectors) that you get from MKL and MAGMA?

Code: Select all

```
>> example
echo on;
A = [
7.20165E+05 3.03112E+03 -3.65387E+03
3.03112E+03 2.58619E+04 -1.53047E+03
-3.65387E+03 -1.53047E+03 1.82411E+04
];
A - A'
ans =
0 0 0
0 0 0
0 0 0
B = [
2.18796E+00 9.38368E-03 -1.12527E-02
4.46981E-03 4.32683E-02 -2.64063E-03
-5.32922E-03 -1.59523E-04 3.02629E-02
];
%B = tril(B) + tril(B,-1)' % uplo = Lower
B = triu(B) + triu(B,1)' % uplo = Upper
B =
2.1880 0.0094 -0.0113
0.0094 0.0433 -0.0026
-0.0113 -0.0026 0.0303
B - B'
ans =
0 0 0
0 0 0
0 0 0
% the first eigenvector corresponding to lowest eigenvalue from MKL is
x = [ 3.29149E+05 2.60432E+02 4.05145E-01 ]';
x
x =
1.0e+05 *
3.2915
0.0026
0.0000
% from MAGMA is
x2 = [ 6.76019E-01 3.12768E-03 -3.82633E-03 ]';
x2
x2 =
0.6760
0.0031
-0.0038
[X, D] = eig( A, B )
X =
0.6760 -0.0256 0.0246
0.0031 4.6815 1.1559
-0.0038 -0.9774 5.6853
D =
1.0e+05 *
3.2915 0 0
0 5.9780 0
0 0 6.0363
norm( A*X - B*X*D, 'fro' ) / (norm( A, 'fro' )*norm( X, 'fro' )*n)
ans =
1.4649e-17
% check B-orthogonality
X' * B * X
ans =
1.0000 -0.0000 0.0000
-0.0000 1.0000 -0.0000
0.0000 -0.0000 1.0000
norm( X' * B * X - eye(3,3) )
ans =
3.6208e-16
cond(B)
ans =
73.6537
```

For the generalized problem, I believe they are B-orthogonal, i.e.,

X^T B X = Identity

Your B is also well-conditioned.

When properly normalized as above, the error should be near machine epsilon (2.22e-16 for double). Typically we check that error < tol*eps with tol = 50 or 100.

-mark

Last edited by mgates3 on Sun Apr 15, 2018 9:26 am, edited 2 times in total.

**Reason:***eigenvector matches, not eigenvalue.*### Re: magma_dsygvdx return wrong eigen vectors for large matri

I'm also confused by the x you get from MKL. It should be unit length in the B-norm, i.e., x^T B x = 1, but isn't.

-mark

Code: Select all

```
% x from MKL
>> x' * B * x
ans =
2.3704e+11
% x2 from MAGMA
>> x2' * B * x2
ans =
1.0000
```

### Re: magma_dsygvdx return wrong eigen vectors for large matri

Hello Mark,

thanks to your help. the issue is straightened out.

btw, I have few comments

1) I had upper bound that's why I choose dsygvdx but somehow it does not trim the eigenvalues out of the bound. there seems no difference between dsygvdx and dsygvd? but that's minor, I can simply trim myself

nfound =0;

for (int i=0;i<n;i++)

{ if (W(i) <= vu) {nfound ++;}

else {break;}

}

2) the performance of dsygvdx and dsygvdx_2stage does not have much difference.

./testing_dsygvdx_2stage -n 12604 -JV --check --niter 3

% MAGMA 2.3.0 compiled for CUDA capability >= 3.0, 32-bit magma_int_t, 64-bit pointer.

% CUDA runtime 9010, driver 9010. OpenMP threads 16. MKL 11.0.3, MKL threads 16.

% device 0: TITAN Xp, 1582.0 MHz clock, 12189.6 MiB memory, capability 6.1

% Mon Apr 16 11:54:52 2018

% Usage: ./testing_dsygvdx_2stage [options] [-h|--help]

% itype = 1, jobz = Vectors needed, uplo = Lower, ngpu = 1

% N Nfound GPU Time (sec) |AZ-BZD| |D - D_magma|

%======================================================

12604 12604 67.1416 3.34e-19 4.10e-19 ok

12604 12604 66.6414 2.70e-19 1.32e-19 ok

12604 12604 62.0791 2.87e-19 6.61e-20 ok

./testing_dsygvdx -n 12604 -JV --check --niter 3

% MAGMA 2.3.0 compiled for CUDA capability >= 3.0, 32-bit magma_int_t, 64-bit pointer.

% CUDA runtime 9010, driver 9010. OpenMP threads 16. MKL 11.0.3, MKL threads 16.

% device 0: TITAN Xp, 1582.0 MHz clock, 12189.6 MiB memory, capability 6.1

% Mon Apr 16 12:10:55 2018

% Usage: ./testing_dsygvdx [options] [-h|--help]

% itype = 1, jobz = Vectors needed, uplo = Lower, ngpu = 1

% N M GPU Time (sec) |AZ-BZD| |D - D_magma|

%======================================================

12604 12604 48.9192 3.10e-19 2.23e-18 ok

12604 12604 48.4318 3.05e-19 9.63e-19 ok

12604 12604 48.4750 3.02e-19 2.41e-18 ok

3) my TianXp seems behaving even worse than your K40, which is surprising ;-)

thanks to your help. the issue is straightened out.

btw, I have few comments

1) I had upper bound that's why I choose dsygvdx but somehow it does not trim the eigenvalues out of the bound. there seems no difference between dsygvdx and dsygvd? but that's minor, I can simply trim myself

nfound =0;

for (int i=0;i<n;i++)

{ if (W(i) <= vu) {nfound ++;}

else {break;}

}

2) the performance of dsygvdx and dsygvdx_2stage does not have much difference.

./testing_dsygvdx_2stage -n 12604 -JV --check --niter 3

% MAGMA 2.3.0 compiled for CUDA capability >= 3.0, 32-bit magma_int_t, 64-bit pointer.

% CUDA runtime 9010, driver 9010. OpenMP threads 16. MKL 11.0.3, MKL threads 16.

% device 0: TITAN Xp, 1582.0 MHz clock, 12189.6 MiB memory, capability 6.1

% Mon Apr 16 11:54:52 2018

% Usage: ./testing_dsygvdx_2stage [options] [-h|--help]

% itype = 1, jobz = Vectors needed, uplo = Lower, ngpu = 1

% N Nfound GPU Time (sec) |AZ-BZD| |D - D_magma|

%======================================================

12604 12604 67.1416 3.34e-19 4.10e-19 ok

12604 12604 66.6414 2.70e-19 1.32e-19 ok

12604 12604 62.0791 2.87e-19 6.61e-20 ok

./testing_dsygvdx -n 12604 -JV --check --niter 3

% MAGMA 2.3.0 compiled for CUDA capability >= 3.0, 32-bit magma_int_t, 64-bit pointer.

% CUDA runtime 9010, driver 9010. OpenMP threads 16. MKL 11.0.3, MKL threads 16.

% device 0: TITAN Xp, 1582.0 MHz clock, 12189.6 MiB memory, capability 6.1

% Mon Apr 16 12:10:55 2018

% Usage: ./testing_dsygvdx [options] [-h|--help]

% itype = 1, jobz = Vectors needed, uplo = Lower, ngpu = 1

% N M GPU Time (sec) |AZ-BZD| |D - D_magma|

%======================================================

12604 12604 48.9192 3.10e-19 2.23e-18 ok

12604 12604 48.4318 3.05e-19 9.63e-19 ok

12604 12604 48.4750 3.02e-19 2.41e-18 ok

3) my TianXp seems behaving even worse than your K40, which is surprising ;-)

### Re: magma_dsygvdx return wrong eigen vectors for large matri

On small matrices (n <= 128), it just calls LAPACK sygvd and computes all eigenvalues. But it should check the range and return only the desired ones; I'll note that as a bug.

For larger matrices it should return only the desired eigenvalues & eigenvectors. Note "M" column below:

For larger matrices it should return only the desired eigenvalues & eigenvectors. Note "M" column below:

Code: Select all

```
mint magma/testing> ./testing_dsygvdx -n 20:200:20 -JV --vrange 0,1
% MAGMA 2.2.0 svn compiled for CUDA capability >= 3.0, 32-bit magma_int_t, 64-bit pointer.
% CUDA runtime 8000, driver 9010. OpenMP threads 4.
% device 0: GeForce GT 750M, 925.5 MHz clock, 2047.6 MiB memory, capability 3.0
% Mon Apr 16 16:27:53 2018
% Usage: ./testing_dsygvdx [options] [-h|--help]
% itype = 1, jobz = Vectors needed, uplo = Lower, ngpu = 1
% N M GPU Time (sec) |AZ-BZD| |D - D_magma|
%======================================================
20 20 0.0002 ---
40 40 0.0004 ---
60 60 0.0007 ---
80 80 0.0015 ---
100 100 0.0022 ---
120 120 0.0028 ---
140 71 0.0083 ---
160 79 0.0095 ---
180 91 0.0123 ---
200 95 0.0148 ---
mint magma/testing> ./testing_dsygvdx -n 20:200:20 -JV --irange 1,50
% MAGMA 2.2.0 svn compiled for CUDA capability >= 3.0, 32-bit magma_int_t, 64-bit pointer.
% CUDA runtime 8000, driver 9010. OpenMP threads 4.
% device 0: GeForce GT 750M, 925.5 MHz clock, 2047.6 MiB memory, capability 3.0
% Mon Apr 16 16:28:06 2018
% Usage: ./testing_dsygvdx [options] [-h|--help]
% itype = 1, jobz = Vectors needed, uplo = Lower, ngpu = 1
% N M GPU Time (sec) |AZ-BZD| |D - D_magma|
%======================================================
20 20 0.0002 ---
40 40 0.0004 ---
60 60 0.0007 ---
80 80 0.0015 ---
100 100 0.0022 ---
120 120 0.0029 ---
140 50 0.0085 ---
160 50 0.0099 ---
180 50 0.0131 ---
200 50 0.0142 ---
```