I have run a test on zgeqrf and get the following strange results, the first I have seen with a z case.
I looked in zgeqrf.cpp and can see no define for a magma BLAS routine and therefore nothing to change to move to CUBLAS.
In this case the problem is repeatable, with a number of other tests running O.K. in between including sgeqrf, cgeqrf and dgeqrf.
Here is a set of the strange answers.
- Code: Select all
fletcher@fletcher-desktop:~/magma_1.0.0-rc2/testing$ ./testing_zgeqrf
device 0: GeForce GTX 460, 1400.0 MHz clock, 2047.2 MB memory
Usage:
testing_zgeqrf -M 1024 -N 1024
M N CPU GFlop/s GPU GFlop/s ||R||_F / ||A||_F
==========================================================
1024 1024 25.10 45.22 2.868703e-15
2048 2048 31.17 63.19 5.441290e-01
3072 3072 32.28 67.20 5.947517e-01
4032 4032 32.83 68.49 6.070624e-01
5184 5184 32.60 69.45 6.263964e-01
6016 6016 32.27 70.04 6.323491e-01
7040 7040 31.64 70.40 6.328434e-01
8064 8064 31.17 70.77 6.319259e-01
9088 9088 31.02 71.15 6.393267e-01
9984 9984 31.14 71.33 6.397711e-01
The cgeqrf answers give me the highest GPU values I have seen for my card, and also rather poor residual values about 1.e-6 compared to 1.e-9 for sgetrf. sgeqrf also is about 1.e-6. Is that to be expected for this algorithm?
- Code: Select all
fletcher@fletcher-desktop:~/magma_1.0.0-rc2/testing$ ./testing_cgeqrf
device 0: GeForce GTX 460, 1400.0 MHz clock, 2047.2 MB memory
Usage:
testing_cgeqrf -M 1024 -N 1024
M N CPU GFlop/s GPU GFlop/s ||R||_F / ||A||_F
==========================================================
1024 1024 36.16 142.92 1.447583e-06
2048 2048 58.61 191.37 1.843371e-06
3072 3072 62.51 341.94 2.260692e-06
4032 4032 63.71 370.86 2.584254e-06
5184 5184 63.14 422.85 3.051479e-06
6016 6016 62.09 427.17 3.216158e-06
7040 7040 61.86 438.37 3.365463e-06
8064 8064 61.17 440.61 3.442715e-06
9088 9088 61.81 445.47 3.522330e-06
9984 9984 61.52 450.82 3.602567e-06
Thanks for all your help.
John