I am trying to use MAGMA to calculate the inverse of a matrix (not to solve a linear system), but I had problems when my test matrix had N=3052.
As I am running on a GeForce GTX590, the sizes in testing_zgetri_gpu.cpp don't fit in my GPU's memory, so I modified the sizes array to run from N=3000 to N=3100 with steps of 10. The results of the tests are:
- Code: Select all
device 0: GeForce GTX 590, 1215.0 MHz clock, 1535.7 MB memory, capability 2.0
device 1: GeForce GTX 590, 1215.0 MHz clock, 1535.6 MB memory, capability 2.0
N CPU GFlop/s GPU GFlop/s ||R||_F / ||A||_F
========================================================
3000 48.24 64.50 2.261154e-14
3010 46.93 62.68 1.157349e-14
3020 42.52 63.02 1.005624e-14
3030 46.84 63.34 6.146287e-14
3040 48.14 63.73 2.019102e-14
3050 47.88 63.39 2.696969e-14
3060 48.41 64.51 1.315522e-14
3070 48.05 65.07 2.235858e-14
3080 47.61 62.90 3.577268e-14
3090 47.62 63.16 1.497066e-14
3100 47.95 63.68 1.115748e-13
Everything OK here!
However, if I run the tests for each N separately, I get (for easier visualization I pasted them together):
- Code: Select all
N CPU GFlop/s GPU GFlop/s ||R||_F / ||A||_F
========================================================
3000 41.54 64.58 2.261154e-14
3010 37.99 62.62 3.026024e-14
3020 47.02 63.07 1.873797e-14
3030 47.34 99.49 5.027260e-02 <--
3040 44.50 100.47 4.631249e-02 <--
3050 46.11 101.64 1.768814e-01 <--
3060 47.76 102.28 1.211222e-01 <--
3070 46.76 103.46 6.297906e-02 <--
3080 44.51 62.93 1.146775e-14
3090 43.01 63.17 1.491277e-14
3100 46.33 63.74 1.978944e-14
Does anyone have a hint on whats happening?
Thanks,
Alberto
