something about zgetrf_gpu

Open discussion for MAGMA

something about zgetrf_gpu

Postby justpk » Mon Jul 16, 2012 10:34 pm

hey,
when i run testing_zgetrf_gpu, when m*n small than about 9000*9000 it can run correct, but when it get bigger,
it just can caculate correct for m or n should be 960*x. but my card is tesla c2050 ,i think it can solve about 13000*13000.
such as 10800*10800 its wrong ,but 11520*11520 its right. such as 7881*7881 not 960 *x,its also right,i dont know the reason. can u help me ?
thank u very much.
lv
Code: Select all
./testing_zgetrf_gpu -M 9500 -N 9500
device 0: Tesla C2050 / C2070, 1147.0 MHz clock, 2687.4 MB memory, capability 2.0
device 1: Quadro 4000, 950.0 MHz clock, 2047.2 MB memory, capability 2.0
  testing_zgetrf -M 9500 -N 9500



  M     N   CPU GFlop/s    GPU GFlop/s   ||PA-LU||/(||A||*N)
============================================================
Argument 103 of zgetrf had an illegal value.
magma_zgetrf_gpu returned with error code -103
 9500  9500  117.20         3600382.87         2.500074e-01
[zhanghw@localhost testing]$ ./testing_zgetrf_gpu -M 10800 -N 10800
device 0: Tesla C2050 / C2070, 1147.0 MHz clock, 2687.4 MB memory, capability 2.0
device 1: Quadro 4000, 950.0 MHz clock, 2047.2 MB memory, capability 2.0
  testing_zgetrf -M 10800 -N 10800



  M     N   CPU GFlop/s    GPU GFlop/s   ||PA-LU||/(||A||*N)
============================================================
Argument 103 of zgetrf had an illegal value.
magma_zgetrf_gpu returned with error code -103
justpk
 
Posts: 5
Joined: Thu Jun 21, 2012 10:15 pm

Re: something about zgetrf_gpu

Postby Stan Tomov » Tue Jul 17, 2012 1:52 pm

Hello,
Most probably there is problem with your CPU libraries (BLAS and LAPACK). To test this, you can run for example
Code: Select all
./testing_zgetrf -M 20 -N 20

which will execute entirely on the CPU. If the error is not of order 1e-16 the result will be wrong. For example, on one of our systems I get
Code: Select all
[tomov@cumin testing]$ ./testing_zgetrf -M 20 -N 20
device 0: GeForce GTX 280, 1296.0 MHz clock, 1023.8 MB memory, capability 1.3
device 1: Quadro NVS 290, 918.0 MHz clock, 255.3 MB memory, capability 1.1
  testing_zgetrf -M 20 -N 20

  M     N   CPU GFlop/s    GPU GFlop/s   ||PA-LU||/(||A||*N)
============================================================
   20    20    0.45           0.45         9.236885e-18

[tomov@cumin testing]$ ./testing_zgetrf_gpu -M 256 -N 256
device 0: GeForce GTX 280, 1296.0 MHz clock, 1023.8 MB memory, capability 1.3
device 1: Quadro NVS 290, 918.0 MHz clock, 255.3 MB memory, capability 1.1
  testing_zgetrf -M 256 -N 256

  M     N   CPU GFlop/s    GPU GFlop/s   ||PA-LU||/(||A||*N)
============================================================
  256   256    1.07           7.29         8.601237e-18


When you said working, did you mean you were actually getting residuals in the order of 1e-18 for some cases? If yes, the problem would be somewhere else.
Stan
Stan Tomov
 
Posts: 247
Joined: Fri Aug 21, 2009 10:39 pm


Return to User discussion

Who is online

Users browsing this forum: No registered users and 1 guest

cron