Problem with zgetri_gpu

Open discussion for MAGMA library (Matrix Algebra on GPU and Multicore Architectures)
Post Reply
albertotrj
Posts: 4
Joined: Sun May 23, 2010 12:00 am
Location: Instituto de Física da Universidade de São Paulo

Problem with zgetri_gpu

Post by albertotrj » Fri Feb 24, 2012 6:26 pm

Hi,

I am trying to use MAGMA to calculate the inverse of a matrix (not to solve a linear system), but I had problems when my test matrix had N=3052.

As I am running on a GeForce GTX590, the sizes in testing_zgetri_gpu.cpp don't fit in my GPU's memory, so I modified the sizes array to run from N=3000 to N=3100 with steps of 10. The results of the tests are:

Code: Select all

device 0: GeForce GTX 590, 1215.0 MHz clock, 1535.7 MB memory, capability 2.0
device 1: GeForce GTX 590, 1215.0 MHz clock, 1535.6 MB memory, capability 2.0

  N    CPU GFlop/s    GPU GFlop/s    ||R||_F / ||A||_F
========================================================
 3000     48.24          64.50        2.261154e-14
 3010     46.93          62.68        1.157349e-14
 3020     42.52          63.02        1.005624e-14
 3030     46.84          63.34        6.146287e-14
 3040     48.14          63.73        2.019102e-14
 3050     47.88          63.39        2.696969e-14
 3060     48.41          64.51        1.315522e-14
 3070     48.05          65.07        2.235858e-14
 3080     47.61          62.90        3.577268e-14
 3090     47.62          63.16        1.497066e-14
 3100     47.95          63.68        1.115748e-13
Everything OK here!
However, if I run the tests for each N separately, I get (for easier visualization I pasted them together):

Code: Select all

  N    CPU GFlop/s    GPU GFlop/s    ||R||_F / ||A||_F
========================================================
 3000     41.54          64.58        2.261154e-14
 3010     37.99          62.62        3.026024e-14
 3020     47.02          63.07        1.873797e-14
 3030     47.34          99.49        5.027260e-02  <--
 3040     44.50         100.47        4.631249e-02  <--
 3050     46.11         101.64        1.768814e-01  <--
 3060     47.76         102.28        1.211222e-01  <--
 3070     46.76         103.46        6.297906e-02  <--
 3080     44.51          62.93        1.146775e-14
 3090     43.01          63.17        1.491277e-14
 3100     46.33          63.74        1.978944e-14

Does anyone have a hint on whats happening?

Thanks,
Alberto

albertotrj
Posts: 4
Joined: Sun May 23, 2010 12:00 am
Location: Instituto de Física da Universidade de São Paulo

Re: Problem with zgetri_gpu

Post by albertotrj » Mon Feb 27, 2012 11:38 am

Hi people,

I have run more tests and the problem happens for 3025 <= N <= 3072.
By the way, magma_zgetrf_gpu runs fine for the same N.

Alberto.

albertotrj
Posts: 4
Joined: Sun May 23, 2010 12:00 am
Location: Instituto de Física da Universidade de São Paulo

Re: Problem with zgetri_gpu

Post by albertotrj » Wed Feb 29, 2012 2:20 pm

Hi again,

I have run more tests, on a Tesla M2050 this time, and it seems there is a bug (I guess) in the *getri_gpu routines:

Code: Select all

./testing_sgetri_gpu -N 3050
device 0: Tesla M2050, 1147.0 MHz clock, 2687.4 MB memory, capability 2.0
device 1: Tesla M2050, 1147.0 MHz clock, 2687.4 MB memory, capability 2.0


  N    CPU GFlop/s    GPU GFlop/s    ||R||_F / ||A||_F
========================================================
 3050     55.43         180.61        1.537267e-05

Code: Select all

./testing_dgetri_gpu -N 3050
device 0: Tesla M2050, 1147.0 MHz clock, 2687.4 MB memory, capability 2.0
device 1: Tesla M2050, 1147.0 MHz clock, 2687.4 MB memory, capability 2.0


  N    CPU GFlop/s    GPU GFlop/s    ||R||_F / ||A||_F
========================================================
can not bind to texture 
can not bind to texture 
can not bind to texture 
can not bind to texture 
can not bind to texture 
can not bind to texture 
can not bind to texture 
can not bind to texture 
can not bind to texture 
can not bind to texture 
can not bind to texture 
can not bind to texture 
can not bind to texture 
can not bind to texture 
can not bind to texture 
can not bind to texture 
can not bind to texture 
can not bind to texture 
can not bind to texture 
can not bind to texture 
can not bind to texture 
can not bind to texture 
can not bind to texture 
can not bind to texture 
can not bind to texture 
can not bind to texture 
can not bind to texture 
can not bind to texture 
can not bind to texture 
can not bind to texture 
can not bind to texture 
can not bind to texture 
can not bind to texture 
can not bind to texture 
can not bind to texture 
can not bind to texture 
can not bind to texture 
can not bind to texture 
can not bind to texture 
can not bind to texture 
can not bind to texture 
can not bind to texture 
can not bind to texture 
can not bind to texture 
can not bind to texture 
can not bind to texture 
can not bind to texture 
can not bind to texture 
can not bind to texture 
can not bind to texture 
can not bind to texture 
can not bind to texture 
can not bind to texture 
can not bind to texture 
can not bind to texture 
can not bind to texture 
can not bind to texture 
can not bind to texture 
can not bind to texture 
can not bind to texture 
can not bind to texture 
can not bind to texture 
can not bind to texture 
can not bind to texture 
can not bind to texture 
can not bind to texture 
can not bind to texture 
can not bind to texture 
can not bind to texture 
can not bind to texture 
can not bind to texture 
can not bind to texture 
can not bind to texture 
can not bind to texture 
can not bind to texture 
can not bind to texture 
can not bind to texture 
can not bind to texture 
can not bind to texture 
can not bind to texture 
can not bind to texture 
can not bind to texture 
can not bind to texture 
can not bind to texture 
can not bind to texture 
can not bind to texture 
can not bind to texture 
can not bind to texture 
can not bind to texture 
can not bind to texture 
can not bind to texture 
can not bind to texture 
can not bind to texture 
 3050     28.21         100.51        2.070095e-01

Code: Select all

./testing_cgetri_gpu -N 3050
device 0: Tesla M2050, 1147.0 MHz clock, 2687.4 MB memory, capability 2.0
device 1: Tesla M2050, 1147.0 MHz clock, 2687.4 MB memory, capability 2.0


  N    CPU GFlop/s    GPU GFlop/s    ||R||_F / ||A||_F
========================================================
 3050     68.57         409.81        1.756635e-01

Code: Select all

./testing_zgetri_gpu -N 3050
device 0: Tesla M2050, 1147.0 MHz clock, 2687.4 MB memory, capability 2.0
device 1: Tesla M2050, 1147.0 MHz clock, 2687.4 MB memory, capability 2.0


  N    CPU GFlop/s    GPU GFlop/s    ||R||_F / ||A||_F
========================================================
 3050     36.30         113.79        1.768814e-01

Here is may make.inc:

Code: Select all

GPU_TARGET = 1

CC        = icc
NVCC      = nvcc
FORT      = ifort

ARCH      = ar
ARCHFLAGS = cr
RANLIB    = ranlib

OPTS      = -O3 -DADD_ -xHost
FOPTS     = -O3 -DADD_ -cpp -xHost
NVOPTS    = --compiler-options -fno-strict-aliasing -DUNIX -O3 -DADD_
LDOPTS    = -fPIC -nofor_main -Xlinker -zmuldefs

LIB       = -lmkl_intel_lp64 -lmkl_intel_thread -lmkl_core -liomp5 -lpthread -lcublas -lcudart -lm

CUDADIR   = /usr/local/cuda

LIBDIR    = -L$(MKLROOT)/lib/intel64 \
            -L$(CUDADIR)/lib64
INC       = -I$(CUDADIR)/include

#LIBMAGMA     = $(MAGMA_DIR)/lib/magma.a
#LIBMAGMABLAS = $(MAGMA_DIR)/lib/magmablas.a
I am using the Intel Compiler Version 12.0.0


Alberto.

mgates3
Posts: 915
Joined: Fri Jan 06, 2012 2:13 pm

Re: Problem with zgetri_gpu

Post by mgates3 » Thu Mar 01, 2012 3:17 pm

We will look into it.
-mark

Post Reply