dgeev is causing PC to shut down

Open discussion for MAGMA

dgeev is causing PC to shut down

Postby CarlosQ » Fri Feb 17, 2012 7:13 pm

Hello guys,

After some struggling I finally could built MAGMA 1.1 linked with GotoBlas2 for my Tesla C2070 (Fermi) GPU. While checking for the installation to be fully OK, I ran some of the testing functions provided with the release and something weird seems to be happening. The first executable I ran was testing_dgeev. The program runs for the first matrix size (1024) and then it makes the PC shut down (I've run it twice with the same result). After that, I ran other testing functions. For example, testing_getrf reaches a matrix of size 4096 and then also shuts the computer down. Other test functions such as testing_dsymv run without problem.

I'm using a Linux 64 machine with 4GB of RAM and Tesla C2070 with 6GB of RAM.

This is the configuration I'm using

GPU_TARGET = 1
CC = gcc
NVCC = nvcc
FORT = gfortran
ARCH = ar
ARCHFLAGS = cr
RANLIB = ranlib
OPTS = -O3 -DADD_
FOPTS = -O3 -DADD_ -x f95-cpp-input
NVOPTS = --compiler-options -fno-strict-aliasing -DUNIX -O3 -DADD_
LDOPTS = -fPIC -Xlinker -zmuldefs
LIB = -lgoto2 -lpthread -lcublas -lcudart -llapack -lm -lgfortran
CUDADIR = /usr/local/cuda
LIBDIR = -L/home/myname/GotoBLAS2 -L/usr/local/cuda/lib64 -L/usr/lib64 -L/home/bpoladian/LAPACK/lapack-3.4.0
INC = -I$(CUDADIR)/include

I don't understand how these test drivers might be causing the pc to suddenly shut down but it looks like they are.
Any advice would be very valuable to me.

CarlosQ
CarlosQ
 
Posts: 4
Joined: Fri Feb 17, 2012 4:48 pm

Re: dgeev is causing PC to shut down

Postby fletchjp » Mon Feb 27, 2012 6:35 am

You may well be running out of memory somewhere.
fletchjp
 
Posts: 175
Joined: Mon Dec 27, 2010 7:29 pm

Re: dgeev is causing PC to shut down

Postby CarlosQ » Mon Feb 27, 2012 8:16 pm

Hello, thanks for your reply.

At some point I assumed it was a memory problem (because I didn't see any memory checking in the driver code) but I was not aware that the PC could actually shut down for this reason.

I'm very interested in the performance of the eigenproblem driver (dgeev) for different matrix sizes using 1 GPU. It looks like I'm not going to be able to run matrices larger than 1024 using my system, so I was wondering if there is any site where I could find this performance for larger matrices? (2k-30k)
I also noticed that the driver performs the reduction to upper hessenberg form in the GPU (using magma_dgehrd) and also the generation of the unitary matrix Q (using magma_dorghr), but the actual call to the QR iteration is performed by calling lapackf77_dhseqr in the CPU. Is there any special reason for that? Are there plans to create "magma_dhseqr" in future releases to run in the GPU?

I appreciate your time

Carlos
CarlosQ
 
Posts: 4
Joined: Fri Feb 17, 2012 4:48 pm

Re: dgeev is causing PC to shut down

Postby fletchjp » Wed Feb 29, 2012 8:09 am

I am not part of the MAGMA team, you will have to await a response from them.

John
fletchjp
 
Posts: 175
Joined: Mon Dec 27, 2010 7:29 pm

Re: dgeev is causing PC to shut down

Postby CarlosQ » Wed Mar 07, 2012 2:15 pm

John, thanks for your answer. I changed the magma_dsyevd call for an lapackf77_dsyevd call in the testing_dsyevd routine and now I'm able to run matrices of up to 4096 (this is what I've don so far). However, even with this change, sometimes the PC just shuts down. Do you guys think it could be a hardware problem? Is anyone having the same results when running this driver routine?

Any advice is very valuable

Thank you
CarlosQ
 
Posts: 4
Joined: Fri Feb 17, 2012 4:48 pm

Re: dgeev is causing PC to shut down (SOLVED)

Postby CarlosQ » Mon Mar 12, 2012 3:54 pm

It looks like it was a heating problem. I had to add four additional fans to the system and it is running now.

Thank you
CarlosQ
 
Posts: 4
Joined: Fri Feb 17, 2012 4:48 pm


Return to User discussion

Who is online

Users browsing this forum: No registered users and 2 guests