Testing executables freeze

Open discussion for MAGMA

Testing executables freeze

Postby NLewkow » Wed Jun 29, 2011 2:34 pm

Greetings!

I just successfully built and installed magma_1.0.0-rc5. I have been trying to run some of the testing routines included in the magma_1.0.0-rc5/testing directory and have be unsuccessful thus far. For each routine it simply does not output any results from the tests. Below are the outputs of 3 routines that I have tried, these are just examples, so far I have not been able to successfully run any of the test executables.

testing_dgemm
Code: Select all
-bash-4.2$ ./testing_dgemm
device 0: Tesla C2050 / C2070, 1147.0 MHz clock, 2687.4 MB memory
device 1: Tesla C2050 / C2070, 1147.0 MHz clock, 2687.4 MB memory

Usage:
  testing_dgemm [-NN|NT|TN|TT] [-N 1024]


Testing transA = N  transB = N
    M    N    K     MAGMA GFLop/s    CUBLAS GFlop/s       error
==================================================================




testing_dgesv_gpu
Code: Select all

-bash-4.2$ ./testing_dgesv_gpu
device 0: Tesla C2050 / C2070, 1147.0 MHz clock, 2687.4 MB memory
device 1: Tesla C2050 / C2070, 1147.0 MHz clock, 2687.4 MB memory

Usage:
  testing_dgesv_gpu -nrhs 100 -N 1024



  N     NRHS       GPU GFlop/s      || b-Ax || / ||A||
========================================================



testing_zgetrf_gpu
Code: Select all
-bash-4.2$ ./testing_zgetrf_gpu
device 0: Tesla C2050 / C2070, 1147.0 MHz clock, 2687.4 MB memory
device 1: Tesla C2050 / C2070, 1147.0 MHz clock, 2687.4 MB memory

Usage:
  testing_zgetrf_gpu -M 1024 -N 1024



  M     N   CPU GFlop/s    GPU GFlop/s   ||PA-LU||/(||A||*N)
============================================================




I have waited upwards of 10 minutes for all of these. Looking at the source code it seems like there should be information printed to the table after each loop, or each different problem size. This leads me to believe that something is very wrong.

Thanks in advance for the help,
Nick
NLewkow
 
Posts: 7
Joined: Fri Jun 24, 2011 11:13 am

Re: Testing executables freeze

Postby NLewkow » Fri Jul 01, 2011 9:35 am

ping
NLewkow
 
Posts: 7
Joined: Fri Jun 24, 2011 11:13 am

Re: Testing executables freeze

Postby Stan Tomov » Mon Jul 04, 2011 3:22 pm

Hi Nick,
This is the first report for something freezing like this.
Obviously the GPU works as the call querying the card succeeds. I am wondering if the CPU calls to LAPACK work (which is a prerequisite for the hybrid algorithms in MAGMA). You can test this for example with
Code: Select all
./testing_dgetrf -N 30 -M 30

This would compute the LU factorization of a 30x30 matrix. This input matrix is considered too small in order to use the GPU, and therefore the entire computation will be done on the CPU using LAPACK (so there would be no MAGMA code; only magma wrapper calling LAPACK). Does a test like this go to completion?
Thanks,
Stan
Stan Tomov
 
Posts: 253
Joined: Fri Aug 21, 2009 10:39 pm

Re: Testing executables freeze

Postby NLewkow » Tue Jul 05, 2011 9:33 am

Hi Stan,
Thanks for the reply. I have ran the test that you specified and just like before it did not complete. The output looks as follows:

Code: Select all
-bash-4.2$ ./testing_dgetrf -N 30 -M 30
device 0: Tesla C2050 / C2070, 1147.0 MHz clock, 2687.4 MB memory
device 1: Tesla C2050 / C2070, 1147.0 MHz clock, 2687.4 MB memory
  testing_dgetrf -M 30 -N 30



  M     N   CPU GFlop/s    GPU GFlop/s   ||PA-LU||/(||A||*N)
============================================================


So it looks like you are correct in that the CPU calls to LAPACK are not working. I don't really understand how MAGMA was successfully built without LAPACK.... Here is the make.inc which was used to build the software

Code: Select all

#//////////////////////////////////////////////////////////////////////////////
#   -- MAGMA (version 1.0) --
#      Univ. of Tennessee, Knoxville
#      Univ. of California, Berkeley
#      Univ. of Colorado, Denver
#      November 2010
#//////////////////////////////////////////////////////////////////////////////

#
# GPU_TARGET specifies for which GPU you want to compile MAGMA
#      0: Tesla family
#      1: Fermi Family
#
GPU_TARGET = 0

CC        = icc
NVCC      = nvcc
FORT      = ifort

ARCH      = ar
ARCHFLAGS = cr
RANLIB    = ranlib

OPTS      = -O3 -DADD_
#FOPTS     = -O3 -DADD_
# MGR 20110629
FOPTS     = -O3 -DADD_ -cpp
NVOPTS    = --compiler-options -fno-strict-aliasing -DUNIX -O3 -DADD_
LDOPTS    = -fPIC -nofor_main -Xlinker -zmuldefs

#LIB       = -lmkl_em64t -lguide -lpthread -lcublas -lm
# MGR 20110629
LIB       = -lmkl_intel_ilp64 -lmkl_intel_thread -lmkl_core -liomp5 -lpthread -lcublas -lm

CUDADIR   = /usr/local/cuda

#LIBDIR    = -L$(HOME)/intel/mkl/10.0.1.014/lib/em64t -L$(CUDADIR)/lib64
# MGR 20110629
LIBDIR    = -L/usr/local/intel/lib/intel64 -L$(CUDADIR)/lib64
INC       = -I$(CUDADIR)/include

LIBMAGMA     = ../lib/libmagma.a
LIBMAGMABLAS = ../lib/libmagmablas.a

Any suggestions to fix this would be much appreciated.
Thanks in advance for your help.
Nick
NLewkow
 
Posts: 7
Joined: Fri Jun 24, 2011 11:13 am

Re: Testing executables freeze

Postby Stan Tomov » Tue Jul 05, 2011 12:23 pm

Nick,
I would suggest to put some printf statements in testing_dgetrf.cpp to see where exactly is the code freezing. Also I see your cards are Fermi so in your make.inc you have to set
Code: Select all
GPU_TARGET = 1

Do a make clean and make after that.
Stan
Stan Tomov
 
Posts: 253
Joined: Fri Aug 21, 2009 10:39 pm

Re: Testing executables freeze

Postby NLewkow » Tue Jul 05, 2011 4:15 pm

Hi Stan,

I recompiled MAGMA with the FERMI option in the make.inc as you specified.

Using print statements in the code
Code: Select all
145 printf("\tBefore Initialize Matrix\n");
146
147         /* Initialize the matrix */
148         lapackf77_dlarnv( &ione, ISEED, &n2, h_A );
149         lapackf77_dlacpy( MagmaUpperLowerStr, &M, &N, h_A, &lda, h_R, &lda );
150
151 printf("\tAfter Initialize Matrix\n");

results in the following output:
Code: Select all
 
-bash-4.2$ ./testing_dgetrf -N 30 -M 30
device 0: Tesla C2050 / C2070, 1147.0 MHz clock, 2687.4 MB memory
device 1: Tesla C2050 / C2070, 1147.0 MHz clock, 2687.4 MB memory
  testing_dgetrf -M 30 -N 30

   Before memory allocation
   After memory allocation


  M     N   CPU GFlop/s    GPU GFlop/s   ||PA-LU||/(||A||*N)
============================================================
   Before Initialize Matrix


It seems this is just as you suspected earlier as a problem with LAPACK.

Any suggestions for the next step to fix this??

Thanks,
Nick
NLewkow
 
Posts: 7
Joined: Fri Jun 24, 2011 11:13 am

Re: Testing executables freeze

Postby Stan Tomov » Tue Jul 05, 2011 4:39 pm

Hi Nick,
Can you please add a fprintf(stderr, "...") after the lapackf77_dlarnv to see which call failed. If it is the copy, can you replace the TESTING_HOSTALLOC by TESTING_MALLOC (this changes the work space from being in pinned to non-pinned memory).
If it still doesn't work can you try setting the MKL_NUM_THREADS to one and re-run, e.g., by
setenv MKL_NUM_THREADS 1
Thanks,
Stan
Stan Tomov
 
Posts: 253
Joined: Fri Aug 21, 2009 10:39 pm

Re: Testing executables freeze

Postby NLewkow » Tue Jul 05, 2011 6:10 pm

Hi Stan,
Placing another print statement indicates that the first LAPACK call (lapackf77_dlarnv) is the one that is hung up. I set the environment variable for MKL_NUM_THREADS=1 and tried rerunning with the same results.

Any other ideas?
Also, I noticed today that there is a home folder on yona with your last name. Are you located at ORNL by chance? I am a summer intern in the astrophysics group.

Thanks,
Nick
NLewkow
 
Posts: 7
Joined: Fri Jun 24, 2011 11:13 am

Re: Testing executables freeze

Postby NLewkow » Thu Jul 07, 2011 9:17 am

I figured out that my problem stemmed from a linking error during the MAGMA build. Apparently some dummy libraries were linked instead of the actual MKL ones which led to a successful build, but run time errors.

Thanks for your help and suggestions,
Nick
NLewkow
 
Posts: 7
Joined: Fri Jun 24, 2011 11:13 am


Return to User discussion

Who is online

Users browsing this forum: No registered users and 2 guests

cron