Segmentation faul error testing

Open discussion for MAGMA library (Matrix Algebra on GPU and Multicore Architectures)
Post Reply
roalmar2
Posts: 20
Joined: Thu Jul 03, 2014 6:06 am

Segmentation faul error testing

Post by roalmar2 » Mon Nov 17, 2014 11:03 am

Hello,

I installed MAGMA 1.5.0_beta3. My make.inc is:

Code: Select all

# GPU_TARGET contains one or more of Tesla, Fermi, or Kepler,
# to specify for which GPUs you want to compile MAGMA:
#     Tesla  - NVIDIA compute capability 1.x cards
#     Fermi  - NVIDIA compute capability 2.x cards
#     Kepler - NVIDIA compute capability 3.x cards
# The default is all, "Tesla Fermi Kepler".
# See http://developer.nvidia.com/cuda-gpus
#
#GPU_TARGET ?= Tesla Fermi Kepler
GPU_TARGET = Tesla


CC        = gcc
NVCC      = nvcc
FORT      = gfortran

ARCH      = ar
ARCHFLAGS = cr
RANLIB    = ranlib

# Defining MAGMA_ILP64 or MKL_ILP64 changes magma_int_t to int64_t in include/magma_types.h
# Compiling with -std=c++98 -pedantic finds non-standard things like variable length arrays

OPTS      = -fPIC -O3 -DADD_ -Wall -fno-strict-aliasing -fopenmp -DMAGMA_WITH_MKL -DMAGMA_SETAFFINITY
F77OPTS   = -fPIC -O3 -DADD_ -Wall
FOPTS     = -fPIC -O3 -DADD_ -Wall -x f95-cpp-input
NVOPTS    =       -O3 -DADD_ -Xcompiler "-fno-strict-aliasing -fPIC"
LDOPTS    = -fPIC -fopenmp


# IMPORTANT: this link line is for 64-bit int !!!!
# For regular 64-bit builds using 64-bit pointers and 32-bit int,
# use the lp64 library, not the ilp64 library. See make.inc.mkl-gcc or make.inc.mkl-icc.
# see MKL Link Advisor at http://software.intel.com/sites/products/mkl/
# gcc with MKL 10.3, GNU threads, 64-bit int
# note -DMAGMA_ILP64 or -DMKL_ILP64, and -fdefault-integer-8 in OPTS above
LIB       = -lmkl_gf_ilp64 -lmkl_gnu_thread -lmkl_core -lpthread -lcublas -lcudart -lstdc++ -lm -lgfortran

# define library directories preferably in your environment, or here.
# for MKL run, e.g.: source /opt/intel/composerxe/mkl/bin/mklvars.sh intel64
#MKLROOT ?= /opt/intel/composerxe/mkl
MKLROOT   = /nfs/LIBS/LIBS/mkl/l_mkl_11.1.0.080/composer_xe_2013_sp1.0.080/mkl
#CUDADIR ?= /usr/local/cuda
CUDADIR   = /nfs/LIBS/LIBS/CUDA/6.0
-include make.check-mkl
-include make.check-cuda

LIBDIR    = -L$(MKLROOT)/lib/intel64 \
            -L$(CUDADIR)/lib64

INC       = -I$(CUDADIR)/include -I$(MKLROOT)/include

When I execute ./testing_dgemm -N 1088, it shows:

Code: Select all

MAGMA 1.5.0 beta3 compiled for CUDA capability >= 1.0
CUDA runtime 6000, driver 6050. OpenMP threads 24. MKL 11.1.0, MKL threads 12. 
device 0: Tesla K20m, 705.5 MHz clock, 4799.6 MB memory, capability 3.5
Usage: ./testing_dgemm [options] [-h|--help]

If running lapack (option --lapack), MAGMA and CUBLAS error are both computed
relative to CPU BLAS result. Else, MAGMA error is computed relative to CUBLAS result.

transA = No transpose, transB = No transpose
    M     N     K   MAGMA Gflop/s (ms)  CUBLAS Gflop/s (ms)   CPU Gflop/s (ms)  MAGMA error  CUBLAS error
=========================================================================================================


and if I use the nvidia-smi command, the gpu seems that not works.

When I execute another command like ./testing_dgegqr_gpu -N 120, shows:

Code: Select all

MAGMA 1.5.0 beta3 compiled for CUDA capability >= 1.0
CUDA runtime 6000, driver 6050. OpenMP threads 24. MKL 11.1.0, MKL threads 12. 
device 0: Tesla K20m, 705.5 MHz clock, 4799.6 MB memory, capability 3.5
Usage: ./testing_dgegqr_gpu [options] [-h|--help]

  M     N     CPU GFlop/s (ms)    GPU GFlop/s (ms)      ||I-Q'Q||_F / M     ||I-Q'Q||_I / M    ||A-Q R||_I
                                                        MAGMA  /  LAPACK    MAGMA  /  LAPACK
==========================================================================================================
Segmentation fault
Why happend this? Any idea?
Thank you, very much

Stan Tomov
Posts: 279
Joined: Fri Aug 21, 2009 10:39 pm

Re: Segmentation faul error testing

Post by Stan Tomov » Mon Nov 17, 2014 12:05 pm

Hello,
In general, we recommend that you upgrade to the latest version of MAGMA, currently 1.6.

Related to the dgemm question, I see you compiled for GPU_TARGET = Tesla when you have a Kepler GPU. Please replace Tesla with Kepler and recompile the library. NVIDIA is not supporting 'compute_10' architecture anymore. Latest CUDA releases would warn you about this, e.g., using CUDA 6.5 I get warnings like (if I try to use 'GPU_TARGET = sm13' on a Kepler GPU)

Code: Select all

nvcc warning : The 'compute_11', 'compute_12', 'compute_13', 'sm_11', 'sm_12', and 'sm_13' architectures are deprecated, and may be removed in a future release.

Related to the 'dgegqr' question, we had a bug in the beta release for small sizes that has been fixed for the final release.
Stan

roalmar2
Posts: 20
Joined: Thu Jul 03, 2014 6:06 am

Re: Segmentation faul error testing

Post by roalmar2 » Tue Nov 18, 2014 7:42 am

Hello, I reinstall with MAGMA 1.6, but it not works. GPU=Kepler

Some examples:

./testing_zgetrf -M 32 -N 32
MAGMA 1.6.0 compiled for CUDA capability >= 3.0
CUDA runtime 6050, driver 6050. OpenMP threads 24. MKL 11.1.0, MKL threads 12.
device 0: Tesla K20m, 705.5 MHz clock, 4799.6 MB memory, capability 3.5
Usage: ./testing_zgetrf [options] [-h|--help]

error: unrecognized option -M

-----------------------------------------------------------------------------------------------------------------------------------------------

./testing_zgetrf -N 32
MAGMA 1.6.0 compiled for CUDA capability >= 3.0
CUDA runtime 6050, driver 6050. OpenMP threads 24. MKL 11.1.0, MKL threads 12.
device 0: Tesla K20m, 705.5 MHz clock, 4799.6 MB memory, capability 3.5
Usage: ./testing_zgetrf [options] [-h|--help]

ngpu 1
M N CPU GFlop/s (sec) GPU GFlop/s (sec) |PA-LU|/(N*|A|)
=========================================================================

Intel MKL ERROR: Parameter 4 was incorrect on entry to ZGETRF.
magma_zgetrf returned error -4: invalid argument.
32 32 --- ( --- ) 0.74 ( 0.00) ---

-----------------------------------------------------------------------------------------------------------------------------------------------

./testing_dgetrf -N 32
MAGMA 1.6.0 compiled for CUDA capability >= 3.0
CUDA runtime 6050, driver 6050. OpenMP threads 24. MKL 11.1.0, MKL threads 12.
device 0: Tesla K20m, 705.5 MHz clock, 4799.6 MB memory, capability 3.5
Usage: ./testing_dgetrf [options] [-h|--help]

ngpu 1
M N CPU GFlop/s (sec) GPU GFlop/s (sec) |PA-LU|/(N*|A|)
=========================================================================
Segmentation fault

-----------------------------------------------------------------------------------------------------------------------------------------------

./testing_dgetrf
MAGMA 1.6.0 compiled for CUDA capability >= 3.0
CUDA runtime 6050, driver 6050. OpenMP threads 24. MKL 11.1.0, MKL threads 12.
device 0: Tesla K20m, 705.5 MHz clock, 4799.6 MB memory, capability 3.5
Usage: ./testing_dgetrf [options] [-h|--help]

ngpu 1
M N CPU GFlop/s (sec) GPU GFlop/s (sec) |PA-LU|/(N*|A|)
=========================================================================
Segmentation fault

-----------------------------------------------------------------------------------------------------------------------------------------------

./testing_dgetrf -N 1088
MAGMA 1.6.0 compiled for CUDA capability >= 3.0
CUDA runtime 6050, driver 6050. OpenMP threads 24. MKL 11.1.0, MKL threads 12.
device 0: Tesla K20m, 705.5 MHz clock, 4799.6 MB memory, capability 3.5
Usage: ./testing_dgetrf [options] [-h|--help]

ngpu 1
M N CPU GFlop/s (sec) GPU GFlop/s (sec) |PA-LU|/(N*|A|)
=========================================================================
Segmentation fault

-----------------------------------------------------------------------------------------------------------------------------------------------

./testing_dgegqr_gpu
MAGMA 1.6.0 compiled for CUDA capability >= 3.0
CUDA runtime 6050, driver 6050. OpenMP threads 24. MKL 11.1.0, MKL threads 12.
device 0: Tesla K20m, 705.5 MHz clock, 4799.6 MB memory, capability 3.5
Usage: ./testing_dgegqr_gpu [options] [-h|--help]

version 1
M N CPU GFlop/s (ms) GPU GFlop/s (ms) ||I-Q'Q||_F / M ||I-Q'Q||_I / M ||A-Q R||_I
MAGMA / LAPACK MAGMA / LAPACK
==========================================================================================================
1088 1088 skipping because dgegqr requires N <= 128
2112 2112 skipping because dgegqr requires N <= 128
3136 3136 skipping because dgegqr requires N <= 128
4160 4160 skipping because dgegqr requires N <= 128
5184 5184 skipping because dgegqr requires N <= 128
6208 6208 skipping because dgegqr requires N <= 128
7232 7232 skipping because dgegqr requires N <= 128
8256 8256 skipping because dgegqr requires N <= 128
9280 9280 skipping because dgegqr requires N <= 128
10304 10304 skipping because dgegqr requires N <= 128
And make.inc:

Code: Select all


GPU_TARGET = Kepler


CC        = gcc
CXX       = g++
NVCC      = nvcc
FORT      = gfortran

ARCH      = ar
ARCHFLAGS = cr
RANLIB    = ranlib

# Defining MAGMA_ILP64 or MKL_ILP64 changes magma_int_t to int64_t in include/magma_types.h
# Compiling with -std=c++98 -pedantic finds non-standard things like variable length arrays

CFLAGS    = -fPIC -O3 -DADD_ -Wall -fno-strict-aliasing -fopenmp -DMAGMA_WITH_MKL -DMAGMA_SETAFFINITY
CFLAGS   += -pedantic -Wno-long-long
FFLAGS    = -fPIC -O3 -DADD_ -Wall
F90FLAGS  = -fPIC -O3 -DADD_ -Wall -x f95-cpp-input
NVCCFLAGS =       -O3 -DADD_ -Xcompiler "-fno-strict-aliasing -fPIC"
LDFLAGS   = -fPIC -fopenmp


# IMPORTANT: this link line is for 64-bit int !!!!
# For regular 64-bit builds using 64-bit pointers and 32-bit int,
# use the lp64 library, not the ilp64 library. See make.inc.mkl-gcc or make.inc.mkl-icc.
# see MKL Link Advisor at http://software.intel.com/sites/products/mkl/
# gcc with MKL 10.3, GNU threads, 64-bit int
# note -DMAGMA_ILP64 or -DMKL_ILP64, and -fdefault-integer-8 in FFLAGS above
LIB       = -lmkl_gf_ilp64 -lmkl_gnu_thread -lmkl_core -lpthread -lcublas -lcudart -lstdc++ -lm -lgfortran

# define library directories preferably in your environment, or here.
# for MKL run, e.g.: source /opt/intel/composerxe/mkl/bin/mklvars.sh intel64
#MKLROOT ?= /opt/intel/composerxe/mkl
MKLROOT   = /nfs/LIBS/LIBS/mkl/l_mkl_11.1.0.080/composer_xe_2013_sp1.0.080/mkl
#CUDADIR ?= /usr/local/cuda
CUDADIR   = /nfs/LIBS/LIBS/CUDA/6.5
-include make.check-mkl
-include make.check-cuda

LIBDIR    = -L$(MKLROOT)/lib/intel64 \
            -L$(CUDADIR)/lib64

INC       = -I$(CUDADIR)/include -I$(MKLROOT)/include

I want to use, mkl, ilp64 (for bigger matrix size), and dynamic libraries (make.inc.mkl-shared?)

How could be my make.inc, is this correct?

Thank you very much

Stan Tomov
Posts: 279
Joined: Fri Aug 21, 2009 10:39 pm

Re: Segmentation faul error testing

Post by Stan Tomov » Tue Nov 18, 2014 1:05 pm

Hi,
Example on how to compile for 64-bit ints is in make.inc.mkl-ilp64, and example for shared libraries is in make.inc.mkl-shared. I see you tried to combine them but missed some flags like '-DMKL_ILP64'. I think this is what you have to add to your make.inc (after the corresponding variables)

Code: Select all

CFLAGS   += -DMKL_ILP64 -std=c99 
FFLAGS   += -fdefault-integer-8
F90FLAGS  += -fdefault-integer-8 -x f95-cpp-input 
NVCCFLAGS += -DMKL_ILP64
Stan

mgates3
Posts: 915
Joined: Fri Jan 06, 2012 2:13 pm

Re: Segmentation faul error testing

Post by mgates3 » Tue Nov 18, 2014 2:09 pm

Thank you for including all the input and output of testers, and your make.inc file. It makes diagnosing problems much easier.

For the first error, as it says, there is no option -M. Try using -h or --help to see the available options. You probably want this:

./testing_zgetrf -N 32

For a rectangular matrix (e.g., 64x32), use this:

./testing_zgetrf -N 64,32

As for the other errors, it appears that you used make.inc.mkl-ilp64, but then edited it and dropped -DMKL_ILP64. Thus you are compiling with LP64 (32-bit int, 64-bit long & pointers), but linking with an ILP64 library (64-bit int, long, and pointers). Either add the -DMKL_ILP64 flag (see make.inc.mkl-ilp64), or link with lp64 libraries (see make.inc.mkl-gcc).

-mark

roalmar2
Posts: 20
Joined: Thu Jul 03, 2014 6:06 am

Re: Segmentation faul error testing

Post by roalmar2 » Thu Nov 20, 2014 7:45 am

I solved it !!!!

make.inc file:

Code: Select all

#GPU_TARGET ?= Tesla Fermi Kepler
GPU_TARGET = Kepler

CC        = gcc
CXX       = g++
NVCC      = nvcc
FORT      = gfortran

ARCH      = ar
ARCHFLAGS = cr
RANLIB    = ranlib

# Defining MAGMA_ILP64 or MKL_ILP64 changes magma_int_t to int64_t in include/magma_types.h
# Compiling with -std=c++98 -pedantic finds non-standard things like variable length arrays
CFLAGS    = -O3 -DADD_ -Wall -fno-strict-aliasing -fopenmp -DMAGMA_WITH_MKL -DMAGMA_SETAFFINITY -DMKL_ILP64
CFLAGS   += -pedantic -Wno-long-long
#CFLAGS   += -Werror  # uncomment to ensure all warnings are dealt with
CXXFLAGS := $(CFLAGS) -std=c++98
CFLAGS   += -std=c99
FFLAGS    = -O3 -DADD_ -Wall -fdefault-integer-8
F90FLAGS  = -O3 -DADD_ -Wall -fdefault-integer-8 -x f95-cpp-input
NVCCFLAGS = -O3 -DADD_ -Xcompiler -fno-strict-aliasing -DMKL_ILP64
LDFLAGS   = -fopenmp

# IMPORTANT: this link line is for 64-bit int !!!!
# For regular 64-bit builds using 64-bit pointers and 32-bit int,
# use the lp64 library, not the ilp64 library. See make.inc.mkl-gcc or make.inc.mkl-icc.
# see MKL Link Advisor at http://software.intel.com/sites/products/mkl/
# gcc with MKL 10.3, GNU threads, 64-bit int
# note -DMAGMA_ILP64 or -DMKL_ILP64, and -fdefault-integer-8 in FFLAGS above
LIB       = -lmkl_gf_ilp64 -lmkl_gnu_thread -lmkl_core -lpthread -lcublas -lcudart -lstdc++ -lm -lgfortran

# define library directories preferably in your environment, or here.
# for MKL run, e.g.: source /opt/intel/composerxe/mkl/bin/mklvars.sh intel64

#MKLROOT ?= /opt/intel/composerxe/mkl
MKLROOT   = /nfs/LIBS/LIBS/mkl/l_mkl_11.1.0.080/composer_xe_2013_sp1.0.080/mkl

#CUDADIR ?= /usr/local/cuda
CUDADIR   = /nfs/LIBS/LIBS/CUDA/6.5

-include make.check-mkl
-include make.check-cuda

LIBDIR    = -L$(MKLROOT)/lib/intel64 \
            -L$(CUDADIR)/lib64

INC       = -I$(CUDADIR)/include -I$(MKLROOT)/include

And add this line in "Makefile.internal":

Code: Select all

NVCCFLAGS += -DHAVE_CUBLAS $(NV_SM) $(NV_COMP) --cudart=shared
Thanks for all !! ^,^

Post Reply