Problem with testing_zgesv

Open discussion for MAGMA

Problem with testing_zgesv

Postby jeremiahpalmer » Thu May 03, 2012 12:00 pm

I ran testing_zgesv and got this output:

Code: Select all
device 0: Tesla C2075, 1147.0 MHz clock, 5375.4 MB memory, capability 2.0
device 1: Tesla C2075, 1147.0 MHz clock, 5375.4 MB memory, capability 2.0
device 2: Tesla C2075, 1147.0 MHz clock, 5375.4 MB memory, capability 2.0
device 3: Tesla C2075, 1147.0 MHz clock, 5375.4 MB memory, capability 2.0
device 4: Tesla C2075, 1147.0 MHz clock, 5375.2 MB memory, capability 2.0
device 5: Tesla C2075, 1147.0 MHz clock, 5375.4 MB memory, capability 2.0
device 6: Tesla C2075, 1147.0 MHz clock, 5375.4 MB memory, capability 2.0
device 7: Tesla C2075, 1147.0 MHz clock, 5375.4 MB memory, capability 2.0

Usage:
  ./testing_zgesv -N <matrix size> -R <right hand sides>
  -N can be repeated up to 10 times


  N     NRHS       GPU GFlop/s      || b-Ax || / ||A||*||B||
========================================================
 1024   100              54.76        6.517249e+152
 2048   100             154454.94        7.701791e-01
 3072   100             499112.47        7.651407e-01
 4032   100             1203744.20        7.519598e-01
 5184   100             2568466.55        7.598556e-01
 6016   100             3958030.74        7.451370e-01
 7040   100             6598876.04        7.576308e-01
 8064   100             9479215.99        7.457324e-01
 9088   100             8045028.83        7.503526e-01
10112   100             10919122.07        7.582880e-01


I also ran testing_zgesv_gpu and got this output:

Code: Select all
device 0: Tesla C2075, 1147.0 MHz clock, 5375.4 MB memory, capability 2.0
device 1: Tesla C2075, 1147.0 MHz clock, 5375.4 MB memory, capability 2.0
device 2: Tesla C2075, 1147.0 MHz clock, 5375.4 MB memory, capability 2.0
device 3: Tesla C2075, 1147.0 MHz clock, 5375.4 MB memory, capability 2.0
device 4: Tesla C2075, 1147.0 MHz clock, 5375.2 MB memory, capability 2.0
device 5: Tesla C2075, 1147.0 MHz clock, 5375.4 MB memory, capability 2.0
device 6: Tesla C2075, 1147.0 MHz clock, 5375.4 MB memory, capability 2.0
device 7: Tesla C2075, 1147.0 MHz clock, 5375.4 MB memory, capability 2.0

Usage:
  ./testing_zgesv_gpu -N <matrix size> -R <right hand sides>
  -N can be repeated up to 10 times


  N     NRHS       GPU GFlop/s      || b-Ax || / ||A||*||B||
========================================================
 1024   100              18.34        1.234136e-03
 2048   100             3282167.55        6.160297e-04
 3072   100             10606140.03        4.142864e-04
 4032   100             23473011.91        3.176803e-04
 5184   100             56139340.28        2.475162e-04
 6016   100             76192091.79        2.133183e-04
 7040   100             107781642.03        1.826547e-04
 8064   100             85312943.95        1.596054e-04
 9088   100             258446551.31        1.417983e-04
10112   100             315441304.14        1.275780e-04


Does anyone know if the zgesv fix is easy? Or do I need to wait for a future release of MAGMA?

Thanks,
Jeremiah
jeremiahpalmer
 
Posts: 58
Joined: Fri Jan 28, 2011 12:46 pm

Re: Problem with testing_zgesv

Postby mgates3 » Thu May 03, 2012 12:38 pm

I could not replicate your problem, so a bit more investigation is needed to diagnose it. Clearly the magma_zgesv routine is failing and returning early, hence the erroneous Gflop/s speeds. It's confusing why no error is reported, though.
What do you get for testing_zgetrf ?
What about other precisions:
testing_sgetrf and testing_sgesv
testing_dgetrf and testing_dgesv
testing_cgetrf and testing_cgesv

-mark
mgates3
 
Posts: 401
Joined: Fri Jan 06, 2012 2:13 pm

Re: Problem with testing_zgesv

Postby jeremiahpalmer » Thu May 03, 2012 2:54 pm

Thanks for investigating my issue, Mark. Here is what I get for testing_zgetrf:

Code: Select all
Usage:
  testing_zgetrf -M 1024 -N 1024



  M     N   CPU GFlop/s    GPU GFlop/s   ||PA-LU||/(||A||*N)
============================================================
 1024  1024   56.82         157.43         2.347769e-01
 2048  2048   96.70         130870.32         2.500543e-01
 3072  3072   99.54         460118.97         2.499506e-01
 4032  4032  103.73         929677.03         2.500080e-01
 5184  5184  109.06         2122727.62         2.499899e-01
 6016  6016  110.28         1872852.04         2.499754e-01
 7040  7040  127.48         3219331.86         2.499592e-01
 8064  8064  128.62         4539932.41         2.499908e-01
 9088  9088  129.71         6540852.40         2.500076e-01
10112 10112  130.72         9606870.12         2.500188e-01


Here is output for testing_sgetrf:

Code: Select all
  testing_sgetrf -M 1024 -N 1024



  M     N   CPU GFlop/s    GPU GFlop/s   ||PA-LU||/(||A||*N)
============================================================
 1024  1024   32.99          15.80         2.238783e-09
 2048  2048  151.85          54.93         1.919010e-09
 3072  3072  168.53          98.64         1.787685e-09
 4032  4032  180.08         206.09         2.056815e-09
 5184  5184  193.18         225.33         1.941383e-09
 6016  6016  196.77         284.97         1.888445e-09
 7040  7040  224.78         314.74         1.847246e-09
 8064  8064  225.39         339.91         1.927249e-09
 9088  9088  233.48         360.27         2.127291e-09
10112 10112  237.15         376.96         2.308559e-09



Here is output for testing_sgesv:
Code: Select all
  testing_sgetrf -M 1024 -N 1024



  M     N   CPU GFlop/s    GPU GFlop/s   ||PA-LU||/(||A||*N)
============================================================
 1024  1024   32.99          15.80         2.238783e-09
 2048  2048  151.85          54.93         1.919010e-09
 3072  3072  168.53          98.64         1.787685e-09
 4032  4032  180.08         206.09         2.056815e-09
 5184  5184  193.18         225.33         1.941383e-09
 6016  6016  196.77         284.97         1.888445e-09
 7040  7040  224.78         314.74         1.847246e-09
 8064  8064  225.39         339.91         1.927249e-09
 9088  9088  233.48         360.27         2.127291e-09
10112 10112  237.15         376.96         2.308559e-09



Here is for testing_dgetrf:

Code: Select all
  testing_dgetrf -M 1024 -N 1024



  M     N   CPU GFlop/s    GPU GFlop/s   ||PA-LU||/(||A||*N)
============================================================
 1024  1024    3.11          13.26         4.160666e-18
 2048  2048   46.38          41.90         3.583859e-18
 3072  3072   84.10         111.83         4.002220e-18
 4032  4032   91.73         141.23         3.816895e-18
 5184  5184  100.09         166.64         3.611515e-18
 6016  6016  100.93         181.44         3.489463e-18
 7040  7040  115.65         196.51         3.404586e-18
 8064  8064  116.62         212.97         2.715763e-18
 9088  9088  120.79         222.71         2.619804e-18
10112 10112  122.61         229.63         2.550349e-18



Here is for testing_dgesv:

Code: Select all
  testing_dgesv -N <matrix size> -R <right hand sides>
  -N can be repeated up to 10 times


  N     NRHS       GPU GFlop/s      || b-Ax || / ||A||*||B||
========================================================
 1024   100              10.70        1.651702e-15
 2048   100              34.92        1.905586e-15
 3072   100              75.23        3.186311e-15
 4032   100              94.44        2.400437e-14
 5184   100             115.25        9.311146e-15
 6016   100             127.01        8.242728e-15
 7040   100             141.52        1.675615e-14
 8064   100             156.07        5.840908e-15
 9088   100             166.20        1.789939e-15
10112   100             174.06        2.756148e-15



Here is output for testing_cgetrf:

Code: Select all
  testing_cgetrf -M 1024 -N 1024



  M     N   CPU GFlop/s    GPU GFlop/s   ||PA-LU||/(||A||*N)
============================================================
 1024  1024   74.37         115.36         4.994611e-09
 2048  2048  189.58         261.72         5.879262e-09
 3072  3072  202.16         366.46         5.797781e-09
 4032  4032  205.56         429.19         5.732481e-09
 5184  5184  213.11         486.30         5.662354e-09
 6016  6016  217.43         517.20         6.340097e-09
 7040  7040  251.04         547.30         7.246737e-09
 8064  8064  252.00         568.04         7.940759e-09
 9088  9088  256.50         590.04         8.660629e-09
10112 10112  259.01         606.37         9.272721e-09



Here is output for testing_cgesv:

Code: Select all
  testing_cgesv -N <matrix size> -R <right hand sides>
  -N can be repeated up to 10 times


  N     NRHS       GPU GFlop/s      || b-Ax || / ||A||*||B||
========================================================
 1024   100              60.95        3.271699e-07
 2048   100             203.56        1.370101e-06
 3072   100             281.00        1.144669e-06
 4032   100             342.44        2.301578e-06
 5184   100             389.63        5.714119e-06
 6016   100             419.74        6.532921e-07
 7040   100             451.95        1.021400e-06
 8064   100             480.62        3.112221e-06
 9088   100             504.06        1.041605e-06
10112   100             523.01        1.944749e-06



So, obviously, the real problem is in zgetrf. (By the way, this output is from magma_1.1.0.)

Thank you for your help!
-Jeremiah
jeremiahpalmer
 
Posts: 58
Joined: Fri Jan 28, 2011 12:46 pm

Re: Problem with testing_zgesv

Postby mgates3 » Fri May 04, 2012 5:58 pm

Can you try other sizes, such as
testing_zgetrf -M 1000 -N 1000
testing_zgetrf -M 2000 -N 2000
testing_zgetrf -M 4000 -N 4000
Do other double complex routines work, such zpotrf, zgeqrf, etc.?
-mark
mgates3
 
Posts: 401
Joined: Fri Jan 06, 2012 2:13 pm

Re: Problem with testing_zgesv

Postby jeremiahpalmer » Fri May 04, 2012 6:51 pm

Here is what I get for testing_zgetrf -M 1000 -N 1000
Code: Select all
  testing_zgetrf -M 1000 -N 1000



  M     N   CPU GFlop/s    GPU GFlop/s   ||PA-LU||/(||A||*N)
============================================================
 1000  1000   49.97         126.31         2.345270e-01



Here is what I get for testing_zgetrf -M 2000 -N 2000

Code: Select all
  testing_zgetrf -M 2000 -N 2000



  M     N   CPU GFlop/s    GPU GFlop/s   ||PA-LU||/(||A||*N)
============================================================
 2000  2000   81.14         438.42         2.420384e-01


Here is what I get for testing_zgetrf -M 4000 -N 4000
testing_zgetrf -M 4000 -N 4000


Code: Select all
  M     N   CPU GFlop/s    GPU GFlop/s   ||PA-LU||/(||A||*N)
============================================================
 4000  4000   93.07         1010.68         2.420734e-01

Here is what I get for testing_zpotrf:

Code: Select all
  testing_zpotrf -UPLO U -N 1024:40000



  N    CPU GFlop/s    GPU GFlop/s    ||R_magma - R_lapack||_F / ||R_lapack||_F
========================================================
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 1024     48.58          86.75        2.760964e-03
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 2048    108.96         197.23        2.071686e-03
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 3072    115.78         293.54        1.725181e-03
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 ** On entry to ZGEMM  parameter number 13 had an illegal value
 4032    121.36         604.60        3.726479e-01
Argument 6 of magma_zpotrf had an illegal value.
 5184    125.88         807968.36        6.149519e+01
Argument 6 of magma_zpotrf had an illegal value.
 6048    127.19         1586436.50        6.648873e+01
Argument 6 of magma_zpotrf had an illegal value.
 7200    128.72         2945677.70        7.262484e+01
Argument 6 of magma_zpotrf had an illegal value.
 8064    128.78         4019409.11        7.690918e+01
Argument 6 of magma_zpotrf had an illegal value.
 8928    129.27         5582925.11        8.096837e+01
Argument 6 of magma_zpotrf had an illegal value.
10240    129.12         8523633.07        8.677301e+01
Argument 6 of magma_zpotrf had an illegal value.
20000    136.71         61663969.36        1.216120e+02
Argument 6 of magma_zpotrf had an illegal value.

(I killed it before it completed.)

Any ideas?

Thanks,
Jeremiah
jeremiahpalmer
 
Posts: 58
Joined: Fri Jan 28, 2011 12:46 pm

Re: Problem with testing_zgesv

Postby mgates3 » Sat May 05, 2012 1:32 pm

Nothing is apparent yet. Since dgetrf works, it doesn't seem to be a precision issue. Since smaller sizes work, it doesn't seem to be a memory issue. It looks like everything double complex is failing for you. Try testing_*gemm with various sizes.

What is your hardware (CPU, memory, GPU), OS, compiler, Magma version, CUDA version, BLAS and LAPACK versions? What is your make.inc? What is your $LD_LIBRARY_PATH? Any other environment variables that might be relevant (MKL, etc.)? Anything other information about how you compiled it, problems you ran into, etc.?

If you can, try different versions of CUDA, BLAS, etc., to see if some different combination works. If you have a different computer, try on different hardware.

I noticed you have multiple GPU cards installed. Try using a different card. In the testing program, right after TESTING_CUDA_INIT put:
if ( cudaSetDevice( 1 ) != cudaSuccess ) {
printf( "cudaSetDevice failed\n" );
exit(1);
}
The "1" designates card 1. It normally defaults to using card 0.

-mark
mgates3
 
Posts: 401
Joined: Fri Jan 06, 2012 2:13 pm

Re: Problem with testing_zgesv

Postby jeremiahpalmer » Tue May 08, 2012 3:38 pm

Some details:

In testing_zgemm, I commented out the call to magma_zgemm and forced it to run only one matrix size (M=N=K=1024). I used cudaGetLastError before the call to cublasZgemm and get no error. I used it after cublasZgemm and I get "unspecified launch failure".

Hardware Details:
  • 2 x Intel X5675
  • 98GB Memory, 8 C2075
Software Details:
  • Magma 1.1.0
  • CUDA release 4.1, V0.2.1221
  • Code: Select all
    LD_LIBRARY_PATH=/opt/intel/Compiler/11.1/064/lib/intel64:/opt/intel/Compiler/11.1/064/ipp/em64t/sharedlib:/opt/intel/Compiler/11.1/064/mkl/lib/em64t:/opt/intel/Compiler/11.1/064/tbb/intel64/cc4.1.0_libc2.4_kernel2.6.16.21/lib:/opt/intel/Compiler/11.1/064/lib/intel64:/opt/intel/Compiler/11.1/064/ipp/em64t/sharedlib:/opt/intel/Compiler/11.1/064/mkl/lib/em64t:/opt/intel/Compiler/11.1/064/tbb/intel64/cc4.1.0_libc2.4_kernel2.6.16.21/lib:/opt/hpmpi/lib/linux_amd64:/opt/intel/mkl/9.0/lib/em64t:/usr/local/cuda/lib64
  • Kernel Version: 2.6.32-220.4.2.el6.x86_64
  • OS: CentOS release 6.2 (Final)
  • Intel Compiler Version 11.1
  • BLAS and LAPack: MKL 10.0.011

Also:
I tried running this on a different GPU card on the same machine. It failed the same way on each GPU.
I tried running this on a different machine with 4 GTX 480s - same fail.

Thanks,
Jeremiah
jeremiahpalmer
 
Posts: 58
Joined: Fri Jan 28, 2011 12:46 pm

Re: Problem with testing_zgesv

Postby jeremiahpalmer » Thu May 17, 2012 12:26 pm

Any ideas?

I downloaded magma 1.2 and ran testing_zgemm as is, and I got this output:

Code: Select all

CUBLAS error: memory mapping error (11) in magma_zgetmatrix at zset_get.cpp:113
CUBLAS error: memory mapping error (11) in magma_zsetmatrix at zset_get.cpp:98
CUBLAS error: memory mapping error (11) in magma_zgetmatrix at zset_get.cpp:113
CUBLAS error: memory mapping error (11) in magma_zsetmatrix at zset_get.cpp:98
CUBLAS error: memory mapping error (11) in magma_zsetmatrix at zset_get.cpp:98
CUBLAS error: memory mapping error (11) in magma_zsetmatrix at zset_get.cpp:98
testing_zgemm: zgemm_fermi.cu:349: void magmablas_zgemm(char, char, int, int, int, cuDoubleComplex, const cuDoubleComplex*, int, const cuDoubleComplex*, int, cuDoubleComplex, cuDoubleComplex*, int): Assertion `cudaBindTexture(&offsetA, tex_ref_A, d_A, sizeA*sizeof(cuDoubleComplex)) == cudaSuccess' failed.
device 0: Tesla C2075, 1147.0 MHz clock, 5375.4 MB memory, capability 2.0
device 1: Tesla C2075, 1147.0 MHz clock, 5375.4 MB memory, capability 2.0
device 2: Tesla C2075, 1147.0 MHz clock, 5375.4 MB memory, capability 2.0
device 3: Tesla C2075, 1147.0 MHz clock, 5375.4 MB memory, capability 2.0
device 4: Tesla C2075, 1147.0 MHz clock, 5375.2 MB memory, capability 2.0
device 5: Tesla C2075, 1147.0 MHz clock, 5375.4 MB memory, capability 2.0
device 6: Tesla C2075, 1147.0 MHz clock, 5375.4 MB memory, capability 2.0
device 7: Tesla C2075, 1147.0 MHz clock, 5375.4 MB memory, capability 2.0

Usage:
  testing_zgemm [-NN|NT|TN|TT] [-N 1024]


Testing transA = N  transB = N
    M    N    K     MAGMA GFLop/s    CUBLAS GFlop/s       error
==================================================================
 1024  1024  1024       648.39           91382.28         1.412704e+00



Thanks for the help!
-Jeremiah
jeremiahpalmer
 
Posts: 58
Joined: Fri Jan 28, 2011 12:46 pm

Re: Problem with testing_zgesv

Postby mgates3 » Mon May 21, 2012 11:30 am

Memory mapping errors usually happen when you pass a device pointer where it expects a CPU pointer, or vice-versa, or any other time the pointer is invalid. They are the generic seg-fault for GPU code.

What is your make.inc file?
I noticed in an earlier post that you used Intel icc. Perhaps try with gcc.

-mark
mgates3
 
Posts: 401
Joined: Fri Jan 06, 2012 2:13 pm

Re: Problem with testing_zgesv

Postby jeremiahpalmer » Mon May 21, 2012 1:03 pm

Your suggestion worked! I replace icc with gcc and recompiled. testing_zgesv, testing_zgemm, etc. all behave themselves.

Thanks!
-Jeremiah
jeremiahpalmer
 
Posts: 58
Joined: Fri Jan 28, 2011 12:46 pm

Next

Return to User discussion

Who is online

Users browsing this forum: No registered users and 1 guest