Magma 1.40 benchmarks - GeForce GTX TITAN; Linux

Open discussion for MAGMA

Magma 1.40 benchmarks - GeForce GTX TITAN; Linux

Postby Boxed Cylon » Mon Sep 23, 2013 7:34 am

Magma 1.40 compiled somewhat easily compared to previous releases, thanks. One tweek I seemed to have needed in my make.inc file was to add the -lstdc++ flag or I got undefined symbols when compiling. I write mainly to post my benchmarks for SGEMM and ask for comment. I compiled for the Kepler architecture and used the GotoBLAS2 library, but as you see MAGMA is some 2-3 times slower than CUBLAS. Seems puzzling - so I ask for comment. The CPU is an i7-3820, linux version is Suse 12.3, GCC version is 4.6.2, compiled from source. I also have the Titan double precision boost turned off; turning that on slows single precision a bit.

Code: Select all
> ./testing_sgemm --lapack --nthread 4
MAGMA 1.4.0 , capability 3.0
device 0: GeForce GTX TITAN, 875.5 MHz clock, 6143.8 MB memory, capability 3.5
device 1: GeForce GT 440, 1620.0 MHz clock, 1023.2 MB memory, capability 2.1
Warning: MAGMA compiled for higher capability; some routines will not run correctly!
Usage: ./testing_sgemm [options] [-h|--help]

If running lapack (option --lapack), MAGMA and CUBLAS error are both computed
relative to CPU BLAS result. Else, MAGMA error is computed relative to CUBLAS result.

transA = N, transB = N
    M     N     K   MAGMA Gflop/s (ms)  CUBLAS Gflop/s (ms)   CPU Gflop/s (ms)  MAGMA error  CUBLAS error
=========================================================================================================
 1088  1088  1088    956.09 (   2.69)    2472.27 (   1.04)     99.37 (  25.92)    2.25e-06     2.16e-06
 2112  2112  2112   1192.94 (  15.79)    2957.79 (   6.37)    110.90 ( 169.90)    3.18e-06     3.18e-06
 3136  3136  3136   1209.24 (  51.01)    3159.62 (  19.52)    112.28 ( 549.38)    4.23e-06     4.23e-06
 4160  4160  4160   1202.47 ( 119.74)    3394.37 (  42.42)    112.57 (1279.07)    4.82e-06     4.82e-06
 5184  5184  5184   1376.62 ( 202.40)    3821.43 (  72.91)    112.91 (2467.68)    6.25e-06     6.33e-06
 6208  6208  6208   1453.41 ( 329.23)    3830.23 ( 124.93)    112.99 (4234.98)    6.01e-06     6.01e-06
 7232  7232  7232   1378.39 ( 548.82)    3764.40 ( 200.96)    113.28 (6677.92)    6.15e-06     6.15e-06
 8256  8256  8256   1456.06 ( 772.97)    3622.18 ( 310.72)    109.49 (10279.64)    6.58e-06     6.58e-06
 9280  9280  9280   1450.92 (1101.62)    3572.99 ( 447.34)    111.17 (14377.87)    7.93e-06     7.93e-06
10304 10304 10304   1453.33 (1505.51)    3448.65 ( 634.45)    110.01 (19888.63)    8.46e-06     8.46e-06


Here are the testing results for SGESV:

Code: Select all
> ./testing_sgesv --lapack --nthread 4
MAGMA 1.4.0 , capability 3.0
device 0: GeForce GTX TITAN, 875.5 MHz clock, 6143.8 MB memory, capability 3.5
device 1: GeForce GT 440, 1620.0 MHz clock, 1023.2 MB memory, capability 2.1
Warning: MAGMA compiled for higher capability; some routines will not run correctly!
Usage: ./testing_sgesv [options] [-h|--help]

ngpu 1
    N  NRHS   CPU Gflop/s (sec)   GPU GFlop/s (sec)   ||B - AX|| / N*||A||*||X||
================================================================================
 1088     1     18.31 (   0.05)     62.04 (   0.01)   2.10e-10
 2112     1     84.43 (   0.07)    165.29 (   0.04)   2.06e-10
 3136     1     94.40 (   0.22)    278.90 (   0.07)   1.48e-10
 4160     1     97.99 (   0.49)    287.75 (   0.17)   1.82e-10
 5184     1     94.79 (   0.98)    378.49 (   0.25)   1.65e-10
 6208     1     99.00 (   1.61)    469.18 (   0.34)   1.71e-10
 7232     1    101.77 (   2.48)    558.55 (   0.45)   1.08e-10
 8256     1    102.67 (   3.65)    648.34 (   0.58)   9.63e-11
 9280     1    103.86 (   5.13)    743.87 (   0.72)   1.33e-10
10304     1    105.24 (   6.93)    841.11 (   0.87)   1.02e-10


Thanks!
Boxed Cylon
 
Posts: 27
Joined: Sat Nov 21, 2009 6:03 pm

Re: Magma 1.40 benchmarks - GeForce GTX TITAN; Linux

Postby mgates3 » Mon Sep 23, 2013 1:20 pm

Yes, this is expected. The CUBLAS gemm has been heavily optimized for Kepler. The MAGMABLAS gemm was written for Fermi and has not been updated for Kepler. All of the MAGMA routines now use the CUBLAS gemm, which is basically the same speed as the MAGMA gemm on Fermi, but much faster on Kepler. (Note the magma_sgemm wrapper calls cublasSgemm, not magmablas_sgemm.)
-mark
mgates3
 
Posts: 401
Joined: Fri Jan 06, 2012 2:13 pm

Re: Magma 1.40 benchmarks - GeForce GTX TITAN; Linux

Postby fletchjp » Tue Sep 24, 2013 10:15 am

I suggest to people using GotoBLAS to move to OpenBLAS instead which is its successor and still being developed. This avoids some problems with the last release of GotoBLAS

http://www.openblas.net/

I am a contented user.

John
fletchjp
 
Posts: 170
Joined: Mon Dec 27, 2010 7:29 pm


Return to User discussion

Who is online

Users browsing this forum: Bing [Bot] and 1 guest

cron