dsytrd slower in magma 1.1.0 than 1.0.0_rc5

Open discussion for MAGMA

dsytrd slower in magma 1.1.0 than 1.0.0_rc5

Postby jeremiahpalmer » Thu Jan 12, 2012 6:34 pm

Hello!
I am in the process of installing magma 1.1.0 for use in my group. I ran test_dsytrd in magma_1.0.0_rc5 and compared it with test_dsytrd in magma_1.1.0 (Same machine, same gpus). I get these results for rc5:

Code: Select all
device 0: GeForce GTX 480, 1401.0 MHz clock, 1535.7 MB memory
device 1: GeForce GTX 480, 1401.0 MHz clock, 1535.7 MB memory
device 2: GeForce GTX 480, 1401.0 MHz clock, 1535.7 MB memory
device 3: GeForce GTX 480, 1401.0 MHz clock, 1535.7 MB memory

Usage:
  testing_dsytrd -L|U -N 1024



  N    CPU GFlop/s    GPU GFlop/s   |A-QHQ'|/N|A|  |I-QQ'|/N
=============================================================
 1024    21.83         14.22
 2048    32.64         14.63
 3072    31.50         22.26
 4032    22.50         28.16
 5184    18.74         34.08
 6016    17.33         37.37
 7040    16.41         40.52
 8064    15.74         42.76
 9088    15.42         45.20
10112    15.20         46.90


Here is what I get for the latest version 1.1.0:
Code: Select all
device 0: GeForce GTX 480, 1401.0 MHz clock, 1535.7 MB memory, capability 2.0
device 1: GeForce GTX 480, 1401.0 MHz clock, 1535.7 MB memory, capability 2.0
device 2: GeForce GTX 480, 1401.0 MHz clock, 1535.7 MB memory, capability 2.0
device 3: GeForce GTX 480, 1401.0 MHz clock, 1535.7 MB memory, capability 2.0

Usage:
  testing_dsytrd -L|U -N 1024



  N    CPU GFlop/s    GPU GFlop/s   |A-QHQ'|/N|A|  |I-QQ'|/N
=============================================================
 1024    23.10         15.72
 2048    33.27         21.13
 3072    32.39         24.67
 4032    22.80         25.70
 5184    18.91         27.75
 6016    17.53         29.06
 7040    16.55         29.82
 8064    15.87         29.71
 9088    15.56         30.79
10112    15.30         31.30


Note that the only modification to the magma source code that I made is that in my rc5 copy I added this:
Code: Select all
#define cublasDsymv magmablas_dsymv

at line 364 in dlatrd.cpp because of this response I received in a discussion on the forum last year:
viewtopic.php?f=2&t=175

Why does the newer dsytrd (mostly) run slower than rc5? Have I neglected to set a parameter somewhere?

Thanks,
Jeremiah
jeremiahpalmer
 
Posts: 58
Joined: Fri Jan 28, 2011 12:46 pm

Re: dsytrd slower in magma 1.1.0 than 1.0.0_rc5

Postby jeremiahpalmer » Thu Jan 19, 2012 11:04 am

Does anyone know the answer to my question?
jeremiahpalmer
 
Posts: 58
Joined: Fri Jan 28, 2011 12:46 pm

Re: dsytrd slower in magma 1.1.0 than 1.0.0_rc5

Postby jeremiahpalmer » Tue Feb 28, 2012 5:11 pm

I have done a little digging, and it looks like testing_dsytrd calls dsytrd, which calls dlatrd, which calls cublas_dsymv. Dsytrd2_gpu calls magmablas_dsymv (if compute capability is 2.0 or above). I bet that since magma's dsymv smokes cublas's dsymv, the dsytrd2_gpu routine is the one I should be testing.

Will there be a "testing_dsytrd_gpu" routine in the next version of magma? Or do you have one already made that I can have? :)

Thanks,
Jeremiah
jeremiahpalmer
 
Posts: 58
Joined: Fri Jan 28, 2011 12:46 pm

Re: dsytrd slower in magma 1.1.0 than 1.0.0_rc5

Postby mgates3 » Tue Feb 28, 2012 5:26 pm

Have you tried the same modification that you added to MAGMA 1.0 rc5?
-mark
mgates3
 
Posts: 407
Joined: Fri Jan 06, 2012 2:13 pm

Re: dsytrd slower in magma 1.1.0 than 1.0.0_rc5

Postby jeremiahpalmer » Tue Feb 28, 2012 5:51 pm

No, I haven't. The dlatrd2 code calls a new (to me) version of magma_dsymv which requires a workspace. I'm a little hesitant to modify magma *that* much.

Dsymv is the bottleneck of dsytrd, so, I guess it would be more informative to test the new version of magma_dsymv first. Although, I don't see a testing_dsymv that calls the new version either...
jeremiahpalmer
 
Posts: 58
Joined: Fri Jan 28, 2011 12:46 pm

Re: dsytrd slower in magma 1.1.0 than 1.0.0_rc5

Postby jeremiahpalmer » Tue Feb 28, 2012 11:21 pm

I replaced calls to cublasDsymv in dlatrd with magmablas_dsymv and now dsytrd is performing like it did in rc5. I look forward to trying out dsytrd2_gpu, when it becomes usable!

Thanks,
Jeremiah
jeremiahpalmer
 
Posts: 58
Joined: Fri Jan 28, 2011 12:46 pm


Return to User discussion

Who is online

Users browsing this forum: Bing [Bot] and 3 guests