MAGMA with GotoBLAS

Open discussion for MAGMA

MAGMA with GotoBLAS

Postby admin » Wed Aug 05, 2009 10:39 pm

I have a 9 node Intel quad-core cluster and on the master node I have a pci-express 16 slot a BFG-GTX-260 NVIDIA GPU card and the latest 2.3 CUDA SDK, toolkit, and drivers. My GPU card has 216 processing elements and I managed to get your new MAGMA library working with GotoBLAS v1.26 (academic) and gfortran with the attached Make.inc. One has to add -llapack and remove -lguide and
yum install lapack
in fedora core 11 Linux because that's what I am running on my cluster. Also I have hard coded the path to the GotoBLAS library so you will have to change that from -L/bummer/GotoBLAS to something like -L$(HOME)/GotoBLAS. All that is required on fc11 is the change to Make.inc and it compiles beautifully in the make all in ../testing directory of the magma-0.1 install!
I tried it and it works for testing_spotrf -N 1024 3072 with good results even during a run where all host cpus were running another program (long run program related to number theory).
Best wishes,
Allan MeneZes MMATH
admin
Site Admin
 
Posts: 18
Joined: Tue Aug 04, 2009 12:23 pm

Re: MAGMA with GotoBLAS

Postby admin » Wed Aug 05, 2009 10:50 pm

Allan,
Thanks for your post and the input! The make.inc that you sent is now included in the MAGMA distribution under name make.inc.goto as an example of linking MAGMA with GotoBLAS. The other two make.inc examples that we have are make.inc.mkl and make.inc.atlas for linking MAGMA correspondingly with MKL BLAS and ATLAS BLAS.
Regards,
Stan Tomov
admin
Site Admin
 
Posts: 18
Joined: Tue Aug 04, 2009 12:23 pm

Re: MAGMA with GotoBLAS

Postby Allan Menezes » Thu Aug 06, 2009 6:20 pm

The make.inc.goto has a hard coded line -L/bummer/GotoBLAS to the GotoBLAS library libgoto.a. So please change it to -L$(HOME)/GotoBLAS if your GotoBLAS
BLAS library is in the $HOME directory or go to your GotoBLAS installation and #pwd from where libgoto.a is and use that: -L/path to libgoto.a
Also MAGMA compiles well with CUDA 2.3 and gfortran with the above make.inc.goto on fedora core 11 x86_64. You will have yum install lapack on fedora or redhat distros!
After you do that as root go to /usr/lib64 and do the following #ln -s liblapack.so.x.x.x ... liblapack.a
then it should find your lapack library!
The above is tested on a GTX 260 and fedora core 11 x86_64 with Intel Q6600 quad core and gfortran latest and CUDA2.3.
When I get the time I shall do complete testing of the various testing_* files (executables) in the ../magma-0.1/testing directory and post results for the above setup.
Allan
Allan Menezes
 
Posts: 14
Joined: Wed Aug 05, 2009 10:01 pm

Re: MAGMA with GotoBLAS

Postby Allan Menezes » Sat Aug 08, 2009 12:42 am

The results for the following setup for BFG-GTX260 NVIDIA GPU :
CPU : Intel Quad Core Q6600 stably overclocked to 3.12 GHz
Memory: 8Gbyte DDR2 1066MHZ
Motherboard: Asus P5Q-VM
CUDA-2.3 CUDA driver 190.16 stable
Fedora core 11 x86_64
gfortran, GotoBLAS V1.26 64bit

****************************************************************************************

Usage:
testing_dgetrf -N 1024



N CPU GFlop/s GPU GFlop/s ||PA-LU|| / (||A||*N)
==========================================================
1024 23.77 25.62 3.408593e-18
2048 30.88 37.63 3.214269e-18
3072 34.16 44.53 2.936542e-18
4032 35.58 48.86 3.327469e-18
5184 37.85 55.31 3.319174e-18
6016 38.78 53.84 2.811525e-18
7040 39.69 59.08 2.790057e-18
8064 40.31 59.16 2.751329e-18
9088 40.82 61.23 2.742905e-18
10112 41.32 61.90 2.718463e-18


Usage:
testing_dgetrf_gpu -N 1024



N CPU GFlop/s GPU GFlop/s ||PA-LU|| / (||A||*N)
==========================================================
1024 24.18 28.76 3.408593e-18
2048 31.12 40.73 3.214269e-18
3072 34.16 47.43 2.936542e-18
4032 36.22 51.48 3.327469e-18
5184 38.03 57.91 3.319174e-18
6016 38.79 59.69 2.811525e-18
7040 39.57 61.26 2.790057e-18
8064 40.15 61.05 2.751329e-18
9088 40.67 63.01 2.742905e-18
10112 41.14 63.53 2.718463e-18

********************************************************************************

Usage:
testing_dgeqrf -N 1024



N CPU GFlop/s GPU GFlop/s ||R||_F / ||A||_F
========================================================
1024 10.72 22.58 2.124636e-15
2048 13.70 41.55 2.690812e-15
3072 16.50 37.81 3.305553e-15
4032 19.15 46.62 3.649429e-15
5184 20.04 52.09 3.868397e-15
6016 20.14 54.26 4.224504e-15
7040 20.61 56.01 4.673449e-15
8064 20.63 56.94 5.023208e-15
9088 20.86 58.05 5.214826e-15
10112 20.86 58.64 5.505322e-15


Usage:
testing_dgeqrf_gpu -N 1024



N CPU GFlop/s GPU GFlop/s ||R||_F / ||A||_F
========================================================
1024 8.36 22.99 2.123458e-15
2048 13.70 41.87 2.695606e-15
3072 16.33 37.89 3.305553e-15
4032 19.09 46.72 3.648723e-15
5184 19.91 52.32 3.868533e-15
6016 20.15 54.50 4.224504e-15
7040 20.49 56.28 4.673449e-15
8064 20.54 57.20 5.023208e-15
9088 20.76 58.28 5.214826e-15
10112 20.73 58.87 5.505322e-15


***********************************************************************************

Usage:
testing_dpotrf -N 1024



N CPU GFlop/s GPU GFlop/s ||R||_F / ||A||_F
========================================================
1024 8.80 20.57 5.264627e-17
2048 33.78 33.23 6.141919e-17
3072 35.52 40.44 6.582184e-17
4032 39.59 44.91 6.715797e-17
5184 40.84 50.99 6.330631e-17
6144 41.42 53.22 6.495241e-17
6912 42.09 54.47 6.484325e-17
8192 42.79 56.28 6.852817e-17
8960 42.25 57.11 6.923006e-17
9984 43.37 58.11 7.124376e-17


Usage:
testing_dpotrf_gpu -N 1024



N CPU GFlop/s GPU GFlop/s ||R||_F / ||A||_F
========================================================
1024 10.77 23.66 5.264627e-17
2048 33.71 35.92 6.141919e-17
3072 37.01 42.98 6.582184e-17
4032 39.48 47.27 6.715797e-17
5184 41.08 53.21 6.330631e-17
6144 41.27 55.23 6.495241e-17
6912 42.03 56.42 6.484325e-17
8192 42.72 58.05 6.852817e-17
8960 42.80 58.31 6.923006e-17
9984 43.26 59.63 7.124376e-17

*******************************************************************************************

Usage:
testing_sgetrf -N 1024



N CPU GFlop/s GPU GFlop/s ||PA-LU|| / (||A||*N)
==========================================================
1024 40.66 48.55 1.976306e-09
2048 57.04 108.70 1.850563e-09
3072 62.21 156.67 1.703210e-09
4032 68.10 166.09 1.892800e-09
5184 70.93 210.96 1.880247e-09
6016 73.26 224.20 1.655149e-09
7040 74.78 236.94 1.620820e-09
8064 76.36 204.78 1.733507e-09
9088 77.45 254.48 1.937144e-09
10112 78.46 261.02 2.085332e-09


Usage:
testing_sgetrf_gpu -N 1024



N CPU GFlop/s GPU GFlop/s ||PA-LU|| / (||A||*N)
==========================================================
1024 40.93 54.12 1.976306e-09
2048 56.96 122.54 1.850563e-09
3072 63.67 175.98 1.703210e-09
4032 67.20 182.30 1.892800e-09
5184 71.24 230.75 1.880247e-09
6016 73.24 243.31 1.655149e-09
7040 74.72 254.99 1.620820e-09
8064 74.58 215.99 1.733507e-09
9088 77.42 270.48 1.937144e-09
10112 78.40 275.95 2.085332e-09

***************************************************************************************

Usage:
testing_sgeqrf -N 1024



N CPU GFlop/s GPU GFlop/s ||R||_F / ||A||_F
========================================================
1024 14.37 40.30 1.026423e-06
2048 19.89 87.27 1.408154e-06
3072 24.35 79.17 1.695626e-06
4032 29.81 104.92 1.881358e-06
5184 32.53 136.18 1.998221e-06
6016 33.70 83.37 2.307082e-06
7040 34.52 97.77 2.549049e-06
8064 34.52 111.81 2.747592e-06
9088 34.82 126.09 7.548179e-02
10112 35.39 139.46 2.915735e-06


Usage:
testing_sgeqrf_gpu -N 1024



N CPU GFlop/s GPU GFlop/s ||R||_F / ||A||_F
========================================================
1024 11.21 42.24 1.027562e-06
2048 19.53 88.53 1.410831e-06
3072 24.44 79.42 1.696237e-06
4032 29.66 105.61 1.881252e-06
5184 32.47 136.40 1.998285e-06
6016 33.56 83.56 2.307082e-06
7040 39.90 97.91 2.549049e-06
8064 34.24 112.05 2.747592e-06
9088 34.89 126.35 7.548179e-02
10112 35.21 139.93 2.915735e-06

**************************************************************************************

Usage:
testing_spotrf -N 1024



N CPU GFlop/s GPU GFlop/s ||R||_F / ||A||_F
========================================================
1024 11.27 43.98 3.145090e-08
2048 62.01 94.15 3.374819e-08
3072 69.39 129.24 3.437894e-08
4032 73.52 153.75 3.890706e-08
5184 75.33 182.81 3.974947e-08
6048 77.89 194.71 3.991257e-08
7200 80.83 208.44 4.320007e-08
8064 81.54 185.18 4.465149e-08
8928 82.92 223.65 4.594171e-08
10080 84.24 231.09 4.837715e-08


Usage:
testing_spotrf_gpu -N 1024



N CPU GFlop/s GPU GFlop/s ||R||_F / ||A||_F
========================================================
1024 6.86 51.80 3.145090e-08
2048 61.63 106.52 3.374819e-08
3072 67.74 143.90 3.437894e-08
4032 70.74 168.35 3.890706e-08
5184 76.93 198.49 3.974947e-08
6048 79.05 209.80 3.991257e-08
7200 80.25 222.62 4.320007e-08
8064 82.11 194.87 4.465149e-08
8928 82.63 236.85 4.594171e-08
10080 84.11 243.53 4.837715e-08

**************************************************************************************
Best wishes and I hope somebody finds this information useful!
Allan MeneZes
Allan Menezes
 
Posts: 14
Joined: Wed Aug 05, 2009 10:01 pm

Re: MAGMA with GotoBLAS

Postby brutus » Mon Sep 14, 2009 1:52 pm

Allan Menezes wrote:You will have yum install lapack on fedora or redhat distros!
After you do that as root go to /usr/lib64 and do the following #ln -s liblapack.so.x.x.x ... liblapack.a
then it should find your lapack library!


I suppose it's a typo, but just in case -- linking a ".so" to a ".a" is a bad idea. A simpler solution is to install the "lapack-devel" package, in addition to "lapack".
brutus
 
Posts: 1
Joined: Mon Sep 14, 2009 1:42 pm

Re: MAGMA with GotoBLAS

Postby Allan Menezes » Mon Sep 14, 2009 11:44 pm

Thanks Brutus, for clarifying this. It my ignorance!
Allan MeneZes
Allan Menezes
 
Posts: 14
Joined: Wed Aug 05, 2009 10:01 pm


Return to User discussion

Who is online

Users browsing this forum: Yahoo [Bot] and 4 guests