MAGMA GEMM Sources for Fermi Released

Open discussion for MAGMA library (Matrix Algebra on GPU and Multicore Architectures)

MAGMA GEMM Sources for Fermi Released

Postby admin » Wed Aug 04, 2010 12:54 pm

The MAGMA BLAS SGEMM and DGEMM sources for Fermi GPUs are now released.
These improved GEMMs, developed by Rajib Nath and Stan Tomov, will be
part of the up-coming MAGMA 0.3 library release and will be included in
CUBLAS 3.2 as well.

The basic algorithm is described in:
Nath, R., Tomov, S., Dongarra, J. "An Improved MAGMA GEMM for Fermi GPUs,"
University of Tennessee Computer Science Technical Report, UT-CS-10-655
(also LAPACK working note 227), July 29, 2010.
http://icl.cs.utk.edu/projectsfiles/mag ... i_gemm.pdf

On a C2050 GPU the new DGEMM gets up to 300 GFlop/s (58% of peak) and
the SGEMM up to 645 (63% of peak). On a GTX480 DGEMM gets up to 166 GFlop/s
and SGEMM up to 844 GFlop/s.
Attachments
magmablas_gemm_fermi.tar.gz
(9.95 KiB) Downloaded 660 times
admin
Site Admin
 
Posts: 20
Joined: Tue Aug 04, 2009 12:23 pm

Re: MAGMA GEMM Sources for Fermi Released

Postby mbibby » Thu Aug 05, 2010 10:04 am

When will we see the cgemm and zgemm equivalents?

Malcolm
mbibby
 
Posts: 10
Joined: Fri Aug 07, 2009 9:07 am

Re: MAGMA GEMM Sources for Fermi Released

Postby Stan Tomov » Thu Aug 05, 2010 12:32 pm

I am not sure if we would personally write the equivalents. NVIDIA is preparing CUBLAS 3.2
that will have improved c/z gemms using ideas from the s/d gemms.
Stan
Stan Tomov
 
Posts: 262
Joined: Fri Aug 21, 2009 10:39 pm

Re: MAGMA GEMM Sources for Fermi Released

Postby Boxed Cylon » Fri Aug 13, 2010 2:26 am

I preface this post with the declaration that I know just about nothing about details of these routines...

I was looking through the fermi_sgemm.cu routine to get some sense of how the code was engineered. I noticed the __mul24 function, and wondered what it did. A google search turned up the Fermi Tuning Guide with:

Code: Select all
32-Bit Integer Multiplication
On devices of compute capability 1.x, 32-bit integer multiplication is implemented using multiple instructions as it is not natively supported. 24-bit integer multiplication is natively supported via the __[u]mul24 intrinsic.

On devices of compute capability 2.0, however, 32-bit integer multiplication is natively supported, but 24-bit integer multiplication is not. __[u]mul24 is therefore implemented using multiple instructions and should not be used (Section 5.4.1).


Should the fermi_sgemm.cu routine be using __mul24? (Or perhaps there are reasons 24-bit integers are employed?)
Boxed Cylon
 
Posts: 34
Joined: Sat Nov 21, 2009 6:03 pm

Re: MAGMA GEMM Sources for Fermi Released

Postby Stan Tomov » Tue Sep 07, 2010 1:15 pm

There is no reason to use __mul24. We will remove it. Thanks for pointing this out.
Stan Tomov
 
Posts: 262
Joined: Fri Aug 21, 2009 10:39 pm

Re: MAGMA GEMM Sources for Fermi Released

Postby Allan Menezes » Sun Sep 12, 2010 11:51 pm

Dear Stan,
As this is just pointer arithmetic and used in only a few places it does not change the perfomance much at all as per my experiment below.
Just for fun I changed fermi_dgemm.cu and fermi_sgemm.cu with a single #define on top as #define __mul24(a,b) ((a)*(b)) and there was no significant difference in Gflops and err was still 0.00 on a GTX-480.
The device memory still on available fermi devices is < 4GB and is going to change in the future with the Tesla C2070 and CUDA 3.2 to 64 bit addresses.
Thank you,
Allan
Allan Menezes
 
Posts: 14
Joined: Wed Aug 05, 2009 10:01 pm

Re: MAGMA GEMM Sources for Fermi Released

Postby rramachand21 » Tue Nov 30, 2010 5:05 pm

Hello,

I am new to cuda and this api. Could I please get the source code for matrix vector multiplication (sgemv and dgemv) which is generic.

Thanks,
Ranjith
rramachand21
 
Posts: 2
Joined: Tue Nov 30, 2010 5:02 pm

Re: MAGMA GEMM Sources for Fermi Released

Postby anikam » Fri Mar 16, 2018 5:13 pm

Hello,
Why does Magmablas only works when m,n,k are multiple of 96?
Can it work if m,n,k are not multiple of 96?

Thanks and Regards
Abhishek Nikam
anikam
 
Posts: 12
Joined: Wed Mar 07, 2018 5:27 pm

Re: MAGMA GEMM Sources for Fermi Released

Postby mgates3 » Fri Mar 16, 2018 5:17 pm

It should work for any m, n, k, not just multiples of 96. If you are having a problem with other sizes, please post specifics, e.g., the output of magma/testing/testing_dgemm.

(There may be problems for very large matrices, due to exceeding GPU texture memory. As I recall, in these cases we just call cublas.)

-mark
mgates3
 
Posts: 822
Joined: Fri Jan 06, 2012 2:13 pm

Re: MAGMA GEMM Sources for Fermi Released

Postby anikam » Fri Mar 16, 2018 5:36 pm

Hello,
Thanks for the reply, for my work I need to only use the open source Magma Blas_Gemm.
Also, it calls cublasgemm if the dimensions are not multiple of 96.
It does not specify any particular warning about large sizes (large sizes with dimensions multiples of 96 must work).
Also, would the magma blas gemm work for dimensions are not multiple of 96 but are pretty small sizes.
Are there any specific changes which need to be done for that?
Also it does not work with latest Cuda versions, Is there any way with which I can make it run with latest Cuda versions?


Thanks and Regards
Abhishek NIkam
anikam
 
Posts: 12
Joined: Wed Mar 07, 2018 5:27 pm

Next

Return to User discussion

Who is online

Users browsing this forum: No registered users and 2 guests

cron