Fremi and Tesla support in one library?

Open discussion for MAGMA library (Matrix Algebra on GPU and Multicore Architectures)
Post Reply
Posts: 18
Joined: Tue Jan 25, 2011 8:20 pm

Fremi and Tesla support in one library?

Post by brom » Wed Mar 02, 2011 10:12 am

Is it possible to compile MAGMA for both Tesla and Fermi support in one library? Right now you must specify the generation in the Makefile.

I have a pre-compiled application where I don't know what generation the end user will have. They might have a Fermi, Tesla, or even a G80. I know this is possible with NVCC but I don't see how to do it in the MAGMA Makefile.


Posts: 41
Joined: Tue Mar 08, 2011 12:38 pm

Re: Fremi and Tesla support in one library?

Post by mateo70 » Tue Mar 08, 2011 2:50 pm


it's part of our plan. In a first shot we will provide the possibility to compile two different libraries with different symbols, and we will think how to merge them in only one library. But it will be probably for next month, not before.


Posts: 2
Joined: Thu Jul 07, 2011 9:01 am

Re: Fremi and Tesla support in one library?

Post by jacquesdutoit » Thu Jul 07, 2011 9:31 am


I work for the Numerical Algorithms Group (NAG). We've been working with a customer combining MAGMA with some of our own code and ran into this issue as well. The customer (and us!) would like to have a single MAGMA library which could be copied to multiple machines, with possibly different NVIDIA cards, and at runtime MAGMA should pick up the GPU architecture and select the correct code path to run.

As I understand it, the biggest complication (from the software engineering perspective) is the fact that MAGMA implements its own GPU BLAS functions for several BLAS algorithms, instead of calling into CUBLAS. This is no doubt for performance reasons.

1.) Do you know whether the MAGMA BLAS functions have made it into CUBLAS 4.0?

We've gone ahead and refactored/reworked several parts of the MAGMA library (basically enough to have a working Cholesky decomposition) so that it can pick up architecture at runtime and call the correct code path. It seems the easiest way to achieve this is to turn MAGMA BLAS into a "separate library" (or at least conceptually treat it that way), which has implementations for the BLAS functions you wish to override. The MAGMA BLAS function should query the device and launch the correct code path, as CUBLAS does. Throughout the code one can then make CUBLAS function calls, and in a global config header one could #define those CUBLAS functions that have been overridden, to point at the corresponding MAGMA BLAS functions. This is very similar to what MAGMA does at the moment.

2.) Do you have any feel for how much of CUBLAS MAGMA might override in the future? I imagine the set would shrink as NVIDIA incorporates the BLAS improvements that you have made.

We are quite happy to contribute the changes we've made back to the MAGMA project. However seeing as multi-architecture support is on your plan anyways, the obvious question is

3.) How far has this work progressed?

If it is almost complete, then there is probably no need. If the work is not very advanced, it might make sense to coordinate efforts and perhaps discuss a design for how best to implement the multi-architecture support. Obviously if we're going to contribute large changes like this, the MAGMA team would have to be happy with the changes. In a perfect world there would be no need for a MAGMA BLAS library, and so one might hope that in the future the MAGMA BLAS library would shrink until MAGMA only relied on CUBLAS. This is the rationale for modelling the design/behaviour of a MAGMA BLAS library on CUBLAS.

I would be very interested in any comments/questions/suggestions you may have.

Jacques du Toit

Post Reply