Complex types of the MAGMA routines
Complex types of the MAGMA routines
I read the documentation and noticed that future developments did not explicitly include the complex equivalents (single or double). Is this an unintended omission? If complex equivalents are intended, when will they be ready?
MMB
MMB
Re: Complex types of the MAGMA routines
Complex versions are high in our priority to add. We have them implemented on the "high" level of the other versions
(we generate the different precision almost automatically) but we don't have yet the complex CUDA BLAS that is
needed, e.g. complex versions of syrk, trmm, and trsm. We have requested them from NVIDIA, and are considering
a MAGMA implementation as well.
Stan
(we generate the different precision almost automatically) but we don't have yet the complex CUDA BLAS that is
needed, e.g. complex versions of syrk, trmm, and trsm. We have requested them from NVIDIA, and are considering
a MAGMA implementation as well.
Stan
Re: Complex types of the MAGMA routines
It appears from other posts that November 14th is an important date for a further release. Will the complex types be included in that release?
Thanks
M M Bibby
Thanks
M M Bibby
-
- Posts: 283
- Joined: Fri Aug 21, 2009 10:39 pm
Re: Complex types of the MAGMA routines
The complex version of the 3 one-sided factorizations will be included. We still don't have some BLAS in complex so if NVIDIA does not provide it until then we are going to provide wrappers for what we need. For example, to do a cherk on the GPU we will just copy the data needed for the operations on the CPU, perform the operation there, and move the result back, as in
The code will still perform well because of the fast complex GPU gemm, e.g. here is the performance of the CPU interface of Cholesky in single precision complex arithmetic
Obviously, this will be significantly faster when we have all of the BLAS needed.
Regards,
Stan
Code: Select all
extern "C" void
magmablas_cherk(char uplo, char trans, int n, int k, float alpha,
float2 *A, int lda, float beta, float2 *C, int ldc){
int ka, ldamin;
if (trans == 'N' || trans == 'n')
ka = k, ldamin = n;
else
ka = n, ldamin = k;
float2 *a = (float2*)malloc(ka*ldamin * sizeof(float2));
float2 *c = (float2*)malloc(n*n * sizeof(float2));
cublasGetMatrix(ldamin, ka, sizeof(float2), A, lda, a, ldamin);
cublasGetMatrix(n, n, sizeof(float2), C, ldc, c, n);
cherk_(&uplo, &trans, &n, &k, &alpha, a, &ldamin, &beta, c, &n);
cublasSetMatrix(n, n, sizeof(float2), c, n, C, ldc);
free(a);
free(c);
}
Code: Select all
./testing_cpotrf
device 0: GeForce GTX 280, 1296.0 MHz clock, 1023.8 MB memory
Usage:
testing_cpotrf -N 1024
N CPU GFlop/s GPU GFlop/s ||R||_F / ||A||_F
========================================================
1024 53.71 49.52 1.598006e-08
2048 46.90 97.17 1.709468e-08
3072 50.33 122.87 1.484603e-08
4032 57.12 112.68 2.677048e-08
5184 58.99 126.12 2.073988e-08
6048 59.44 134.18 2.214732e-08
7200 67.61 148.46 2.458159e-08
8064 65.66 155.81 2.687845e-08
8928 60.30 182.20 3.045712e-08
Regards,
Stan
Re: Complex types of the MAGMA routines
Stan, thanks for the update. Is there any known reason that you can share with me as to why nVidia is so slow in releasing the complex version(s) of the BLAS? Technical or commercial?
Malcolm
Malcolm
-
- Posts: 283
- Joined: Fri Aug 21, 2009 10:39 pm
Re: Complex types of the MAGMA routines
Malcolm,
I don't see any technical reasons. As far as I know they are working on it and would have it soon. My guess is that the reason is combination of man-power needed to do it and priorities. There are many routines to do; others to optimize; maintain them for different platforms; the ones to be developed are also not easy - otherwise probably a third party would have provided them (unless everyone is waiting on NVIDIA to do it).
Stan
I don't see any technical reasons. As far as I know they are working on it and would have it soon. My guess is that the reason is combination of man-power needed to do it and priorities. There are many routines to do; others to optimize; maintain them for different platforms; the ones to be developed are also not easy - otherwise probably a third party would have provided them (unless everyone is waiting on NVIDIA to do it).
Stan
Re: Complex types of the MAGMA routines
Just a note on the complex routines. EM Photonics have recently released their CULA Tools which provide a similar functionality to MAGMA. As far as I can tell, they provide complex version of the routines (although the free basic version is limited to only six routines and only single precision). Since they are marketing their product, I assume that they have the manpower side of things sorted.
I understand fully that as an academic one often wishes that one had at least an extra two sets of arms. Thus I think it is important for us to share information and resources as much as possible to ensure the success of projects such as MAGMA.
I understand fully that as an academic one often wishes that one had at least an extra two sets of arms. Thus I think it is important for us to share information and resources as much as possible to ensure the success of projects such as MAGMA.
Last edited by evanlezar on Wed Nov 04, 2009 5:00 pm, edited 1 time in total.
Re: Complex types of the MAGMA routines
Link spamming? What about posting some links to magma in CUlaTools Forums?. From what I know perfomance of CUla is worse than MAGMA. I think it would be interesting to open a new discussion thread for reporting benchmarks of MAGMA compared to other libraries
Re: Complex types of the MAGMA routines
It was not my intention to link spam. I have no affiliation with EM Photonics, and was just pointing it out to those readers that were not aware of it.
It is my opinion, that although in its infancy Magma offers a much better solution - especially to academic developers such as myself - and I will contribute as much as I can.
Thanks
It is my opinion, that although in its infancy Magma offers a much better solution - especially to academic developers such as myself - and I will contribute as much as I can.
Thanks
Re: Complex types of the MAGMA routines
Hello Stan.
1. Any update on when the complex versions of your codes will be available? And which ones will they be?
2. I read somewhere, that you would be releasing BLAS codes as well. Is this correct and, if so, when?
Thanks
Malcolm
1. Any update on when the complex versions of your codes will be available? And which ones will they be?
2. I read somewhere, that you would be releasing BLAS codes as well. Is this correct and, if so, when?
Thanks
Malcolm