Search found 19 matches

by haidar
Fri Dec 08, 2017 11:47 am
Forum: User discussion
Topic: Sequential SVD computation for Big Data using MAGMA
Replies: 2
Views: 2423

Re: Sequential SVD computation for Big Data using MAGMA

Dear B-C, The paper you refer to is a experimental code for CPU multicore, it does not use GPU. your matrix is square, you might look at this paper https://link.springer.com/chapter/10.1007/978-3-319-58667-0_9 The paper include formula to calculate the expected time for a calculation, so you can fir...
by haidar
Tue Aug 15, 2017 11:04 am
Forum: User discussion
Topic: Is autotuning a offline process?
Replies: 1
Views: 1568

Re: Is autotuning a offline process?

Hi,
The current autotunning process is performed offline since it is done once per GPU type.
Thus we generate all acceptable kernel configuration and run them, and analyze the performance and choose the best for the target architecture.
Azzam
by haidar
Tue Aug 01, 2017 9:13 pm
Forum: User discussion
Topic: Batched GEMV with float4
Replies: 3
Views: 1132

Re: Batched GEMV with float4

I think both should provide similar performance since a gemm with 4 columns will look like 4 gemv's.
This is considered to be memory bound operation and the performance of it will be behave like dgemv performance
Azzam
by haidar
Thu Jul 20, 2017 11:10 am
Forum: User discussion
Topic: Multiple queues and sgemv_batched
Replies: 2
Views: 958

Re: Multiple queues and sgemv_batched

Hi, if you create different queues and launch different sgemv_batched this mean, you are telling the GPU, that whenever he has slot available for work he can launch work from queue 2, 3, 4, etc. Now two questions: 1- if you dispatch them over 9 queues that can run in parallel so why you didn't made ...
by haidar
Thu Jul 20, 2017 10:49 am
Forum: User discussion
Topic: Trouble Compiling Magma on an AMD Cray
Replies: 2
Views: 987

Re: Trouble Compiling Magma on an AMD Cray

Ray,
Thank you very much for sharing the solution, that's interesting to know.
Azzam
by haidar
Thu Jul 20, 2017 10:48 am
Forum: User discussion
Topic: Toeplitz Matrix Batch
Replies: 2
Views: 988

Re: Toeplitz Matrix Batch

Dear user, As of today we do not have any particular routine suitable for Toeplitz matrices and we would be happy and will help if you would like to contribute such routine into the Magma library. However, I did not quite understand the batch are you having one system of equation Ax=b meaning one ma...
by haidar
Mon Jul 03, 2017 10:52 pm
Forum: User discussion
Topic: Batched GEMV with float4
Replies: 3
Views: 1132

Re: Batched GEMV with float4

Can you please elaborate in more detail on what you want to do? Are you meaning the float4 of Cuda vector unit? I think it might be easy to cast the type into float and use the single precision dgemv. In term of performance, our GEMV routine reach the theoretical peak which is bandwidth/2 for single...
by haidar
Fri Apr 08, 2016 10:13 am
Forum: User discussion
Topic: Using magma_ssyevd_gpu in MAGMA 2.0.1
Replies: 4
Views: 1659

Re: Using magma_ssyevd_gpu in MAGMA 2.0.1

You can also use the ssyevdx_2stage which is a newer algorithm faster for large matrices.
by haidar
Fri Mar 25, 2016 2:19 pm
Forum: User discussion
Topic: papers to dgeqrf_gpu/dgeqrf algorithms, explanations?
Replies: 6
Views: 3645

Re: papers to dgeqrf_gpu/dgeqrf algorithms, explanations?

Hi Nahla, all the current Magma routine are hybrid meaning they use both CPU and GPU. So overall, Cholesky, LU and QR follows the LAPACK fashion of factorization, meaning a panel facto followed by an update of the trailing matrix. a general overview about it is explained in: https://www.google.com/u...
by haidar
Fri Mar 25, 2016 2:00 pm
Forum: User discussion
Topic: Variable-size batched GEMM
Replies: 2
Views: 1979

Re: Variable-size batched GEMM

Hi, The Magma variable size batched routine are not released yet, some of them are expected to be released with our next release (around July 2016). However, if you detail for us your need and requirements I might be able to send you a patch (or tarball) that include the routines you need. Thanks Az...