Hello everyone.
I am using MAGMA library, magma_dgels_gpu in more details, for obtaining least square solutions of an equation, 'Ax=b'.
During a profiling process of my algorithm, I have found that there is unignorable latency for launching MAGMA kernel, magm_dgels_gpu as shown in the attached file.
With regard to the mentioned issue above, I came up with two questions and hoped if I can have some advice thankfully.
1. Would there be any way of reducing the time for launching MAGMA kernel?
2. Can MAGMA kernel be called in device kernels by means of 'dynamic parallelism'?
Thank you for your help and time.
Best regards.
About time required for launching MAGMA kernels in host
About time required for launching MAGMA kernels in host
- Attachments
-
- Capture for NVVP profiling result
- Inkednvvp capture_LI.jpg (262.04 KiB) Viewed 111 times
Re: About time required for launching MAGMA kernels in host
It's unclear what is occurring in your example. A sample run using one of the MAGMA testers, or at least sample code, so we can attempt to reproduce the issue is needed. What is the problem size? What is your system — OS, CUDA version, what CPU & GPU?
Also, it would help to expanding the "[+] Compute" section to show what kernels are actually getting launched. Is there more to the computation outside of the window shown? It's unclear in your profile when the MAGMA dgels routine is actually working.
There is no particular "magma_dgels_gpu" kernel. It is a hybrid code that launches many cuBLAS kernels (cuBLAS dgemm, etc.). That is, magma_dgels_gpu itself runs on the CPU, while it launches kernels on the GPU.
-mark
Also, it would help to expanding the "[+] Compute" section to show what kernels are actually getting launched. Is there more to the computation outside of the window shown? It's unclear in your profile when the MAGMA dgels routine is actually working.
There is no particular "magma_dgels_gpu" kernel. It is a hybrid code that launches many cuBLAS kernels (cuBLAS dgemm, etc.). That is, magma_dgels_gpu itself runs on the CPU, while it launches kernels on the GPU.
-mark