Hi. In some of the MAGMA presentations, and in one paper, autotuned OpenCL GEMM routines were mentioned. Are they distributed/available anywhere?
I actually have developed autotuned BLAS of my own, (to be open-sourced soon) and wanted to compare the results. I am also interested in porting clMAGMA to my BLAS instead of AMD's BLAS.
