Problem with testing_zgesv

Open discussion for MAGMA

Re: Problem with testing_zgesv

Postby evanlezar » Thu Sep 13, 2012 8:31 am

Just thought I would ping this thread and ask if there is anything special that we should be doing with the Intel compiler?

I have been picking up similar messages, but specifically for VERY LARGE problems (see the 64bit Integer thread), and am also using the Intel compiler. Unfortunately simply switching to GCC is not an option for me (we also use the Intel compiler under Windows, for example).
evanlezar
 
Posts: 33
Joined: Tue Aug 25, 2009 7:20 pm
Location: Stellenbosch, South Africa

Re: Problem with testing_zgesv

Postby Stan Tomov » Thu Sep 13, 2012 10:25 am

Upon further investigation we found that the problem is with Intel's compiler. This CUDA release note summarizes the issue:
There is a known bug in ICC with respect to passing 16-byte aligned types by value to GCC-built code such as the CUDA Toolkit libraries (e.g., CUBLAS). At this time, passing a double2 or cuDoubleComplex or any other 16-byte aligned type by value to GCC-built code from ICC-built code will pass incorrect data. Intel has been informed of this bug. As a workaround, a GCC-built wrapper function that accepts the data by reference from the ICC-built code can be linked with the ICC-built code; the GCC-built wrapper can then, in turn, pass the data by value to the CUDA Toolkit libraries.


Until we implement and release the workaround suggested we recommend the use of gcc to compile MAGMA.
Stan
Stan Tomov
 
Posts: 251
Joined: Fri Aug 21, 2009 10:39 pm

Re: Problem with testing_zgesv

Postby evanlezar » Fri Sep 14, 2012 3:38 am

Thanks Stan,

I also recall seeing that message in the release notes a while back, but since it is not mentioned in the CUDA 4.1 or 4.2 release notes (I have not checked 4.0), I assumed this was no longer an issue. Could it be that since the new CUBLAS interface passes the scaling values (alpha in the case of ZGEMM for example) as a call by reference parameter, that this is the reason that it is not mentioned. The legacy API still uses call by value parameters.

One thing that makes me wonder if the launch failures I am experiencing (see viewtopic.php?f=2&t=536) is not simply due to compiler incompatibility is that they occur on Windows as well. Am I right in assuming that the Windows CUDA binaries are not built using GCC?

I will do a test or two here and see if this is indeed a problem.

Regards
Evan
evanlezar
 
Posts: 33
Joined: Tue Aug 25, 2009 7:20 pm
Location: Stellenbosch, South Africa

Previous

Return to User discussion

Who is online

Users browsing this forum: Baidu [Spider], Bing [Bot], Google [Bot] and 3 guests

cron