Bug: getrf_batched kernel produces NaNs on singular square inputs of size <=32

Open discussion for MAGMA library (Matrix Algebra on GPU and Multicore Architectures)
Post Reply
pearu
Posts: 1
Joined: Tue Oct 29, 2019 4:53 am

Bug: getrf_batched kernel produces NaNs on singular square inputs of size <=32

Post by pearu » Tue Oct 29, 2019 5:23 am

Hi,

I was not able to create a bug report in magma mercurial issues, so I'll report it here.

The subject of this message summarizes the issue, here's a reproducer based on pytorch:

Code: Select all

>>> import torch
>>> m, n = 3, 3
>>> torch.ones(1, m, n, device='cuda').lu()
(tensor([[[1., 1., 1.],
         [1., 0., 0.],
         [1., nan, nan]]], device='cuda:0'), tensor([[1, 2, 3]], device='cuda:0', dtype=torch.int32))
Notice the nan entries appear only when m == n and m <= 32, for other cases, the getrf_batched works correctly.

The source of this issue is likely in the kernel functions implemented in magmablas/zgetrf_batched_smallsq_shfl.cu and ./magmablas/zgetrf_batched_smallsq_noshfl.cu .

Best regards,
Pearu

mgates3
Posts: 906
Joined: Fri Jan 06, 2012 2:13 pm

Re: Bug: getrf_batched kernel produces NaNs on singular square inputs of size <=32

Post by mgates3 » Tue Oct 29, 2019 8:11 pm

You need to have a Bitbucket account to post bug reports. I posted this there for tracking:
https://bitbucket.org/icl/magma/issues/ ... es-nans-on

-mark

Post Reply