LAPACK Archives

[Lapack] NaN propagation in BLAS

How about this for a careful (nonreckless) matmul (or other BLAS):
    Spend O(n^2) to check the inputs for NaNs (and perhaps infinities) 
... cheap
    If (a NaN appears)   ... very unlikely
         use a version of matmul that we know will propagate them correctly
             ... could be slower matmul, or we could just fill in 
selected rows and
             ... columns of product with NaNs depending on where NaNs 
appeared in the input
    else
         use your favorite performance tuned (reckless) version, 
including Strassen etc.
    end if

How to do this of course depends on how we decide to treat the cases 
alpha=0, beta=0.

A less portable version might use some kind of exceptions flags at the 
end (of course
the usual ones will not be set by quiet NaNs) to see if a NaN appeared, 
and recomputation
is necessary (so slow with low probability). This would of course not 
fix the problem identified
in Julien's email, where the NaN may not be touched at all.


Jim


On 7/10/14, 8:34 AM, Julien Langou wrote:
Hi all,

1) Below is an email that we just received from some R & redhat folks 
about NaN propagation in the BLAS. Please read the email discussion 
between them and I.
I summarized their request in my reply to them. (So you can just read 
my answer.)

2) Please see as well my question to them. Starting with: "A related 
question. How do you feel about the other of IF statements if DGEMM?? 
(I do not have an answer yet from them.)

3) Please see as well: 
http://icl.cs.utk.edu/lapack-forum/viewtopic.php?f=5&t=4499

4) Please see as well email from Jim (Demmel) to Penny Andrerson 
(MathWorks) "[LAPACK-DEV] NaNs and consistent exception handling" from 
DEC-14-2012.

So basically this is the recurring of issue of NaN propagation.

1) I think we will apply the patch of these guys. Reason: I am OK with 
keeping the non-propagation of NaNs when ALPHA and BETA have special 
values (for now), but this non-propagation here is really weird. It is 
specific to the JLI variant of DGEMM, it applies when an entry of B is 
ZERO, pfff . . . Oh my. I just have a headache to think what the 
output would look like. There would be NaN at some places in C, no NaN 
at other places. A mess. I am OK with keeping the non-propagation of 
NaNs when ALPHA and BETA have special values (for now), but not based 
on values of the entries of the matrices A and B.

(Does this make sense?)

2) We can rethink the NaN propagation in DGEMM (and BLAS in general) 
for values of ALPHA and BETA. (???) So bottom line: do we want the 
reference DGEMM to just be three loops and propagate everything? I 
think this is more a question for users, super-users, etc. Jim might 
have some answers with his related projects.

3) Note: Strassen version of DGEMM will not propagate NaN correctly. 
Too many NaNs will be created. (Just a random comment, I was just 
thinking about this.)

Cheers,
Julien.



Begin forwarded message:

*From: *Julien Langou <julien.langou@Domain.Removed 
<mailto:julien.langou@Domain.Removed>>
*Subject: **Re: [Lapack] Issue with BLAS causing regression test 
failure in R*
*Date: *July 10, 2014 at 9:12:10 AM MDT
*To: *Tom Callaway <tcallawa@Domain.Removed <mailto:tcallawa@Domain.Removed>>
*Cc: *Martyn Plummer <plummerm@Domain.Removed 
<mailto:plummerm@Domain.Removed>>, 
"peljasz@Domain.Removed <mailto:peljasz@Domain.Removed>" 
<peljasz@Domain.Removed <mailto:peljasz@Domain.Removed>>, julie langou 
<julie@Domain.Removed <mailto:julie@Domain.Removed>>


Hi Tom, Hi Martyn, Hi Lejeczek,

Thanks for pointing this, just to let you know that we (LAPACK folks) 
are discussing this.

So just taking DGEMM as an example, if you use notation:
C(I,J) = C(I,J) + alpha * A(I,L) * B(L,J),
the reference BLAS DGEMM is using the JLI version of matrix matrix 
multiply and is checking for each J (from 1 to N) and each L (from 1 
to K), whether B(L,J) is zero (or not) to save (or not) the 2M 
following operations.

The snippets of code is as follows
              DO 90 J = 1,N
                  DO 80 L = 1,K
                      IF (B(L,J).NE.ZERO) THEN
                          TEMP = ALPHA*B(L,J)
                          DO 70 I = 1,M
                              C(I,J) = C(I,J) + TEMP*A(I,L)
   70                     CONTINUE
                      END IF
   80             CONTINUE
   90         CONTINUE

You suggest to remove the IF statement so that NaNs are correctly 
propagated.

So as I said, we (LAPACK folks) are discussing this.

A related question. How do you feel about the other of IF statements 
if DGEMM?

For example: (always on DGEMM)
(1) if BETA = 0, we set C to ZERO, then we follow with computation, 
etc. So if BETA = 0 and NaN in C, the NaN in C are removed. Are you 
OK with this?
(2) if ALPHA = 0 and BETA = 1, we return directly. So if ALPHA = 0 
and BETA = 1, and NaNs in A or in B, these NaNs are not propagated to 
output C. Are you OK with this?
What I am trying to say is that there are plenty of other places 
where NaN are not necessarily propagated within DGEMM.
(Pretty much each time there is an IF statement in the DGEMM 
subroutine is a case of this.)

So my question is how do you feel about the other places where NaN in 
input (A, B, C, or alpha, or beta) do not propagate?

Cheers,
Julien.


On Jul 7, 2014, at 9:58 AM, Tom Callaway <tcallawa@Domain.Removed 
<mailto:tcallawa@Domain.Removed>> wrote:

Hello LAPACK people,

Martyn & Lejeczek (on CC) reported an issue to Fedora relating to R
using our system copy of BLAS (from LAPACK).

As noted in the R administration and Installation Manual, "R relies on
ISO/IEC 60559 compliance of an external BLAS. This can be broken if for
example the code assumes that terms with a zero factor are always zero
and do not need to be computed - whereas x*0 can be NaN. This is checked
in the test suite."

In the stock BLAS, DGBMV, DGEMM, and DGEMV fail this. R has been
patching their bundled BLAS to resolve this issue since 2010, but Fedora
now uses the system BLAS.

Attached is a patch (from upstream R) to fix this issue in the LAPACK
BLAS. Please consider applying it.

Thanks,

~tom

==
?.???`?.??`?.??.???`?.?><(((?> OSAS @ Red Hat
University Outreach || Fedora Special Projects || Fedora Legal
<lapack-3.5.0-R-blas-fixes.patch><tcallawa.vcf>_______________________________________________
Lapack mailing list
Lapack@Domain.Removed <mailto:Lapack@Domain.Removed>
http://lists.eecs.utk.edu/mailman/listinfo/lapack



-------------- next part --------------
An HTML attachment was scrubbed...
URL: 
<http://lists.eecs.utk.edu/mailman/private/lapack/attachments/20140710/e095b2c2/attachment.html>

<Prev in Thread] Current Thread [Next in Thread>


For additional information you may use the LAPACK/ScaLAPACK Forum.
Or one of the mailing lists, or