non-consistent results for dgemm between MAGMA and CPU

Open discussion for MAGMA

non-consistent results for dgemm between MAGMA and CPU

Postby shingoxlf » Sun Jun 15, 2014 3:49 pm

I have a piece of code for dgemm. However, the result from MAGMA CUDA is slightly different from the CPU version. And the errors can accumulate in my program which return the wrong result at the end.

Here is the code:

Code: Select all
                      int i_temp;
                      double *d_a;
                      double *d_b;
                      double *d_c;
                      double *c_temp=(double *)malloc(sizeof(double)*n*n);
                      magma_dmalloc(&d_a,n*n);
                      magma_dmalloc(&d_b,n*n);
                      magma_dmalloc(&d_c,n*n);

                      magma_dsetmatrix(n,n,a,n,d_a,n);
                      magma_dsetmatrix(n,n,b,n,d_b,n);
                      magma_dsetmatrix(n,n,c,n,d_c,n);
                     
                      magma_dgemm(MagmaNoTrans, MagmaNoTrans,n,n,n,scale1,d_a,n,d_b,n,scale2,d_c,n);
                                                                                                                                                 
                      magma_dgetmatrix(n,n,d_c,n,c_temp,n);
                                                                                                       
                      magma_free(d_a);
                      magma_free(d_b);
                      magma_free(d_c);
                      dgemm_("N","N",&n,&n,&n,&scale1,a,&n,b,&n,&scale2,c,&n);
                      for(i_temp=0;i_temp<n*n;i_temp++){if(c[i_temp]!=c_temp[i_temp]){ printf("%.16lf %.16lf %.16lf\n",c[i_temp],c_temp[i_temp],c[i_temp]-c_temp[i_temp]);}}


And here is the difference, first column is CPU result, second is GPU MAGMA, and thrid column is difference:

559.7084046638115069 559.7084046638113932 0.0000000000001137
560.9680396978561703 560.9680396978560566 0.0000000000001137
224.6888767425457729 224.6888767425458298 -0.0000000000000568
-629.0056419715349421 -629.0056419715350557 0.0000000000001137
562.8440549398717394 562.8440549398718531 -0.0000000000001137
338.2343050347719213 338.2343050347718645 0.0000000000000568
572.4736225567216934 572.4736225567215797 0.0000000000001137
569.0900265654566965 569.0900265654565828 0.0000000000001137
246.1574177192350703 246.1574177192350987 -0.0000000000000284
278.4800513095936481 278.4800513095935912 0.0000000000000568
244.9750100271907911 244.9750100271907627 0.0000000000000284
554.3704885694941140 554.3704885694942277 -0.0000000000001137
553.4623967852701298 553.4623967852702435 -0.0000000000001137
550.0223361361106527 550.0223361361105390 0.0000000000001137
551.1277505071294627 551.1277505071295764 -0.0000000000001137
238.4941567717439739 238.4941567717439455 0.0000000000000284
270.0293564005178837 270.0293564005179405 -0.0000000000000568
577.7618347343277492 577.7618347343278629 -0.0000000000001137
231.6990575521797382 231.6990575521797666 -0.0000000000000284
628.0800863104219616 628.0800863104220753 -0.0000000000001137
428.2931579550929655 428.2931579550930223 -0.0000000000000568
266.3359836020954390 266.3359836020954958 -0.0000000000000568
246.0548404016163602 246.0548404016163317 0.0000000000000284
625.4341410331356883 625.4341410331358020 -0.0000000000001137
266.2263578922264742 266.2263578922264173 0.0000000000000568
245.9585218483765061 245.9585218483765345 -0.0000000000000284
611.2471473987446871 611.2471473987448007 -0.0000000000001137
397.3603926269892668 397.3603926269893236 -0.0000000000000568
286.1767765169782933 286.1767765169782365 0.0000000000000568
635.9287922588151787 635.9287922588152924 -0.0000000000001137
374.0391255111871374 374.0391255111870805 0.0000000000000568
362.1371525200933661 362.1371525200933092 0.0000000000000568
360.4833364544822416 360.4833364544821848 0.0000000000000568
248.2673620775768484 248.2673620775768200 0.0000000000000284
282.5821979986632186 282.5821979986631618 0.0000000000000568
375.3497808933937563 375.3497808933938131 -0.0000000000000568
-714.6692526551440778 -714.6692526551439641 -0.0000000000001137
-719.3730379983975354 -719.3730379983976491 0.0000000000001137
-703.9812988431797294 -703.9812988431798431 0.0000000000001137
-291.2201000761381806 -291.2201000761381238 -0.0000000000000568


I would like to ask if the difference is normal in MAGMA? If there is anyway to prevent this?
shingoxlf
 
Posts: 3
Joined: Sat Jun 14, 2014 2:08 pm

Re: non-consistent results for dgemm between MAGMA and CPU

Postby mgates3 » Mon Jun 16, 2014 12:06 pm

These errors appear normal. They are simply rounding errors, on the order of machine precision (1e-16). Any computer implementation -- CPU or GPU -- will necessarily incur similar errors, since the numbers are represented in finite precision floating point. Algorithms should be designed to be insensitive to such rounding errors, since they are impossible to avoid (in floating point) on either the CPU or the GPU.

One other suggestion: allocate matrices on the GPU using an lda that is a multiple of 32. This should provide better performance in general by aligning memory reads so they are coalesced on the GPU. See the testers in magma/testing/ for examples.

Note that the magma_dgemm function is simply a wrapper around cublasDgemm, which we use to provide better platform independence between CUDA, OpenCL, and Xeon Phi.

-mark
mgates3
 
Posts: 442
Joined: Fri Jan 06, 2012 2:13 pm


Return to User discussion

Who is online

Users browsing this forum: Bing [Bot], Google [Bot], Yahoo [Bot] and 2 guests

cron