Error from magmablas_dtrsm

Open discussion for MAGMA

Error from magmablas_dtrsm

Postby katayama » Wed Jan 19, 2011 12:08 pm

Dear experts,
I am trying to use rc2.
I have the following line

cublasDtrsm('R', 'L','T','N', g, g, 1.0, dev_2, g, dev_1, g);

Where g is something like 4 - 6144. With
#define cublasDtrsm magmablas_dtrsm
I get nans in dev_2 on return. I get the right answer with the
define statement commented out.

I wonder if magmablas_dtrsm is working.

The reason I am testing magma version of this routine is that on C2050,
I get cublasDtrsm running very slow, compared to GTX580. It takes 3 times
more time. Other routines like dgemm, dsyrk dtrmm do as expected.

Thanks!

Nobu
katayama
 
Posts: 12
Joined: Sat Jan 16, 2010 8:33 am

Re: Error from magmablas_dtrsm

Postby fletchjp » Wed Jan 19, 2011 4:44 pm

Nobu

I have been seeing some similar problems with NaN values on some testing but not on others - see some of the other recent threads with my name on them. I am waiting to see whether RC3 will solve some of these problems.

John
fletchjp
 
Posts: 175
Joined: Mon Dec 27, 2010 7:29 pm

Re: Error from magmablas_dtrsm

Postby pdgetrf » Thu Jan 20, 2011 11:26 pm

Hi all

We are trying to locate the bug.

First off, could you change NB to different value (a multiple of 16) and see if you still get NaN?

Thanks.
pdgetrf
 
Posts: 9
Joined: Wed Jan 19, 2011 8:32 pm

Re: Error from magmablas_dtrsm

Postby pdgetrf » Thu Jan 20, 2011 11:27 pm

Hi all

We are trying to locate the bug.

First off, could you change NB to different value (a multiple of 16) and see if you still get NaN?

Thanks.
pdgetrf
 
Posts: 9
Joined: Wed Jan 19, 2011 8:32 pm

Re: Error from magmablas_dtrsm

Postby pdgetrf » Fri Jan 21, 2011 11:17 am

Hello Katayama,

Could you please give some more details about your running environment? For example, version of CUDA, nvcc, driver, which GPU. Also the exact size of g that causes NaN or wrong result. This so that we could repeat the issue on our side and pinpoint the bug.

thanks!
pdgetrf
 
Posts: 9
Joined: Wed Jan 19, 2011 8:32 pm

Re: Error from magmablas_dtrsm

Postby fletchjp » Fri Jan 21, 2011 5:37 pm

I have had similar problems with strange NaN values, which are 'cured' by using CUBLAS instead of MAGMA Blas routines. See discussion in another thread.

I had hoped these would go away when I used GotoBLAS compiled with CORE2 (see another recent thread) on my Nehalem CPU but unfortunately this is not the case. The errors are still there.

John
fletchjp
 
Posts: 175
Joined: Mon Dec 27, 2010 7:29 pm

Re: Error from magmablas_dtrsm

Postby katayama » Sat Jan 22, 2011 4:12 am

Hi,

Sorry for a late reply.

Here is what I have. I've attached a test program, makefile and result. (Hope attachment works.)

[katayama@lb01 magma]$ ~/NVIDIA_GPU_Computing_SDK/C/bin/linux/release/deviceQueryDrv
CUDA Device Query (Driver API) statically linked version
There is 1 device supporting CUDA

Device 0: "Tesla C2050"
CUDA Driver Version: 3.20
CUDA Capability Major/Minor version number: 2.0
Total amount of global memory: 2817720320 bytes
Multiprocessors x Cores/MP = Cores: 14 (MP) x 32 (Cores/MP) = 448 (Cores)
Total amount of constant memory: 65536 bytes
Total amount of shared memory per block: 49152 bytes
Total number of registers available per block: 32768
Warp size: 32
Maximum number of threads per block: 1024
Maximum sizes of each dimension of a block: 1024 x 1024 x 64
Maximum sizes of each dimension of a grid: 65535 x 65535 x 1
Maximum memory pitch: 2147483647 bytes
Texture alignment: 512 bytes
Clock rate: 1.15 GHz
Concurrent copy and execution: Yes
Run time limit on kernels: Yes
Integrated: No
Support host page-locked memory mapping: Yes
Concurrent kernel execution: Yes
Device has ECC support enabled: Yes
Device is using TCC driver mode: No

PASSED
katayama
 
Posts: 12
Joined: Sat Jan 16, 2010 8:33 am

Re: Error from magmablas_dtrsm

Postby katayama » Sat Jan 22, 2011 4:17 am

Uh mm. I see no attachment.
Here is the code, output and Makefile. I use cuda 3.2.
Thanks to you!

katayama@lb01 magma]$ more out.nan
[ [ 0.391137 , 0.37845 , 0.676826 , 0.828647 ];
[ 0.572013 , 1.29715 , 1.72296 , 3.94056 ];
[ 0.382288 , 0.33823 , 1.39019 , 2.54674 ];
[ 0.719967 , 1.69088 , 1.85096 , 1.95143 ] ]
[ [ 0.729483 , 0 , 0 , 0 ];
[ 0.551107 , 1.98647 , 0 , 0 ];
[ 0.946236 , 0.61067 , 2.91446 , 0 ];
[ 0.493657 , 0.600847 , 0.152826 , 3.33746 ] ]
[ [ nan , nan , nan , nan ];
[ nan , nan , nan , nan ];
[ nan , nan , nan , nan ];
[ nan , nan , nan , nan ] ]



#include <magmablas.h>
#include <iostream>

#define cublasDtrsm magmablas_dtrsm

void printmat(int N, int M, const double *A, int LDA)
{
double mtmp;
std::cout << "[ ";
for (int i = 0; i < N; i++) {
if(i>0) std::cout << " ";
std::cout << "[ ";
for (int j = 0; j < M; j++) {
mtmp = A[i + j * LDA];
std::cout << mtmp << " ";
if (j < M - 1) {
std::cout << ", ";
}
}
if (i < N - 1) {
std::cout << "];" << std::endl;
}
else {
std::cout << "] ";
}
}
std::cout << "]" << std::endl;
}


int main(void) {

double B[16];

double C[16];

int g(4);
size_t NxN(16);

B[0] = 0.391137;
B[4] = 0.37845;
B[8] = 0.676826;
B[12] =0.828647;
B[1] = 0.572013;
B[5] = 1.29715;
B[9] = 1.72296;
B[13] = 3.94056;
B[2] = 0.382288;
B[6] = 0.33823;
B[10] = 1.39019;
B[14] = 2.54674;
B[3] = 0.719967;
B[7] = 1.69088;
B[11] = 1.85096;
B[15] =1.95143;

C[0] =0.729483;
C[4] = 0;
C[8] = 0;
C[12] = 0;
C[1] = 0.551107;
C[5] = 1.98647;
C[9] = 0;
C[13] = 0;
C[2] = 0.946236;
C[6] = 0.61067;
C[10] = 2.91446;
C[14] = 0;

C[3] = 0.493657;
C[7] = 0.600847;
C[11] = 0.152826;
C[15] = 3.33746;

printmat(g,g,B,g);
printmat(g,g,C,g);

cuInit(0);
cublasInit();

double *dev_1;
double *dev_2;
cublasAlloc(NxN,sizeof(double),(void**)&dev_1);
cublasAlloc(NxN,sizeof(double),(void**)&dev_2);

cublasSetVector(NxN,sizeof(double),B,1,dev_1,1);
cublasSetVector(NxN,sizeof(double),C,1,dev_2,1);

cublasDtrsm('R', 'L','T','N', g, g, 1.0, dev_2, g, dev_1, g);

cublasGetVector(NxN,sizeof(double),dev_1,1,B,1);

printmat(g,g,B,g);
}

[katayama@lb01 magma]$ more Makefile
CXX=g++

MAGMA_TOP=/home/katayama/work/magma/magma_1.0.0-rc2
CUDA_TOP=/usr/local/cuda

INC=-I$(MAGMA_TOP)/include -I$(CUDA_TOP)/include

LIB=-L$(MAGMA_TOP)/lib -L$(CUDA_TOP)/lib64

CXXFLAGS = $(INC) -O3


all:nan

nan.o : nan.cc

nan : nan.o
$(CXX) -o $@ $< $(LIB) -lcuda -lmagma -lmagmablas -lcublas -lm

clean :
rm -f nan
katayama
 
Posts: 12
Joined: Sat Jan 16, 2010 8:33 am

cublas result

Postby katayama » Mon Jan 24, 2011 2:02 am

When I comment out #define line, I get

[katayama@lb01 magma]$ ./nan.cublas
[ [ 0.391137 , 0.37845 , 0.676826 , 0.828647 ];
[ 0.572013 , 1.29715 , 1.72296 , 3.94056 ];
[ 0.382288 , 0.33823 , 1.39019 , 2.54674 ];
[ 0.719967 , 1.69088 , 1.85096 , 1.95143 ] ]
[ [ 0.729483 , 0 , 0 , 0 ];
[ 0.551107 , 1.98647 , 0 , 0 ];
[ 0.946236 , 0.61067 , 2.91446 , 0 ];
[ 0.493657 , 0.600847 , 0.152826 , 3.33746 ] ]
[ [ 0.536184 , 0.0417602 , 0.0493978 , 0.159198 ];
[ 0.784135 , 0.43545 , 0.245352 , 0.975092 ];
[ 0.524053 , 0.0248786 , 0.301641 , 0.667271 ];
[ 0.986955 , 0.577387 , 0.193681 , 0.325904 ] ]

Thanks,
Nobu
katayama
 
Posts: 12
Joined: Sat Jan 16, 2010 8:33 am

Re: Error from magmablas_dtrsm

Postby pdgetrf » Wed Jan 26, 2011 12:15 pm

thanks, we'll get back to you asap.
pdgetrf
 
Posts: 9
Joined: Wed Jan 19, 2011 8:32 pm

Next

Return to User discussion

Who is online

Users browsing this forum: Bing [Bot], Google [Bot] and 2 guests

cron