Stan Tomov wrote:Yes, I can confirm it works. You can look in testing_dpotrf.cpp. The examples there compares the
results from LAPACK and MAGMA. If you want to do the test by computing A - L L^T, using the notations
in testing_dpotrf.cpp, you can do something like
- Code: Select all
double alpha = 1., beta = 0.;
double *C = (double*)malloc(N*N * sizeof(double));
for( j = 0; j < N; j++ )
for( i = 0; i < j; i++ )
// here I assume h_R holds the lower triangular matrix from the
// Cholesky decomposition, as returned by magma_dpotrf
h_R[i+j*N] = 0.;
// make C = the original matrix
for( j = 0; j < N*N; j++ )
C[j] = h_A[j];
dsyrk_("L", "N", &N, &N, &alpha, h_R, &N, &beta, C, &N);
daxpy_(&n2, &mone, C, &one, h_A, &one);
printf("%5d %6.2f %6.2f %e\n",
size[i], cpu_perf, gpu_perf,
dlange_("f", &N, &N, h_A, &N, work) );
funny thing, I run Your code (without making C = original matrix , h_R[j+i*N] = 0. because this is lower matrix so upper should be 0 and dsyrk_ with "U" argument) .
- Code: Select all
for(int i=N-10;i<N;i++){
for(int j=N-10;j<N;j++)
printf("%lf ",C[i*N+j]);
printf("\n");
}
because the begin of matrix is good, but the end of matrix isn't the same as the input matrix before decomposition.
If You have time please give me the whole code with printf statements where I can see that decomposition is done correctly
also I tried my own code for matrix multiplication and it shows that the decomposition isn't correctly
- Code: Select all
double *temp,*h_a_u,*h_a_l;
double sum;
temp = new double[n2];
h_a_u = new double[n2];
h_a_l = new double[n2];
for(int i=0;i<N;i++)
for(int j=0;j<N;j++){
if(j>=i) h_a_u[i*N+j]=h_R[j*N+i];
if(j<=i)
h_a_l[i*N+j]=h_R[i*N+j];
}
for(int j=N-20;j<N;j++)
for(int k=N-20;k<N;k++){
sum=0;
for(int i=0;i<N;i++)
sum+=h_a_l[j*N+i]*h_a_u[i*N+k];
temp[j*N+k]=sum;
}