I am trying to implement Gram-Schmidt over differents environments and I am in trouble when trying it with magma & cublas.
I have done it with octave to have all matrixes (final matrix with orthonomalized vectors and intermediated ones) to compare against with. Moreover, I check that the final vectors are orthonormal.
If I am not wrong, both, magma and cublas, store matrix by columns.
h_x = matrix on host
d_x = matrix on device (GPU)
dim_y = rows of initial matrix (number of vectors that generates the subspace)
dim_x = columns of initial matrix(dimension of the vectors)
alpha = 1.0
beta = 0.0
I am doing these calls
initial matrix times initial matrix transpose
stat = cublasDsyrk_v2( handle,
char* upper = MagmaUpperStr;
magma_dpotrf_gpu( upper, dim_y, d_matrix_syrk, dim_y, &info );
To the last part, I have chosen to resolve X * op(A) = alpha * B => X = alpha* B * inv( op( A ) );
I will no op and B has to be tranpose to match with A's dimension. So, I have to invert B first (initial vector matrix).
magmablas_dtranspose2( d_matrix_t, dim_x, d_matrix, dim_y, dim_y, dim_x );
Finally I call dtrsm
magma_dtrsm( MagmaRight, MagmaUpper, MagmaNoTrans, MagmaNonUnit,
dim_x, dim_y, alpha, d_matrix_syrk, dim_y, d_matrix_t, dim_x );
The result returned on d_matrix_t should be a matrix (dim_x x dim_y, stored by columns) which each column is a vector that is orthonormalized.
This matrix compared with octave one is not equal (substracting them and compare with a bias). Moreover, Vectors are not orthonormal.
Matrix used by this function (d_matrix_syrk: U, d_matrix_t:B') are correct (compared against octave ones).
I have checked matrixes and dimensions again and again and I think parameters are correct. I have test cublasDsyrk and got same results.
I am using double precision real. CUDA 4.1. Hardware NVIDIA GeForce GTX 285.
I forgot to say that this issue occurs only when dimensions are different, i.e if dim_x = dim_y = 1000, final vectors are orthonormalized but if I change dimension, for example to dim_x = 1000 and dim_y = 2000 is when error happens.
Thanks in advance.