R12 Sgemm - NAN

PostPosted: Fri Feb 17, 2012 12:52 pm
by bballmitch2
Hello, I'm using CULA R12 (CUDA 4.0) in my current project. I'm using Linux, with an Nvidia GTX 570 graphics card. I make the following call in my code:

int m = cudaLowerBp_rows; //100
int n = 1;
int k = cudaLowerBp_cols; //1280
int lda = m;
int ldb = k;
int ldc = m;
culaStat = culaDeviceSgemm('N', 'N', m, n, k, 1.0, cudaLowerBp, lda, cudaPartialDataTemp, ldb, 0.0, cudaHairCoeffs, ldc);

I've verified that all of the inputs have valid and correct data, and the culaStat does not give any errors. However, all values in the output from that call (cudaHairCoeffs) are NaN. I have no idea what is causing this problem. In addition, I can't imagine what could cause an output of NaN from a matrix multiply operation - its not like I'm in danger of dividing by zero.

Is there something obvious that I'm doing wrong? Or maybe its a bug in that function call? Any help would be much appreciated!

Re: R12 Sgemm - NAN

PostPosted: Fri Feb 17, 2012 1:13 pm
by bballmitch2
Also, I should mention that cudaHairCoeffs has allocated, but not initialized memory. According to the CULA Programmer's guide, this shouldn't be a problem when beta = 0.0.

Re: R12 Sgemm - NAN

PostPosted: Mon Feb 20, 2012 6:18 am
by john
Have you checked the status code that was returned?