R12 Sgemm - NAN

General CULA Dense (LAPACK & BLAS) support and troubleshooting. Use this forum if you are having a general problem or have encountered a bug.

R12 Sgemm - NAN

Postby bballmitch2 » Fri Feb 17, 2012 12:52 pm

Hello, I'm using CULA R12 (CUDA 4.0) in my current project. I'm using Linux, with an Nvidia GTX 570 graphics card. I make the following call in my code:

Code: Select all
int m = cudaLowerBp_rows; //100
int n = 1;
int k = cudaLowerBp_cols; //1280
int lda = m;
int ldb = k;
int ldc = m;
culaStat = culaDeviceSgemm('N', 'N', m, n, k, 1.0, cudaLowerBp, lda, cudaPartialDataTemp, ldb, 0.0, cudaHairCoeffs, ldc);


I've verified that all of the inputs have valid and correct data, and the culaStat does not give any errors. However, all values in the output from that call (cudaHairCoeffs) are NaN. I have no idea what is causing this problem. In addition, I can't imagine what could cause an output of NaN from a matrix multiply operation - its not like I'm in danger of dividing by zero.

Is there something obvious that I'm doing wrong? Or maybe its a bug in that function call? Any help would be much appreciated!
Last edited by bballmitch2 on Fri Feb 17, 2012 1:41 pm, edited 6 times in total.
bballmitch2
 
Posts: 2
Joined: Wed Sep 28, 2011 11:27 am

Re: R12 Sgemm - NAN

Postby bballmitch2 » Fri Feb 17, 2012 1:13 pm

Also, I should mention that cudaHairCoeffs has allocated, but not initialized memory. According to the CULA Programmer's guide, this shouldn't be a problem when beta = 0.0.
bballmitch2
 
Posts: 2
Joined: Wed Sep 28, 2011 11:27 am

Re: R12 Sgemm - NAN

Postby john » Mon Feb 20, 2012 6:18 am

Have you checked the status code that was returned?
john
Administrator
 
Posts: 587
Joined: Thu Jul 23, 2009 2:31 pm


Return to CULA Dense Support

Who is online

Users browsing this forum: No registered users and 2 guests

cron