Potential bug in culaDeviceDgesv

General CULA Dense (LAPACK & BLAS) support and troubleshooting. Use this forum if you are having a general problem or have encountered a bug.

Potential bug in culaDeviceDgesv

Postby zaspel » Wed Aug 08, 2012 5:46 am

Hi,

there might be a bug in culaDeviceDgesv when using high numbers of right-hand sides.

I attached an minimal test code + makefile to reproduce the problem: It allocates a 100x100 matrix and solves the linear system for 2560000 right-hand sides. All GPU calls after the call to culaDeviceDgesv result in a freeze of the program. In my test program, I perform a cudaMemcpy which then freezes. Thus you will see the output of the string "Test1" on the screen, but will never see "Test2". Neither cuda-memcheck nor valgrind do report any problem. However I only encouter the problem for a Tesla M2090 GPU, but not for Tesla C2050 or Tesla S1070.

Operating system: Ubuntu Lucid 10.04
Compiler: gcc (Ubuntu 4.4.3-4ubuntu5.1) 4.4.3

Attached, you will find the test code, a Makefile and the output of deviceQuery. If further information is needed, please ask. (Notice: I added the .txt suffix to the makefile to get it uploaded to your forum.)

It would be great if you could find a solution for this bug. We bought a bunch of licenses which are now waiting for work. :(
Attachments
Makefile.txt
(119 Bytes) Downloaded 284 times
deviceQuery.txt
(2.18 KiB) Downloaded 299 times
test.cu
(1.9 KiB) Downloaded 294 times
zaspel
 
Posts: 2
Joined: Thu Nov 19, 2009 10:51 am

Re: Potential bug in culaDeviceDgesv

Postby kyle » Wed Aug 08, 2012 2:21 pm

Hello and thanks for the bug report.

We were able to reproduce this issue on a system with a C2075 (similar to you, our C2050s were OK). I was able to narrow down the root cause to a CUDA resource overflow. This has since been fixed by splitting the problem into multiple, smaller problems that don't overflow the CUDA resources.

The fix has been integrated into our development build of CULA Dense and will be available in the upcoming release.
kyle
Administrator
 
Posts: 301
Joined: Fri Jun 12, 2009 7:47 pm

Re: Potential bug in culaDeviceDgesv

Postby zaspel » Thu Aug 09, 2012 12:04 am

Hello,

first of all it is great to see such a fast handling and reply for a bug report! :shock:

I'm glad to know that you could reproduce the bug and there is already a solution for it in the next release.

The obvious question is now: What is the expected release date? And is there a chance to get access to some sort of a beta release of the upcoming version of the library?
zaspel
 
Posts: 2
Joined: Thu Nov 19, 2009 10:51 am

Re: Potential bug in culaDeviceDgesv

Postby john » Thu Aug 09, 2012 5:48 am

We got on this bug quickly because we wanted to ensure that it would be in the next release, which will be very soon.
john
Administrator
 
Posts: 587
Joined: Thu Jul 23, 2009 2:31 pm


Return to CULA Dense Support

Who is online

Users browsing this forum: No registered users and 2 guests

cron