[phpBB Debug] PHP Notice: in file /includes/bbcode.php on line 483: preg_replace(): The /e modifier is no longer supported, use preg_replace_callback instead
[phpBB Debug] PHP Notice: in file /includes/bbcode.php on line 483: preg_replace(): The /e modifier is no longer supported, use preg_replace_callback instead
[phpBB Debug] PHP Notice: in file /includes/bbcode.php on line 483: preg_replace(): The /e modifier is no longer supported, use preg_replace_callback instead
[phpBB Debug] PHP Notice: in file /includes/bbcode.php on line 483: preg_replace(): The /e modifier is no longer supported, use preg_replace_callback instead
[phpBB Debug] PHP Notice: in file /includes/bbcode.php on line 483: preg_replace(): The /e modifier is no longer supported, use preg_replace_callback instead
[phpBB Debug] PHP Notice: in file /includes/bbcode.php on line 483: preg_replace(): The /e modifier is no longer supported, use preg_replace_callback instead
[phpBB Debug] PHP Notice: in file /includes/bbcode.php on line 483: preg_replace(): The /e modifier is no longer supported, use preg_replace_callback instead
[phpBB Debug] PHP Notice: in file /includes/bbcode.php on line 483: preg_replace(): The /e modifier is no longer supported, use preg_replace_callback instead
[phpBB Debug] PHP Notice: in file /includes/bbcode.php on line 483: preg_replace(): The /e modifier is no longer supported, use preg_replace_callback instead
[phpBB Debug] PHP Notice: in file /includes/bbcode.php on line 483: preg_replace(): The /e modifier is no longer supported, use preg_replace_callback instead
[phpBB Debug] PHP Notice: in file /includes/bbcode.php on line 483: preg_replace(): The /e modifier is no longer supported, use preg_replace_callback instead
[phpBB Debug] PHP Notice: in file /includes/bbcode.php on line 483: preg_replace(): The /e modifier is no longer supported, use preg_replace_callback instead
[phpBB Debug] PHP Notice: in file /includes/bbcode.php on line 483: preg_replace(): The /e modifier is no longer supported, use preg_replace_callback instead
[phpBB Debug] PHP Notice: in file /includes/bbcode.php on line 483: preg_replace(): The /e modifier is no longer supported, use preg_replace_callback instead
[phpBB Debug] PHP Notice: in file /includes/bbcode.php on line 483: preg_replace(): The /e modifier is no longer supported, use preg_replace_callback instead
[phpBB Debug] PHP Notice: in file /includes/bbcode.php on line 483: preg_replace(): The /e modifier is no longer supported, use preg_replace_callback instead
[phpBB Debug] PHP Notice: in file /includes/bbcode.php on line 483: preg_replace(): The /e modifier is no longer supported, use preg_replace_callback instead
[phpBB Debug] PHP Notice: in file /includes/bbcode.php on line 483: preg_replace(): The /e modifier is no longer supported, use preg_replace_callback instead
[phpBB Debug] PHP Notice: in file /includes/bbcode.php on line 483: preg_replace(): The /e modifier is no longer supported, use preg_replace_callback instead
[phpBB Debug] PHP Notice: in file /includes/bbcode.php on line 483: preg_replace(): The /e modifier is no longer supported, use preg_replace_callback instead
[phpBB Debug] PHP Notice: in file /includes/bbcode.php on line 483: preg_replace(): The /e modifier is no longer supported, use preg_replace_callback instead
[phpBB Debug] PHP Notice: in file /includes/bbcode.php on line 483: preg_replace(): The /e modifier is no longer supported, use preg_replace_callback instead
[phpBB Debug] PHP Notice: in file /includes/bbcode.php on line 483: preg_replace(): The /e modifier is no longer supported, use preg_replace_callback instead
[phpBB Debug] PHP Notice: in file /includes/bbcode.php on line 483: preg_replace(): The /e modifier is no longer supported, use preg_replace_callback instead
[phpBB Debug] PHP Notice: in file /includes/bbcode.php on line 483: preg_replace(): The /e modifier is no longer supported, use preg_replace_callback instead
[phpBB Debug] PHP Notice: in file /includes/bbcode.php on line 483: preg_replace(): The /e modifier is no longer supported, use preg_replace_callback instead
[phpBB Debug] PHP Notice: in file /includes/bbcode.php on line 112: preg_replace(): The /e modifier is no longer supported, use preg_replace_callback instead
[phpBB Debug] PHP Notice: in file /includes/bbcode.php on line 112: preg_replace(): The /e modifier is no longer supported, use preg_replace_callback instead
[phpBB Debug] PHP Notice: in file /includes/functions.php on line 4284: Cannot modify header information - headers already sent by (output started at /includes/functions.php:3493)
[phpBB Debug] PHP Notice: in file /includes/functions.php on line 4286: Cannot modify header information - headers already sent by (output started at /includes/functions.php:3493)
[phpBB Debug] PHP Notice: in file /includes/functions.php on line 4287: Cannot modify header information - headers already sent by (output started at /includes/functions.php:3493)
[phpBB Debug] PHP Notice: in file /includes/functions.php on line 4288: Cannot modify header information - headers already sent by (output started at /includes/functions.php:3493)
CULA • View topic - Multi-GPU CULA

Multi-GPU CULA

General CULA Dense (LAPACK & BLAS) support and troubleshooting. Use this forum if you are having a general problem or have encountered a bug.

Multi-GPU CULA

Postby wgomez » Thu Jul 08, 2010 11:52 am

We've been trying to get a multi-GPU program working but have been running into inconsistent errors with CULA. We run into problems if we try to run the program with 6 or 10 threads to create. It consistently completes if asked to do 5 threads, but will crash on the fifth or sixth thread when asked to do 6 threads. For some reason it also completes fine when asked to do 9 threads, but crashes with 10. The error is a culaRuntimeError with error code 17.

We are using CULA 1.2 and calling syev on the device. We've tried moving data around, switching to the host interface, and moving our calls to culaInitialize() and culaShutdown(). Our machine is using 4 Tesla C1060s. We also checked that the input data is correct.

Since we got the most stability when calling the initialization and shutdown immediately before and after the culaDeviceDsyev function call, we were wondering what the best practices for using the initialization and shutdown functions are. When and where should they be used? We found that if a thread called the shutdown function during another thread's run, the thread would return a culaNotInitialized error. Shouldn't the calls be thread specific?

Any help would be appreciated. Really frustrated at this point.
wgomez
 
Posts: 3
Joined: Thu Jul 08, 2010 11:35 am

Re: Multi-GPU CULA

Postby kyle » Thu Jul 08, 2010 1:13 pm

kyle
Administrator
 
Posts: 301
Joined: Fri Jun 12, 2009 7:47 pm

Re: Multi-GPU CULA

Postby kyle » Thu Jul 08, 2010 3:09 pm

We have narrowed down the error to a context management bug in culaShutdown(). A fix is in the works, but in the mean time you can most likely ignore culaShutdown() without problem.
kyle
Administrator
 
Posts: 301
Joined: Fri Jun 12, 2009 7:47 pm

Re: Multi-GPU CULA

Postby john » Fri Jul 09, 2010 9:04 am

john
Administrator
 
Posts: 587
Joined: Thu Jul 23, 2009 2:31 pm

Re: Multi-GPU CULA

Postby wgomez » Fri Jul 09, 2010 12:23 pm

Thanks for the replies.

We got our code to work without errors for multiple runs. We ended up adding a mutex that made sure that only one thread was using CULA at a time, and it hasn't crashed since. Every thread that is about to call a CULA function locks the mutex, initializes CULA, calls CULA, shuts down CULA, and then frees the mutex. Not a big fan of the solution, but for now it works.

We tried taking out the culaShutdown call at one point, but that didn't fix our particular problem. Unfortunately, we are planning on consistently creating and shutting down threads that will use CULA, so just ignoring the shutdown call won't work.

One more question, is there a reason that a call to culaDeviceDsyev() would spawn several extra threads? It appears to spawn 4 threads the first time it is called (whatever thread gets there first), and then 3 threads the first time a subsequent thread calls it. I don't think it's causing a problem, but I was wondering why those threads are appearing.
wgomez
 
Posts: 3
Joined: Thu Jul 08, 2010 11:35 am

Re: Multi-GPU CULA

Postby john » Mon Jul 12, 2010 8:46 am

Thank you for the reply and thank you also for the input. It has been quite valuable. We believe that we have fixed the issue sufficiently and that in the next CULA version you will not need to apply such workarounds. In future releases, each thread should call culaInitialize/culaShutdown as appropriate and you will not see CUBLAS errors. Please keep in mind that CULA uses CUBLAS internally, so cublasShutdown should not be called if you have any upcoming CULA calls.

For the DSYEV question, please keep in mind that CULA is a hybrid CPU/GPU library and as such we use both the CPU and GPU for certain portions of the code. In the case of a multicore CPU, we will also attempt to use as many cores as necessary. That extra thread you have observed is likely to be for one-time bookkeeping and allocation.
john
Administrator
 
Posts: 587
Joined: Thu Jul 23, 2009 2:31 pm

Re: Multi-GPU CULA

Postby wgomez » Mon Jul 12, 2010 10:15 am

We're glad we could help improve CULA.

Just to be complete, our solution to the problem still throws a cudaError sometime during each culaInitialize() call. The call itself returns culaNoError and our program functions correctly, though, so we are moving forward.

Thanks for the reply about our DSYEV question. We expected it to have to do with having a multicore CPU, but we weren't sure if they were meant to stick around until the entire program completes.
wgomez
 
Posts: 3
Joined: Thu Jul 08, 2010 11:35 am

Re: Multi-GPU CULA

Postby dan » Tue Jul 13, 2010 7:40 am

Hi wgomez,

With regard to the exception being thrown, this is perfectly fine because no exception will ever propagate beyond the API boundary. Microsoft Visual Studio does list every exception that it sees (ever if it's not in your code) and this is likely what you're observing. This is actually expected behavior -- it's actually the CUDA runtime that is throwing the exception, not us.

Thanks for your input,

Dan
dan
Administrator
 
Posts: 61
Joined: Thu Jul 23, 2009 2:29 pm

Re: Multi-GPU CULA

Postby john » Tue Jul 13, 2010 8:31 am

Just a small comment to add to Dan's post, which was that we once looked into this exception because we noticed it just as you did. It seems that it's thrown and handled several times by CUDA during normal operations. I'd only worry if one was unhandled, but that won't occur from CULA.
john
Administrator
 
Posts: 587
Joined: Thu Jul 23, 2009 2:31 pm


Return to CULA Dense Support

Who is online

Users browsing this forum: No registered users and 1 guest

cron