How do I set the CUDA Stream CULA uses?

General discussion for CULA. Use this forum for questions, examples, feedback, and feature requests.

How do I set the CUDA Stream CULA uses?

Postby rdun044 » Tue Jun 05, 2012 9:09 pm

I need to set the cuda stream that CULA functions use on the GPU.

How do I do this?, I can't find the function anywhere.

In CUBLAS I do this with a couple of functions like this

cublasCreate_v2(&CublasHandle);
cublasSetStream_v2(CublasHandle, theStream);

Thanks,
rdun044
 
Posts: 3
Joined: Tue Oct 05, 2010 3:36 pm

Re: How do I set the CUDA Stream CULA uses?

Postby john » Wed Jun 06, 2012 6:10 am

Many CULA functions will already use 1 or more streams internally, so we can't bind to only one stream in this manner.
john
Administrator
 
Posts: 587
Joined: Thu Jul 23, 2009 2:31 pm

Re: How do I set the CUDA Stream CULA uses?

Postby rdun044 » Wed Jun 06, 2012 1:54 pm

Is there a way to stop it from requiring a device sync each call, I need to run cula functions while I copy data in a real-time video application. I can handle cula allocating its own streams etc but I need it to run concurrently to the memory copies.

I've attached a couple of screenshots to show what I mean,

This is the behavior of cublas sgemm I can run it concurrently to the memory copies.

Image

CULA however, insists on waiting for the memory copy to finish before starting it's kernel.

Image

For SGEMM I can just use cublas but this is not the case for the more advanced features CULA offers. The only difference between the code in the two above profiler examples is one calls the cublas sgemm the other calls the cula sgemm on the device.

Thanks,

Also: I can't seem to post in the private support forum for those who have paid a subscription.
rdun044
 
Posts: 3
Joined: Tue Oct 05, 2010 3:36 pm

Re: How do I set the CUDA Stream CULA uses?

Postby john » Thu Jun 07, 2012 8:48 am

We'll have a look into not syncing the whole device, which is something of a pre-CUDA-4 decision from when we basically demanded exclusive access to the device because of the threading model.

I will note that overlapping your own memory copies with a CULA routine might lead to degraded performance because most CULA routines will be performing their own transfers internally while executing.

For the premium support forums, those are unfortunately not open to academic accounts, but we still pretty readily answer questions as you've submitted here.
john
Administrator
 
Posts: 587
Joined: Thu Jul 23, 2009 2:31 pm

Re: How do I set the CUDA Stream CULA uses?

Postby rdun044 » Thu Jun 07, 2012 3:45 pm

Ah thanks for that, I guess I missed the section about Academic and the support package!

The frustrating thing for me is that the CULA functions I need, eigendecomposition and LU factorization are complexity wise only very small parts of the overall algorithm. Sadly they are non-trivial to implement myself in a reasonable time-frame.

I could probably tolerate a decent amount of degraded performance if I could run it asynchronously. With more control over when cula executes I could stagger or split up my memory copies a certain amount too.

Although, I am intrigued (and did some experiments) does CULA use the CPU for parts of computation eigenvalues and vectors or does it complete the entire calculation on the GPU?

One of the problems I had with using MAGMA was that it split the computation between the CPU and the GPU where I really need the entire calculation done on the GPU as I am using the entire host->device bandwidth already for transferring my raw image data and this is likely the bottleneck in my system.
rdun044
 
Posts: 3
Joined: Tue Oct 05, 2010 3:36 pm

Re: How do I set the CUDA Stream CULA uses?

Postby v7kubkzs » Mon Jan 14, 2013 5:31 pm

john wrote:We'll have a look into not syncing the whole device, which is something of a pre-CUDA-4 decision from when we basically demanded exclusive access to the device because of the threading model.


In the interim, is it safe to assume that deviceSynchronize() is called at the beginning of any CULA routine (or, specific to my case, GESVD), and again before termination, or do we need to put cudaDeviceSynchronize() calls in explicitly? I'm using streams to overlap communication and computation, sandwiched around CULA calls.
v7kubkzs
 
Posts: 6
Joined: Fri Aug 17, 2012 8:32 pm

Re: How do I set the CUDA Stream CULA uses?

Postby john » Wed Jan 16, 2013 2:27 pm

CULA routines are synchronized on the output. Some are synchronized on the input, depending on the needs of the routine, but in general it is safest to assume not synchronized on input.
john
Administrator
 
Posts: 587
Joined: Thu Jul 23, 2009 2:31 pm


Return to General CULA Discussion

Who is online

Users browsing this forum: No registered users and 1 guest