culaDeviceSgels on K20

General CULA Dense (LAPACK & BLAS) support and troubleshooting. Use this forum if you are having a general problem or have encountered a bug.

culaDeviceSgels on K20

Postby mark_joshi » Thu Oct 31, 2013 8:50 pm

I have a complex derivatives pricing application using Monte carlo on the GPU. Almost
everything is done on the GPU so there is very little data transfer between CPU and GPU.

The main bottleneck is

culaDeviceSgels

which is called about 200 times. A typical call has 320k rows and 10 columns.

On the K20, it spends about 4.4 secs in this routine in total. On the QUADRO FX 5800, it's about 3.6 secs.

Is this behaviour to be expected?
mark_joshi
 
Posts: 2
Joined: Tue Jul 02, 2013 6:04 pm

Re: culaDeviceSgels on K20

Postby john » Fri Nov 01, 2013 8:56 am

Our GELS isn't specifically optimized for the extremely rectangular cases, like the one you have here.
john
Administrator
 
Posts: 587
Joined: Thu Jul 23, 2009 2:31 pm

Re: culaDeviceSgels on K20

Postby mark_joshi » Sun Nov 03, 2013 4:04 pm

Do you have any suggestions on how to proceed?
mark_joshi
 
Posts: 2
Joined: Tue Jul 02, 2013 6:04 pm


Return to CULA Dense Support

Who is online

Users browsing this forum: No registered users and 2 guests

cron