culaDeviceSgels on K20
3 posts
• Page 1 of 1
culaDeviceSgels on K20
I have a complex derivatives pricing application using Monte carlo on the GPU. Almost
everything is done on the GPU so there is very little data transfer between CPU and GPU.
The main bottleneck is
culaDeviceSgels
which is called about 200 times. A typical call has 320k rows and 10 columns.
On the K20, it spends about 4.4 secs in this routine in total. On the QUADRO FX 5800, it's about 3.6 secs.
Is this behaviour to be expected?
everything is done on the GPU so there is very little data transfer between CPU and GPU.
The main bottleneck is
culaDeviceSgels
which is called about 200 times. A typical call has 320k rows and 10 columns.
On the K20, it spends about 4.4 secs in this routine in total. On the QUADRO FX 5800, it's about 3.6 secs.
Is this behaviour to be expected?
- mark_joshi
- Posts: 2
- Joined: Tue Jul 02, 2013 6:04 pm
Re: culaDeviceSgels on K20
Our GELS isn't specifically optimized for the extremely rectangular cases, like the one you have here.
- john
- Administrator
- Posts: 587
- Joined: Thu Jul 23, 2009 2:31 pm
Re: culaDeviceSgels on K20
Do you have any suggestions on how to proceed?
- mark_joshi
- Posts: 2
- Joined: Tue Jul 02, 2013 6:04 pm
3 posts
• Page 1 of 1
Who is online
Users browsing this forum: No registered users and 2 guests