CULA Device
7 posts
• Page 1 of 1
CULA Device
Hi,
I'm getting much slower results when launching culaDeviceSgesv than culaSgesv on very similar matrices/RHSs. Why is this?
I'm getting much slower results when launching culaDeviceSgesv than culaSgesv on very similar matrices/RHSs. Why is this?
- jezz0r
- Posts: 5
- Joined: Tue Jul 02, 2013 6:39 am
Re: CULA Device
Actually, on identical systems, I just checked, the problem is the same: the device version is about three times as slow
- jezz0r
- Posts: 5
- Joined: Tue Jul 02, 2013 6:39 am
Re: CULA Device
The host interface takes certain liberties with how the data sits on card (since we own that data, not the user). Sometimes it can work out to a decent little speed boost, but not usually 3x. It's hard to say without more detail from you.
- john
- Administrator
- Posts: 587
- Joined: Thu Jul 23, 2009 2:31 pm
Re: CULA Device
Be sure to include a "warmup" run in your testing. The first hit to the GPU will cause things like kernels being loaded down to the card.
- john
- Administrator
- Posts: 587
- Joined: Thu Jul 23, 2009 2:31 pm
Re: CULA Device
To be clear, that is the output from one program, and does not even include the first run. It solves this thing repeatedly, and updates certain values from the result.
- jezz0r
- Posts: 5
- Joined: Tue Jul 02, 2013 6:39 am
Re: CULA Device
It's impossible to really be helpful without a complete test program with data, but I can keep giving one-off suggestions. You should try padding your matrix to an even multiple of 16, 32, 64, etc (try a few different ones to see what your GPU likes.) Just remember to make the padded portions into the identity matrix rather than just zeroes (all zeroes would be singular.) You could also just pad LDA, but I find it easier to do N=LDA.
- john
- Administrator
- Posts: 587
- Joined: Thu Jul 23, 2009 2:31 pm
7 posts
• Page 1 of 1
Who is online
Users browsing this forum: No registered users and 1 guest