CULA is not using my GPU!?

General discussion for CULA. Use this forum for questions, examples, feedback, and feature requests.

CULA is not using my GPU!?

Postby omidkar » Tue Apr 23, 2013 4:30 pm

Hi,

I am new to using CULA. I have a code in which a sparse linear system of equations is to be solved
using preconditioned conjugate gradient methods provided in CULA Sparse.
I have tried all solver&preconditioner combinations, but all of them appear to be much slower than the same code on CPU. Then I am thought perhaps CULA does not actually use my GPU. Using GPU-Z software, I monitored the performance of the GPU while the CULA code was running. It turned out that the Tesla K20 was not really used at all during the code running. Some of the details of my work is as follows:

I have two GPUs on my system, Tesla K20 and Quadro 600. But the one I'd like to use for my codes is Tesla K20. As you know, it is the latest available GPU from Nvidia, and is supposed to work superbly faster.
Also, I am using PGI Visual Fortran, which uses CUDA fortran architecture.
Finally, when running the code, after CULA initialization, I get the following warining:
"your GPU <Tesla K20c> does not have full double-precision perf. As such, you may not be able to use CULA Sparse to its full potential. Recommended cards include the Tesla 20 series and Quadro 6000".

Could somebody please help me with this issue?

Many thanks,

Omid
omidkar
 
Posts: 2
Joined: Wed Jan 23, 2013 5:35 pm

Re: CULA is not using my GPU!?

Postby john » Wed Apr 24, 2013 5:53 am

You can ignore that warning - it is spurious and will be removed in the next version. NVIDIA doesn't provide a reliable way to detect full double precision, and the K20C has been misdetected by our warning code.

Your GPU will definitely be used for this computation. Performance will depend on which GPU was selected (there is a chance that your Quadro has been selected) and the matrix itself. You'll want to make sure the K20 is selected, by either using cudaSetDevice or the CUDA_VISIBLE_DEVICES environment variable. But even if the GPU is selected, it's possible that under certain circumstances the GPU could still be slower than the CPU - for instance if your matrix is very small, or if you converge in just a few iterations. It's all very problem dependent from here.
john
Administrator
 
Posts: 587
Joined: Thu Jul 23, 2009 2:31 pm

Re: CULA is not using my GPU!?

Postby chongteng » Sat Sep 27, 2014 8:53 am

john wrote:But even if the GPU is selected, it's possible that under certain circumstances the GPU could still be slower than the CPU - for instance if your matrix is very small, or if you converge in just a few iterations. It's all very problem dependent from here.


John,

I have downloaded the CULASparse demo program from CULA site (http://www.culatools.com/downloads/sparse/) and tested with different Matrix sizes with the program. What I found is that with Matrix size ranges from 100x100 to 1000000x1000000 (the largest I have tested), I got much better performance on the CPU solver than on the GPU solver. This is really a surprise for me.

Matrix size: 1000000x1000000 (NNZ=8999974)
CPU Solver Result : Solver Time:0.1s, Total Time:0.72s
GPU Solver Result : Solver Time:0.57s, Total Time:1.62s

Matrix size: 1000x1000 (NNZ=1100)
CPU Solver Result: Solver Time:0.0032s, Total Time:0.0033s
GPU Solver Result: Solver Time:0.46s, Total Time:0.79s

Here is one screen output from my test with Matrix Size: 23556x23556 NNZ=484512:
data format: coo
platform: host
preconditioner: jacobi
solver: cg
iterations: 20
overhead time: 0.0039s
precond. time 0.00058s
solver time: 0.023s
total time: 0.028s

data format: coo
platform: cuda
preconditioner: jacobi
solver: cg
iterations: 20
overhead time: 0.5319s
precond. time 0.55s
solver time: 0.823s
total time: 1.35s

The machine I'm testing with is a MacBook Pro with i7-4850HQ 2.30GHz CPU and GetForce 750 GPU. I have also tested on a TESLA GPU Server box, got similar result.

I have tried all the combinations of Solver and PreConditioner. So far I haven't found one case that the GPU solver is faster than the CPU solver. I don't know if this is the expected result.
chongteng
 
Posts: 1
Joined: Mon May 05, 2014 7:34 am


Return to General CULA Discussion

Who is online

Users browsing this forum: No registered users and 1 guest

cron