CULA benchmark

General CULA Dense (LAPACK & BLAS) support and troubleshooting. Use this forum if you are having a general problem or have encountered a bug.

CULA benchmark

Postby dandan » Wed Jun 22, 2011 5:08 am

Hi,

I am running the benchmark code which is shipped with CULA R10 (CUDA 3.2) on a machine with two quad-core CPUs and equipped with two C2050 Fermi-based Tesla cards. The result of the benchmark code is:

Code: Select all
Initializing CULA...
Initializing MKL...

Benchmarking the following functions:
-------------------------------------
             SGEQRF
             SGETRF
             SGELS
             SGGLSE
             SGESV
             SGESVD
-------------------------------------


     -- SGEQRF Benchmark  --

Size   CULA (s)    MKL (s)   Speedup
------ ---------- ---------- ---------
4096       0.34       0.84    2.4611
5120       0.58       1.41    2.4349
6144       0.98       2.36    2.4133
7168       1.31       3.67    2.7986
8192       1.89       5.43    2.8805

     -- SGETRF Benchmark  --

Size   CULA (s)    MKL (s)   Speedup
------ ---------- ---------- ---------
4096       0.25       0.43    1.7560
5120       0.36       0.75    2.0738
6144       0.53       1.19    2.2245
7168       0.78       1.83    2.3379
8192       1.07       2.88    2.6834

     -- SGELS Benchmark  --

Size   CULA (s)    MKL (s)   Speedup
------ ---------- ---------- ---------
4096       0.54       0.92    1.6941
5120       0.88       1.58    1.7827
6144       1.30       2.58    1.9793
7168       1.89       3.97    2.1078
8192       2.63       5.79    2.2003

     -- SGGLSE Benchmark  --

Size   CULA (s)    MKL (s)   Speedup
------ ---------- ---------- ---------
4096       0.56       2.98    5.3260
5120       0.89       4.89    5.5167
6144       1.29       7.40    5.7243
7168       1.85      10.51    5.6757
8192       2.53      14.53    5.7317

     -- SGESV Benchmark  --

Size   CULA (s)    MKL (s)   Speedup
------ ---------- ---------- ---------
4096       0.32       0.44    1.3760
5120       0.50       0.74    1.4752
6144       0.74       1.22    1.6577
7168       1.04       1.86    1.7908
8192       1.41       2.91    2.0671

     -- SGESVD Benchmark  --

Size   CULA (s)    MKL (s)   Speedup
------ ---------- ---------- ---------
4096      28.69      70.78    2.4667
5120      46.30     120.94    2.6122
6144      70.39     181.81    2.5828
7168      98.30     279.28    2.8411
8192     195.18     371.59    1.9038


I will use CULA routines for complex double precision and double precision calculations in my code. With such a good hardware, I expected much better results and speed up at least when single precision calculations are used. Is there any explanation for this or am I missing something? If these results are correct, so it would be much worse when complex or double precision is used.

Thanks,

D.
dandan
 
Posts: 16
Joined: Sat Feb 26, 2011 7:30 am

Re: CULA benchmark

Postby kyle » Wed Jun 22, 2011 7:27 am

We have some benchmarks with double complex performance:
http://www.culatools.com/features/performance

Also note that the benchmark doesn't use both of your GPUs.
kyle
Administrator
 
Posts: 301
Joined: Fri Jun 12, 2009 7:47 pm


Return to CULA Dense Support

Who is online

Users browsing this forum: Google [Bot] and 0 guests

cron