Poor cula performance
5 posts
• Page 1 of 1
Poor cula performance
I have two servers (4X Intel(R) Core(TM)2 Extreme CPU X9770 @ 3.20GHz with a GTX 285 and 4X CPU0: Intel(R) Core(TM)2 Quad CPU Q9400 @ 2.66GHz with a GTX 480 and C2050) running ubuntu 10.04.2 LTS and CUDA 3.2. When I run the benchmark (results below) I see considerably worse performance on the server with the fermi cards. Also concerning is that the server with the fermi cards takes 40x longer to run the SGESVD using MK. Any ideas on tracking this down?
As a side note I tried to post this in the private support forum but I couldn't start a post.
Sever with GTX 285
Sever with GTX480
As a side note I tried to post this in the private support forum but I couldn't start a post.
Sever with GTX 285
- Code: Select all
Initializing CULA...
Initializing MKL...
Benchmarking the following functions:
-------------------------------------
SGEQRF
SGETRF
SGELS
SGGLSE
SGESV
SGESVD
-------------------------------------
-- SGEQRF Benchmark --
Size CULA (s) MKL (s) Speedup
------ ---------- ---------- ---------
4096 0.65 1.21 1.8576
5120 1.17 2.23 1.9055
6144 1.86 3.82 2.0579
7168 2.87 6.05 2.1079
8192 4.13 11.97 2.8967
-- SGETRF Benchmark --
Size CULA (s) MKL (s) Speedup
------ ---------- ---------- ---------
4096 0.30 0.78 2.6007
5120 0.53 1.39 2.6198
6144 0.65 2.42 3.6913
7168 1.34 3.73 2.7868
8192 1.97 6.14 3.1113
-- SGELS Benchmark --
Size CULA (s) MKL (s) Speedup
------ ---------- ---------- ---------
4096 0.87 1.63 1.8841
5120 1.48 3.06 2.0755
6144 2.31 5.41 2.3451
7168 3.42 8.95 2.6173
8192 4.81 13.89 2.8897
-- SGGLSE Benchmark --
Size CULA (s) MKL (s) Speedup
------ ---------- ---------- ---------
4096 0.91 4.63 5.1123
5120 1.55 7.47 4.8292
6144 2.41 11.41 4.7292
7168 3.59 16.68 4.6429
8192 5.05 24.53 4.8588
-- SGESV Benchmark --
Size CULA (s) MKL (s) Speedup
------ ---------- ---------- ---------
4096 0.44 0.81 1.8369
5120 0.74 1.41 1.8999
6144 0.95 2.31 2.4439
7168 1.73 3.58 2.0754
8192 2.47 5.88 2.3818
-- SGESVD Benchmark --
Size CULA (s) MKL (s) Speedup
------ ---------- ---------- ---------
4096 22.49 116.07 5.1620
5120 37.54 206.11 5.4901
6144 51.61 338.14 6.5514
7168 85.09 557.83 6.5558
8192 120.41 776.76 6.4508
Sever with GTX480
- Code: Select all
Initializing CULA...
Initializing MKL...
Benchmarking the following functions:
-------------------------------------
SGEQRF
SGETRF
SGELS
SGGLSE
SGESV
SGESVD
-------------------------------------
-- SGEQRF Benchmark --
Size CULA (s) MKL (s) Speedup
------ ---------- ---------- ---------
4096 6.18 3.62 0.5868
5120 7.76 5.44 0.7017
6144 9.26 8.03 0.8669
7168 5.79 10.80 1.8631
8192 12.54 21.74 1.7331
-- SGETRF Benchmark --
Size CULA (s) MKL (s) Speedup
------ ---------- ---------- ---------
4096 1.47 1.57 1.0689
5120 2.90 3.08 1.0616
6144 3.39 5.18 1.5268
7168 4.14 7.62 1.8390
8192 5.22 11.07 2.1181
-- SGELS Benchmark --
Size CULA (s) MKL (s) Speedup
------ ---------- ---------- ---------
4096 8.03 3.47 0.4319
5120 14.38 5.99 0.4166
6144 17.32 9.45 0.5457
7168 19.75 12.76 0.6465
8192 22.79 18.95 0.8317
-- SGGLSE Benchmark --
Size CULA (s) MKL (s) Speedup
------ ---------- ---------- ---------
4096 11.62 11.47 0.9865
5120 14.66 16.75 1.1427
6144 17.59 24.75 1.4071
7168 20.76 33.80 1.6283
8192 24.22 43.62 1.8011
-- SGESV Benchmark --
Size CULA (s) MKL (s) Speedup
------ ---------- ---------- ---------
4096 2.31 2.19 0.9465
5120 3.20 2.83 0.8836
6144 3.77 5.25 1.3941
7168 2.22 7.85 3.5339
8192 5.41 12.27 2.2703
-- SGESVD Benchmark --
Size CULA (s) MKL (s) Speedup
------ ---------- ---------- ---------
4096 644.57 4155.89 6.4475
- cbv3b
- Posts: 3
- Joined: Wed Nov 10, 2010 2:05 pm
Re: Poor cula performance
Private support is for non-academic purchases, only.
With regards to your speed, are you sure your benchmarks are running on the correct device? Are your servers under heavy load when you are benchmarking? Those numbers seem very low for both the CPU and GPU.
With regards to your speed, are you sure your benchmarks are running on the correct device? Are your servers under heavy load when you are benchmarking? Those numbers seem very low for both the CPU and GPU.
- kyle
- Administrator
- Posts: 301
- Joined: Fri Jun 12, 2009 7:47 pm
Re: Poor cula performance
Yes the benchmarks are running on the correct device, and top showed that only a few root threads were running. What was concerning to me was that he FERMI devices were so much slower than the TESLA device...
- cbv3b
- Posts: 3
- Joined: Wed Nov 10, 2010 2:05 pm
Re: Poor cula performance
For single precision, there should be about a 20% performance increase between generation. For double precision that increase is upwards of 100%.
I'd suggest running other CUDA SDK examples to further benchmark your performance.
I'd suggest running other CUDA SDK examples to further benchmark your performance.
- kyle
- Administrator
- Posts: 301
- Joined: Fri Jun 12, 2009 7:47 pm
Re: Poor cula performance
Looks like I'm have some problems with some of the SDK routines too. Should have checked this first. Thanks 

- cbv3b
- Posts: 3
- Joined: Wed Nov 10, 2010 2:05 pm
5 posts
• Page 1 of 1
Who is online
Users browsing this forum: No registered users and 0 guests