Running Cula examples

General CULA Dense (LAPACK & BLAS) support and troubleshooting. Use this forum if you are having a general problem or have encountered a bug.

Running Cula examples

Postby yummysteak » Sun Jul 31, 2011 5:23 am

I am having trouble running the examples that come with CULA R12.
My system is running 64 bit Windows 7.
I open the examples folder, when I try running the benchmark.exe it complained that it couldn't find one of the cula.dll so I copied the dll's in the bin folder to the system32 folder. It now says that the application was unable to start correctly (0xc000007b). I had similar problems opening the basic usage and system solve solutions in visual studio 2010. When I build the projects I get:
Code: Select all
c:\program files\cula\r12\examples\systemsolve\systemsolve.c(224): warning C4013: 'sqrt' undefined; assuming extern returning int
  systemSolve_vc9.vcxproj -> C:\Program Files\CULA\R12\examples\systemSolve\Debug\systemSolve.exe
  C:\Program Files\CULA\R12\bin\cudart32_40_1732_*.dll
          0 file(s) copied.
  The system cannot find the file specified.
  C:\Program Files\CULA\R12\bin\cublas32_40_1732_*.dll
          0 file(s) copied.
  The system cannot find the file specified.
          1 file(s) copied.
========== Rebuild All: 1 succeeded, 0 failed, 0 skipped ==========

If I run the project as is I get an error that it can't find cudart32_40_17.dll, so I copied the dll's into the debug folder. I know get an error saying CUDA error: initialization error (See Programmer's guide) after it prints that it is allocating matrices and initializing cula.

I'd greatly appreciate your help, and would be happy to provide any additional information you may need.

Ben
yummysteak
 
Posts: 3
Joined: Sun Jul 17, 2011 11:55 pm

Re: Running Cula examples

Postby yummysteak » Sun Jul 31, 2011 6:18 am

I installed CULA on another computer(also 64 bit windows 7) in the meantime and I managed to get the basic usage example to run. However, the benchmark executable gives me the same errors. When I run the system solve I get further than before, but I think I am still experiencing problems:
Code: Select all
ed:  i=258  X[i]=(24081.000000,2995.000000)  B[i]=(24081.000000,2995.000000)Resu
lt check failed:  i=259  X[i]=(2678.000000,24676.000000)  B[i]=(2678.000000,2467
6.000000)Result check failed:  i=260  X[i]=(27753.000000,20899.000000)  B[i]=(27
753.000000,20899.000000)Result check failed:  i=261  X[i]=(11784.000000,15565.00
0000)  B[i]=(11784.000000,15565.000000)Result check failed:  i=262  X[i]=(3093.0
00000,13608.000000)  B[i]=(3093.000000,13608.000000)Result check failed:  i=263
X[i]=(6172.000000,11243.000000)  B[i]=(6172.000000,11243.000000)Result check fa
iled:  i=264  X[i]=(29929.000000,7514.000000)  B[i]=(29929.000000,7514.000000)Re
sult check failed:  i=265  X[i]=(10168.000000,5055.000000)  B[i]=(10168.000000,5
055.000000)Result check failed:  i=266  X[i]=(11191.000000,5973.000000)  B[i]=(1
1191.000000,5973.000000)Result check failed:  i=267  X[i]=(8922.000000,6748.0000
00)  B[i]=(8922.000000,6748.000000)Result check failed:  i=268  X[i]=(5651.00000
0,10986.000000)  B[i]=(5651.000000,10986.000000)Result check failed:  i=269  X[i
]=(2144.000000,16446.000000)  B[i]=(2144.000000,16446.000000)Result check failed
:  i=270  X[i]=(31577.000000,26517.000000)  B[i]=(31577.000000,26517.000000)Resu
lt check failed:  i=271  X[i]=(14629.000000,29916.000000)  B[i]=(14629.000000,29
916.000000)Result check failed:  i=272  X[i]=(5874.000000,15791.000000)  B[i]=(5
874.000000,15791.000000)Result check failed:  i=273  X[i]=(15469.000000,22912.00
0000)  B[i]=(15469.000000,22912.000000)Result check failed:  i=274  X[i]=(8146.0
00000,30693.000000)  B[i]=(8146.000000,30693.000000)Result check failed:  i=275
X[i]=(9091.000000,9815.000000)  B[i]=(9091.000000,9815.000000)Result check fail
ed:  i=276  X[i]=(26949.000000,26857.000000)  B[i]=(26949.000000,26857.000000)Re
sult check failed:  i=277  X[i]=(20640.000000,26052.000000)  B[i]=(20640.000000,
26052.000000)Result check failed:  i=278  X[i]=(236.000000,8551.000000)  B[i]=(2
36.000000,8551.000000)Result check failed:  i=279  X[i]=(9487.000000,31226.00000
0)  B[i]=(9487.000000,31226.000000)Result check failed:  i=280  X[i]=(28162.0000
00,16955.000000)  B[i]=(28162.000000,16955.000000)Result check failed:  i=281  X
[i]=(23183.000000,8394.000000)  B[i]=(23183.000000,8394.000000)Result check fail
ed:  i=282  X[i]=(30180.000000,16097.000000)  B[i]=(30180.000000,16097.000000)Re
sult check failed:  i=283  X[i]=(3065.000000,27065.000000)  B[i]=(3065.000000,27
065.000000)Result check failed:  i=284  X[i]=(2513.000000,9261.000000)  B[i]=(25
13.000000,9261.000000)Result check failed:  i=285  X[i]=(12578.000000,21078.0000
00)  B[i]=(12578.000000,21078.000000)Result check failed:  i=286  X[i]=(16878.00
0000,14140.000000)  B[i]=(16878.000000,14140.000000)Result check failed:  i=287
X[i]=(4611.000000,31947.000000)  B[i]=(4611.000000,31947.000000)Result check fa
iled:  i=288  X[i]=(2445.000000,170.000000)  B[i]=(2445.000000,170.000000)Result
check failed:  i=289  X[i]=(29975.000000,13489.000000)  B[i]=(29975.000000,1348
9.000000)Result check failed:  i=290  X[i]=(24750.000000,6149.000000)  B[i]=(247
50.000000,6149.000000)Result check failed:  i=291  X[i]=(3333.000000,13865.00000
0)  B[i]=(3333.000000,13865.000000)Result check failed:  i=292  X[i]=(22214.0000
00,17282.000000)  B[i]=(22214.000000,17282.000000)Result check failed:  i=293  X
[i]=(27007.000000,27432.000000)  B[i]=(27007.000000,27432.000000)Result check fa
iled:  i=294  X[i]=(8896.000000,16367.000000)  B[i]=(8896.000000,16367.000000)Re
sult check failed:  i=295  X[i]=(28522.000000,4882.000000)  B[i]=(28522.000000,4
882.000000)Result check failed:  i=296  X[i]=(31810.000000,17641.000000)  B[i]=(
31810.000000,17641.000000)Result check failed:  i=297  X[i]=(7231.000000,2187.00
0000)  B[i]=(7231.000000,2187.000000)Result check failed:  i=298  X[i]=(6705.000
000,6479.000000)  B[i]=(6705.000000,6479.000000)Result check failed:  i=299  X[i
]=(6321.000000,6538.000000)  B[i]=(6321.000000,6538.000000)Result check failed:
i=300  X[i]=(31351.000000,19447.000000)  B[i]=(31351.000000,19447.000000)Result
check failed:  i=301  X[i]=(24208.000000,9646.000000)  B[i]=(24208.000000,9646.
000000)Result check failed:  i=302  X[i]=(22276.000000,25759.000000)  B[i]=(2227
6.000000,25759.000000)Result check failed:  i=303  X[i]=(30189.000000,30422.0000
00)  B[i]=(30189.000000,30422.000000)Result check failed:  i=304  X[i]=(27666.00
0000,8486.000000)  B[i]=(27666.000000,8486.000000)Result check failed:  i=305  X
[i]=(3455.000000,2028.000000)  B[i]=(3455.000000,2028.000000)Result check failed
:  i=306  X[i]=(29614.000000,4860.000000)  B[i]=(29614.000000,4860.000000)Result
check failed:  i=307  X[i]=(29253.000000,11777.000000)  B[i]=(29253.000000,1177
7.000000)Result check failed:  i=308  X[i]=(31348.000000,12503.000000)  B[i]=(31
348.000000,12503.000000)Result check failed:  i=309  X[i]=(10861.000000,22431.00
0000)  B[i]=(10861.000000,22431.000000)Result check failed:  i=310  X[i]=(29082.
000000,12455.000000)  B[i]=(29082.000000,12455.000000)Result check failed:  i=31
1  X[i]=(14197.000000,22106.000000)  B[i]=(14197.000000,22106.000000)Result chec
k failed:  i=312  X[i]=(8752.000000,15821.000000)  B[i]=(8752.000000,15821.00000
0)Result check failed:  i=313  X[i]=(17296.000000,26281.000000)  B[i]=(17296.000
000,26281.000000)Result check failed:  i=314  X[i]=(26021.000000,24455.000000)
B[i]=(26021.000000,24455.000000)Result check failed:  i=315  X[i]=(15947.000000,
27124.000000)  B[i]=(15947.000000,27124.000000)Result check failed:  i=316  X[i]
=(18318.000000,9135.000000)  B[i]=(18318.000000,9135.000000)Result check failed:
  i=317  X[i]=(11376.000000,1774.000000)  B[i]=(11376.000000,1774.000000)Result
check failed:  i=318  X[i]=(29859.000000,24998.000000)  B[i]=(29859.000000,24998
.000000)Result check failed:  i=319  X[i]=(12074.000000,9253.000000)  B[i]=(1207
4.000000,9253.000000)Result check failed:  i=320  X[i]=(6922.000000,10635.000000
)  B[i]=(6922.000000,10635.000000)Result check failed:  i=321  X[i]=(1643.000000
,28888.000000)  B[i]=(1643.000000,28888.000000)Result check failed:  i=322  X[i]
=(8153.000000,13232.000000)  B[i]=(8153.000000,13232.000000)Result check failed:
  i=323  X[i]=(4747.000000,28680.000000)  B[i]=(4747.000000,28680.000000)Result
check failed:  i=324  X[i]=(19926.000000,25678.000000)  B[i]=(19926.000000,25678
.000000)Result check failed:  i=325  X[i]=(6450.000000,14801.000000)  B[i]=(6450
.000000,14801.000000)Result check failed:  i=326  X[i]=(24961.000000,14199.00000
0)  B[i]=(24961.000000,14199.000000)Result check failed:  i=327  X[i]=(20855.000
000,26363.000000)  B[i]=(20855.000000,26363.000000)Result check failed:  i=328
X[i]=(5716.000000,10573.000000)  B[i]=(5716.000000,10573.000000)Result check fai
led:  i=329  X[i]=(31561.000000,23245.000000)  B[i]=(31561.000000,23245.000000)R
esult check failed:  i=330  X[i]=(6473.000000,28274.000000)  B[i]=(6473.000000,2
8274.000000)Result check failed:  i=331  X[i]=(1550.000000,24353.000000)  B[i]=(
1550.000000,24353.000000)Result check failed:  i=332  X[i]=(1181.000000,4287.000
000)  B[i]=(1181.000000,4287.000000)Result check failed:  i=333  X[i]=(2699.0000
00,18110.000000)  B[i]=(2699.000000,18110.000000)Result check failed:  i=334  X[
i]=(18643.000000,17465.000000)  B[i]=(18643.000000,17465.000000)Result check fai
led:  i=335  X[i]=(7172.000000,2529.000000)  B[i]=(7172.000000,2529.000000)Resul
t check failed:  i=336  X[i]=(9981.000000,2112.000000)  B[i]=(9981.000000,2112.0
00000)Result check failed:  i=337  X[i]=(13476.000000,4381.000000)  B[i]=(13476.
000000,4381.000000)Result check failed:  i=338  X[i]=(8247.000000,26890.000000)
B[i]=(8247.000000,26890.000000)Result check failed:  i=339  X[i]=(16671.000000,
8805.000000)  B[i]=(16671.000000,8805.000000)Result check failed:  i=340  X[i]=(
32372.000000,30032.000000)  B[i]=(32372.000000,30032.000000)Result check failed:
  i=341  X[i]=(3989.000000,9320.000000)  B[i]=(3989.000000,9320.000000)Result ch
eck failed:  i=342  X[i]=(23165.000000,15431.000000)  B[i]=(23165.000000,15431.0
00000)Result check failed:  i=343  X[i]=(9658.000000,11293.000000)  B[i]=(9658.0
00000,11293.000000)Result check failed:  i=344  X[i]=(17206.000000,26578.000000)
  B[i]=(17206.000000,26578.000000)Result check failed:  i=345  X[i]=(16948.00000
0,2206.000000)  B[i]=(16948.000000,2206.000000)Result check failed:  i=346  X[i]
=(27171.000000,18166.000000)  B[i]=(27171.000000,18166.000000)Result check faile
d:  i=347  X[i]=(3396.000000,16697.000000)  B[i]=(3396.000000,16697.000000)Resul
t check failed:  i=348  X[i]=(31020.000000,23694.000000)  B[i]=(31020.000000,236
94.000000)Result check failed:  i=349  X[i]=(15529.000000,14788.000000)  B[i]=(1
5529.000000,14788.000000)Result check failed:  i=350  X[i]=(30109.000000,17984.0
00000)  B[i]=(30109.000000,17984.000000)Result check failed:  i=351  X[i]=(11969
.000000,28978.000000)  B[i]=(11969.000000,28978.000000)Result check failed:  i=3
52  X[i]=(21617.000000,4015.000000)  B[i]=(21617.000000,4015.000000)Result check
failed:  i=353  X[i]=(16626.000000,3684.000000)  B[i]=(16626.000000,3684.000000
)Result check failed:  i=354  X[i]=(9168.000000,17906.000000)  B[i]=(9168.000000
,17906.000000)Result check failed:  i=355  X[i]=(25928.000000,12097.000000)  B[i
]=(25928.000000,12097.000000)Result check failed:  i=356  X[i]=(28118.000000,243
90.000000)  B[i]=(28118.000000,24390.000000)Result check failed:  i=357  X[i]=(1
5199.000000,11785.000000)  B[i]=(15199.000000,11785.000000)Result check failed:
i=358  X[i]=(14486.000000,19199.000000)  B[i]=(14486.000000,19199.000000)Result
check failed:  i=359  X[i]=(12420.000000,20710.000000)  B[i]=(12420.000000,2071
0.000000)Result check failed:  i=360  X[i]=(18271.000000,15813.000000)  B[i]=(18
271.000000,15813.000000)Result check failed:  i=361  X[i]=(27415.000000,6085.000
000)  B[i]=(27415.000000,6085.000000)Result check failed:  i=362  X[i]=(318.0000
00,3580.000000)  B[i]=(318.000000,3580.000000)Result check failed:  i=363  X[i]=
(1331.000000,7267.000000)  B[i]=(1331.000000,7267.000000)Result check failed:  i
=364  X[i]=(8387.000000,13444.000000)  B[i]=(8387.000000,13444.000000)Result che
ck failed:  i=365  X[i]=(23186.000000,14507.000000)  B[i]=(23186.000000,14507.00
0000)Result check failed:  i=366  X[i]=(4360.000000,17827.000000)  B[i]=(4360.00
0000,17827.000000)Result check failed:  i=367  X[i]=(28074.000000,26431.000000)
B[i]=(28074.000000,26431.000000)Result check failed:  i=368  X[i]=(7152.000000,
30271.000000)  B[i]=(7152.000000,30271.000000)Result check failed:  i=369  X[i]=
(10268.000000,4693.000000)  B[i]=(10268.000000,4693.000000)Result check failed:
i=370  X[i]=(19885.000000,337.000000)  B[i]=(19885.000000,337.000000)Result che
ck failed:  i=371  X[i]=(31311.000000,17604.000000)  B[i]=(31311.000000,17604.00
0000)Result check failed:  i=372  X[i]=(12677.000000,406.000000)  B[i]=(12677.00
0000,406.000000)Result check failed:  i=373  X[i]=(7768.000000,29022.000000)  B[
i]=(7768.000000,29022.000000)Result check failed:  i=374  X[i]=(19413.000000,500
0.000000)  B[i]=(19413.000000,5000.000000)Result check failed:  i=375  X[i]=(542
.000000,17537.000000)  B[i]=(542.000000,17537.000000)Result check failed:  i=376
  X[i]=(30038.000000,21388.000000)  B[i]=(30038.000000,21388.000000)Result check
failed:  i=377  X[i]=(7355.000000,13289.000000)  B[i]=(7355.000000,13289.000000
)Result check failed:  i=378  X[i]=(31647.000000,3181.000000)  B[i]=(31647.00000
0,3181.000000)Result check failed:  i=379  X[i]=(13093.000000,16584.000000)  B[i
]=(13093.000000,16584.000000)Result check failed:  i=380  X[i]=(10987.000000,107
61.000000)  B[i]=(10987.000000,10761.000000)Result check failed:  i=381  X[i]=(2
0493.000000,8217.000000)  B[i]=(20493.000000,8217.000000)Result check failed:  i
=382  X[i]=(9501.000000,17482.000000)  B[i]=(9501.000000,17482.000000)Result che
ck failed:  i=383  X[i]=(29447.000000,15665.000000)  B[i]=(29447.000000,15665.00
0000)Result check failed:  i=384  X[i]=(10753.000000,22104.000000)  B[i]=(10753.
000000,22104.000000)Result check failed:  i=385  X[i]=(15084.000000,19095.000000
)  B[i]=(15084.000000,19095.000000)Result check failed:  i=386  X[i]=(13525.0000
00,30221.000000)  B[i]=(13525.000000,30221.000000)Result check failed:  i=387  X
[i]=(3964.000000,21781.000000)  B[i]=(3964.000000,21781.000000)Result check fail
ed:  i=388  X[i]=(4872.000000,8106.000000)  B[i]=(4872.000000,8106.000000)Result
check failed:  i=389  X[i]=(3656.000000,3343.000000)  B[i]=(3656.000000,3343.00
0000)Result check failed:  i=390  X[i]=(22593.000000,27080.000000)  B[i]=(22593.
000000,27080.000000)Result check failed:  i=391  X[i]=(16080.000000,14868.000000
)  B[i]=(16080.000000,14868.000000)Result check failed:  i=392  X[i]=(21411.0000
00,13713.000000)  B[i]=(21411.000000,13713.000000)Result check failed:  i=393  X
[i]=(20968.000000,3251.000000)  B[i]=(20968.000000,3251.000000)Result check fail
ed:  i=394  X[i]=(27216.000000,12079.000000)  B[i]=(27216.000000,12079.000000)Re
sult check failed:  i=395  X[i]=(28768.000000,17040.000000)  B[i]=(28768.000000,
17040.000000)Result check failed:  i=396  X[i]=(31531.000000,12933.000000)  B[i]
=(31531.000000,12933.000000)Result check failed:  i=397  X[i]=(23779.000000,2066
3.000000)  B[i]=(23779.000000,20663.000000)Result check failed:  i=398  X[i]=(12
259.000000,26653.000000)  B[i]=(12259.000000,26653.000000)Result check failed:
i=399  X[i]=(27936.000000,2095.000000)  B[i]=(27936.000000,2095.000000)Result ch
eck failed:  i=400  X[i]=(24365.000000,11874.000000)  B[i]=(24365.000000,11874.0
00000)Result check failed:  i=401  X[i]=(7720.000000,26835.000000)  B[i]=(7720.0
00000,26835.000000)Result check failed:  i=402  X[i]=(25680.000000,8976.000000)
B[i]=(25680.000000,8976.000000)Result check failed:  i=403  X[i]=(18455.000000,
5725.000000)  B[i]=(18455.000000,5725.000000)Result check failed:  i=404  X[i]=(
4071.000000,24808.000000)  B[i]=(4071.000000,24808.000000)Result check failed:
i=405  X[i]=(13559.000000,9156.000000)  B[i]=(13559.000000,9156.000000)Result ch
eck failed:  i=406  X[i]=(5602.000000,17832.000000)  B[i]=(5602.000000,17832.000
000)Result check failed:  i=407  X[i]=(7905.000000,10440.000000)  B[i]=(7905.000
000,10440.000000)Result check failed:  i=408  X[i]=(7375.000000,21562.000000)  B
[i]=(7375.000000,21562.000000)Result check failed:  i=409  X[i]=(22885.000000,21
962.000000)  B[i]=(22885.000000,21962.000000)Result check failed:  i=410  X[i]=(
21080.000000,1836.000000)  B[i]=(21080.000000,1836.000000)Result check failed:
i=411  X[i]=(10797.000000,31202.000000)  B[i]=(10797.000000,31202.000000)Result
check failed:  i=412  X[i]=(10508.000000,10080.000000)  B[i]=(10508.000000,10080
.000000)Result check failed:  i=413  X[i]=(5340.000000,12076.000000)  B[i]=(5340
.000000,12076.000000)Result check failed:  i=414  X[i]=(9058.000000,31493.000000
)  B[i]=(9058.000000,31493.000000)Result check failed:  i=415  X[i]=(7740.000000
,8546.000000)  B[i]=(7740.000000,8546.000000)Result check failed:  i=416  X[i]=(
20474.000000,24773.000000)  B[i]=(20474.000000,24773.000000)Result check failed:
  i=417  X[i]=(19097.000000,8880.000000)  B[i]=(19097.000000,8880.000000)Result
check failed:  i=418  X[i]=(23335.000000,11072.000000)  B[i]=(23335.000000,11072
.000000)Result check failed:  i=419  X[i]=(23400.000000,707.000000)  B[i]=(23400
.000000,707.000000)Result check failed:  i=420  X[i]=(22955.000000,20666.000000)
  B[i]=(22955.000000,20666.000000)Result check failed:  i=421  X[i]=(4141.000000
,23588.000000)  B[i]=(4141.000000,23588.000000)Result check failed:  i=422  X[i]
=(12481.000000,17168.000000)  B[i]=(12481.000000,17168.000000)Result check faile
d:  i=423  X[i]=(28315.000000,19396.000000)  B[i]=(28315.000000,19396.000000)Res
ult check failed:  i=424  X[i]=(16225.000000,1009.000000)  B[i]=(16225.000000,10
09.000000)Result check failed:  i=425  X[i]=(22012.000000,18136.000000)  B[i]=(2
2012.000000,18136.000000)Result check failed:  i=426  X[i]=(11455.000000,18762.0
00000)  B[i]=(11455.000000,18762.000000)Result check failed:  i=427  X[i]=(25043
.000000,742.000000)  B[i]=(25043.000000,742.000000)Result check failed:  i=428
X[i]=(21.000000,17922.000000)  B[i]=(21.000000,17922.000000)Result check failed:
  i=429  X[i]=(24512.000000,9248.000000)  B[i]=(24512.000000,9248.000000)Result
check failed:  i=430  X[i]=(26018.000000,27368.000000)  B[i]=(26018.000000,27368
.000000)Result check failed:  i=431  X[i]=(23717.000000,9714.000000)  B[i]=(2371
7.000000,9714.000000)Result check failed:  i=432  X[i]=(17650.000000,13290.00000
0)  B[i]=(17650.000000,13290.000000)Result check failed:  i=433  X[i]=(3335.0000
00,12759.000000)  B[i]=(3335.000000,12759.000000)Result check failed:  i=434  X[
i]=(3169.000000,21895.000000)  B[i]=(3169.000000,21895.000000)Result check faile
d:  i=435  X[i]=(5303.000000,22640.000000)  B[i]=(5303.000000,22640.000000)Resul
t check failed:  i=436  X[i]=(21979.000000,24199.000000)  B[i]=(21979.000000,241
99.000000)Result check failed:  i=437  X[i]=(29105.000000,24791.000000)  B[i]=(2
9105.000000,24791.000000)Result check failed:  i=438  X[i]=(18661.000000,8681.00
0000)  B[i]=(18661.000000,8681.000000)Result check failed:  i=439  X[i]=(3652.00
0000,8753.000000)  B[i]=(3652.000000,8753.000000)Result check failed:  i=440  X[
i]=(24033.000000,32029.000000)  B[i]=(24033.000000,32029.000000)Result check fai
led:  i=441  X[i]=(15987.000000,7042.000000)  B[i]=(15987.000000,7042.000000)Res
ult check failed:  i=442  X[i]=(26253.000000,20083.000000)  B[i]=(26253.000000,2
0083.000000)Result check failed:  i=443  X[i]=(11420.000000,15814.000000)  B[i]=
(11420.000000,15814.000000)Result check failed:  i=444  X[i]=(32718.000000,12244
.000000)  B[i]=(32718.000000,12244.000000)Result check failed:  i=445  X[i]=(310
63.000000,7229.000000)  B[i]=(31063.000000,7229.000000)Result check failed:  i=4
46  X[i]=(20652.000000,18864.000000)  B[i]=(20652.000000,18864.000000)Result che
ck failed:  i=447  X[i]=(4769.000000,30470.000000)  B[i]=(4769.000000,30470.0000
00)Result check failed:  i=448  X[i]=(15005.000000,21047.000000)  B[i]=(15005.00
0000,21047.000000)Result check failed:  i=449  X[i]=(1594.000000,21487.000000)
B[i]=(1594.000000,21487.000000)Result check failed:  i=450  X[i]=(24326.000000,3
276.000000)  B[i]=(24326.000000,3276.000000)Result check failed:  i=451  X[i]=(2
1323.000000,6540.000000)  B[i]=(21323.000000,6540.000000)Result check failed:  i
=452  X[i]=(7679.000000,23990.000000)  B[i]=(7679.000000,23990.000000)Result che
ck failed:  i=453  X[i]=(32588.000000,24710.000000)  B[i]=(32588.000000,24710.00
0000)Result check failed:  i=454  X[i]=(29271.000000,17945.000000)  B[i]=(29271.
000000,17945.000000)Result check failed:  i=455  X[i]=(29221.000000,28470.000000
)  B[i]=(29221.000000,28470.000000)Result check failed:  i=456  X[i]=(20183.0000
00,23589.000000)  B[i]=(20183.000000,23589.000000)Result check failed:  i=457  X
[i]=(23955.000000,4978.000000)  B[i]=(23955.000000,4978.000000)Result check fail
ed:  i=458  X[i]=(24779.000000,5006.000000)  B[i]=(24779.000000,5006.000000)Resu
lt check failed:  i=459  X[i]=(13262.000000,20135.000000)  B[i]=(13262.000000,20
135.000000)Result check failed:  i=460  X[i]=(23487.000000,27196.000000)  B[i]=(
23487.000000,27196.000000)Result check failed:  i=461  X[i]=(29033.000000,2088.0
00000)  B[i]=(29033.000000,2088.000000)Result check failed:  i=462  X[i]=(12935.
000000,19779.000000)  B[i]=(12935.000000,19779.000000)Result check failed:  i=46
3  X[i]=(15993.000000,14790.000000)  B[i]=(15993.000000,14790.000000)Result chec
k failed:  i=464  X[i]=(24962.000000,18965.000000)  B[i]=(24962.000000,18965.000
000)Result check failed:  i=465  X[i]=(11001.000000,19105.000000)  B[i]=(11001.0
00000,19105.000000)Result check failed:  i=466  X[i]=(11807.000000,24567.000000)
  B[i]=(11807.000000,24567.000000)Result check failed:  i=467  X[i]=(2669.000000
,3134.000000)  B[i]=(2669.000000,3134.000000)Result check failed:  i=468  X[i]=(
32671.000000,1457.000000)  B[i]=(32671.000000,1457.000000)Result check failed:
i=469  X[i]=(12998.000000,3545.000000)  B[i]=(12998.000000,3545.000000)Result ch
eck failed:  i=470  X[i]=(13597.000000,14218.000000)  B[i]=(13597.000000,14218.0
00000)Result check failed:  i=471  X[i]=(8838.000000,14844.000000)  B[i]=(8838.0
00000,14844.000000)Result check failed:  i=472  X[i]=(7372.000000,8563.000000)
B[i]=(7372.000000,8563.000000)Result check failed:  i=473  X[i]=(21028.000000,29
264.000000)  B[i]=(21028.000000,29264.000000)Result check failed:  i=474  X[i]=(
28801.000000,14723.000000)  B[i]=(28801.000000,14723.000000)Result check failed:
  i=475  X[i]=(13490.000000,7604.000000)  B[i]=(13490.000000,7604.000000)Result
check failed:  i=476  X[i]=(31601.000000,24227.000000)  B[i]=(31601.000000,24227
.000000)Result check failed:  i=477  X[i]=(11197.000000,23692.000000)  B[i]=(111
97.000000,23692.000000)Result check failed:  i=478  X[i]=(19771.000000,20363.000
000)  B[i]=(19771.000000,20363.000000)Result check failed:  i=479  X[i]=(29301.0
00000,22363.000000)  B[i]=(29301.000000,22363.000000)Result check failed:  i=480
  X[i]=(7721.000000,3565.000000)  B[i]=(7721.000000,3565.000000)Result check fai
led:  i=481  X[i]=(17421.000000,23445.000000)  B[i]=(17421.000000,23445.000000)R
esult check failed:  i=482  X[i]=(18610.000000,495.000000)  B[i]=(18610.000000,4
95.000000)Result check failed:  i=483  X[i]=(16741.000000,15022.000000)  B[i]=(1
6741.000000,15022.000000)Result check failed:  i=484  X[i]=(31812.000000,29151.0
00000)  B[i]=(31812.000000,29151.000000)Result check failed:  i=485  X[i]=(23015
.000000,8055.000000)  B[i]=(23015.000000,8055.000000)Result check failed:  i=486
  X[i]=(3393.000000,8738.000000)  B[i]=(3393.000000,8738.000000)Result check fai
led:  i=487  X[i]=(15279.000000,19882.000000)  B[i]=(15279.000000,19882.000000)R
esult check failed:  i=488  X[i]=(1608.000000,12654.000000)  B[i]=(1608.000000,1
2654.000000)Result check failed:  i=489  X[i]=(3822.000000,32707.000000)  B[i]=(
3822.000000,32707.000000)Result check failed:  i=490  X[i]=(24245.000000,1338.00
0000)  B[i]=(24245.000000,1338.000000)Result check failed:  i=491  X[i]=(144.000
000,22290.000000)  B[i]=(144.000000,22290.000000)Result check failed:  i=492  X[
i]=(31339.000000,23154.000000)  B[i]=(31339.000000,23154.000000)Result check fai
led:  i=493  X[i]=(24604.000000,4623.000000)  B[i]=(24604.000000,4623.000000)Res
ult check failed:  i=494  X[i]=(22225.000000,20078.000000)  B[i]=(22225.000000,2
0078.000000)Result check failed:  i=495  X[i]=(21724.000000,31981.000000)  B[i]=
(21724.000000,31981.000000)Result check failed:  i=496  X[i]=(2330.000000,29733.
000000)  B[i]=(2330.000000,29733.000000)Result check failed:  i=497  X[i]=(28223
.000000,20594.000000)  B[i]=(28223.000000,20594.000000)Result check failed:  i=4
98  X[i]=(29130.000000,18846.000000)  B[i]=(29130.000000,18846.000000)Result che
ck failed:  i=499  X[i]=(4987.000000,29445.000000)  B[i]=(4987.000000,29445.0000
00)Result check failed:  i=500  X[i]=(18805.000000,8616.000000)  B[i]=(18805.000
000,8616.000000)Result check failed:  i=501  X[i]=(5750.000000,20489.000000)  B[
i]=(5750.000000,20489.000000)Result check failed:  i=502  X[i]=(27338.000000,219
63.000000)  B[i]=(27338.000000,21963.000000)Result check failed:  i=503  X[i]=(2
8135.000000,14697.000000)  B[i]=(28135.000000,14697.000000)Result check failed:
i=504  X[i]=(32209.000000,21630.000000)  B[i]=(32209.000000,21630.000000)Result
check failed:  i=505  X[i]=(23224.000000,1908.000000)  B[i]=(23224.000000,1908.
000000)Result check failed:  i=506  X[i]=(26737.000000,24474.000000)  B[i]=(2673
7.000000,24474.000000)Result check failed:  i=507  X[i]=(31920.000000,27372.0000
00)  B[i]=(31920.000000,27372.000000)Result check failed:  i=508  X[i]=(10293.00
0000,3855.000000)  B[i]=(10293.000000,3855.000000)Result check failed:  i=509  X
[i]=(6734.000000,9561.000000)  B[i]=(6734.000000,9561.000000)Result check failed
:  i=510  X[i]=(31056.000000,27606.000000)  B[i]=(31056.000000,27606.000000)Resu
lt check failed:  i=511  X[i]=(8184.000000,7075.000000)  B[i]=(8184.000000,7075.
000000)Shutting down CULA

Press any key to continue . . .


I'm not really sure why the two computers are behaving differently. The first one is a laptop with g210m, the new one is a desktop that I just reformatted with a gtx480.
yummysteak
 
Posts: 3
Joined: Sun Jul 17, 2011 11:55 pm

Re: Running Cula examples

Postby john » Tue Aug 02, 2011 6:12 am

You shouldn't copy the CULA DLLs to the system32 directory; I wouldn't want to upset Windows for some reason. Instead, add %CULA_BIN_PATH_64% to your PATH environment variable. We don't set that variable automatically for you because there is a length limit that we don't want to help exceed, but it is necessary for click-and-run functionality on the benchmark.exe.

When building the benchmark, it will copy the files for you automatically and save you the headache. (More accurately, it will remove that headache and substitute for it the requirement for you to get MKL set up for linking. Either way.)

I think the warning is indicative and is probably the problem. The examples are provided for VS 2005 and 2008, but the use of 2010 is presently not guaranteed with these, so you might have to work to correct that warning first.

Also, do be sure that you're on CUDA 4.0 only, meaning that your CUDA toolkit and CUDA driver are both the latest 4.0 version from the CUDA download page.
john
Administrator
 
Posts: 587
Joined: Thu Jul 23, 2009 2:31 pm

Re: Running Cula examples

Postby yummysteak » Wed Aug 03, 2011 3:16 am

Thank you very much for your reply. The benchmark executable now runs. However I am only getting speedups of about 2x-5x. I was expecting much larger speed ups. I would like to use cula because of your built in system solve, but I don't think these speed ups will be enough. Is there some way to get better results? From what I read the benchmark only uses 1 of my devices, is there a way to have it leverage all of them, since I have 2 gtx 295's on this machine(4 devices). Also, the system solve continues to say result checked failed, although they are all coming from the complex example, and not from the regular example.

Ben
yummysteak
 
Posts: 3
Joined: Sun Jul 17, 2011 11:55 pm

Re: Running Cula examples

Postby john » Fri Aug 05, 2011 10:03 am

I can't comment completely without knowing the CPU you have paired with your GPU, but I do want to point out that CUBLAS does not have 100x performance compared to the Intel MKL that we benchmark against (except in cases where the CPU is grossly underpowered compared to the GPU). 6-10x is a more typical number for CUBLAS vs CPU. If you are seeing 100x, it is likely that one of the following is true:
[list=][*]Your CPU is badly mismatched to your GPU, ie a low end Core 2 chip against a GTX 580 GPU
[*]The CUBLAS benchmark is flawed and neglects that CUBLAS is an asynchronous API - this would basically time the launch overhead but not the actual work done on the GPU
[*]The reference CPU code is single threaded, such as Netlib BLAS and doesn't represent optimized CPU code. We benchmark against MKL because it is the strongest CPU code availble.[/list]

If the system solve example is reporting that a check failed, it is likely that something is still a bit squirrely with your configuration/compilation.

To answer the last question, CULA presently uses 1 GPU only, but this may change in the future.

To get a better of how we expect CULA to run a solve on a typical system, please see the CGETRF chart on http://www.culatools.com/performance
john
Administrator
 
Posts: 587
Joined: Thu Jul 23, 2009 2:31 pm


Return to CULA Dense Support

Who is online

Users browsing this forum: No registered users and 3 guests

cron