Why using CULA in matlab is slower than MKL

General discussion for CULA. Use this forum for questions, examples, feedback, and feature requests.

Why using CULA in matlab is slower than MKL

Postby areslp » Wed Mar 13, 2013 6:48 pm

I'm using ubuntu 12.04 64 bit with Intel(R) Core(TM) i5-2410M CPU and NVIDIA 540M card.
The benchmark gives the following result:
l@l-Aspire-4750:/usr/local/cula/examples/benchmark$ optirun ./benchmark 4096 4096
Initializing CULA...
Initializing MKL...

Benchmarking the following functions:
-------------------------------------
SGEQRF
SGETRF
SGELS
SGGLSE
SGESV
SGESVD
SSYEV
DGEQRF
DGETRF
DGELS
DGGLSE
DGESV
DGESVD
DSYEV
-------------------------------------


-- SGEQRF Benchmark --

Size CULA (s) MKL (s) Speedup
------ ---------- ---------- ---------
4096 1.06 2.23 2.0893

-- SGETRF Benchmark --

Size CULA (s) MKL (s) Speedup
------ ---------- ---------- ---------
4096 0.54 1.01 1.8627

-- SGELS Benchmark --

Size CULA (s) MKL (s) Speedup
------ ---------- ---------- ---------
4096 1.26 2.17 1.7169

-- SGGLSE Benchmark --

Size CULA (s) MKL (s) Speedup
------ ---------- ---------- ---------
4096 1.36 3.36 2.4667

-- SGESV Benchmark --

Size CULA (s) MKL (s) Speedup
------ ---------- ---------- ---------
4096 0.60 0.93 1.5616

-- SGESVD Benchmark --

Size CULA (s) MKL (s) Speedup
------ ---------- ---------- ---------
4096 80.41 111.08 1.3813

-- SSYEV Benchmark --

Size CULA (s) MKL (s) Speedup
------ ---------- ---------- ---------
4096 7.22 7.22 1.0005

-- DGEQRF Benchmark --

Size CULA (s) MKL (s) Speedup
------ ---------- ---------- ---------
4096 4.84 10.25 2.1193

-- DGETRF Benchmark --

Size CULA (s) MKL (s) Speedup
------ ---------- ---------- ---------
4096 2.45 5.17 2.1090

-- DGELS Benchmark --

Size CULA (s) MKL (s) Speedup
------ ---------- ---------- ---------
4096 5.80 7.22 1.2441

-- DGGLSE Benchmark --

Size CULA (s) MKL (s) Speedup
------ ---------- ---------- ---------
4096 5.72 15.98 2.7939

-- DGESV Benchmark --

Size CULA (s) MKL (s) Speedup
------ ---------- ---------- ---------
4096 2.51 5.22 2.0807

-- DGESVD Benchmark --

Size CULA (s) MKL (s) Speedup
------ ---------- ---------- ---------
4096 135.17 221.88 1.6415

-- DSYEV Benchmark --

Size CULA (s) MKL (s) Speedup
------ ---------- ---------- ---------
4096 19.45 22.57 1.1605

It seems correct...But when I link CULA to matlab2012a, the CULA version becomes slow than MKL.

Result using MKL:
>> A=rand(4096);
>> save A.mat A;

>> load A;
>> version -lapack
cpu_id: x86 Family 6 Model 42 Stepping 7, GenuineIntel
libmwlapack: trying spec file...
libmwlapack: loading mkl.so
libmwlapack: loaded mkl.so@0x21b63b0
libmwlapack: mkl.so is not a compatibility layer.
libmwlapack: loading mklcompat.so
libmwlapack: loaded mklcompat.so@0x21b7100
libmwlapack: initializing compatibility layer mklcompat.so

ans =

Intel(R) Math Kernel Library Version 10.3.5 Product Build 20110720 for Intel(R) 64 architecture applications
Linear Algebra PACKage Version 3.3.1


>> version -blas

ans =

Intel(R) Math Kernel Library Version 10.3.5 Product Build 20110720 for Intel(R) 64 architecture applications


>> tic;svd(A);toc;
Elapsed time is 27.527998 seconds.
>>


Result using CULA:
>> load A;
>> tic;svd(A);toc;
cpu_id: x86 Family 6 Model 42 Stepping 7, GenuineIntel
libmwlapack: trying environment...
libmwlapack: loading /usr/local/cula/lib64/libcula_lapack_link.so
libmwlapack: loaded /usr/local/cula/lib64/libcula_lapack_link.so@0x243e620
libmwlapack: /usr/local/cula/lib64/libcula_lapack_link.so is not a compatibility layer.
Elapsed time is 38.682359 seconds.
>>

cula info: dgesvd (N, N, 4096, 4096, 0x7f418fffe020, 4096, 0x7f423dee1960, (nil), 4096, (nil), 4096)
cula info: issuing to CPU (work query)
cula info: CPU library is lapackcpu.so
cula info: work query returned 143360
cula info: done
cula info: dgesvd (N, N, 4096, 4096, 0x7f418fffe020, 4096, 0x7f423dee1960, (nil), 4096, (nil), 4096)
cula info: issuing to GPU (over threshold)
cula info: done


The script to start matlab:
#!/bin/sh
export LAPACK_VERSION=/usr/local/cula/lib64/libcula_lapack_link.so
export BLAS_VERSION=/usr/local/cula/lib64/libcula_lapack_link.so
export CULA_ILP64=1
export LAPACK_VERBOSITY=1
export CULA_DEBUG_LOG=/home/l/cula.log
optirun ./matlab

Am I doing something wrong?
areslp
 
Posts: 2
Joined: Sat Jan 07, 2012 6:07 am

Re: Why using CULA in matlab is slower than MKL

Postby john » Thu Mar 14, 2013 7:02 am

Your two tests are exercising different code paths in SVD. One is N,N (no vectors) and the other is A,A (all vectors).

Keep in mind that your GPU is rather weak, so it might not produce great numbers.
john
Administrator
 
Posts: 587
Joined: Thu Jul 23, 2009 2:31 pm

Re: Why using CULA in matlab is slower than MKL

Postby areslp » Sat Mar 16, 2013 3:13 am

Thanks, I change 'AA' to 'NN', and the performance in benchmark is coincident with it in matlab.
areslp
 
Posts: 2
Joined: Sat Jan 07, 2012 6:07 am


Return to General CULA Discussion

Who is online

Users browsing this forum: No registered users and 1 guest

cron