
is a GPU-accelerated implementation of dense linear algebra routines
Providing a wide set of LAPACK and BLAS capability
CULA Dense provides accelerated implementations of the most popular and essential routines for dense linear algebra in a prepackaged library. If you are already using LAPACK or BLAS in your existing codes, you can even use the library to get acceleration with absolutely no changes to your source code.
| LAPACK | BLAS | |
|---|---|---|
| LU factorization | Cholesky factorization | Matrix-matrix multiply |
| QR decomposition | Orthogonal factorization | Matrix-vector multiply |
| Least squares | System solve | Rank updates |
| Eigenvalue routines | Matrix inversion | Conjugate |
| Singular value decomposition | Auxiliary routines | Transpose |
And offering your software superior performance
CULA Dense's performance is up to an order of magnitude faster than optimized CPU-based linear algebra solvers. Using CULA will allow your software to simply run faster.

While working with you, not against you.

Programmers can easily call CULA Dense from their C/C++, FORTRAN, MATLAB, or Python codes. CULA works with all GPUs supported by NVIDIA's CUDA and is built for all standard CUDA platforms so that you can be assured that your solution runs wherever you need it to.
More Information
» Review the programmer's guide
» See more performance charts
» See the full function list
» Learn about different interfaces
» Read the FAQ
» Downloads
» Visit the forums
