15Dec/10Off
R10 BLAS Support
New to the R10 Release is support for BLAS Level 3 routines. We added these routines to CULA so that you can use CULA as a stand-alone linear algebra package without requiring several other packages to provide a capable development system. Additionally, in many cases, we have implemented performance tweaks to get even more performance out of these functions than is available in CUBLAS.
| Matrix Type | Operation | S | C | D | Z |
| General | Matrix-matrix multiply | SGEMM | CGEMM | DGEMM | ZGEMM |
| Matrix-vector multiply | SGEMV | CGEMV | DGEMV | ZGEMV | |
| Triangular | Triangular matrix-matrix multiply | STRMM | CTRMM | DTRMM | ZTRMM |
| Triangular matrix solve | STRSM | CTRSM | DTRSM | ZTRSM | |
| Symmetric | Symmetric matrix-matrix multiply | SSYMM | CSYMM | DSYMM | ZSYMM |
| Symmetric rank 2k update | SSYR2K | CSYR2K | DSYR2K | ZSYR2K | |
| Symmetric rank k update | SSYRK | CSYRK | DSYRK | ZSYRK | |
| Hermitian | Hermitian matrix-matrix multiply | CHEMM | ZHEMM | ||
| Hermitian rank 2k update | CHER2K | ZHER2K | |||
| Hermitian rank k update | CHERK | ZHERK |
Like our LAPACK routines, in addition to our Standard, Device, and Fortran interfaces, CULA's BLAS routines support the bridge interface for automatically switching between CPU and GPU execution.
