CULA R10 Versus MAGMA 1.0 (Part 1)

by Kyle

We are often asked about the differences between CULA and the University of Tennessee's GPU linear algebra package, MAGMA. The simple answer we normally give is that CULA is a commercial product developed for deployment while MAGMA is an academic project focused on research. As a commercial product, we strive to produce a cutting edge library that is well supported, feature rich, easy to use, and regularly updated.

Performance wise, both CULA and MAGMA both provide substantial speedups compared to the CPU. Since both libraries provide a wide range of routines, it's better to analyze them on an individual basis rather than generalize about the entire library.

The following graph shows that the performance of the popular routine, DGETRF, is fairly consistent between the two libraries with CULA pulling ahead at the large sizes. We have seen a similar pattern for other routines such as Cholesky and QR factorization. These tests were performed using an Intel Core i7 and NVIDIA C2070 where MAGMA was compiled using MKL with full threading enabled.

While the performance of GETRF is at parity for both libraries, for other routines, the performance of CULA is leaps and bounds ahead of the competition. The following chart shows that the performance of CULA's SGESVD is far ahead of MAGMA's performance when finding both the U and V unitary matrices. This performance gap is because CULA contains a parallelized implementation of the step that generates the unitary matrices, where other implementations have left it as a CPU operation. This step consumes significant processing time and must be implemented for the GPU in order to see a speedup!  This same holds true for the symmetric eigenvalue solver CULA contains accelerated implementations of both the tridiagonalization step and the vectors stage.

While we strive for high performance in CULA, we'd like to reiterate that the CULA design philosophy is first-and-foremost focused on an error-free and accurate solver under any circumstances. As we have discussed before, herehere, and here, we are constantly testing our entire code base to make sure that any code change allows for stable and accurate results. Using these extensive tests, we are able to find and fix bugs and inaccuracies before we release the final product.

Comments (0) Trackbacks (0)

Sorry, the comment form is closed at this time.

Trackbacks are disabled.