LAPACK Functionality

by Kyle

We are often asked how much of LAPACK is in CULA.  The short answer to this that we have implemented the majority of the core functionality and that we are constantly adding more routines.

In regards to CULA, the functions in LAPACK can be broken in four major categories:

1) Core functions
2) Building block and auxiliary functions
3) Alternative (or legacy) algorithms
4) Rarely used functions

The core functions of LAPACK include major routines such as xGESV (general system solve), xSYEV (symmetric eigenvalues), or xGELS (least squares solve). These (as well as a handful of others) make up the core of widely used functions included in LAPACK. In CULA, of the functions we feel are ‘core’, we have implemented over 80%. The remaining 20% is in the functionality relating to the Schur decomposition and packed symmetric routines. Fortunately, these routines are high on our priority list and will be implmented in an upcoming CULA release.

Building block functions are routines that are typically only used internally by other LAPACK functions.  These routines include things like xGETR2 (panel factorization) or xLARFG (reflector generation).  These routines have little-to-no use outside of LAPACK and are lower priority to release to the public.  However, if one of these routines is important to you, feel free to contact us and we'll consider it for inclusion in a future release. Additionally, we do expose some of the more general purpose auxiliary function. In R11 we are releasing some useful routines like transpose and conjugate (see this blog post for more information).

Alternative and legacy functions are algorithms that have alternative methods for finding the same result. One example is solving a SVD via a divide-and-conquer method (xGESVDD). While we don't provide this implementation, we do have the general implementation (xGESVD). For these routines, we try to point out in our documentation that an alternative implementation might still be viable for you.

Finally, there are a number of rarely used functions in LAPACK that have been designed for a very specific function.  We have classified about 20% of LAPACK to fall into this category. These functions are planned for eventual inclusion in CULA, but are low on the priority list. However, if we find users with a specific need for one of these functions, we'll certainly raise them to the top! As a reminder, if you are looking for a function not in CULA, stop by our forums and let us know.


User Spotlight: David Hastie, Ph.D.

by Liana

This is the first post of a new blog series called User Spotlight. Our goal is to help facilitate knowledge sharing within our user community.  As often as possible, we will be writing about the cutting edge research our users are involved in, and how CULA has helped them.

On the spotlight today: Dr. David Hastie from the Imperial College in London.

Dr. Hastie is a research assistant in the Department of Epidemiology and Public Health, in the School of Public Health at the Imperial College. He is a member of the Biostatistics group and is currently focused on a project aimed at understanding how various factors combine towards the risk of lung cancer.  His work also involves the development of Evolutionary Stochastic Search, a variable selection algorithm based on an Evolutionary Monte Carlo approach, for single and multiple response linear models. Working with collaborators, he is looking to extend the algorithm to be applicable to logistic regression and regression with interaction terms.  He is involved in the development of the C++ software for this algorithm, and is also leveraging GPU programming techniques to improve performance of the algorithm.

How CULA has helped

“We use CULA within an algorithm we have developed to do variable selection. We are applying this algorithm to genetic data to see which genes are associated with different outcomes. This involves very large matrices. We are mainly using the QR decomposition functionality in CULA and have found it to be hugely helpful. In particular we have overcome bottlenecks that previously we had not been able to surmount,” said Hastie.

Once his research paper has been published, we will certainly add it to our library. Meanwhile, if you would like to learn more about his work, feel free to visit his web site.

If you'd like to share about your research work , including how CULA has contributed to it, please contact us!  We appreciate the feedback!


Auxiliary Functions in R11

by Dan

One of the most fundamental concepts in linear algebra is the transpose function. As implemented on a computer, a transpose changes the shape of a matrix and repositions its values in memory.

For programming modern mathematics, a transpose function is especially useful because it provides a conversion between Row-Major and Column-Major data orderings, which can be required when mixing code from languages that use different orderings, such as C and Fortran. Despite its usefulness, until very recently LAPACK hadn’t provided routines for this and other basic operations. One of the reasons for this is that these functions had been so simple to write that they didn’t need a canonical definition. For example, consider the implementation of a transpose in the loop below:

There is not much in this loop that can go wrong. When programming for the GPU, however, this function becomes significantly more complicated. Compare the loops above to the transpose example from the CUDA SDK:

Obviously, this function is not trivial. The code here models an out-of-place transpose; if it were to be in-place, it would be even more complex. This isn’t the kind of function that you can quickly write in-line when you need it. Instead, you should look to use a time-tested and high-performance library routine to do the job for you.

In the next release of CULA, we’re going to include a transpose and other auxiliary functions directly so that you do not have to write any of them yourself. Available functions will include copy, transpose, conjugate, nancheck, and others, and you can be confident in knowing that these functions have been fully validated by our extensive test suite.

With the inclusion of these new functions, we are continuing to make CULA easier and easier to use by rounding out our support for auxiliary functions, just as we did in R10 with the inclusion of Level 3 BLAS functions. What functions would make your life easier when using CULA? We’re always open to new ideas so feel free to drop by our forums and discuss any suggestions you have.