Heeswijk’s main research interest is in the field of high-performance computing and machine learning. In particular, how techniques and hardware from high-performance computing can be applied to meet the challenges one has to deal with in machine learning. His current work consists of training multiple neural networks, each on their own GPU. The particular models used in this work are a type of feedforward neural network, called Extreme Learning Machine (ELM).
How CULA has helped
“Using the CULA library, the training and model structure selection of the models can be accelerated. The training can be expressed in terms of CULA operations (with some trick to avoid needing the matrix inverse which is not part of the CULA basic package). The parallelization over multiple GPUs is achieved by combining mex-wrapped CULA with the MATLAB Parallel Computing Toolbox, and binding each of the MATLAB workers to its own GPU. Speciﬁcally, the (culaGesv) and (culaGels) functions were used, and wrappers around these functions were written, such that they can be used from MATLAB in the training and model structure selection of the ELM. This way all CULA operations within that worker will operate on that GPU. “
The paper illustrates the effect of both types of parallelization on the total running time of the algorithm. You will find the abstract and link for the paper in our Research Papers section.
In the coming weeks we are going to be making an announcement related to sparse solvers. I wanted to make those of you who don't visit our forums regularly that we have put out an official request for feedback related to sparse solvers. Specifically, we're looking to answer the following questions:
- Are your solvers home-grown or do they use a toolkit like MKL, PARDISO, UMFPACK, or ITSOL?
- Do you primarily use direct or iterative solvers?
- What speedup would you need to see before you would consider moving to a GPU accelerated solver for your sparse problems?
- What would you define as a typical problem size? What would you define as a large problem size?
You can voice your opinion here. We appreciate your feedback because it helps us to deliver solutions that are most relevant to you.
Today, we are putting a spotlight on Dr. Andrzej Karwowski and his colleagues Dr. Tomasz Topa and Dr. Artur Noga of the GPU computing group at the Silesian University of Technology, Poland.
Dr. Karwowski is with the Department of Electronics, where he currently holds a position of Professor and is a leader of the Radioelectronics group. Dr. Topa and Dr. Noga are Faculty members who have been working closely with Professor Karwowski. Most of the group's work is the fields of computational electromagnetics (CEM), electromagnetic compatibility, antennas and wireless communication. Recently, the focus is on creating GPU-based low-cost hardware platforms for computational electromagnetics (CEM). Dr. Karwowski and the other researchers are examining the possibilities of accelerating the full-wave method of moments (MoM) by employing CUDA-capable GPUs.
How CULA has helped
“We used CULA in the context of numerical modeling and analysis of electromagnetic radiation and scattering with the use of the so-called method of moments (MoM). Roughly speaking, this latter consists in constructing and then solving a matrix equation that describes a system in question. The key problem here is that even relatively simple structures generate complex-valued dense system matrices whose size can be of the order of thousands. With the use of LU factorization routine available in CULA we were able to offload to the GPU the most intensive computations required for the solution of the matrix equation thus attaining noticeable speedup of MoM simulations,” said Dr. Karwowski.
The group has recently published two papers that we suggest reading if you are also looking to accelerate your MoM simulations on CUDA. You will find the abstracts and links for both papers under our Research Papers section.