Page 1 of 1

Calling cula from a shared library in python

PostPosted: Tue Apr 13, 2010 7:32 am
by tomt21
I'm writing a shared library in C++ which I call from python using ctypes to accelerate some calculations that would be slow in python.

The shared library in turn calls CULA, using Ssyev.

I originally started out working on a pure C++ program, and had no problems getting CULA to work.

However building a shared library linked against cula, and calling this library from python seems to cause some problems.

If I simply load the .so from python using ctypes.CDLL, then calling a function in my shared library that calls culaInitialize causes a segmentation fault. However if I first load libcublas.so and libcula.so using ctypes.CDLL this works.

This shouldn't need to be done since ctypes should automatically load all of the dependencies, and it works perfectly with the other libraries I have linked to.

I've also noticed that even once culaInitialize() works the program will sometimes crash with a device error 4, and NVRM Xid 13 error, seemingly at random even when performing the same computation.

Is there something about the way libcula works that could be causing this? loading libcula into ctypes and calling culaInitialize directly works fine, but doing so indirectly fails.

I am using debian testing, gcc 4.4, linux 2.6.32-amd64 on a quadro 3700, and have tried both the 190.53 and 195.36 drivers using cuda 2.3.

Here is the output of ldd on my shared library:

linux-vdso.so.1 => (0x00007fff1271e000)
libgomp.so.1 => /usr/lib/libgomp.so.1 (0x00007f8f1fdd8000)
libgsl.so.0 => /usr/lib/libgsl.so.0 (0x00007f8f1f9b8000)
libgslcblas.so.0 => /usr/lib/libgslcblas.so.0 (0x00007f8f1f77f000)
libyaml-cpp.so.0.2 => /usr/local/lib/libyaml-cpp.so.0.2 (0x00007f8f1f523000)
libmicrohttpd.so.5 => /usr/lib/libmicrohttpd.so.5 (0x00007f8f1f318000)
libcula.so => /usr/local/cula/lib64/libcula.so (0x00007f8f1d8fb000)
libcublas.so.2 => /usr/local/cula/lib64/libcublas.so.2 (0x00007f8f1c36e000)
libstdc++.so.6 => /usr/lib/libstdc++.so.6 (0x00007f8f1c05d000)
libm.so.6 => /lib/libm.so.6 (0x00007f8f1bddb000)
libgcc_s.so.1 => /lib/libgcc_s.so.1 (0x00007f8f1bbc5000)
libc.so.6 => /lib/libc.so.6 (0x00007f8f1b870000)
librt.so.1 => /lib/librt.so.1 (0x00007f8f1b668000)
libpthread.so.0 => /lib/libpthread.so.0 (0x00007f8f1b44c000)
libcudart.so.2 => /usr/local/cula/lib64/libcudart.so.2 (0x00007f8f1b20b000)
/lib64/ld-linux-x86-64.so.2 (0x00007f8f20212000)
libdl.so.2 => /lib/libdl.so.2 (0x00007f8f1b007000)

I'm sure it is entirely possible that there is something incorrect about what I am trying to do since I am no export on ld linking on linux or ctypes!

Oh and to clarify, ctypes only supports C calling conventions, but I need to use the C++ stl in my code, so the functions being called from python, including the one calling culaInitialize are defined as extern "C" in my C++.

Re:Calling cula from a shared library in python

PostPosted: Tue Apr 13, 2010 8:00 am
by tomt21
seems i have partially solved my own problem - making a test case to reproduce this I discovered that not linking to gomp solves the problem and culaInitialize no longer segfaults when called from my library. I have no idea why this would be the case since I'm not even using openmp in my code (yet).

I am still getting the runtime errors but presumably that may be unrelated?

Re:Calling cula from a shared library in python

PostPosted: Wed Apr 14, 2010 2:57 pm
by dan
gomp is actually a dependency of CULA on 64-bit linux systems. If not linking to it (at least not in the way you had been) causes the crashing to stop, then I think that is a good approach.

Since you've gotten the segfault problem solved and you're discussing the numerical issues in another thread, I'd call this issue closed unless it crops up again.

Dan