CULA and MPI

General CULA Dense (LAPACK & BLAS) support and troubleshooting. Use this forum if you are having a general problem or have encountered a bug.

CULA and MPI

Postby gusgw » Wed Jul 25, 2012 8:43 pm

G'day,

I'm trying to use the zheev function of CULA R12 in mpi processes. I'm finding an error however, when I call the function, which gives SIGABRT:

forrtl: error (76): Abort trap signal
Image PC Routine Line Source
libc.so.6 00000036AAA30265 Unknown Unknown Unknown
libc.so.6 00000036AAA31D10 Unknown Unknown Unknown
libpthread.so.0 00000036AB60639E Unknown Unknown Unknown
libpthread.so.0 00000036AB60655D Unknown Unknown Unknown
libc.so.6 00000036AAAD3C2D Unknown Unknown Unknown

running with idb gives a little extra info:

Program received signal SIGABRT
raise () in /lib64/libc-2.5.so

and a backtrace:

#0 0x00000036aaa30265 in raise () in /lib64/libc-2.5.so
#1 0x00000036aaa31d10 in abort () in /lib64/libc-2.5.so
#2 0x00000036ab60639e in __deallocate_stack () in /lib64/libpthread-2.5.so
#3 0x0000000045e409c0 in ?? ()
#4 0x0000428e00000000 in ?? ()
#5 0x000000004543f9e0 in ?? ()
#6 0x000000004543f9e0 in ?? ()
#7 0xffffffffffffffe0 in ?? ()

This error doesn't occur on versions of the codes without mpi however running the mpi code on it's own (i.e. not using mpirun) still fails in this manner. Is there some conflict with the OpenMPI and CULA libs?

CULA R12
CUDA 4.00
OpenMPI 1.4.5
Intel 2011.9
Red Hat Linux

Any ideas?

Cheers,
Nick
gusgw
 
Posts: 21
Joined: Wed Nov 17, 2010 9:50 pm

Return to CULA Dense Support

Who is online

Users browsing this forum: No registered users and 1 guest

cron