Page 1 of 1

bug in Ssyev?

PostPosted: Wed Apr 14, 2010 7:29 am
by tomt21
I am experiencing a problem running Ssyev which seems to only occur for certain input matrices.

I have put together a small python script to test this along with the matrices used, one which causes the error and another that works, attached below.

The matrices are 0 except for entries for each i,j listed in the txt files set to 1 for A_{i,j} and A_{j,i}, although the python code only sets the upper diagonal.

Calling LAPACK on the same matrix runs without any problems, and setting the job type to 'V' for CULA also works, although giving different results.

I would be very grateful if you could confirm that this is indeed a bug or if I am doing something incorrectly!

I have tested this on a tesla system and a Quadro 3700 and both give the same results.

Example:
Code: Select all
$> python2.5 test.py goodmatrix.txt
culaInitialize:
No Errors
Ssyev:
No Errors
0
$> python2.5 test.py badmatrix.txt
culaInitialize:
No Errors
Ssyev:
Runtime error, see culaGetErrorInfo for error code
4


tesla system:
Code: Select all
>$ cat /proc/driver/nvidia/version
NVRM version: NVIDIA UNIX x86_64 Kernel Module  190.18  Wed Jul 22 15:36:09 PDT 2009
GCC version:  gcc version 4.1.2 20080704 (Red Hat 4.1.2-46)


quadro
Code: Select all
>$ cat /proc/driver/nvidia/version                                                                                tt104
NVRM version: NVIDIA UNIX x86_64 Kernel Module  195.36.15  Fri Mar 12 00:29:13 PST 2010
GCC version:  gcc version 4.3.4 (Debian 4.3.4-8)


I'd be happy to provide any more information if needed!
(This is a tar.gz file)
[file name=ssyevTest.gz size=41184]http://www.culatools.com/images/fbfiles/files/ssyevTest.gz[/file]

Re:bug in Ssyev?

PostPosted: Wed Apr 14, 2010 8:14 am
by kyle
Hi tomt21,

I'm checking this out now. I can't speak for your Python code being correct, but I'm replicating your matrix and trying some testing as we speak.

Re:bug in Ssyev?

PostPosted: Wed Apr 14, 2010 8:26 am
by kyle
I'm having some issues reading some lines from your script. Is this the start of the "bad matrix"?

0 -- 85
14 -- 86
87 -- 92
28 -- 99
0 -- 127
3 -- 147
45 -- 182
91 -- 184
.
.
.

Re:bug in Ssyev?

PostPosted: Wed Apr 14, 2010 8:49 am
by tomt21
yes thats correct, sorry for the slightly strange formatting.

they are row/column numbers starting from 0 for a 5000 by 5000 real symmetric matrix for the non-zero entries, all of which are 1.

here is the relevant python if it helps - it is extracting i and j from the lines "i -- j" in the text files and then setting A[i+j*N] to 1.

Code: Select all
for i in xrange(N*N):
        A[i]=0.0

f=open(sys.argv[1],'r')

er=re.compile('(\S+) -- (\S+)')
for l in f:
        ij=er.findall(l)
        i=int(ij[0][0])
        j=int(ij[0][1])
        if i<j:
                A[i+j*N]=1.0
        else:
                A[j+i*N]=1.0


Re:bug in Ssyev?

PostPosted: Wed Apr 14, 2010 9:22 am
by kyle
Alright, I was able to reproduce your matrix in MATLAB using:

X = zeros(5000,5000);
X(0+1,1+85 ) = 1;
X(14+1,1+86 ) = 1;
...
X(803+1,1+4974 ) = 1;
X(2388+1,1+4991 ) = 1;

% Fill lower triangle with ones
X = X + tril( ones(5000,5000), -1 );

I then exported this matrix and ran it through our testing system with the "No Vectors" and "Upper" options for SSYEV.

It did not error and my answer matched results generated with Intel MKL. This was done on a 64-bit Windows7 machine with a Tesla C1060. I'll get back to you in a bit with results from our 64-bit CentOS machine.

Re:bug in Ssyev?

PostPosted: Wed Apr 14, 2010 9:37 am
by tomt21
thanks for testing it out - i'm not filling the lower triangle with 1's but I just added that and tried again and am still getting the same error.

i you could run the python on a linux machine as well that would be immensely helpful (all it needs is python 2.5, and changing the path at the top to where libcula is located, then run it passing one of the text files as the first argument).

Re:bug in Ssyev?

PostPosted: Wed Apr 14, 2010 1:09 pm
by kyle
I managed to get your script running on a machine and I am indeed getting the same error. It's narrowed down to something in the eigenvalue routine not converging. I'm not sure why this is only happening with the Python wrapper, though. Hopefully we'll have an answer shortly.

Re:bug in Ssyev?

PostPosted: Wed Apr 14, 2010 2:52 pm
by kyle
I've managed to track down the bug to an oddity in floating point arithmetic that was preventing convergence of the eigenvalues. This has since been fixed and we'll try to get a service release out shortly.

Re:bug in Ssyev?

PostPosted: Wed Apr 14, 2010 3:07 pm
by kyle
Also, I've attached the eigenvalues I'm now calculating. Do they look within reason? [file name=eig.txt size=90544]http://www.culatools.com/images/fbfiles/files/eig.txt[/file]

Re:bug in Ssyev?

PostPosted: Wed Apr 14, 2010 3:17 pm
by tomt21
yes that looks correct, glad to hear it wasn't just me. thanks very much for responding so quickly!

Re:bug in Ssyev?

PostPosted: Mon Apr 19, 2010 1:01 pm
by kyle
This bug has been fixed in 1.3a, which is available today. Thanks for the feedback!