## Numerical error in SVD?

12 posts
• Page

**1**of**1**### Numerical error in SVD?

The U and V returned by the SVD diverge from the U and V of Matlab, with increasing N. With N around 1100, there is an error in the second digit (MAX ERROR = 0.02***), for U and V... Is this a known issue? Are you guys able to reproduce it? I tired it on a random matrix.

Senior Developer, AccelerEyes

PhD Student, Dept. of ECE,

UC Santa Barbara

- jrk1982
**Posts:**26**Joined:**Thu Aug 13, 2009 8:48 pm

### Re:Numerical error in SVD?

We have made further corrections to the unitary correctness of the U and V matrices for Beta 3 (out soon) but it's worth noting that U and V should not be expected to match Matlab's answers precisely. SVD guarantees unique singular values but only requires that U and V be unitary; U and V are not unique however. As long as U*S*V' = A and U*U'=I and V*V'=I then we consider it a valid SVD.

- john
- Administrator
**Posts:**587**Joined:**Thu Jul 23, 2009 2:31 pm

### Re:Numerical error in SVD?

I agree that U and V are not unique. I merely observed that they deviate more from Matlab's result with increasing N, which indicates possible numeric instability. Thanks

for your immediate response!

for your immediate response!

Senior Developer, AccelerEyes

PhD Student, Dept. of ECE,

UC Santa Barbara

- jrk1982
**Posts:**26**Joined:**Thu Aug 13, 2009 8:48 pm

### Re:Numerical error in SVD?

I have also noted this for culaDeviceSgesvd. In fact, for different input matrices I've noted pretty wild instability to the point where u*u' and v*v' are not even close to identity. I've been using Eigen for SVD before this and wanted to get my hands on a GPU implementation, but the u and v' returned are always very different from what eigen returns. I need to project 144x144 matrices back to orthonormal and a good way to do that is to take SVD and use u*v' for it. Unfortunately, CULA's SVD always makes my projection differ from eigen's projection by a frobenius norm of 1 to 10, which is not ok. The ML algorithms I'm using train just fine when Eigen SVD is used but fail miserably with CULA. Is there an expected time-frame for when this will be fixed?

- amrasarfeiniel
**Posts:**4**Joined:**Mon Jan 25, 2010 6:11 am

### Re:Numerical error in SVD?

Is it possible to provide more information about your input matrices? All of our in-house unitary tests for SVD have been showing accurate results since Beta3, which was a few months ago.

Our accuracy for SVD's unitary checks are verified by:

|| UU' - I || / O(1) * ulp * N

and

|| VT'VT - I || / O(1) * ulp * N

Also, have you tried LAPACK's SVD for your calculations? All of our GPU accelerated code is also verified against Netlib's reference implementation and has not shown the major deviation you have described.

I hope we can help! Like I mentioned, we have considered our SVD implementation to be numerically stable since the Beta3 changes.

Our accuracy for SVD's unitary checks are verified by:

|| UU' - I || / O(1) * ulp * N

and

|| VT'VT - I || / O(1) * ulp * N

Also, have you tried LAPACK's SVD for your calculations? All of our GPU accelerated code is also verified against Netlib's reference implementation and has not shown the major deviation you have described.

I hope we can help! Like I mentioned, we have considered our SVD implementation to be numerically stable since the Beta3 changes.

- kyle
- Administrator
**Posts:**301**Joined:**Fri Jun 12, 2009 7:47 pm

### Re:Numerical error in SVD?

yeah here's one of the matrices that seems to break things (attached). I have 687 more from a single run of our neural network training algorithm that break things as well. If you would like them, I have a tar of them and I can send them to you. The tar is about 50MB.

Here is the code we are calling to get the SVD:

static bool culaInitialized = false;

void sdn::CudaMatrix::svd(CudaMatrix& u, CudaMatrix& vt, CudaMatrix& s) const {

if(!culaInitialized) {

SDN_ASSERT(culaInitialize() == culaNoError);

culaInitialized = true;

}

CudaMatrix copy(*this); // need copy so we don't destroy matrix

copy.updateMatrixGPU();

int min_mn = std::min(copy.rows(), copy.cols());

u = Zero(copy.rows(), min_mn);

vt = Zero(min_mn, copy.cols());

CudaVector sigma = CudaVector::Zero(min_mn);

SDN_ASSERT(culaDeviceSgesvd('S', 'S', copy.rows(), copy.cols(),

copy.matrixGPU_, copy.pm_,

sigma.matrixGPU_, u.matrixGPU_, u.pm_,

vt.matrixGPU_, vt.pm_) == culaNoError);

s = sigma.asDiagonal();

}

here is the test code for ensuring cula works with eigen as a benchmark:

for(int count = 0; count < 100; count++) {

CudaMatrix W(DIM, DIM);

char filename[128];

sprintf(filename, "/home/jeffrey/libsdn-trunk/tests/lib/W%d.mat", count);

FILE * f = fopen(filename, "rb");

for(int j = 0; j < DIM; j++) {

for(int k = 0; k < DIM; k++) {

float val;

fscanf(f, "%f", &val);

W(j,k) = val;

}

}

CudaMatrix u, vt, s;

W.svd(u, vt, s);

Eigen::MatrixXf w(W.rows(), W.cols());

cudaToEigen(W, w);

Eigen::SVD<Eigen::MatrixXf> w_svd(w);

CudaMatrix ue(DIM, DIM), vte(DIM, DIM), se(DIM, DIM);

eigenToCuda(w_svd.matrixU(), ue);

eigenToCuda(w_svd.matrixV(), vte);

vte = vte.transpose();

eigenToCuda(w_svd.singularValues().asDiagonal(), se);

for(int i = 0; i < W.cols(); i++) {

//for(int j = 0; j < W.cols(); j++) {

float u_val = (u.col(i) - ue.col(i)).squaredNorm();

/*if(u_val > .0001) {

u.col(i) = -u.col(i);

vt.row(i) = -vt.row(i);

}*/

}

float uDiff = (u-ue).squaredNorm();

float vtDiff = (vt-vte).squaredNorm();

float sDiff = (s-se).squaredNorm();

float identDiffU = (u*u.transpose()-

CudaVector::Ones(DIM).asDiagonal()).squaredNorm();

float identDiffVt = (vt*vt.transpose()-

CudaVector::Ones(DIM).asDiagonal()).squaredNorm();

float svdDiff = (W-u*s*vt).squaredNorm();

float esvdDiff = (W-ue*se*vte).squaredNorm();

float projDiff = (u*vt-ue*vte).squaredNorm();

SDN_DEBUG("udiff:" << uDiff << ", vtdiff:" << vtDiff << ", sdiff:" << sDiff

<< ", IUdiff:" << identDiffU << ", IVTdiff:" << identDiffVt

<< ", svddiff:" << svdDiff << ", esvddiff:" << esvdDiff

<< ", projdiff:" << projDiff);

}

obviously, you can't run that directly since you don't have our CudaMatrix class, but it should give a general idea of what we're doing. And finally, here's a few lines of output of the test (the first line corresponds to the attached matrix):

-D- udiff:263.838, vtdiff:264.827, sdiff:3.36724e-11, IUdiff:29.1175, IVTdiff:30.584, svddiff:35.1618, esvddiff:3.67794e-10, projdiff:243.746

-D- udiff:250.746, vtdiff:254.494, sdiff:2.49938e-10, IUdiff:34.9725, IVTdiff:43.764, svddiff:29.341, esvddiff:1.89376e-09, projdiff:231.396

-D- udiff:261.07, vtdiff:258.52, sdiff:1.32705e-10, IUdiff:49.1446, IVTdiff:43.4663, svddiff:16.9425, esvddiff:1.21796e-09, projdiff:251.536

-D- udiff:275.527, vtdiff:268.927, sdiff:1.43109e-10, IUdiff:87.1182, IVTdiff:49.7741, svddiff:25.9259, esvddiff:1.491e-09, projdiff:262.372

-D- udiff:260.407, vtdiff:265.878, sdiff:1.65283e-10, IUdiff:28.0323, IVTdiff:28.6273, svddiff:109.633, esvddiff:1.34261e-09, projdiff:232.627

-D- udiff:257.141, vtdiff:259.705, sdiff:1.00432e-10, IUdiff:32.1512, IVTdiff:30.7813, svddiff:107.76, esvddiff:7.47435e-10, projdiff:241.585

-D- udiff:260.023, vtdiff:259.648, sdiff:1.12667e-10, IUdiff:25.2452, IVTdiff:25.1066, svddiff:92.9703, esvddiff:9.99705e-10, projdiff:241.176

-D- udiff:264.495, vtdiff:275.31, sdiff:1.46459e-10, IUdiff:29.6779, IVTdiff:31.6966, svddiff:111.435, esvddiff:1.11735e-09, projdiff:265.427

-D- udiff:259.784, vtdiff:264.603, sdiff:1.60536e-10, IUdiff:29.1742, IVTdiff:25.7128, svddiff:111.356, esvddiff:1.18979e-09, projdiff:235.136

-D- udiff:252.679, vtdiff:261.655, sdiff:1.06585e-10, IUdiff:34.0553, IVTdiff:33.384, svddiff:105.997, esvddiff:8.11008e-10, projdiff:236.002

-D- udiff:263.308, vtdiff:254.987, sdiff:1.23534e-10, IUdiff:24.6389, IVTdiff:29.5374, svddiff:90.6421, esvddiff:8.50347e-10, projdiff:230.571

-D- udiff:266.719, vtdiff:265.023, sdiff:1.77858e-10, IUdiff:29.2374, IVTdiff:26.04, svddiff:104.638, esvddiff:1.30383e-09, projdiff:246.13

-D- udiff:262.745, vtdiff:254.752, sdiff:1.84412e-10, IUdiff:25.28, IVTdiff:29.5805, svddiff:113.491, esvddiff:1.44183e-09, projdiff:238.687

-D- udiff:258.666, vtdiff:259.779, sdiff:1.19513e-10, IUdiff:34.7407, IVTdiff:33.3303, svddiff:111.696, esvddiff:9.14567e-10, projdiff:248.946

-D- udiff:257.898, vtdiff:255.723, sdiff:5.88738e-11, IUdiff:28.5761, IVTdiff:28.9096, svddiff:94.3089, esvddiff:5.66239e-10, projdiff:240.8

-D- udiff:265.912, vtdiff:258.555, sdiff:1.25012e-10, IUdiff:30.4378, IVTdiff:31.4343, svddiff:105.057, esvddiff:1.10069e-09, projdiff:246.321

-D- udiff:264.463, vtdiff:259.32, sdiff:2.15877e-10, IUdiff:26.7603, IVTdiff:31.1173, svddiff:115.24, esvddiff:1.68899e-09, projdiff:239.546

as you can see, even u*s*vt is not equal to the input matrix, and u and vt aren't orthogonal, which is clearly broken. If our calls to perform the SVD are wrong, please let me know. It should be noted that these tests have never failed over about 1000 iterations on random matrices.

Here is the code we are calling to get the SVD:

static bool culaInitialized = false;

void sdn::CudaMatrix::svd(CudaMatrix& u, CudaMatrix& vt, CudaMatrix& s) const {

if(!culaInitialized) {

SDN_ASSERT(culaInitialize() == culaNoError);

culaInitialized = true;

}

CudaMatrix copy(*this); // need copy so we don't destroy matrix

copy.updateMatrixGPU();

int min_mn = std::min(copy.rows(), copy.cols());

u = Zero(copy.rows(), min_mn);

vt = Zero(min_mn, copy.cols());

CudaVector sigma = CudaVector::Zero(min_mn);

SDN_ASSERT(culaDeviceSgesvd('S', 'S', copy.rows(), copy.cols(),

copy.matrixGPU_, copy.pm_,

sigma.matrixGPU_, u.matrixGPU_, u.pm_,

vt.matrixGPU_, vt.pm_) == culaNoError);

s = sigma.asDiagonal();

}

here is the test code for ensuring cula works with eigen as a benchmark:

for(int count = 0; count < 100; count++) {

CudaMatrix W(DIM, DIM);

char filename[128];

sprintf(filename, "/home/jeffrey/libsdn-trunk/tests/lib/W%d.mat", count);

FILE * f = fopen(filename, "rb");

for(int j = 0; j < DIM; j++) {

for(int k = 0; k < DIM; k++) {

float val;

fscanf(f, "%f", &val);

W(j,k) = val;

}

}

CudaMatrix u, vt, s;

W.svd(u, vt, s);

Eigen::MatrixXf w(W.rows(), W.cols());

cudaToEigen(W, w);

Eigen::SVD<Eigen::MatrixXf> w_svd(w);

CudaMatrix ue(DIM, DIM), vte(DIM, DIM), se(DIM, DIM);

eigenToCuda(w_svd.matrixU(), ue);

eigenToCuda(w_svd.matrixV(), vte);

vte = vte.transpose();

eigenToCuda(w_svd.singularValues().asDiagonal(), se);

for(int i = 0; i < W.cols(); i++) {

//for(int j = 0; j < W.cols(); j++) {

float u_val = (u.col(i) - ue.col(i)).squaredNorm();

/*if(u_val > .0001) {

u.col(i) = -u.col(i);

vt.row(i) = -vt.row(i);

}*/

}

float uDiff = (u-ue).squaredNorm();

float vtDiff = (vt-vte).squaredNorm();

float sDiff = (s-se).squaredNorm();

float identDiffU = (u*u.transpose()-

CudaVector::Ones(DIM).asDiagonal()).squaredNorm();

float identDiffVt = (vt*vt.transpose()-

CudaVector::Ones(DIM).asDiagonal()).squaredNorm();

float svdDiff = (W-u*s*vt).squaredNorm();

float esvdDiff = (W-ue*se*vte).squaredNorm();

float projDiff = (u*vt-ue*vte).squaredNorm();

SDN_DEBUG("udiff:" << uDiff << ", vtdiff:" << vtDiff << ", sdiff:" << sDiff

<< ", IUdiff:" << identDiffU << ", IVTdiff:" << identDiffVt

<< ", svddiff:" << svdDiff << ", esvddiff:" << esvdDiff

<< ", projdiff:" << projDiff);

}

obviously, you can't run that directly since you don't have our CudaMatrix class, but it should give a general idea of what we're doing. And finally, here's a few lines of output of the test (the first line corresponds to the attached matrix):

-D- udiff:263.838, vtdiff:264.827, sdiff:3.36724e-11, IUdiff:29.1175, IVTdiff:30.584, svddiff:35.1618, esvddiff:3.67794e-10, projdiff:243.746

-D- udiff:250.746, vtdiff:254.494, sdiff:2.49938e-10, IUdiff:34.9725, IVTdiff:43.764, svddiff:29.341, esvddiff:1.89376e-09, projdiff:231.396

-D- udiff:261.07, vtdiff:258.52, sdiff:1.32705e-10, IUdiff:49.1446, IVTdiff:43.4663, svddiff:16.9425, esvddiff:1.21796e-09, projdiff:251.536

-D- udiff:275.527, vtdiff:268.927, sdiff:1.43109e-10, IUdiff:87.1182, IVTdiff:49.7741, svddiff:25.9259, esvddiff:1.491e-09, projdiff:262.372

-D- udiff:260.407, vtdiff:265.878, sdiff:1.65283e-10, IUdiff:28.0323, IVTdiff:28.6273, svddiff:109.633, esvddiff:1.34261e-09, projdiff:232.627

-D- udiff:257.141, vtdiff:259.705, sdiff:1.00432e-10, IUdiff:32.1512, IVTdiff:30.7813, svddiff:107.76, esvddiff:7.47435e-10, projdiff:241.585

-D- udiff:260.023, vtdiff:259.648, sdiff:1.12667e-10, IUdiff:25.2452, IVTdiff:25.1066, svddiff:92.9703, esvddiff:9.99705e-10, projdiff:241.176

-D- udiff:264.495, vtdiff:275.31, sdiff:1.46459e-10, IUdiff:29.6779, IVTdiff:31.6966, svddiff:111.435, esvddiff:1.11735e-09, projdiff:265.427

-D- udiff:259.784, vtdiff:264.603, sdiff:1.60536e-10, IUdiff:29.1742, IVTdiff:25.7128, svddiff:111.356, esvddiff:1.18979e-09, projdiff:235.136

-D- udiff:252.679, vtdiff:261.655, sdiff:1.06585e-10, IUdiff:34.0553, IVTdiff:33.384, svddiff:105.997, esvddiff:8.11008e-10, projdiff:236.002

-D- udiff:263.308, vtdiff:254.987, sdiff:1.23534e-10, IUdiff:24.6389, IVTdiff:29.5374, svddiff:90.6421, esvddiff:8.50347e-10, projdiff:230.571

-D- udiff:266.719, vtdiff:265.023, sdiff:1.77858e-10, IUdiff:29.2374, IVTdiff:26.04, svddiff:104.638, esvddiff:1.30383e-09, projdiff:246.13

-D- udiff:262.745, vtdiff:254.752, sdiff:1.84412e-10, IUdiff:25.28, IVTdiff:29.5805, svddiff:113.491, esvddiff:1.44183e-09, projdiff:238.687

-D- udiff:258.666, vtdiff:259.779, sdiff:1.19513e-10, IUdiff:34.7407, IVTdiff:33.3303, svddiff:111.696, esvddiff:9.14567e-10, projdiff:248.946

-D- udiff:257.898, vtdiff:255.723, sdiff:5.88738e-11, IUdiff:28.5761, IVTdiff:28.9096, svddiff:94.3089, esvddiff:5.66239e-10, projdiff:240.8

-D- udiff:265.912, vtdiff:258.555, sdiff:1.25012e-10, IUdiff:30.4378, IVTdiff:31.4343, svddiff:105.057, esvddiff:1.10069e-09, projdiff:246.321

-D- udiff:264.463, vtdiff:259.32, sdiff:2.15877e-10, IUdiff:26.7603, IVTdiff:31.1173, svddiff:115.24, esvddiff:1.68899e-09, projdiff:239.546

as you can see, even u*s*vt is not equal to the input matrix, and u and vt aren't orthogonal, which is clearly broken. If our calls to perform the SVD are wrong, please let me know. It should be noted that these tests have never failed over about 1000 iterations on random matrices.

- amrasarfeiniel
**Posts:**4**Joined:**Mon Jan 25, 2010 6:11 am

### Re:Numerical error in SVD?

sorry about that. Originally was a .mat file. Have changed to .txt so it should upload now. [file name=W0.txt size=220767]http://www.culatools.com/images/fbfiles/files/W0.txt[/file]

- amrasarfeiniel
**Posts:**4**Joined:**Mon Jan 25, 2010 6:11 am

### Re:Numerical error in SVD?

Thanks for the data - we'll try running it and see what kind of results we get here. There might be some numerical drift because the data has been saved in a text file (ie minor rounding has occured) but we'll see if we get a quality answer with the data as-is.

It's worth noting that you should not expect the (u-ue) test and the (v-ve) test to pass, as these are not unique matrices. You should however be able to expect that the (s-se) test and the identDiffU/identDiffV tests should pass, as well as reconstruction.

It's worth noting that you should not expect the (u-ue) test and the (v-ve) test to pass, as these are not unique matrices. You should however be able to expect that the (s-se) test and the identDiffU/identDiffV tests should pass, as well as reconstruction.

- john
- Administrator
**Posts:**587**Joined:**Thu Jul 23, 2009 2:31 pm

### Re:Numerical error in SVD?

Hi, I wanted to update this thread.

Your data did indeed break a certain path in our SVD algorithm, and we have implemented a correction. This correction provides a very good answer for your data.

For everyone else, I would like to note that we have increased the accuracy of both our U and VT arguments for all SVD cases. We are seeing answers that are one digit more accurate for these two matrices. Interestingly, we traced this drift to the single-precision fused multiply-add (FMAD) on pre-Fermi cards. The precision of this instruction turns out to too low for SVD. Disabling the instruction corrected the error.

The updated SVD will be present in CULA 1.2. We expect this to be available around the middle of this month.

Here is the output of my test on your data, with the S,S case of SGESVD. All norms are 1-norms:

144x144 (S,S)

norm(~S-S)/norm(S): 5.2e-007 (~S is our vector, S is MKL)

norm(U*U'-I)/norm(I): 1.7e-005

norm(VT'*VT-I)/norm(I): 1.7e-005

norm(U*S*VT-A)/norm(A): 2.1e-006

Your data did indeed break a certain path in our SVD algorithm, and we have implemented a correction. This correction provides a very good answer for your data.

For everyone else, I would like to note that we have increased the accuracy of both our U and VT arguments for all SVD cases. We are seeing answers that are one digit more accurate for these two matrices. Interestingly, we traced this drift to the single-precision fused multiply-add (FMAD) on pre-Fermi cards. The precision of this instruction turns out to too low for SVD. Disabling the instruction corrected the error.

The updated SVD will be present in CULA 1.2. We expect this to be available around the middle of this month.

Here is the output of my test on your data, with the S,S case of SGESVD. All norms are 1-norms:

144x144 (S,S)

norm(~S-S)/norm(S): 5.2e-007 (~S is our vector, S is MKL)

norm(U*U'-I)/norm(I): 1.7e-005

norm(VT'*VT-I)/norm(I): 1.7e-005

norm(U*S*VT-A)/norm(A): 2.1e-006

- john
- Administrator
**Posts:**587**Joined:**Thu Jul 23, 2009 2:31 pm

### Re:Numerical error in SVD?

hotness, those numbers look much better. What a nasty bug; no wonder it didn't get caught the first time around. We've noticed some numerical instability/precision issues in some of the float operations on pre-Fermi cards as well, leading to very strange output when doing things like checking derivatives numerically. Thanks for the response, and I look forward to cula 1.2!

- amrasarfeiniel
**Posts:**4**Joined:**Mon Jan 25, 2010 6:11 am

### Re:Numerical error in SVD?

Thanks for the feedback! Just to update further, CULA 1.2 is still on schedule and should be out to users early next week.

We'll have this fix, several more fixes, speed improvements, and a whole bunch of new routines.

We'll have this fix, several more fixes, speed improvements, and a whole bunch of new routines.

- john
- Administrator
**Posts:**587**Joined:**Thu Jul 23, 2009 2:31 pm

### Re:Numerical error in SVD?

Can you give me a result about how much time will be taken about CGESVD of about 16*16 matrix?

My programm costs about 9ms which I doubt is so slow,so I come here to ask your help.

Waiting for your reply.

My programm costs about 9ms which I doubt is so slow,so I come here to ask your help.

Waiting for your reply.

- xmlguca
**Posts:**1**Joined:**Wed Jan 13, 2010 3:20 am

12 posts
• Page

**1**of**1**### Who is online

Users browsing this forum: No registered users and 1 guest