[Lapackpp-devel] Interesting crash in QR solve
Status: Beta
Brought to you by:
cstim
From: Michael P. <mr...@cm...> - 2006-07-13 20:00:22
|
Dear Christian and others, I am thankful for this library to avoid calling LAPACK functions directly, but I'm having trouble from memory corruption in two situations: printing out matrices, and linear systems solution. (I don't remember seeing the problem when using LU, only QR and SVD.) First, sometimes the assertion assert(cI.end() >= 0); (mtmpl.h, line 279) fails. The main issue is a strange crash, which I was able to track down in valgrind (since the stack in GDB is messed up): ==16581== Invalid write of size 4 ==16581== at 0x40635D7: dgels_ (in /usr/local/lib/liblapackpp.so.1.13.0) ==16581== by 0x4031EA1: LaQRLinearSolveIP(LaGenMatDouble&, LaGenMatDouble&, L aGenMatDouble const&) (linslv.cc:220) ==16581== by 0x4032141: LaQRLinearSolve(LaGenMatDouble const&, LaGenMatDouble &, LaGenMatDouble const&) (linslv.cc:160) ==16581== by 0x4032BD7: LaLinearSolve(LaGenMatDouble const&, LaGenMatDouble&, LaGenMatDouble const&) (linslv.cc:56) ==16581== by 0x805385D: Patch::solve() (in /home/mrprice/student_projects/IAV /segmenter/segmenter) ==16581== by 0x8054388: SurfaceFit::solve() (in /home/mrprice/student_project s/IAV/segmenter/segmenter) ==16581== by 0x804AE39: Segmenter::surfacefit() (segmenter.cpp:935) ==16581== by 0x804A480: main (main.cpp:133) ==16581== Address 0x6DC5C10 is 0 bytes after a block of size 16 alloc'd ==16581== at 0x40057E9: operator new[](unsigned) (vg_replace_malloc.c:195) ==16581== by 0x4052287: VectorDouble::VectorDouble(unsigned) (vd.h:46) ==16581== by 0x403701F: LaGenMatDouble::LaGenMatDouble(int, int) (gmd.cc:62) ==16581== by 0x4042954: LaGenMatDouble::resize(int, int) (mtmpl.h:189) ==16581== by 0x4042C2D: LaGenMatDouble::resize(LaGenMatDouble const&) (gmtmpl .cc:113) ==16581== by 0x4044A1B: LaGenMatDouble::copy(LaGenMatDouble const&) (mtmpl.h: 210) ==16581== by 0x4031E0D: LaQRLinearSolveIP(LaGenMatDouble&, LaGenMatDouble&, L aGenMatDouble const&) (gmd.h:665) ==16581== by 0x4032141: LaQRLinearSolve(LaGenMatDouble const&, LaGenMatDouble &, LaGenMatDouble const&) (linslv.cc:160) ==16581== by 0x4032BD7: LaLinearSolve(LaGenMatDouble const&, LaGenMatDouble&, LaGenMatDouble const&) (linslv.cc:56) ==16581== by 0x805385D: Patch::solve() (in /home/mrprice/student_projects/IAV /segmenter/segmenter) ==16581== by 0x8054388: SurfaceFit::solve() (in /home/mrprice/student_project s/IAV/segmenter/segmenter) ==16581== by 0x804AE39: Segmenter::surfacefit() (segmenter.cpp:935) ==16581== ==16581== ---- Attach to debugger ? --- [Return/N/n/Y/y/C/c] ---- y starting debugger ==16581== starting debugger with cmd: /usr/bin/gdb -nw /proc/16618/fd/1014 16618 GNU gdb Red Hat Linux (6.3.0.0-1.122rh) Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i386-redhat-linux-gnu"...Using host libthread_db lib rary "/lib/libthread_db.so.1". Attaching to program: /proc/16618/fd/1014, process 16618 Reading symbols from shared object read from target memory...done. Loaded system supplied DSO at 0x37fff000 `shared object read from target memory' has disappeared; keeping its symbols. 0x040635d7 in ?? () (gdb) bt #0 0x040635d7 in ?? () #1 0xbef53da4 in ?? () #2 0xbef53d6c in ?? () #3 0x00000001 in ?? () #4 0x00929300 in ?? () #5 0x00000000 in ?? () Here's the code in question, it's just a variable-order polynomial least squares fit function. I have checked (using valgrind) and found no memory leaks or other issues up to this point. void Patch::solve() { int i = 0; int j = 0; surfcoord x = 0, y = 0, z = 0; int eq_size = PATCH_EQ_LENGTH; switch (MAX_ORDER) { case 1: eq_size = 3; break; case 2: eq_size = 6; break; case 3: eq_size = 10; break; case 4: eq_size = 15; break; } // Every patch makes a 15x15 square in the matrix AtA LaGenMatDouble A(num_points, eq_size); LaVectorDouble b(num_points); LaVectorDouble eq(eq_size); for (j = 0; j < num_points; j++) { x = points[j * 3]; y = points[j * 3 + 1]; z = points[j * 3 + 2]; A(j, 0) = 1; A(j, 1) = x; A(j, 2) = y; if (MAX_ORDER >= 2) { A(j, 3) = x * x; A(j, 4) = x * y; A(j, 5) = y * y; } if (MAX_ORDER >= 3) { A(j, 6) = A(j, 3) * x; A(j, 7) = A(j, 3) * y; A(j, 8) = A(j, 5) * x; A(j, 9) = A(j, 5) * y; } if (MAX_ORDER >= 4) { A(j, 10) = A(j, 6) * x; A(j, 11) = A(j, 6) * y; A(j, 12) = A(j, 3) * A(j, 5); A(j, 13) = A(j, 9) * x; A(j, 14) = A(j, 9) * y; } b(j) = z; } LaLinearSolve(A, eq, b); for (j = 0; j < eq_size; j++) { equation[j] = eq2(j); } } I don't know the workings of Lapack++ too well, but it sounds like the workspace of the Fortran function may not have been allocated correctly in LaQRLinearSolveIP(). Have you seen this problem before, or know how to fix it? Thanks Michael |