[Lapackpp-devel] Interesting crash in QR solve
Status: Beta
Brought to you by:
cstim
|
From: Michael P. <mr...@cm...> - 2006-07-13 20:00:22
|
Dear Christian and others,
I am thankful for this library to avoid calling LAPACK functions
directly, but I'm having trouble from memory corruption in two
situations: printing out matrices, and linear systems solution. (I
don't remember seeing the problem when using LU, only QR and SVD.)
First, sometimes the assertion
assert(cI.end() >= 0); (mtmpl.h, line 279)
fails. The main issue is a strange crash, which I was able to track
down in valgrind (since the stack in GDB is messed up):
==16581== Invalid write of size 4
==16581== at 0x40635D7: dgels_ (in /usr/local/lib/liblapackpp.so.1.13.0)
==16581== by 0x4031EA1: LaQRLinearSolveIP(LaGenMatDouble&,
LaGenMatDouble&, L aGenMatDouble const&) (linslv.cc:220)
==16581== by 0x4032141: LaQRLinearSolve(LaGenMatDouble const&,
LaGenMatDouble &, LaGenMatDouble const&) (linslv.cc:160)
==16581== by 0x4032BD7: LaLinearSolve(LaGenMatDouble const&,
LaGenMatDouble&, LaGenMatDouble const&) (linslv.cc:56)
==16581== by 0x805385D: Patch::solve() (in
/home/mrprice/student_projects/IAV /segmenter/segmenter)
==16581== by 0x8054388: SurfaceFit::solve() (in
/home/mrprice/student_project s/IAV/segmenter/segmenter)
==16581== by 0x804AE39: Segmenter::surfacefit() (segmenter.cpp:935)
==16581== by 0x804A480: main (main.cpp:133)
==16581== Address 0x6DC5C10 is 0 bytes after a block of size 16 alloc'd
==16581== at 0x40057E9: operator new[](unsigned)
(vg_replace_malloc.c:195)
==16581== by 0x4052287: VectorDouble::VectorDouble(unsigned) (vd.h:46)
==16581== by 0x403701F: LaGenMatDouble::LaGenMatDouble(int, int)
(gmd.cc:62)
==16581== by 0x4042954: LaGenMatDouble::resize(int, int) (mtmpl.h:189)
==16581== by 0x4042C2D: LaGenMatDouble::resize(LaGenMatDouble const&)
(gmtmpl .cc:113)
==16581== by 0x4044A1B: LaGenMatDouble::copy(LaGenMatDouble const&)
(mtmpl.h: 210)
==16581== by 0x4031E0D: LaQRLinearSolveIP(LaGenMatDouble&,
LaGenMatDouble&, L aGenMatDouble const&) (gmd.h:665)
==16581== by 0x4032141: LaQRLinearSolve(LaGenMatDouble const&,
LaGenMatDouble &, LaGenMatDouble const&) (linslv.cc:160)
==16581== by 0x4032BD7: LaLinearSolve(LaGenMatDouble const&,
LaGenMatDouble&, LaGenMatDouble const&) (linslv.cc:56)
==16581== by 0x805385D: Patch::solve() (in
/home/mrprice/student_projects/IAV /segmenter/segmenter)
==16581== by 0x8054388: SurfaceFit::solve() (in
/home/mrprice/student_project s/IAV/segmenter/segmenter)
==16581== by 0x804AE39: Segmenter::surfacefit() (segmenter.cpp:935)
==16581==
==16581== ---- Attach to debugger ? --- [Return/N/n/Y/y/C/c] ---- y
starting debugger
==16581== starting debugger with cmd: /usr/bin/gdb -nw
/proc/16618/fd/1014 16618
GNU gdb Red Hat Linux (6.3.0.0-1.122rh)
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain
conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "i386-redhat-linux-gnu"...Using host
libthread_db lib rary "/lib/libthread_db.so.1".
Attaching to program: /proc/16618/fd/1014, process 16618
Reading symbols from shared object read from target memory...done.
Loaded system supplied DSO at 0x37fff000
`shared object read from target memory' has disappeared; keeping its
symbols.
0x040635d7 in ?? ()
(gdb) bt
#0 0x040635d7 in ?? ()
#1 0xbef53da4 in ?? ()
#2 0xbef53d6c in ?? ()
#3 0x00000001 in ?? ()
#4 0x00929300 in ?? ()
#5 0x00000000 in ?? ()
Here's the code in question, it's just a variable-order polynomial least
squares fit function. I have checked (using valgrind) and found no
memory leaks or other issues up to this point.
void Patch::solve()
{
int i = 0;
int j = 0;
surfcoord x = 0, y = 0, z = 0;
int eq_size = PATCH_EQ_LENGTH;
switch (MAX_ORDER)
{
case 1: eq_size = 3; break;
case 2: eq_size = 6; break;
case 3: eq_size = 10; break;
case 4: eq_size = 15; break;
}
// Every patch makes a 15x15 square in the matrix AtA
LaGenMatDouble A(num_points, eq_size);
LaVectorDouble b(num_points);
LaVectorDouble eq(eq_size);
for (j = 0; j < num_points; j++)
{
x = points[j * 3];
y = points[j * 3 + 1];
z = points[j * 3 + 2];
A(j, 0) = 1;
A(j, 1) = x;
A(j, 2) = y;
if (MAX_ORDER >= 2)
{
A(j, 3) = x * x;
A(j, 4) = x * y;
A(j, 5) = y * y;
}
if (MAX_ORDER >= 3)
{
A(j, 6) = A(j, 3) * x;
A(j, 7) = A(j, 3) * y;
A(j, 8) = A(j, 5) * x;
A(j, 9) = A(j, 5) * y;
}
if (MAX_ORDER >= 4)
{
A(j, 10) = A(j, 6) * x;
A(j, 11) = A(j, 6) * y;
A(j, 12) = A(j, 3) * A(j, 5);
A(j, 13) = A(j, 9) * x;
A(j, 14) = A(j, 9) * y;
}
b(j) = z;
}
LaLinearSolve(A, eq, b);
for (j = 0; j < eq_size; j++)
{
equation[j] = eq2(j);
}
}
I don't know the workings of Lapack++ too well, but it sounds like the
workspace of the Fortran function may not have been allocated correctly
in LaQRLinearSolveIP(). Have you seen this problem before, or know how
to fix it?
Thanks
Michael
|