Thread: [Bayes++] Crash on destruction of SIR_scheme
Brought to you by:
mistevens
From: yon <yo...@ci...> - 2007-09-05 22:01:12
|
Hello, I'm a bit stuck with a bug afflicting my program that uses Bayes++. I am consistently crashing when the destructor for SIR_scheme is =20 called by a class I inherited from it. The trace below implicates =20 the destructor for an unsigned long vector, but there is no member of =20= this type in SIR_scheme (nor in the other classes related to it, as =20 far as I can tell). Can anyone offer me advice as to how to locate the cause, or advise =20 as to whether this is my code or a problem in one of the libraries? Thank you, Yon #0 0x315c2539 in __gnu_cxx::new_allocator<unsigned long>::destroy =20 (this=3D0xae6ce8c4, __p=3D0x82) at /usr/include/c++/4.0.0/ext/=20 new_allocator.h:107 #1 0x315c2895 in std::vector<unsigned long, std::allocator<unsigned =20 long> >::~vector (this=3D0x30852844) at /usr/include/c++/4.0.0/bits/=20 stl_vector.h:273 #2 0x315c2d72 in Bayesian_filter::SIR_scheme::~SIR_scheme =20 (this=3D0x30852820, __vtt_parm=3D0x316057a4) at /Developer/YV-dev/=20 dcppf-0.91/../Bayes++/BayesFilter/SIRFlt.hpp:179 #3 0x315c4599 in SIR_scheme_ext::~SIR_scheme_ext (this=3D0x30852820) =20= at /Developer/YV-dev/dcppf-0.91/SIR_scheme_ext.h:21 . . __________________________________ Yon Visell Centre for Intelligent Machines, McGill University http://cim.mcgill.ca/~yon CIRMMT, Schulich School of Music, McGill University http://www.music.mcgill.ca/cirmmt University of Applied Sciences & Arts Z=FCrich CLOSED: Closing the loop of sound evaluation and design http://closed.ircam.fr Zero-Th Association, Pula, Croatia http://www.zero-th.org Tel +1 514 967 1648 Fax +1 415 520 0193 |
From: yon <yo...@ci...> - 2007-09-05 22:49:26
|
Hello, Just to follow up on the problem I asked about: -- The program does not crash, rather it hangs. -- When I interrupt the hang by pausing in the debugger, the stack =20 trace looks similar to that noted below, but slightly different: #0 std::_Destroy<unsigned long*, std::allocator<unsigned long> > =20 (__first=3D0x856ef238, __last=3D0x82, __alloc=3D@0xbfffe4ef) at /usr/=20 include/c++/4.0.0/bits/stl_construct.h:173 #1 0x315c2895 in std::vector<unsigned long, std::allocator<unsigned =20 long> >::~vector (this=3D0x30852844) at /usr/include/c++/4.0.0/bits/=20 stl_vector.h:273 #2 0x315c2d72 in Bayesian_filter::SIR_scheme::~SIR_scheme =20 (this=3D0x30852820, __vtt_parm=3D0x316057a4) at /Developer/YV-dev/=20 dcppf-0.91/../Bayes++/BayesFilter/SIRFlt.hpp:179 #3 0x315c4599 in SIR_scheme_ext::~SIR_scheme_ext (this=3D0x30852820) =20= at /Developer/YV-dev/dcppf-0.91/SIR_scheme_ext.h:21 it is always stuck in std::_Destroy which is called in response to =20 destroying the unsigned long vector noted above in this section of =20 ~vector: ~vector() { std::_Destroy(this->_M_impl._M_start, this->_M_impl._M_finish, this->get_allocator()); } I don't know where the unsigned long vector being destroyed here =20 resides. The arguments to std::_Destroy in this call look like the problem to =20 me. In the relevant section of std::_Destroy, template<typename _ForwardIterator, typename _Allocator> void _Destroy(_ForwardIterator __first, _ForwardIterator __last, _Allocator __alloc) { for (; __first !=3D __last; ++__first) __alloc.destroy(&*__first); } The address __last is always the address with decimal value 130 while =20= __first is typically some large number orders of magnitude higher. =20 130 represents the number of samples (particles) in the filter, and =20 presumably is the length of the vector here. It looks to me like this->_M_impl._M_finish (which becomes __last) =20 is not set correctly in the std::vector. If I understand correctly, =20 this vector has somehow acquired a pointer to its last element which =20 has an address that is errantly set equal to its length. Since I am not making pointer manipulations with the ublas vectors, I =20= am not sure how I could have caused this in my own code, and in any =20 case, I can't find the vector that is responsible in the Bayes++ code =20= or anywhere else. Thanks for any advice, Yon On Sep 5, 2007, at 6:00 PM, yon wrote: > Hello, > > I'm a bit stuck with a bug afflicting my program that uses Bayes++. > > I am consistently crashing when the destructor for SIR_scheme is =20 > called by a class I inherited from it. The trace below implicates =20 > the destructor for an unsigned long vector, but there is no member =20 > of this type in SIR_scheme (nor in the other classes related to it, =20= > as far as I can tell). > > Can anyone offer me advice as to how to locate the cause, or advise =20= > as to whether this is my code or a problem in one of the libraries? > > Thank you, > Yon > > > #0 0x315c2539 in __gnu_cxx::new_allocator<unsigned long>::destroy =20 > (this=3D0xae6ce8c4, __p=3D0x82) at /usr/include/c++/4.0.0/ext/=20 > new_allocator.h:107 > #1 0x315c2895 in std::vector<unsigned long, =20 > std::allocator<unsigned long> >::~vector (this=3D0x30852844) at /usr/=20= > include/c++/4.0.0/bits/stl_vector.h:273 > #2 0x315c2d72 in Bayesian_filter::SIR_scheme::~SIR_scheme =20 > (this=3D0x30852820, __vtt_parm=3D0x316057a4) at /Developer/YV-dev/=20 > dcppf-0.91/../Bayes++/BayesFilter/SIRFlt.hpp:179 > #3 0x315c4599 in SIR_scheme_ext::~SIR_scheme_ext (this=3D0x30852820) =20= > at /Developer/YV-dev/dcppf-0.91/SIR_scheme_ext.h:21 > . > . > > > __________________________________ > Yon Visell > > Centre for Intelligent Machines, McGill University > http://cim.mcgill.ca/~yon > > CIRMMT, Schulich School of Music, McGill University > http://www.music.mcgill.ca/cirmmt > > University of Applied Sciences & Arts Z=FCrich > CLOSED: Closing the loop of sound evaluation and design > http://closed.ircam.fr > > Zero-Th Association, Pula, Croatia > http://www.zero-th.org > > Tel +1 514 967 1648 > Fax +1 415 520 0193 > > > __________________________________ Yon Visell Centre for Intelligent Machines, McGill University http://cim.mcgill.ca/~yon CIRMMT, Schulich School of Music, McGill University http://www.music.mcgill.ca/cirmmt University of Applied Sciences & Arts Z=FCrich CLOSED: Closing the loop of sound evaluation and design http://closed.ircam.fr Zero-Th Association, Pula, Croatia http://www.zero-th.org Tel +1 514 967 1648 Fax +1 415 520 0193 |
From: yon <yo...@ci...> - 2007-09-06 04:02:18
|
Hello, Just to follow up, I have isolated the problem to the resamples =20 member of SIR_scheme. The allocator for the underlying vector of unsigned ints obtains an =20 invalid finish ptr at some point, as noted, which causes the program =20 to hang when deconstructed. I don't think this member is touched in my code in the case under =20 consideration, but am attempting to isolate where this occurs. If someone knows what operations can modify the finish ptr of the =20 vector allocator for such a vector (some sort of invalid resize =20 operation?), that would be helpful to know. Best, Yon __________________________________ Yon Visell Centre for Intelligent Machines, McGill University http://cim.mcgill.ca/~yon CIRMMT, Schulich School of Music, McGill University http://www.music.mcgill.ca/cirmmt University of Applied Sciences & Arts Z=FCrich CLOSED: Closing the loop of sound evaluation and design http://closed.ircam.fr Zero-Th Association, Pula, Croatia http://www.zero-th.org Tel +1 514 967 1648 Fax +1 415 520 0193 |
From: Michael S. <ma...@mi...> - 2007-09-06 09:22:04
|
Hello, On Thursday 06 September 2007 06:02, yon wrote: > Hello, > > Just to follow up, I have isolated the problem to the resamples > member of SIR_scheme. > > The allocator for the underlying vector of unsigned ints obtains an > invalid finish ptr at some point, as noted, which causes the program > to hang when deconstructed. > > I don't think this member is touched in my code in the case under > consideration, but am attempting to isolate where this occurs. > > If someone knows what operations can modify the finish ptr of the > vector allocator for such a vector (some sort of invalid resize > operation?), that would be helpful to know. I think the most likely is probably an array overflow or pointer problem causing the finish ptr to be corrupted. This could be in Bayes++ but much more likely to be in your code :-) Not sure what system you are developing with but the debugger can help you out here. Certainly the Visual C++ debugger has the ability to set a breakpoint which is trigger when a memory location is modifed. You can use this to find any corruption of the finish ptr. Alternatively under Linux you the valgrind tool is very helpful to look for incorrect memory access. Good luck, Michael -- ___________________________________ Michael Stevens Systems Engineering 34128 Kassel, Germany Phone/Fax: +49 561 5218038 Navigation Systems, Estimation and Bayesian Filtering http://bayesclasses.sf.net ___________________________________ |
From: yon <yo...@ci...> - 2007-09-15 20:08:55
|
Hello All, I took the time to remove elements of my program until nothing else remained. A different search strategy would have been smarter. The program below hangs when the SIR_scheme instance is destroyed. This is a fresh project, cleaned, etc. It hangs when the std::vector allocator is destroying each element of the array, because the array bounds are invalid after SIR_scheme is created. I didn't have such problems earlier. I'm assuming one of the following has gotten corrupted on my computer: bayes++, STL, boost, or gcc. Any guesses? My next idea is to rebuild libraries. I'll start with bayes++. cheers, Yon #include "../Bayes++/BayesFilter/bayesFlt.hpp" #include "../Bayes++/BayesFilter/SIRFlt.hpp" namespace ublas = boost::numeric::ublas; using namespace Bayesian_filter; using namespace boost::numeric::ublas; struct SIR_random_my : public Bayesian_filter::SIR_random { public: void normal(Bayesian_filter::FM::DenseVec& v) {}; void uniform_01(Bayesian_filter::FM::DenseVec& v) {}; }; int main(void) { SIR_random_my RandomHelper; { Bayesian_filter::SIR_scheme mySirScheme2(9, 13, RandomHelper);} } |
From: yon <yo...@ci...> - 2007-09-08 22:57:34
|
Hello Michael, Copying this message to the list in case it may be useful to others. Following up on my previous email, the resamples vector has a valid =20 memory allocation inside the SIR_scheme constructor when it is =20 created, as I am able to verify in the debugger. As soon as I step =20 from that constructor back out to the constructor of the class I =20 derived from SIR_scheme, the _M_finish ptr of resamples' allocator is =20= invalid in the way I described before. Program-wise, nothing occurs =20 between (not even the constructors of other members are called yet), =20 except that the context in which I am looking at this variable changes. As noted earlier, when I destroy my class instance, the program is =20 stuck in an infinite loop due to the invalid _M_finish ptr, as I =20 verify in the debugger. I can't see any way for this to be occuring. I'm going to try to =20 change some project related settings and rearrange the order of =20 members of the class to see if there is any change in behavior. Thanks for any advice, Yon On Sep 7, 2007, at 12:03 PM, yon wrote: > Thank you for the feedback and insight. > > I can report that the state of the SIR_scheme resamples allocator =20 > seems to be invalid after construction. My class's constructor =20 > invokes the SIR_scheme constructor with an s_size argument of 130. =20= > This argument is passed to the resamples constructor as the s_size =20 > parameter. Then when control returns to my class's constructor =20 > after SIR_scheme is constructed, the resamples's allocator's =20 > _M_finish pointer is equal to the s_size argument of 130 (=3D0x82) =20 > instead of _M_start + 130. I don't see any way that I could have =20 > corrupted this pointer in between when SIR_scheme is constructed =20 > and when control returns to the class constructor that called it =20 > (which is inherited from SIR_scheme). > > As I noted, when resamples is then destroyed, it is stuck in an =20 > infinite loop iterating _M_start until an _M_finish value that =20 > precedes it. > > I felt too that the error more likely to be in my own code, however =20= > I am at a loss to explain this behavior, or to understand why I did =20= > not encounter it before. The latter observation would certainly =20 > point to an error somewhere in my code. > > For the reason I mention here, watching that location in memory is =20 > not too useful, because none of my code is called before the =20 > object's state appears to be invalidated. > > Thanks for any advice. > > Regards, > Yon > > > On Sep 6, 2007, at 5:21 AM, Michael Stevens wrote: > >> Hello, >> >> On Thursday 06 September 2007 06:02, yon wrote: >>> Hello, >>> >>> Just to follow up, I have isolated the problem to the resamples >>> member of SIR_scheme. >>> >>> The allocator for the underlying vector of unsigned ints obtains an >>> invalid finish ptr at some point, as noted, which causes the program >>> to hang when deconstructed. >>> >>> I don't think this member is touched in my code in the case under >>> consideration, but am attempting to isolate where this occurs. >>> >>> If someone knows what operations can modify the finish ptr of the >>> vector allocator for such a vector (some sort of invalid resize >>> operation?), that would be helpful to know. >> >> I think the most likely is probably an array overflow or pointer =20 >> problem >> causing the finish ptr to be corrupted. This could be in Bayes++ =20 >> but much >> more likely to be in your code :-) >> >> Not sure what system you are developing with but the debugger can =20 >> help you out >> here. Certainly the Visual C++ debugger has the ability to set a =20 >> breakpoint >> which is trigger when a memory location is modifed. You can use =20 >> this to find >> any corruption of the finish ptr. >> Alternatively under Linux you the valgrind tool is very helpful to =20= >> look for >> incorrect memory access. >> >> Good luck, >> Michael >> --=20 >> ___________________________________ >> Michael Stevens Systems Engineering >> >> 34128 Kassel, Germany >> Phone/Fax: +49 561 5218038 >> >> Navigation Systems, Estimation and >> Bayesian Filtering >> http://bayesclasses.sf.net >> ___________________________________ >> >> ---------------------------------------------------------------------=20= >> ---- >> This SF.net email is sponsored by: Splunk Inc. >> Still grepping through log files to find problems? Stop. >> Now Search log events and configuration files using AJAX and a =20 >> browser. >> Download your FREE copy of Splunk now >> http://get.splunk.com/ >> _______________________________________________ >> Bayesclasses-general mailing list >> Bay...@li... >> https://lists.sourceforge.net/lists/listinfo/bayesclasses-general > > __________________________________ > Yon Visell > > Centre for Intelligent Machines, McGill University > http://cim.mcgill.ca/~yon > > CIRMMT, Schulich School of Music, McGill University > http://www.music.mcgill.ca/cirmmt > > University of Applied Sciences & Arts Z=FCrich > CLOSED: Closing the loop of sound evaluation and design > http://closed.ircam.fr > > Zero-Th Association, Pula, Croatia > http://www.zero-th.org > > Tel +1 514 967 1648 > Fax +1 415 520 0193 > > > __________________________________ Yon Visell Centre for Intelligent Machines, McGill University http://cim.mcgill.ca/~yon CIRMMT, Schulich School of Music, McGill University http://www.music.mcgill.ca/cirmmt University of Applied Sciences & Arts Z=FCrich CLOSED: Closing the loop of sound evaluation and design http://closed.ircam.fr Zero-Th Association, Pula, Croatia http://www.zero-th.org Tel +1 514 967 1648 Fax +1 415 520 0193 |
From: Michael S. <ma...@mi...> - 2007-09-10 12:49:28
|
On Sunday 09 September 2007 00:57, yon wrote: > Hello Michael, > > Copying this message to the list in case it may be useful to others. > > Following up on my previous email, the resamples vector has a valid > memory allocation inside the SIR_scheme constructor when it is > created, as I am able to verify in the debugger. As soon as I step > from that constructor back out to the constructor of the class I > derived from SIR_scheme, the _M_finish ptr of resamples' allocator is > invalid in the way I described before. Program-wise, nothing occurs > between (not even the constructors of other members are called yet), > except that the context in which I am looking at this variable changes. Two possibilities. a) Something inbetween is changing it. I guess you would have to single step and the machine code level. b) The debugger is not showing the corruption until to late. > As noted earlier, when I destroy my class instance, the program is > stuck in an infinite loop due to the invalid _M_finish ptr, as I > verify in the debugger. > > I can't see any way for this to be occuring. I'm going to try to > change some project related settings and rearrange the order of > members of the class to see if there is any change in behavior. This is starting to look like a compiler bug. Which compiler are you using? Do you have a small test case? If so maybe you can send it to see if the problem is reproducable. All the best, Michael -- ___________________________________ Michael Stevens Systems Engineering 34128 Kassel, Germany Phone/Fax: +49 561 5218038 Navigation Systems, Estimation and Bayesian Filtering http://bayesclasses.sf.net ___________________________________ |
From: yon <yo...@ci...> - 2007-09-11 21:21:55
|
Hello, Thank you for the advice. I received some other suggestions in the mean time and haven't found an answer. I have little luck with the debugger, including hardware watchpoints on those memory locations. Indeed, it's not clear that the debugger is indicating this problem when it occurs. However, it is clear that the resamples array is in a corrupted state as soon as SIR_scheme is created. I have resisted, but may be forced., to inspect matters at the assembler level. The following code reproduces the corrupt state for me: SIR_random_impl RandomHelper; bf::SIR_scheme mySirScheme(6, 10, RandomHelper); The RandomHelper is not called in the SIR_scheme constructor, so it can't possibly be to blame. Almost any operation on resamples after that breaks it (because the array bounds are completely wrong). I verified that I can instantiate and manipulate other std::vector<size_t> on my computer. In the SIR_scheme constructor, the only significant operation after resamples is created is the initialization of the wir member (an FMVec type). The compiler is gcc 4.0.1. My platform is OS X. The dev environment is xcode using the gdb debugger and the xcode debugger. Thanks for any thoughts. Best, Yon |