|
From: Joe S. <st...@sa...> - 2003-03-20 16:07:43
|
(a) Has anyone had experience debugging SystemC programs with valgrind? Were there any unexpected problems? (b) Can anyone suggest what I should do next with the following problem? I have a SystemC program which maintains a number of queues, implemented as C++ list<packet *>. gdb reveals that after the simulation has been running for a considerable time, I create and initialise a packet and put it on the back of one queue, but a packet with the same pointer is already at the front of a quite different queue and has its contents overwritten. (All queues and packets are in storage around 0x8,000,000.) This looks like a classic case of premature deletion, so I try to find it with valgrind. But valgrind produces an "invalid read" error a little bit earlier in the simulation, and gdb shows all the packets in the queue about to be clobbered have now been moved to invalid addresses around 0x43,000,000. (Valgrind guesses that this address is "on thread 1's stack", but I'm not sure that's reliable.) Running without valgrind shows, according to gdb at the relevant moment, no sign of this relocation. So I reckon that if the packets have turned up in the new place, something (presumably with access rights) must have put them there. So I try again with the offending address made invalid by a VALGRIND_MAKE_NOACCESS call, hoping to catch whatever it is that wrote them in the first place. But the program fails in the same place as before (though this time the error is because I've declared the address invalid, not because valgrind thought it was invalid anyway). The packets are all there at the new place, with their proper contents; but whatever it is that wrote them there snuck in under my prohibition. Sorry to go on at length, but I'm a bit stuck. I'd be grateful for any suggestions . . . joe stoy |
|
From: Nicholas N. <nj...@ca...> - 2003-03-20 18:32:13
|
On Thu, 20 Mar 2003, Joe Stoy wrote: > (a) > Has anyone had experience debugging SystemC programs with valgrind? > Were there any unexpected problems? I have no experience, but there shouldn't be any problems, since Valgrind works in principle with programs written in any languages (although it can do a bit more with C and C++ programs). > (b) > Can anyone suggest what I should do next with the following problem? I don't have much to say about this, except maybe GDB watch points could be useful? Don't know. N |
|
From: Julian S. <js...@ac...> - 2003-03-20 23:18:46
|
Don't I know your name in connection with denotational semantics,
or something?
> (a)
> Has anyone had experience debugging SystemC programs with valgrind?
> Were there any unexpected problems?
Umm ... what's systemC ?
> (b)
> Can anyone suggest what I should do next with the following problem?
>
> I have a SystemC program which maintains a number of queues, implemented
> as C++ list<packet *>. gdb reveals that after the simulation has been
> running for a considerable time, I create and initialise a packet and
> put it on the back of one queue, but a packet with the same pointer is
> already at the front of a quite different queue and has its contents
> overwritten. (All queues and packets are in storage around 0x8,000,000.)
>
> This looks like a classic case of premature deletion, so I try to find
> it with valgrind. But valgrind produces an "invalid read" error a
> little bit earlier in the simulation, and gdb shows all the packets in
> the queue about to be clobbered have now been moved to invalid addresses
> around 0x43,000,000. (Valgrind guesses that this address is "on thread
> 1's stack", but I'm not sure that's reliable.)
>
> Running without valgrind shows, according to gdb at the relevant moment,
> no sign of this relocation.
>
> So I reckon that if the packets have turned up in the new place,
> something (presumably with access rights) must have put them there. So
> I try again with the offending address made invalid by a
> VALGRIND_MAKE_NOACCESS call, hoping to catch whatever it is that wrote
> them in the first place. But the program fails in the same place as
> before (though this time the error is because I've declared the address
> invalid, not because valgrind thought it was invalid anyway). The
> packets are all there at the new place, with their proper contents; but
> whatever it is that wrote them there snuck in under my prohibition.
>
> Sorry to go on at length, but I'm a bit stuck. I'd be grateful for any
> suggestions . . .
3 comments.
1. Increase valgrind's --freelist-vol parameter to as large a number
as you can. The manual describes what it is and why this might
be useful. (do valgrind --skin=memcheck --help, assuming you
are using v-1.9.4).
2. If you are doing this VALGRIND_MAKE_NOACCESS with valgrind version
1.1.0 or later, we found recently a bug in which the file to
include is "memcheck.h" and not "valgrind.h". The real bad thing
is that including "valgrind.h" still seems to work, but strange
things happen at run time. I've put in a fix to both the head
and 2_0_BRANCH to cause a compilation error if you include valgrind.h
directly, and that will be in 1.9.5.
3. As Nick rightly points out, make friends with GDB's hardware
watchpoints if you haven't already. In summary,
watch *(int*) 0xAddressToWatch
They've saved my ass on various occasions since I learnt of them.
J
|
|
From: Joe S. <st...@sa...> - 2003-03-21 15:35:20
|
Julian Seward wrote: > Don't I know your name in connection with denotational semantics, or > something? Yes, that's me. Seems a while ago now. > Umm ... what's systemC ? It's a system for simulation of hardware and other designs, built on top of C++. From http://www.systemc.org: "SystemC is the standard design and verification language built in C++ that spans from concept to implementation in hardware and software. Designers design and verify using SystemC and standard ANSI C++. EDA vendors create tools that are automatically interoperable." I had written: >> Can anyone suggest what I should do next with the following >> problem? I have now sorted this out, thank you. It was indeed a classic premature-deletion bug. I think Valgrind comes out with an "A-double-minus" grade. It gets a basic A because it did indeed detect the invalid use of memory which had already been freeed. But it doesn't quite get a straight A because: 1. It got confused about the error message (presumably because of the multi-threading). It ought to have told me simply that it was memory in the free store which I'd already deallocated; instead it went on about how it was from another thread's stack area. 2. My setting the address invalid, by VALGRIND_MAKE_NOACCESS, seemed not to affect my claiming it, using it and releasing it; it was only when I used it after that that the error message told me it was because I had invalidated it. (Maybe that's part of the bug Julian mentions in his message -- I had indeed included "valgrind.h", not "memcheck.h".) 3. The documentation emphasises that the system is designed to be "as non-intrusive as possible". I did not therefore realises that it nevertheless alters all the memory addresses used. This should perhaps be emphasised -- after all, if you're investigating memory usage errors, addresses tend to be relevant. Julian and Nick gave me helpful advice about gdb (some of which I was already doing). But the tricky part was hunting down where I had prematurely deleted the item. Two ways came to mind: 1. If the program was already printing lots of diagnostic output, run with Valgrind's --trace-malloc=yes. Then you can search for the delete, and see which of your own messages surround it. 2. If you know the type of the errant object, you can write code in its destructor to break for the case you're looking for. But these are both a little bit clunky. Perhaps a Valgrind option to do it more cleanly would be nice. I'm not sure. Anyway, thanks again for your help. joe stoy |