|
From: Nicholas N. <nj...@ca...> - 2004-07-27 20:16:53
|
Hi,
I just found a bug in the skiplist implementation; you're going to love
this.
A skiplist node looks like this:
struct _SkipNode {
UInt magic;
UShort level; /* level is the max level (level == 0 means 1 next pointer) */
SkipNode *next[0];
};
Part of VG_(SkipNode_Alloc)() looks like this:
n->level = h;
n->magic = SKIPLIST_MAGIC;
while (h-- >= 0)
n->next[h] = NULL;
Seems reasonable, right? Er, actually, that loop iterates from h-1 to -1.
Bad use of post-decrementing. Whoops.
So why are things still working? Because next[-1] is actually 'level'. So
the 'level' field of every SkipNode is zero. Hmm, there goes our nice
logarithmic lookups; the skiplist has effectively degenerated into a
linked list... or something like that; I didn't investigate to work out
exactly what is happening.
So how did I find this? Because on x86-64, the 8-byte pointers mean that
next[-1] clobbers both 'level' and 'magic' (a canary word) and so it
complains about 'magic' being incorrect.
The obvious fix is to change the loop to this:
while (h >= 0) {
n->next[h] = NULL;
h--;
}
except then there's some other problem and a seg fault occurs. I'm
guessing that there's a bug in the skiplist code that has never been
exposed because the levels are always zero. I can't be bothered looking
for it now, I'm hoping someone else will be inspired. Ripping the
skiplist code out into a stand-alone program and running Memcheck on it
might be the easiest thing to do.
[I've been thinking that we should do that for a number of parts of the
systems, the parts that can be easily separated -- eg. the skiplist, the
hashtable, the allocator, etc -- somehow as part of standard testing.
Not that it would have found the above bug...]
N
|
|
From: Jeremy F. <je...@go...> - 2004-07-27 21:04:37
|
On Tue, 2004-07-27 at 21:16 +0100, Nicholas Nethercote wrote:
> Hi,
>
> I just found a bug in the skiplist implementation; you're going to love
> this.
>
> A skiplist node looks like this:
>
> struct _SkipNode {
> UInt magic;
> UShort level; /* level is the max level (level == 0 means 1 next pointer) */
> SkipNode *next[0];
> };
>
> Part of VG_(SkipNode_Alloc)() looks like this:
>
> n->level = h;
> n->magic = SKIPLIST_MAGIC;
> while (h-- >= 0)
> n->next[h] = NULL;
>
> Seems reasonable, right? Er, actually, that loop iterates from h-1 to -1.
> Bad use of post-decrementing. Whoops.
>
> So why are things still working? Because next[-1] is actually 'level'. So
> the 'level' field of every SkipNode is zero. Hmm, there goes our nice
> logarithmic lookups; the skiplist has effectively degenerated into a
> linked list... or something like that; I didn't investigate to work out
> exactly what is happening.
Oh, nasty. I'm pretty sure I'd checked that some actual skipping was
going on, but maybe not.
Anyway, I'll take a look.
J
|
|
From: Nicholas N. <nj...@ca...> - 2004-07-28 11:07:27
|
On Tue, 27 Jul 2004, Jeremy Fitzhardinge wrote: > Oh, nasty. I'm pretty sure I'd checked that some actual skipping was > going on, but maybe not. > > Anyway, I'll take a look. Your commit (despite its less-than-inspiring log message :) seems to have worked, thanks. I now have the following invocations working correctly: valgrind --version valgrind --help valgrind --help-debug valgrind --tool=none --help valgrind --tool=none --help-debug Next step is to get init_baseBlock() working... N |