|
From: Christoph B. <bar...@or...> - 2006-10-24 19:40:10
|
Am Dienstag, 24. Oktober 2006 20:28 schrieb Julian Seward:
> > > If that's true one simple fix would be the usual hack of moving
> > > an ExeContext in a chain one step forwards for each successful
> > > search. Various other tables already do that.
> >
> > You could also store the full hash value along with the list entry. This
> > adds another simple equality test (at the cost of some memory) before
> > you start to iterate up the stack trace doing the expensive full
> > comparison and should also be simple to add in.
>
> Er .. no. Each list contains elements with the same hash value -
> the only point of the initial hash value is to decide which list
> to use. I did spot that the hash computation isn't 64-bit clean,
> though, which can't help.
>
> hash = 0;
> for (i = 0; i < n_ips; i++) {
> hash ^= ips[i];
> hash = (hash << 29) | (hash >> 3);
> }
> hash = hash % N_EC_LISTS;
You are right. For the first 30,009 allocations in my testcase the buckets
are:
Bucket: 3770 Count: 1
Bucket: 4023 Count: 1
Bucket: 4131 Count: 1
Bucket: 4143 Count: 1
Bucket: 4497 Count: 1
Bucket: 6757 Count: 2400
Bucket: 13338 Count: 1
Bucket: 14909 Count: 1
Bucket: 19312 Count: 1
Bucket: 21097 Count: 25799
Bucket: 23305 Count: 1
Bucket: 23854 Count: 1800
Bucket: 25883 Count: 1
Changing the line to
hash = (hash << 61) | (hash >> 3);
leads to a better distribution of the values and better runtime for the
testcase. However I do not know whether rotation by 3 really leads to a good
hash function for 64 bit.
Additionally, why is no dynamic hashmap used? One that expands the number of
buckets each time the density gets to high.
Christoph
|