> Alternatively, just mark the table as requiring a rehash.
> Actually, just create all HTs as GC-INVARIANT and turn them into
> non-invariant as needed.
I've not followed this thread carefully, but it does sound related
to something that I recall being discussed before.
I gather you're trying to save rehashing by knowing that there's
nothing in the table that needs to be rehashed.
Why not arrange for each table to contain internally a list of
the keys that have to be rehashed?
This seems like a more precise generalization (at the cost of space)
of the scheme you suggest (which stores one bit).
It saves a lot when a large table has a few keys that need to be
rehashed. The cost in space could be limited by a policy that if
there are too many elements that need to be rehashed you just iterate
I hope that the code does a lazy form of rehashing, so that if the
table is not accessed between two gc's there's no rehash cost.
Depending on how the tables work, you might arrange that you don't
need to worry about rehashing when you write, only when you read.
This would be especially worth while for situations where there's
a large initialization phase in which data is entered into the table
but nothing is retrieved.
I'm just about to start experimenting with a large amount of data
where all of this gc performance becomes highly relevant.
The (time) output for the phase that reads the data into a list
of 7.5 million entries:
Real time: 518.6605 sec.
Run time: 389.74 sec.
Space: 6894053224 Bytes
GC: 339, GC time: 90.07 sec.
On input I printed a dot for every 1000 entries.
I noticed what I interpreted as patterns of GC delays.
Is there some way to record when gc's start and end ?
Ideally a gc hook that would be called after gc but give you
access to some data stored from before - run time, real time.
It would be nice to be able to do something (that uses a limited
amount of space) before, like print a message.
I already had to move to the biggest machine I have available.
Current image ~600MB
Any advice is welcome.
Get latest updates about Open Source Projects, Conferences and News.