Amitav Mohanty wrote:
I am interested in contributing to SBCL; but I am no more a student so I
can't go through GSoC. However, since I am beginning to work on SBCL, I
will need some guidance. Yesterday we talked about it at #sbcl too. We
talked about a GSoC student getting preference and I agree to that.
I am interested in the task 'stronger hash functions and specialised hash
tables'. Robert would like to join in and both of us would like to work on
it. As we are both engaged by our jobs, we would not be bound to a
timeline; but we are interested in contributing.
Robert already has done some work related to hashes
I'm CCing Aenik Shah, as he's a student interested in the project.
Discussions on #sbcl seemed to converge on two points:
1. Implementing a strong block hash function and hacking it for incremental hashing would be useful. We could use the function for strings and bit vectors, but also for object graphs: feeding data to a block hash function (rather than mixing fixnum-sized hash values) should result in a higher-quality distribution. I expect assembly code implementations of the inner loop will eventually be interesting, but getting something working in CL would be a good first step, and doesn't even have to be tightly integrated with SBCL.
2. We want to improve the hash function for immediate data (characters, fixnums, pointer addresses, etc.), but speed doesn't seem to be too much of an issue: curently, only sxhash of symbols has an impact, and that's handled as a cache lookup. For everything else, the common use case (gethash/(setf gethash)) goes through a full call to the hash function, which then dispatches on the object's type before computing a hash value. So it's not too bad if hashing immediate data (after type dispatch) is up to twice or thrice as slow, as long as the distribution of hash values is significantly improved in return.
I'm not sure how you could best collaborate with a GSoC student. I suppose it would be reasonable to try and coordinate to implement hash functions in CL. After that, it will clearly depend on the direction the student will wish to work in.