Once we start the project, the GSoC student can be bound to deliverables
that are possible within the time frame. However the full scope of the
project might take more. I can work on those parallely.
If I can be kept in the loop for discussions and I can do more detailed
analysis and I hope we can find ways we can collaborate with each other.
On Tue, Apr 23, 2013 at 5:22 AM, Paul Khuong <pvk@...> wrote:
> Amitav Mohanty wrote:
>> I am interested in contributing to SBCL; but I am no more a student so I
>> can't go through GSoC. However, since I am beginning to work on SBCL, I
>> will need some guidance. Yesterday we talked about it at #sbcl too. We
>> talked about a GSoC student getting preference and I agree to that.
>> I am interested in the task 'stronger hash functions and specialised hash
>> tables'. Robert would like to join in and both of us would like to work on
>> it. As we are both engaged by our jobs, we would not be bound to a
>> timeline; but we are interested in contributing.
>> Robert already has done some work related to hashes
> I'm CCing Aenik Shah, as he's a student interested in the project.
> Discussions on #sbcl seemed to converge on two points:
> 1. Implementing a strong block hash function and hacking it for
> incremental hashing would be useful. We could use the function for strings
> and bit vectors, but also for object graphs: feeding data to a block hash
> function (rather than mixing fixnum-sized hash values) should result in a
> higher-quality distribution. I expect assembly code implementations of the
> inner loop will eventually be interesting, but getting something working in
> CL would be a good first step, and doesn't even have to be tightly
> integrated with SBCL.
> 2. We want to improve the hash function for immediate data (characters,
> fixnums, pointer addresses, etc.), but speed doesn't seem to be too much of
> an issue: curently, only sxhash of symbols has an impact, and that's
> handled as a cache lookup. For everything else, the common use case
> (gethash/(setf gethash)) goes through a full call to the hash function,
> which then dispatches on the object's type before computing a hash value.
> So it's not too bad if hashing immediate data (after type dispatch) is up
> to twice or thrice as slow, as long as the distribution of hash values is
> significantly improved in return.
> I'm not sure how you could best collaborate with a GSoC student. I suppose
> it would be reasonable to try and coordinate to implement hash functions in
> CL. After that, it will clearly depend on the direction the student will
> wish to work in.
> Paul Khuong