predictable hash function
Brought to you by:
artyom-beilis
It's not really a bug, but i think it should be mentioned somewhere in the docs, that the hash_map and hash_map based modules (e.g caching) are not safe for all kinds of user input/user provided data - because of the predictable hash function and the resulting vulnerability to hash-DoS attacks.
Poc:
>>> def weinberg_hasher(inp):
... h = 0
... for c in inp:
... h = (h << 4) + ord(c)
... high = h & 0xF0000000
... if high != 0:
... h = h ^ (high >> 24) ^ high
... return h & 0xFFFFFFFF
>>> res = [hasher("abc" * len_) for len_ in range(10000)]
>>> len(res) # values
10000
>>> len(set(res)) # distinct values
5
Anonymous
Interesting point to think about