predictable hash function
Brought to you by:
artyom-beilis
It's not really a bug, but i think it should be mentioned somewhere in the docs, that the hash_map and hash_map based modules (e.g caching) are not safe for all kinds of user input/user provided data - because of the predictable hash function and the resulting vulnerability to hash-DoS attacks.
Poc:
>>> def weinberg_hasher(inp): ... h = 0 ... for c in inp: ... h = (h << 4) + ord(c) ... high = h & 0xF0000000 ... if high != 0: ... h = h ^ (high >> 24) ^ high ... return h & 0xFFFFFFFF >>> res = [hasher("abc" * len_) for len_ in range(10000)] >>> len(res) # values 10000 >>> len(set(res)) # distinct values 5
Anonymous
Interesting point to think about