Hi, the reason I'm emailing you is that your program appears to have an internal implementation of an XML parser. As you may or may not know a Hash DoS vulnerability has been found to affect a large number of programming languages (C, C++, Java, PHP, Python. Ruby, etc.) and a large number of programs that implement data arrays using hash functions to generate key values from user supplied input (especially in the XML space). If you've already found and fixed this vulnerability in your code please let me know (I wasn't able to find a fix, but that doesn't mean one hasn't been committed already).
So what is the problem? Put simply an attacker can provide a specially formatted XML file with a few hundred or thousand entries that will cause a pathological worst case behavior for array insertions thatwill consume several seconds or possibly even minutes of CPU time resulting in a denial of service (DoS).
However, strictly speaking, not all XML implementations are vulnerable to exploitation. For example an implementation that only reads an administrator supplied XML configuration file are highly unlikely to be attacked (since the attacker would need administrative access and could likely cause the program to do bad things in other ways as well). Also depending on the language the applications is written in and how the implementation is done it may be addressed by a language update (e.g. Python).
So what I am trying to find out is:
1) Can users pass data to your XML parser (or is it used on trusted data or internally generated data only)?
2) Does your implementation of hashing the key values use randomness, or is it deterministic (e.g. does it simply hash the key with MD5 or something else that will always produce the same hash value for the same input)? If it uses randomness can you let me know how (e.g. a function name or something that I can check quickly).
I would appreciate answers either way (if you are vulnerable or not vulnerable or it is unknown), that way I can cross you off the list if not vulnerable. If you need help researching this I'm more than willing to provide assistance.
Please note that while the following URLs have some information on this class of issue they do not list every single vulnerable language and application (the original focus was on web services, it was only later found to affect XML and other parsers that handle user serializable data. Also if you would like any assistance in correcting this issue please feel free to poke me back.
It should be noted that fixing this issue is relatively simple, just add a small random seed so that the hashed key values are not predictable. Please note that the level of randomness needed is not huge, for example using srand(time(NULL)) on a typical Linux system gives you 2^32 possible values (about 4 billion) which makes pre-computation and testing of collision values nearly impossible (the amount of testing to find a correct value would in itself constitute a significant flood of data). This is in fact the exact solution used by libxml2 to correct this vulnerability.
http://www.ocert.org/advisories/ocert-2011-003.html - #2011-003
multiple implementations denial-of-service via hash algorithm collision
https://rhn.redhat.com/errata/RHSA-2012-0324.html - libxml2 security
- - Effective DoS attacks against Web Application Plattforms ? #hashDoS
http://en.wikipedia.org/wiki/Hash_table - Hash table explanation