#2487 HashDoS fix 3498482

Security
closed-fixed
5
2012-09-14
2012-06-15
Tomas Hoger
No

This is a follow-up on:
https://sourceforge.net/tracker/?func=detail&atid=712784&aid=3498482&group_id=128809

created as new bug as SF.net tracker does not seem to allow new comments on closed bugs.

The fix that was applied to address that bug is here:
http://sblim.cvs.sourceforge.net/viewvc/sblim/jsr48-client/src/org/sblim/cimclient/internal/cimxml/sax/NodeFactory.java?view=log#rev1.7

My questions are:

- Does it address the issue? The fix adds a randomized suffix to keys used to access NodeFactory.NODENAME_HASH HashMap. The suffix is constant during the life time of the NODENAME_HASH, all keys get the same suffix. However, String.hashCode() has a property that if key1.hashCode() equals key2.hashCode(), hash code of key1+suffix equals to key2+suffix for any suffix. Given that, this randomized suffix approach should not help avoiding collisions.

- Is it a real issue? NODENAME_HASH only seems to be written to in initNodeNameHash(). Additionally, initNodeNameHash() only seems to be called from a single place, where fixed (hard-coded in the source) String[] array is passed to it to populate NODENAME_HASH. Is there some other use I missed where inputs from (untrusted) XML gets stored to NODENAME_HASH?

Discussion

  • Dave Blaschke

    Dave Blaschke - 2012-06-15

    - I doubt it perfectly addresses the issue, but it supposedly addresses the issue enough to satisfy the submitter's intent for opening the original bug. The perfect solution is to fix the underlying Java code, but Oracle can't fix Hashmaps because the Java spec defines how hashing must take place. According to the developer at Red Hat who raised the original bug against the Java CIM Client:

    "One solution would be to append a random string to each key, the same random string would work (effectively you're adding the random seed in the data rather than in the hash array). Remember, it doesn't need to be super strong, just
    enough to defeat precomputed attacks. Something like Random() should be plenty."

    The accepted solution was based on this logic, using Random(current time). While it won't prevent collisions for a single instance of the client, it will prevent incoming XML keywords from hashing to the exact same value on subsequent instances. I was under the impression that this is what was trying to be avoided - a set of XML keywords always hashing to the same value, making it possible for an attacker to construct XML that hits on a single entry over and over again, leading to DoS.

    - NODENAME_HASH is populated with the String[] of CIM keywords and is used to lookup (untrusted) XML keywords to see if they are valid CIM keywords or not. If the hash DoS attack only involves adding (writing) untrusted XML to a Hashmap, then that is not the case here - the untrusted XML is only used to read from the Hashmap.

    I did not get a lot of feedback from Red Hat on this issue as the submitter was way too busy, so something might have got lost in the details. They originally said the problem was in CIMXMLParserImpl.java because it uses arrays and the arrays use hashing internally, but I could not find what they were talking about in CIMXMLParserImpl.java. The only possible issue I found was in NodeFactory.java, and the conversation concerning that is in the original bug.

     
  • Dave Blaschke

    Dave Blaschke - 2012-06-15
    • assigned_to: nobody --> blaschke-oss
     
  • Tomas Hoger

    Tomas Hoger - 2012-06-15

    After discussions with Kurt, we seem to agree that random suffix does not actually help. Using random prefix would not help either given the Java hashCode() implementation, even though that seems to be rather equivalent to using random seed.

    This issue is only relevant when you're adding sufficiently large amount of untrusted inputs to the hash. Sufficiently large means that it's the amount that hash is normally expected to handle gracefully (e.g. thousands of entries), but hash function collisions make them all end up in a single bucket. After that, any further attempt to either put or get finds right bucket quickly, but then it needs to do linear search doing full compare of all colliding keys.

    If NODENAME_HASH is only populated with small / fixed amount of trusted entries, it's not affected by this problem. Doing look-ups using untrusted strings is not an issue, as that does not lead to unexpected growth of the specific hash bucket.

     
  • Dave Blaschke

    Dave Blaschke - 2012-06-19

    Thank you for the explanation. Unless you indicate otherwise, I will use this bug to back out the fix for bug 3498482.

    Is there any chance you might know where Kurt thought the original vulnerability was in CIMXMLParserImpl.java "which uses arrays, the arrays use hashing internally?" I would really like to know where the vulnerability is to see if there's anything I can do to lessen/eliminate it.

     
  • Dave Blaschke

    Dave Blaschke - 2012-06-19
    • status: open --> open-accepted
     
  • Tomas Hoger

    Tomas Hoger - 2012-07-26

    I agree that removing NodeFactory.NODENAME_HASH randomization is ok given the explanation above.

    I had a quick look at other Hastable and HashMap uses to see if any other may be relevant. Here's my list, review from someone more familiar with this code is definitely appreciated.

    Hashtable:

    - HttpHeader.iFields - HTTP headers, number probably not limited. HTTP headers were one of the attack vectors for e.g. Tomcat. Unlimited number of headers stored in the Hashtable can have memory usage consequences, so might be worth limiting regardless of the hashdos issues. I think options here are:
    * limit number of headers processed per HTTP request. After the limit is reached, either abandon request or ignore subsequent headers.
    * if code ever uses certain fixed set of headers, header parser may skip all unknown headers and only store those few that will get used. This may not be an option if headers can be used by applications using the library.

    - CIMMessage.iElements - not used anywhere?

    HashMap:

    - ServiceTable.iMap - at max 1024 keys (0x1000 + 0 .. 0x1000 + 1023), ok

    - WBEMListenerImpl.iPortMap - listener ports, this should be trusted input I believe, and hence ok

    - NodeFactory.cParserMap - fixed hard-coded set of strings used as keys, ok

    - NodeFactory.NODENAME_HASH - previously discussed, fixed hard-coded set of strings used as keys, ok

    - NodePool.iPoolMap - only seems to use NODENAME_HASH items as keys, ok

    - CIMObjectFactory.cTypeStrMap - fixed hard-coded set of strings used as keys, ok

    - WBEMServiceAdvertisementSLP.iAttributeMap - seems to process untrusted input, possible issue

    - AdvertisementCatalog.iCatalogByDirectory, AdvertisementCatalog.iCatalogById, AdvertisementCatalog.addAdvertisements.innerMap - unsure, but it seems it might use untrusted keys

    Note that it seems this may get fixed in Java after all, some fixes are expected in upcoming 1.7.0 updates from Oracle.

    HTH

     
  • Dave Blaschke

    Dave Blaschke - 2012-07-31

    Thank you so much for looking through the code. I have examined the code you highlighted in the previous comment, and here are my observations:

    Hashtable:

    - HttpHeader.iFields - Stores unlimited unique HTTP header fields (name-value pairs) so a multitude of unique fields can lead to memory issues; limiting number of fields is preferred solution since HttpHeader is public

    - CIMMessage.iElements - Not used dating back to oldest available code (1.2.1.1), so ok

    HashMap:

    - ServiceTable.iMap - Stores counters for IPv6 SLP services that controls joining/leaving groups (maximum 1024 keys), agree ok

    - WBEMListenerImpl.iPortMap - Stores listener ports created via Java API and NOT via untrusted HTTP/XML/other data, agree ok

    - NodeFactory.cParserMap - Stores fixed set of CIM keywords to map to Java node objects, agree ok

    - NodeFactory.NODENAME_HASH - Stores fixed set of CIM keywords for parsing XML, agree ok

    - NodePool.iPoolMap - Stores Java node objects so they can be reused (objects limited to CIM keywords), agree ok

    - CIMObjectFactory.cTypeStrMap - Stores fixed set of CIM data types to map to Java CIMDataType objects, agree ok

    - WBEMServiceAdvertisementSLP.iAttributeMap - Stores unlimited unique SLP service attributes (name-value pairs) so a multitude of unique attributes can lead to memory issues; limiting number of attributes is preferred solution since set of attributes not known

    - AdvertisementCatalog.iCatalogByDirectory, AdvertisementCatalog.iCatalogById, AdvertisementCatalog.addAdvertisements.innerMap - Stores SLP service advertisements in catalog created via Java API and not untrusted HTTP/XML/other data, should be ok

    So, with this info in mind, my suggestion is to use this bug to back out the HashDoS fix and get back the 2% performance degradation, then open two new bugs to limit number of parsed HTTP header fields and number of SLP service attributes - acceptable?

     
  • Dave Blaschke

    Dave Blaschke - 2012-08-01
    • status: open-accepted --> open-duplicate
     
  • Dave Blaschke

    Dave Blaschke - 2012-08-01

    Patch sent for community review. During a 2 week period any
    exploiter may comment on the patch, request changes or turn it
    down completely (with good reason). For the time being the patch is part of the "Experimental" branch in CVS.

     
  • Dave Blaschke

    Dave Blaschke - 2012-08-01
    • status: open-duplicate --> open-fixed
     
  • Tomas Hoger

    Tomas Hoger - 2012-08-02

    > So, with this info in mind, my suggestion is to use this bug to back out
    > the HashDoS fix and get back the 2% performance degradation, then open two
    > new bugs to limit number of parsed HTTP header fields and number of SLP
    > service attributes - acceptable?

    Agree there's no need to do any NODENAME_HASH randomization.

    Limiting the number of headers and attributes sounds like a reasonable fix too, but I'm pretty much unfamiliar with sblim/wbem to know if there may be any better approach, or if it makes a real difference (i.e. limiting memory that can be used by headers may not help much if there the lib is only using available memory as a limit to the size of some other untrusted inputs that are processed and not stored in hash data structures).

     
  • Dave Blaschke

    Dave Blaschke - 2012-08-17

    Patch against HEAD

     
  • Dave Blaschke

    Dave Blaschke - 2012-08-17

    The community review has completed and we received no substantial criticism. Therefore the patch has been approved and merged into the "HEAD" branch. The next release will pick it up.

     
  • Dave Blaschke

    Dave Blaschke - 2012-08-17
    • status: open-fixed --> pending-fixed
     
  • Dave Blaschke

    Dave Blaschke - 2012-09-14

    The patch was picked up by release 2.2.0 and will therefore be closed.

     
  • Dave Blaschke

    Dave Blaschke - 2012-09-14
    • status: pending-fixed --> closed-fixed
     

Log in to post a comment.

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:

JavaScript is required for this form.





No, thanks