Bug in handling TIXML_UTF_LEAD0

Brought to you by: leethomason

Bug in handling TIXML_UTF_LEAD0

Forum: Developer

Creator: Yuheng Kuang

Created: 2008-01-17

Updated: 2013-05-20

Yuheng Kuang - 2008-01-17

Version 2.5.3, TINYXML_USE_STL = no

in TinyXmlParsingData::Stamp()
When we meet an TIXML_UTF_LEAD_0 followed by '\0' in UTF-8 NULL
we come to line 276 and do nothing
that causes an infinite loop

It happens when
1. in CDATA section
2. TIXML_UTF_LEAD_0 followed by one '\0',
3. and there's some non-zero bytes right after '\0'

It's introduced when parsing a CDATA section
in the loop of line 1524, it meets '\0' and break
but it still perform ReadText before checking in line 1533
it skips all the loop coz *p == '\0';
but in line 634 it did not check whether *p == '\0' and simply += strlen(endTag)
now p points to some random bytes after '\0'
then the TinyXmlParsingData::Stamp() will get into infinite loop right after.

a simple program can reproduce this problem:

<pre>
#include <stdio.h>
#include "tinyxml.h"
int
main(int argc, char **argv)
{
        char buf[] = "<?xml version=\"1.0\" encoding=\"utf-8\" ?><feed><![CDATA[Test XMLblablablalblbl";
        buf[60] = (char)239;
        buf[61] = '\0';

        TiXmlDocument doc;
        doc.Parse(buf);
        return 0;
}
</pre>

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Lee Thomason - 2008-03-23
  
  Good catch, and already entered in "bugs" - thanks!
  lee
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Log in to post a comment.