cJSON parse_string can cause segment fault when parsing invalid unicode strings (When parsing some content from libcurl, because network problems, the unicode is half of the original string. e.g. \u0571 ==> \u0)
in this function, the length used to allocate memory for ptr2 is determined by ptr, but when parsing ptr, if the unicode is not correct, parse_hex4 will return 0, but ptr is still added by 4 or 6:
....
uc=parse_hex4(ptr);ptr+=4;
....
uc2=parse_hex4(ptr);ptr+=6;
the pointer ptr will jump after the end of the input string, and looks for \" or 0 to end, and keeps on writing memorys pointed by ptr2. But ptr2 memory may not be enough ... and write something to some memory owned by other codes ...
a simple patch will be:
change
static unsigned parse_hex4(const char * str)
to
static unsigned parse_hex4(const char * & str)
and change
uc=parse_hex4(ptr+1)
to:
ptr++;
uc=parse_hex4(ptr);
//ptr+=4; / get the unicode char. /
and change
uc2=parse_hex4(ptr+3);ptr+=6;
to
uc2=parse_hex4(ptr+3);
//ptr+=6;
a simpler fix:
cJSON.c file:
Last edit: kolman 2015-02-25
Last edit: kolman 2015-02-25
Thanks kolman !
However, we found that the suggested patch is still dangerous when parsing invalid json string such as "\u". We end up referencing the buffer beyond the invalid string and could cause crashes.
We added kolman's patch to address other issues. Attached is our patch version. Now parse_string function looks like the following:
Accepted patch. Everything is moving to github at https://github.com/davegamble/cJSON
Dave.