Function ezxml_decode() while parsing crafted XML file performs incorrect memory handling leading to heap buffer overread while running strlen() on NULL pointer.
=================================================================
==21303==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000000 (pc 0x7f5b350b1746 bp 0x7fff4cc8acd0 sp 0x7fff4cc8a458 T0)
0 0x7f5b350b1745 in strlen (/lib/x86_64-linux-gnu/libc.so.6+0x8b745)
1 0x7f5b354601a5 in __interceptor_strlen (/usr/lib/x86_64-linux-gnu/libasan.so.2+0x701a5)
2 0x40635b in ezxml_decode ezxml_0.8.6/ezxml.c:196
3 0x415d2b in ezxml_parse_str ezxml_0.8.6/ezxml.c:525
4 0x417a7a in ezxml_parse_fd ezxml_0.8.6/ezxml.c:641
5 0x417d00 in ezxml_parse_file ezxml_0.8.6/ezxml.c:659
6 0x401972 in main ezxml_0.8.6/test_ezxml.c:113
7 0x7f5b3504682f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2082f)
8 0x401c78 in _start (ezxml_0.8.6/test_ezxml_asan.exe+0x401c78)
AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV ??:0 strlen
==21303==ABORTING
Reproduction:
Sample XML file leading to crash:
crash_009_SEGV_ezxml_decode_strlen.raw
Code snippet for reproduction:
ezxml_t result = ezxml_parse_file("crash_009_SEGV_ezxml_decode_strlen.raw");
The issue is due to bogus input data where an entity reference does not end in a ';'. The proposed patch addresses this.