Dear Matthieu Labas
We have found a buffer overflow issue in read_line_alloc at src/sxmlc.c:1841.
The crash input is automatically generated by our test generation tool FOCAL.
You can find crash2.html in the attachement
Here are details to reproduce the buffer overflow.
- OS & Compiler
Ubuntu Linux 16.04 x64 and GCC 5.4.0
- Build command
$ gcc -fsanitize=address -o htmlstrip ./src/examples/htmlstrip.c src/sxmlc.c
- Run command
$ ./htmlstrip crash2.html
- Outputs
```
=================================================================
==10239==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x61100000a000 at pc 0x0000004251ae bp 0x7fff2bff9880 sp 0x7fff2bff9870
WRITE of size 1 at 0x61100000a000 thread T0
#0 0x4251ad in read_line_alloc src/sxmlc.c:1841
#1 0x4251ad in _parse_data_SAX src/sxmlc.c:1251
#2 0x42635e in XMLDoc_parse_file_SAX src/sxmlc.c:1622
#3 0x401817 in main src/examples/htmlstrip.c:133
#4 0x7fc392f7082f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2082f)
#5 0x401ac8 in _start (/home/yhkim/sxmlc/htmlstrip+0x401ac8)
0x61100000a000 is located 0 bytes to the right of 256-byte region [0x611000009f00,0x61100000a000)
allocated by thread T0 here:
#0 0x7fc3933b2602 in malloc (/usr/lib/x86_64-linux-gnu/libasan.so.2+0x98602)
#1 0x4045fa in read_line_alloc src/sxmlc.c:1825
SUMMARY: AddressSanitizer: heap-buffer-overflow src/sxmlc.c:1841 read_line_alloc
Shadow bytes around the buggy address:
0x0c227fff93b0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c227fff93c0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c227fff93d0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c227fff93e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0c227fff93f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
=>0x0c227fff9400:[fa]fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c227fff9410: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c227fff9420: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c227fff9430: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c227fff9440: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c227fff9450: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
Addressable: 00
Partially addressable: 01 02 03 04 05 06 07
Heap left redzone: fa
Heap right redzone: fb
Freed heap region: fd
Stack left redzone: f1
Stack mid redzone: f2
Stack right redzone: f3
Stack partial redzone: f4
Stack after return: f5
Stack use after scope: f8
Global redzone: f9
Global init order: f6
Poisoned by user: f7
Container overflow: fc
Array cookie: ac
Intra object redzone: bb
ASan internal: fe
==10239==ABORTING
```
I can't read the
crash2.html. Can you try to attach it again? Or copy-paste it in a comment?Thanks!
crash2.html is not a human-readable file because crash2.html is generated by our automated test generation tool FOCAL, not a human developer. A malicious attacker can use a such unexpected file (human readable or not) to cause a crash and exploit this crash as a security vulnerability. Thus, I think that this bug can cause a serious security problem and should be fixed.
Thank you
Fixed in v4.2.10.