autogen fails with -D_FORTIFY_SOURCE3
Brought to you by:
bkorb
Fails here:
[ 259s] >stress-run_ag> MALLOC_CHECK_=3
[ 259s] >stress-run_ag> /home/abuild/rpmbuild/BUILD/autogen-5.18.16/agen5/autogen --trace=every '--trace-out=>>stress-aglog-x-30963.log' -L/home/abuild/rpmbuild/BUILD/autogen-5.18.16/autoopts/tpl -b stress -
[ 259s] *** buffer overflow detected ***: terminated
[ 259s] ./defs: line 200: 31075 Aborted MALLOC_CHECK_=3 ${AGexe} ${opts} "$@"
[ 259s] >stress-> test 134 -eq 0
[ 259s] >stress-> failure autogen failed
fails due to defLoad.c:
rem_sz = data_sz+4+sizeof(*base_ctx);
base_ctx = (scan_ctx_t *)AGALOC(rem_sz, "file buf");
memset(VOIDP(base_ctx), 0, rem_sz);
base_ctx->scx_line = 1;
rem_sz = data_sz;
/*
* Our base context will have its currency pointer set to this
* input. It is also a scanning pointer, but since this buffer
* is never deallocated, we do not have to remember the initial
* value. (It may get reallocated here in this routine, tho...)
*/
data =
base_ctx->scx_scan =
base_ctx->scx_data = (char *)(base_ctx + 1);
base_ctx->scx_next = NULL;
... for (;;) {
size_t rdct = fread(VOIDP(data), (size_t)1, rem_sz, fp);
as seen base_ctx is allocated with size rem_sz , but then fread is called for data (which is base_ctx + 1). Thus it's off by one error.
That use definitely looks a bit odd;
(char *)(base_ctx + 1)is actually not just 1 byte off, it'ssizeof (scan_ctx_t)off, leading tofreadgoing beyond bounds beforesizeLeftis 0.As an aside, it seems like the code is looking to use scan_ctx_t with a flex array at the end, so it might make sense to simply add a
char data[];at the end of that struct.Last edit: Siddhesh Poyarekar 2022-04-01
Our analysis is not correct:
Note
rem_szis assigneddata_sz, thendata = (base_ctx + 1)means thatdatahas room forrem_sz + 4. Thusfread(VOIDP(data), (size_t)1, rem_sz, fp);should be fine. Let me isolate a test-case.It's GCC bug:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105217
Please close this issue.
This exposes a GCC bug
P.S. the GCC bug's cause might be the temporary use of rem_sz for the allocation size. In the GIT sources, I've changed those 3 lines to 5, but I've not compiled or tested yet.
Last edit: Bruce Korb 2022-04-11
Thanks to Martin's reduced reproducer in the gcc bug[1], I can now see what's going on. The trouble is that gcc is unable to associate the reallocation in:
with
pzDatawhenp == base_ctxbecause the pointer is not explicitly updated. This happens to work with the glibc malloc implementation but strictly speaking, the comparisonif (p != base_ctx) {is invoking undefined behaviour as the C standard states that the pointer to an object (in this case, base_ctx) has an indeterminate value beyond its lifetime and hence using it (even its numeric value) may invoke undefined behaviour.The most correct fix for this should be this:
since it avoids the undefined behaviour and reliably updates pzData and base_ctx but if this code is performance-sensitive, minimally updating base_ctx and pzData unconditionally should be sufficient. The undefined behaviour would continue to exist though and there's no guarantee that future changes to compilers won't stumble on this.
All that said though, I know that this is a fairly common micro-optimization idiom to avoid updating pointers when the reallocated block hasn't moved. I'll probably try adding in some more diagnostics to gcc 13 to make at least a small subset of this kind of idiom work but I don't know how far I can take it.
[1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105217
If the putative undefined behavior actually becomes programmatically incorrect, then the compiler/OS implementation will be grossly incorrect. Meanwhile, this is only ever used if the defs are being read from a pipe or other source wherein you cannot determine its size. It should be pretty rare that it's ever used and when it's used, it will likely squander several microseconds. IOW, I'll make your suggested change. :) thank you.
Thanks for the applied commit.
I think it contains a small typo:
as reported by:
Thank you. I can't get it to build myself. The underlying infrastructure has been "improved" to the point where I have to adapt to the changes. Fixed now.