|
From: Daniel P. <do...@me...> - 2023-02-20 21:38:14
|
I was able to whittle away the xml parts of the logic until none was left. This
turns out to be a gawk core bug. Reproducer:
#!/usr/bin/gawk -f
function f(x) {
return x;
}
BEGIN {
print "a[b] is " (a["b"] ? "true" : "false");
f(a["b"]);
print "a[b] is " (a["b"] ? "true" : "false");
print a["b"];
}
Result on 5.1.1:
$ /tmp/arraybug.awk
a[b] is false
a[b] is false
On 5.2.1:
$ /tmp/portage/sys-apps/gawk-5.2.1/image/usr/bin/gawk -f /tmp/arraybug.awk
a[b] is false
a[b] is true
free(): double free detected in tcache 2
Aborted
The syndrome in a nutshell: if a nonexistent array element is passed as an
argument to a function, the element is sortof-created, such that testing it
somehow evaluates to true, but its state/internal pointers are invalid. I've
actually gotten scripts to outright SEGV and exhibit various other obviously
undefined behavior, like printing characters from the name of the redirect
target ("/dev/stde" etc), by just changing the length of words in a printf
format (constant string).
Do I need to refile a bug on gawk core, or have I "done enough", as it were?
Oh and thanks for the quick turnaround!
On Mon, 2023-02-20 at 10:13 -0500, Andrew J. Schorr wrote:
> Hi,
>
> On Mon, Feb 20, 2023 at 02:43:58AM -0600, Daniel Pouzzner via Gawkextlib-users wrote:
> > Is gawkextlib xml expected to work with gawk 5.2.1 (API 3.2), with the new
> > AWK_BOOL?
>
> I naively expect it to work. :-) If it doesn't work, then we've got a problem.
>
> > It works as expected with awk 5.1.1, and with all earlier versions going back to
> > 4.1.3. I've been using it regularly since 2017.
>
> Glad to hear you've been finding it useful.
>
> > But with 5.2.1 I'm seeing anomalous behavior where empty xml elements (e.g.
> > <doi></doi>) are evaluating as true even though they string-equal "".
> >
> > In connection with that empty xml field, gawk 5.2.1 crashes with
> >
> > gawk: ../mkbib.awk:1142: (FILENAME=buzsaki_2003_EEG_source.xml FNR=173) fatal: internal error: file eval.c, line 1358: unexpected parameter type Node_illegal
> >
> > If I build with sanitizer, I see concat_exp() doing a double-free of an arg that
> > was earlier freed by r_interpret().
> >
> > I did a whole slew of experiments to try to understand what's happening, but
> > it's a large and tricky code base. It seems to have something to do with
> > Node_var appearing where usually Node_val is, but I was quickly in over my head.
> >
> > libgawkextlib and xml.so were both built with gawk-5.2.1 installed. I tried
> > with old release code and with the latest git sources -- same result, as above.
> >
> > If it's useful I can share buzsaki_2003_EEG_source.xml and even the script
> > that's crashing on 5.2.1.
>
> Do you have a small test case that reproduces the problem? That would be very
> helpful for debugging. If you don't have a small test case, then I guess a large
> test case may be better than nothing.
>
> Regards,
> Andy
|