From: Daniel P. <do...@me...> - 2023-02-20 21:38:14
|
I was able to whittle away the xml parts of the logic until none was left. This turns out to be a gawk core bug. Reproducer: #!/usr/bin/gawk -f function f(x) { return x; } BEGIN { print "a[b] is " (a["b"] ? "true" : "false"); f(a["b"]); print "a[b] is " (a["b"] ? "true" : "false"); print a["b"]; } Result on 5.1.1: $ /tmp/arraybug.awk a[b] is false a[b] is false On 5.2.1: $ /tmp/portage/sys-apps/gawk-5.2.1/image/usr/bin/gawk -f /tmp/arraybug.awk a[b] is false a[b] is true free(): double free detected in tcache 2 Aborted The syndrome in a nutshell: if a nonexistent array element is passed as an argument to a function, the element is sortof-created, such that testing it somehow evaluates to true, but its state/internal pointers are invalid. I've actually gotten scripts to outright SEGV and exhibit various other obviously undefined behavior, like printing characters from the name of the redirect target ("/dev/stde" etc), by just changing the length of words in a printf format (constant string). Do I need to refile a bug on gawk core, or have I "done enough", as it were? Oh and thanks for the quick turnaround! On Mon, 2023-02-20 at 10:13 -0500, Andrew J. Schorr wrote: > Hi, > > On Mon, Feb 20, 2023 at 02:43:58AM -0600, Daniel Pouzzner via Gawkextlib-users wrote: > > Is gawkextlib xml expected to work with gawk 5.2.1 (API 3.2), with the new > > AWK_BOOL? > > I naively expect it to work. :-) If it doesn't work, then we've got a problem. > > > It works as expected with awk 5.1.1, and with all earlier versions going back to > > 4.1.3. I've been using it regularly since 2017. > > Glad to hear you've been finding it useful. > > > But with 5.2.1 I'm seeing anomalous behavior where empty xml elements (e.g. > > <doi></doi>) are evaluating as true even though they string-equal "". > > > > In connection with that empty xml field, gawk 5.2.1 crashes with > > > > gawk: ../mkbib.awk:1142: (FILENAME=buzsaki_2003_EEG_source.xml FNR=173) fatal: internal error: file eval.c, line 1358: unexpected parameter type Node_illegal > > > > If I build with sanitizer, I see concat_exp() doing a double-free of an arg that > > was earlier freed by r_interpret(). > > > > I did a whole slew of experiments to try to understand what's happening, but > > it's a large and tricky code base. It seems to have something to do with > > Node_var appearing where usually Node_val is, but I was quickly in over my head. > > > > libgawkextlib and xml.so were both built with gawk-5.2.1 installed. I tried > > with old release code and with the latest git sources -- same result, as above. > > > > If it's useful I can share buzsaki_2003_EEG_source.xml and even the script > > that's crashing on 5.2.1. > > Do you have a small test case that reproduces the problem? That would be very > helpful for debugging. If you don't have a small test case, then I guess a large > test case may be better than nothing. > > Regards, > Andy |