You can subscribe to this list here.
2006 |
Jan
(4) |
Feb
(3) |
Mar
(4) |
Apr
(5) |
May
(10) |
Jun
(7) |
Jul
(2) |
Aug
(1) |
Sep
|
Oct
|
Nov
|
Dec
|
---|---|---|---|---|---|---|---|---|---|---|---|---|
2007 |
Jan
(1) |
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
(18) |
Oct
|
Nov
|
Dec
|
2008 |
Jan
(5) |
Feb
(9) |
Mar
(24) |
Apr
(3) |
May
(2) |
Jun
(1) |
Jul
(8) |
Aug
(3) |
Sep
(10) |
Oct
(5) |
Nov
|
Dec
(9) |
2009 |
Jan
(1) |
Feb
|
Mar
(1) |
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2010 |
Jan
|
Feb
(39) |
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2012 |
Jan
|
Feb
|
Mar
(7) |
Apr
|
May
|
Jun
|
Jul
(1) |
Aug
|
Sep
|
Oct
|
Nov
(6) |
Dec
(3) |
2013 |
Jan
(13) |
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
(6) |
Dec
(2) |
2014 |
Jan
|
Feb
|
Mar
(2) |
Apr
(7) |
May
(8) |
Jun
|
Jul
(3) |
Aug
(3) |
Sep
(15) |
Oct
|
Nov
(15) |
Dec
|
2015 |
Jan
|
Feb
|
Mar
(1) |
Apr
|
May
|
Jun
(5) |
Jul
|
Aug
|
Sep
(2) |
Oct
|
Nov
|
Dec
|
2016 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(1) |
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2018 |
Jan
(2) |
Feb
|
Mar
(1) |
Apr
(10) |
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
(4) |
Dec
|
2019 |
Jan
(5) |
Feb
|
Mar
|
Apr
|
May
|
Jun
(10) |
Jul
|
Aug
|
Sep
(3) |
Oct
|
Nov
|
Dec
(1) |
2020 |
Jan
(5) |
Feb
|
Mar
|
Apr
|
May
(2) |
Jun
|
Jul
|
Aug
|
Sep
(27) |
Oct
(4) |
Nov
|
Dec
|
2021 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
(20) |
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2023 |
Jan
|
Feb
(5) |
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
(2) |
Sep
|
Oct
|
Nov
|
Dec
|
2025 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(1) |
Jun
(2) |
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
From: SBP <se...@ik...> - 2025-06-02 03:49:39
|
Hi Juergen, First of all, thank you for taking the time to read the doc and give your feedback. I'll provide some info on each point as a way of reply. 1. Thanks! It took me quite a while (measured in years) to go from thinking "I wish this could be done in GAWK..." to having the time to understand how it all works and then to actually implement it. The GAWK docs were very informative and helpful, as was being able to look at the code of other extensions (the wonders of free software!). 2. Yes, I'm all for integrating this into gawkextlib, but that would require the thumbs up from the maintainers. Also I'm not familiar with either sourceforge code hosting nor the legal intricacies of ceding copyright to the FSF, GNU coding standards, etc. (Help needed.) 3. Honestly, I only realized the FIELDWIDTHS variable existed until the extension was practically done, so I never explored that possibility, heh. I guess it could replace the input parser part of the extension (which sets the field widths internally), but as you say, the specialized functions to process binary data would still be needed, as would be the parsing of the format. 4. Cool! There must be for sure many ways of solving this, but part of my motivation for writing the extension was to provide a standardized way of processing binary data files in GAWK, without involving other tools or passing through textual representations. Of course, the success of this premise rests on user acceptance. Finally, well... it's not so much that I wanted to learn about GAWK's internals than I was willing to do the investment in time and effort to achieve this functionality. I even considered the thought of developing a whole AWK clone from scratch (in a similar spirit as that prior "bawk" project, I guess), but thankfully the way of a GAWK extension worked in the end :) I think this is a better way, as it can potentially become part of a community and ecosystem of widely available free software. Cheers! Sirio On June 1, 2025 8:16:46 PM UTC, "Jürgen Kahrs" <jue...@go...> wrote: >Hello Sirio, >I had a look at your doc and it surprised me in several ways. > >1. You read the GAWK doc about the internal structures and the extension >mechanism and the build mechanism, you implemented the new extension >and you even wrote doc about it. That was surely a challenge and much work. >Respect ! > >2. You made the new extension available in a forked project so that it >can be compiled on its own. There is still the option of integrating the new >extension into the gawkextlib source tree. You should consider doing this. > >3. Instead of writing an extension for processing binary data, I myself >would have tried to find a way to use GNU Awk's FIELDWIDTHS variable. >FIELDWIDTHS determines how binary record splitting is done and then >individual fields can be processed with special functions as binary data. >No need to write an extension. > >4. Years ago I also used GNU Awk to handle binary data (without an extension). >Writing is easy, but formatted reading was harder. >I found a way to pre-process binary input data in a portable way with the help >of the POSIX compliant tool od (which has many many options). >od converts binary data to a "hexdump" that is text based and then >AWK can work on the "hexdump". > >I guess you also implemented the new extension because you wanted to >learn about GAWK's internal mechanism and there was plenty of stuff to >learn. This is Ok, if you are willing to invest time into this effort. >Depending on the other user's feedback you should reconsider the way >you make your source code available (integrating it into the gawkextlib >source tree). > >Thanks for sending us a notice about your work. >Let's see what the other users think about it. > >Juergen Kahrs > >> Hi, >> >> I've developed and extension called "binrecord" to process binary data files in GAWK, and I'd like to share it with you all! >> >> I've wanted to be able to do this for a long time, and now I finally got the time to look into it and make it happen. >> >> The code is available at https://gitlab.com/seirios/binrecord under the GPLv3. It is technically a gawkextlib extension. >> >> Essentially, if you want to process a binary data file "data.bin" consisting of records with, say, a uint32 index and three float coordinates, all in little-endian byte order, you can simply do (using the included `bawk' command): >> >> $ bawk '%u32le 3%f32le' '{ print $2,$3,$4 }' data.bin >> >> You can find all the information about the extension in the README. >> >> Hope you like this, as I do :) >> >> Cheers! >> Sirio >> >> >> _______________________________________________ >> Gawkextlib-users mailing list >> Gaw...@li... >> https://lists.sourceforge.net/lists/listinfo/gawkextlib-users > |
From: <jue...@go...> - 2025-06-01 20:17:02
|
Hello Sirio, I had a look at your doc and it surprised me in several ways. 1. You read the GAWK doc about the internal structures and the extension mechanism and the build mechanism, you implemented the new extension and you even wrote doc about it. That was surely a challenge and much work. Respect ! 2. You made the new extension available in a forked project so that it can be compiled on its own. There is still the option of integrating the new extension into the gawkextlib source tree. You should consider doing this. 3. Instead of writing an extension for processing binary data, I myself would have tried to find a way to use GNU Awk's FIELDWIDTHS variable. FIELDWIDTHS determines how binary record splitting is done and then individual fields can be processed with special functions as binary data. No need to write an extension. 4. Years ago I also used GNU Awk to handle binary data (without an extension). Writing is easy, but formatted reading was harder. I found a way to pre-process binary input data in a portable way with the help of the POSIX compliant tool od (which has many many options). od converts binary data to a "hexdump" that is text based and then AWK can work on the "hexdump". I guess you also implemented the new extension because you wanted to learn about GAWK's internal mechanism and there was plenty of stuff to learn. This is Ok, if you are willing to invest time into this effort. Depending on the other user's feedback you should reconsider the way you make your source code available (integrating it into the gawkextlib source tree). Thanks for sending us a notice about your work. Let's see what the other users think about it. Juergen Kahrs > Hi, > > I've developed and extension called "binrecord" to process binary data files in GAWK, and I'd like to share it with you all! > > I've wanted to be able to do this for a long time, and now I finally got the time to look into it and make it happen. > > The code is available at https://gitlab.com/seirios/binrecord under the GPLv3. It is technically a gawkextlib extension. > > Essentially, if you want to process a binary data file "data.bin" consisting of records with, say, a uint32 index and three float coordinates, all in little-endian byte order, you can simply do (using the included `bawk' command): > > $ bawk '%u32le 3%f32le' '{ print $2,$3,$4 }' data.bin > > You can find all the information about the extension in the README. > > Hope you like this, as I do :) > > Cheers! > Sirio > > > _______________________________________________ > Gawkextlib-users mailing list > Gaw...@li... > https://lists.sourceforge.net/lists/listinfo/gawkextlib-users |
From: SBP <se...@ik...> - 2025-05-31 02:29:44
|
Hi, I've developed and extension called "binrecord" to process binary data files in GAWK, and I'd like to share it with you all! I've wanted to be able to do this for a long time, and now I finally got the time to look into it and make it happen. The code is available at https://gitlab.com/seirios/binrecord under the GPLv3. It is technically a gawkextlib extension. Essentially, if you want to process a binary data file "data.bin" consisting of records with, say, a uint32 index and three float coordinates, all in little-endian byte order, you can simply do (using the included `bawk' command): $ bawk '%u32le 3%f32le' '{ print $2,$3,$4 }' data.bin You can find all the information about the extension in the README. Hope you like this, as I do :) Cheers! Sirio |
From: Andrew J. S. <as...@te...> - 2023-08-21 10:30:21
|
Hi, This is not a bug. Please set XMLMODE=0 and see that it works. Without that, it is trying to parse the output from sort as XML data. Regards, Andy On Thu, Aug 17, 2023 at 09:23:57PM +0200, Jürgen Kahrs via Gawkextlib-users wrote: > Hello, > > the following posting looks like a reasonable question to our list. > For some unknown reason it was auto-discarded by the mail server. > Therefore I am re-posting it myself with a copy going to the author. > > The origin of the problem probably is that the loading of the XML extension > switches the behaviour of the interpreter when processing two-way I/O. > To the best of my knowledge, this is intended and documented somewhere. > > Jürgen Kahrs > > > Hi everyone, > > I'm a huge fan of the xml Extension for gawk. Great stuff! > Recently I noticed that I cannot open another process within gawk when the > xml extension is loaded (Two-way I/O (The GNU Awk User’s Guide) > > Simple script to reproduce (I basically took the example from the man > page): > > Does not work: > > @load "xml" > BEGIN { > command = "LC_ALL=C sort" > n = split("abc", a, "") > > for (i = n; i > 0; i--) > print a[i] |& command > close(command, "to") > > while ((command |& getline line) > 0) > print "got", line > close(command) > } > > Works (notice the "@load" Statement is commented) > > #@load "xml" > BEGIN { > command = "LC_ALL=C sort" > n = split("abc", a, "") > > for (i = n; i > 0; i--) > print a[i] |& command > close(command, "to") > > while ((command |& getline line) > 0) > print "got", line > close(command) > } > > Interestingly enough if I load other extensions (e.g @load "json") it is no > problem. > It seems to be an issue specifically with xml. > > Am I doing something wrong here? Can anyone confirm? > Maybe this was already reported > > Versions I used: > □ GNU Awk 5.2.2 > □ XML: 1.1.1 > Greetings > Joseph > > > > > _______________________________________________ > Gawkextlib-users mailing list > Gaw...@li... > https://lists.sourceforge.net/lists/listinfo/gawkextlib-users -- Andrew Schorr e-mail: as...@te... Telemetry Investments, L.L.C. phone: 917-305-1748 152 W 36th St, #402 fax: 212-425-5550 New York, NY 10018-8765 |
From: <jue...@go...> - 2023-08-17 19:24:08
|
Hello, the following posting looks like a reasonable question to our list. For some unknown reason it was auto-discarded by the mail server. Therefore I am re-posting it myself with a copy going to the author. The origin of the problem probably is that the loading of the XML extension switches the behaviour of the interpreter when processing two-way I/O. To the best of my knowledge, this is intended and documented somewhere. Jürgen Kahrs > Hi everyone, > > I'm a huge fan of the xml Extension for gawk. Great stuff! > Recently I noticed that I cannot open another process within gawk when the xml extension is loaded (Two-way I/O (The GNU Awk User’s Guide) <https://www.gnu.org/software/gawk/manual/html_node/Two_002dway-I_002fO.html> > > Simple script to reproduce (I basically took the example from the man page): > > Does not work: > > @load "xml" > BEGIN { > command = "LC_ALL=C sort" > n = split("abc", a, "") > > for (i = n; i > 0; i--) > print a[i] |& command > close(command, "to") > > while ((command |& getline line) > 0) > print "got", line > close(command) > } > > Works (notice the "@load" Statement is commented) > > #@load "xml" > BEGIN { > command = "LC_ALL=C sort" > n = split("abc", a, "") > > for (i = n; i > 0; i--) > print a[i] |& command > close(command, "to") > > while ((command |& getline line) > 0) > print "got", line > close(command) > } > > Interestingly enough if I load other extensions (e.g @load "json") it is no problem. > It seems to be an issue specifically with xml. > > Am I doing something wrong here? Can anyone confirm? > Maybe this was already reported > > Versions I used: > > * GNU Awk 5.2.2 > * XML: 1.1.1 > > Greetings > Joseph > > |
From: Andrew J. S. <as...@te...> - 2023-02-21 14:56:12
|
Hi, Thanks so much for drilling down and finding this. I just reported it to bug-gawk and cc'ed you. Best, Andy On Mon, Feb 20, 2023 at 03:37:56PM -0600, Daniel Pouzzner via Gawkextlib-users wrote: > I was able to whittle away the xml parts of the logic until none was left. This > turns out to be a gawk core bug. Reproducer: > > #!/usr/bin/gawk -f > > function f(x) { > return x; > } > > BEGIN { > print "a[b] is " (a["b"] ? "true" : "false"); > > f(a["b"]); > > print "a[b] is " (a["b"] ? "true" : "false"); > > print a["b"]; > } > > Result on 5.1.1: > > $ /tmp/arraybug.awk > a[b] is false > a[b] is false > > On 5.2.1: > > $ /tmp/portage/sys-apps/gawk-5.2.1/image/usr/bin/gawk -f /tmp/arraybug.awk > a[b] is false > a[b] is true > free(): double free detected in tcache 2 > Aborted > > > The syndrome in a nutshell: if a nonexistent array element is passed as an > argument to a function, the element is sortof-created, such that testing it > somehow evaluates to true, but its state/internal pointers are invalid. I've > actually gotten scripts to outright SEGV and exhibit various other obviously > undefined behavior, like printing characters from the name of the redirect > target ("/dev/stde" etc), by just changing the length of words in a printf > format (constant string). > > > Do I need to refile a bug on gawk core, or have I "done enough", as it were? > > > Oh and thanks for the quick turnaround! > > > On Mon, 2023-02-20 at 10:13 -0500, Andrew J. Schorr wrote: > > Hi, > > > > On Mon, Feb 20, 2023 at 02:43:58AM -0600, Daniel Pouzzner via Gawkextlib-users wrote: > > > Is gawkextlib xml expected to work with gawk 5.2.1 (API 3.2), with the new > > > AWK_BOOL? > > > > I naively expect it to work. :-) If it doesn't work, then we've got a problem. > > > > > It works as expected with awk 5.1.1, and with all earlier versions going back to > > > 4.1.3. I've been using it regularly since 2017. > > > > Glad to hear you've been finding it useful. > > > > > But with 5.2.1 I'm seeing anomalous behavior where empty xml elements (e.g. > > > <doi></doi>) are evaluating as true even though they string-equal "". > > > > > > In connection with that empty xml field, gawk 5.2.1 crashes with > > > > > > gawk: ../mkbib.awk:1142: (FILENAME=buzsaki_2003_EEG_source.xml FNR=173) fatal: internal error: file eval.c, line 1358: unexpected parameter type Node_illegal > > > > > > If I build with sanitizer, I see concat_exp() doing a double-free of an arg that > > > was earlier freed by r_interpret(). > > > > > > I did a whole slew of experiments to try to understand what's happening, but > > > it's a large and tricky code base. It seems to have something to do with > > > Node_var appearing where usually Node_val is, but I was quickly in over my head. > > > > > > libgawkextlib and xml.so were both built with gawk-5.2.1 installed. I tried > > > with old release code and with the latest git sources -- same result, as above. > > > > > > If it's useful I can share buzsaki_2003_EEG_source.xml and even the script > > > that's crashing on 5.2.1. > > > > Do you have a small test case that reproduces the problem? That would be very > > helpful for debugging. If you don't have a small test case, then I guess a large > > test case may be better than nothing. > > > > Regards, > > Andy > > > > _______________________________________________ > Gawkextlib-users mailing list > Gaw...@li... > https://lists.sourceforge.net/lists/listinfo/gawkextlib-users |
From: Daniel P. <do...@me...> - 2023-02-20 21:38:14
|
I was able to whittle away the xml parts of the logic until none was left. This turns out to be a gawk core bug. Reproducer: #!/usr/bin/gawk -f function f(x) { return x; } BEGIN { print "a[b] is " (a["b"] ? "true" : "false"); f(a["b"]); print "a[b] is " (a["b"] ? "true" : "false"); print a["b"]; } Result on 5.1.1: $ /tmp/arraybug.awk a[b] is false a[b] is false On 5.2.1: $ /tmp/portage/sys-apps/gawk-5.2.1/image/usr/bin/gawk -f /tmp/arraybug.awk a[b] is false a[b] is true free(): double free detected in tcache 2 Aborted The syndrome in a nutshell: if a nonexistent array element is passed as an argument to a function, the element is sortof-created, such that testing it somehow evaluates to true, but its state/internal pointers are invalid. I've actually gotten scripts to outright SEGV and exhibit various other obviously undefined behavior, like printing characters from the name of the redirect target ("/dev/stde" etc), by just changing the length of words in a printf format (constant string). Do I need to refile a bug on gawk core, or have I "done enough", as it were? Oh and thanks for the quick turnaround! On Mon, 2023-02-20 at 10:13 -0500, Andrew J. Schorr wrote: > Hi, > > On Mon, Feb 20, 2023 at 02:43:58AM -0600, Daniel Pouzzner via Gawkextlib-users wrote: > > Is gawkextlib xml expected to work with gawk 5.2.1 (API 3.2), with the new > > AWK_BOOL? > > I naively expect it to work. :-) If it doesn't work, then we've got a problem. > > > It works as expected with awk 5.1.1, and with all earlier versions going back to > > 4.1.3. I've been using it regularly since 2017. > > Glad to hear you've been finding it useful. > > > But with 5.2.1 I'm seeing anomalous behavior where empty xml elements (e.g. > > <doi></doi>) are evaluating as true even though they string-equal "". > > > > In connection with that empty xml field, gawk 5.2.1 crashes with > > > > gawk: ../mkbib.awk:1142: (FILENAME=buzsaki_2003_EEG_source.xml FNR=173) fatal: internal error: file eval.c, line 1358: unexpected parameter type Node_illegal > > > > If I build with sanitizer, I see concat_exp() doing a double-free of an arg that > > was earlier freed by r_interpret(). > > > > I did a whole slew of experiments to try to understand what's happening, but > > it's a large and tricky code base. It seems to have something to do with > > Node_var appearing where usually Node_val is, but I was quickly in over my head. > > > > libgawkextlib and xml.so were both built with gawk-5.2.1 installed. I tried > > with old release code and with the latest git sources -- same result, as above. > > > > If it's useful I can share buzsaki_2003_EEG_source.xml and even the script > > that's crashing on 5.2.1. > > Do you have a small test case that reproduces the problem? That would be very > helpful for debugging. If you don't have a small test case, then I guess a large > test case may be better than nothing. > > Regards, > Andy |
From: <jue...@go...> - 2023-02-20 15:45:25
|
Hello David, welcome to our users-mailing-list. > Is gawkextlib xml expected to work with gawk 5.2.1 (API 3.2), with the new > AWK_BOOL? We have a build server running that always builds the latest development branch of GNU Awk and the the latest development branch of all the GNU Awk extensions that SourceForge hosts for us. Automatic testing is also done, but I have not seen error message with the combination of the latest branches. > > It works as expected with awk 5.1.1, and with all earlier versions going back to > 4.1.3. I've been using it regularly since 2017. > > But with 5.2.1 I'm seeing anomalous behavior where empty xml elements (e.g. > <doi></doi>) are evaluating as true even though they string-equal "". This sounds like it could become a pretty small regression test case. Assuming that empty XML input elements always cause a differing behaviour. > In connection with that empty xml field, gawk 5.2.1 crashes with > > gawk: ../mkbib.awk:1142: (FILENAME=buzsaki_2003_EEG_source.xml FNR=173) fatal: internal error: file eval.c, line 1358: unexpected parameter type Node_illegal > > If I build with sanitizer, I see concat_exp() doing a double-free of an arg that > was earlier freed by r_interpret(). > > I did a whole slew of experiments to try to understand what's happening, but > it's a large and tricky code base. It seems to have something to do with > Node_var appearing where usually Node_val is, but I was quickly in over my head. > > libgawkextlib and xml.so were both built with gawk-5.2.1 installed. I tried > with old release code and with the latest git sources -- same result, as above. > > If it's useful I can share buzsaki_2003_EEG_source.xml and even the script > that's crashing on 5.2.1. As Andrew said, having a small test case would be nice. If your application works on electroencephalogram data, I might be inclined to have a look at it. In the early 1990s I used the original GNU Awk for processing tons of EEG data. Seeing how the processing of EEG data has changed over the last 30 years may be interesting. Thanks to Daniel for the precise report and to Andrew for the quick reply. Juergen Kahrs |
From: Andrew J. S. <as...@te...> - 2023-02-20 15:30:00
|
Hi, On Mon, Feb 20, 2023 at 02:43:58AM -0600, Daniel Pouzzner via Gawkextlib-users wrote: > Is gawkextlib xml expected to work with gawk 5.2.1 (API 3.2), with the new > AWK_BOOL? I naively expect it to work. :-) If it doesn't work, then we've got a problem. > It works as expected with awk 5.1.1, and with all earlier versions going back to > 4.1.3. I've been using it regularly since 2017. Glad to hear you've been finding it useful. > But with 5.2.1 I'm seeing anomalous behavior where empty xml elements (e.g. > <doi></doi>) are evaluating as true even though they string-equal "". > > In connection with that empty xml field, gawk 5.2.1 crashes with > > gawk: ../mkbib.awk:1142: (FILENAME=buzsaki_2003_EEG_source.xml FNR=173) fatal: internal error: file eval.c, line 1358: unexpected parameter type Node_illegal > > If I build with sanitizer, I see concat_exp() doing a double-free of an arg that > was earlier freed by r_interpret(). > > I did a whole slew of experiments to try to understand what's happening, but > it's a large and tricky code base. It seems to have something to do with > Node_var appearing where usually Node_val is, but I was quickly in over my head. > > libgawkextlib and xml.so were both built with gawk-5.2.1 installed. I tried > with old release code and with the latest git sources -- same result, as above. > > If it's useful I can share buzsaki_2003_EEG_source.xml and even the script > that's crashing on 5.2.1. Do you have a small test case that reproduces the problem? That would be very helpful for debugging. If you don't have a small test case, then I guess a large test case may be better than nothing. Regards, Andy |
From: Daniel P. <do...@me...> - 2023-02-20 09:10:56
|
Is gawkextlib xml expected to work with gawk 5.2.1 (API 3.2), with the new AWK_BOOL? It works as expected with awk 5.1.1, and with all earlier versions going back to 4.1.3. I've been using it regularly since 2017. But with 5.2.1 I'm seeing anomalous behavior where empty xml elements (e.g. <doi></doi>) are evaluating as true even though they string-equal "". In connection with that empty xml field, gawk 5.2.1 crashes with gawk: ../mkbib.awk:1142: (FILENAME=buzsaki_2003_EEG_source.xml FNR=173) fatal: internal error: file eval.c, line 1358: unexpected parameter type Node_illegal If I build with sanitizer, I see concat_exp() doing a double-free of an arg that was earlier freed by r_interpret(). I did a whole slew of experiments to try to understand what's happening, but it's a large and tricky code base. It seems to have something to do with Node_var appearing where usually Node_val is, but I was quickly in over my head. libgawkextlib and xml.so were both built with gawk-5.2.1 installed. I tried with old release code and with the latest git sources -- same result, as above. If it's useful I can share buzsaki_2003_EEG_source.xml and even the script that's crashing on 5.2.1. |
From: Andrew J. S. <as...@te...> - 2021-06-28 13:20:44
|
On Mon, Jun 28, 2021 at 09:57:51AM -0300, Vinícius dos Santos Oliveira wrote: > Em seg., 28 de jun. de 2021 às 09:51, Andrew J. Schorr > <as...@te...> escreveu: > > On Mon, Jun 28, 2021 at 09:56:03AM +0200, Manuel Collado wrote: > > > >Maybe that'd do. Can such a function use the statement next? How'd > > > >that work, the plugin calling the function triggering next? > > > > That's a very interesting question. I imagine that all hell > > would break loose. > > Unfortunately triggering next would be one of the obvious error > handling approaches just like triggering nextfile. > > On file-level error, user triggers nextfile on BEGINFILE. > > On record-level error, user triggers next on JSONRECORDERROR. I really don't know. If you can figure out how to create an API to call an awk function, then maybe "next" will just work. But I don't think it will be terribly easy to implement this idea in the first place. I think you just need to mess around. It will be cool if you can pull it off, but it's nontrivial. Regards, Andy |
From: Vinícius d. S. O. <vin...@gm...> - 2021-06-28 12:58:49
|
Em seg., 28 de jun. de 2021 às 09:51, Andrew J. Schorr <as...@te...> escreveu: > On Mon, Jun 28, 2021 at 09:56:03AM +0200, Manuel Collado wrote: > > >Maybe that'd do. Can such a function use the statement next? How'd > > >that work, the plugin calling the function triggering next? > > That's a very interesting question. I imagine that all hell > would break loose. Unfortunately triggering next would be one of the obvious error handling approaches just like triggering nextfile. On file-level error, user triggers nextfile on BEGINFILE. On record-level error, user triggers next on JSONRECORDERROR. -- Vinícius dos Santos Oliveira https://vinipsmaker.github.io/ |
From: Andrew J. S. <as...@te...> - 2021-06-28 12:51:46
|
On Mon, Jun 28, 2021 at 09:56:03AM +0200, Manuel Collado wrote: > >Maybe that'd do. Can such a function use the statement next? How'd > >that work, the plugin calling the function triggering next? That's a very interesting question. I imagine that all hell would break loose. > The idea was to let the plugin invoke the user function. But I was a > bit wrong. The current API doesn't provide such facility. It would > be necessary to extend the API in order to allow an extension to > call a gawk user function. That seems like a more plausible API extension to me. The code is probably similar to how indirect function calls are implemented (Op_indirect_func_call), I'd guess. Regards, Andy |
From: Manuel C. <mco...@gm...> - 2021-06-28 07:56:13
|
El 28/06/2021 a las 9:40, Vinícius dos Santos Oliveira escribió: > Em seg., 28 de jun. de 2021 às 04:27, Manuel Collado > <mco...@gm...> escreveu: >> ... >> I.e., instead of expecting the user to write some "JSONRECORDERROR" >> rule, expect the user to write some "JsonRecordError()" function. If >> there is such function the extension will invoke it. If not, the >> extension will execute a default error handling action. > > Maybe that'd do. Can such a function use the statement next? How'd > that work, the plugin calling the function triggering next? The idea was to let the plugin invoke the user function. But I was a bit wrong. The current API doesn't provide such facility. It would be necessary to extend the API in order to allow an extension to call a gawk user function. > > Thanks for the idea anyway. It does seem simpler (to implement). You are welcome. -- Manuel Collado - http://mcollado.z15.es |
From: Vinícius d. S. O. <vin...@gm...> - 2021-06-28 07:41:51
|
Em seg., 28 de jun. de 2021 às 04:27, Manuel Collado <mco...@gm...> escreveu: > The fact that a rule exist doesn't ensures it will be executed. Use of > 'next' or 'getline' can prevent the execution of some rules. That's not a problem. Wrong code can always be written. > It seems you want to check if the user wants to handle record errors. Yes. > It > this is want you want, then an alternative is to check the existence of > a specific user-defined function, instead of a rule. > > I.e., instead of expecting the user to write some "JSONRECORDERROR" > rule, expect the user to write some "JsonRecordError()" function. If > there is such function the extension will invoke it. If not, the > extension will execute a default error handling action. Maybe that'd do. Can such a function use the statement next? How'd that work, the plugin calling the function triggering next? Thanks for the idea anyway. It does seem simpler (to implement). -- Vinícius dos Santos Oliveira https://vinipsmaker.github.io/ |
From: Manuel C. <mco...@gm...> - 2021-06-28 07:27:51
|
El 26/06/2021 a las 0:54, Vinícius dos Santos Oliveira escribió: > Em sex., 25 de jun. de 2021 às 10:07, Andrew J. Schorr > <as...@te...> escreveu: >> I'm not sure that I understand what "rule" means in an awk script. > ... > I mean the pattern. On plugin startup, I want to traverse all the > program rules to check if any pattern in them matches the AST > "somevar". > > I'll code the plugin's behaviour to depend on whether such a rule > exists in the program. The end result gives an approximation of what's > offered by BEGINFILE rules to deal with errors on single files, but > for record-level errors. The fact that a rule exist doesn't ensures it will be executed. Use of 'next' or 'getline' can prevent the execution of some rules. It seems you want to check if the user wants to handle record errors. It this is want you want, then an alternative is to check the existence of a specific user-defined function, instead of a rule. I.e., instead of expecting the user to write some "JSONRECORDERROR" rule, expect the user to write some "JsonRecordError()" function. If there is such function the extension will invoke it. If not, the extension will execute a default error handling action. Checking the existence of a function is straightforward. Just look for the corresponding entry in the FUNCTAB array. No need to extend the API. Is this what you want? Regards. -- Manuel Collado - http://mcollado.z15.es |
From: Vinícius d. S. O. <vin...@gm...> - 2021-06-25 22:55:37
|
Em sex., 25 de jun. de 2021 às 10:07, Andrew J. Schorr <as...@te...> escreveu: > I'm not sure that I understand what "rule" means in an awk script. > From the POSIX spec: > > https://pubs.opengroup.org/onlinepubs/9699919799/utilities/awk.html > > An awk program is composed of pairs of the form: > > pattern { action } > > So what's a "rule"? In the source code, it appears that the word "rule" > is used to refer to the combination of a pattern and an action. So > is the idea that this API hook would depend on detecting whether a combined > pattern & action is present? I mean the pattern. On plugin startup, I want to traverse all the program rules to check if any pattern in them matches the AST "somevar". I'll code the plugin's behaviour to depend on whether such a rule exists in the program. The end result gives an approximation of what's offered by BEGINFILE rules to deal with errors on single files, but for record-level errors. > The implementation of BEGINFILE and ENDFILE requires a lot of > special support in the interpreter. It seems to me that implementing > this sort of thing requires diving into the guts of how a program > is compiled in awkgram.y, which is quite frankly an area of the > code that I've never messed with. In main.c, the program is parsed > into code_block: > > INSTRUCTION *code_block = NULL; > ... > /* Read in the program */ > if (parse_program(& code_block, false) != 0 || dash_v_errs > 0) > exit(EXIT_FAILURE); > > And then it runs the program by calling interpret(code_block). > > So any info about the running program would have to be extracted by > parsing the INSTRUCTION data pointed to by code_block, as far > as I can tell. Thanks for the pointer. Now I know where to start. -- Vinícius dos Santos Oliveira https://vinipsmaker.github.io/ |
From: Andrew J. S. <as...@te...> - 2021-06-25 13:07:21
|
Hi Jürgen, On Fri, Jun 25, 2021 at 10:09:54AM +0200, Jürgen Kahrs via Gawkextlib-users wrote: > I think he wants to detect the presence (and not the value) of a > certain AWK rule at run-time. The term "rule" here seems to mean > a keyword that exists after being declared by the plugin. I'm not sure that I understand what "rule" means in an awk script. >From the POSIX spec: https://pubs.opengroup.org/onlinepubs/9699919799/utilities/awk.html An awk program is composed of pairs of the form: pattern { action } So what's a "rule"? In the source code, it appears that the word "rule" is used to refer to the combination of a pattern and an action. So is the idea that this API hook would depend on detecting whether a combined pattern & action is present? > I agree with you (Andrew) that the script example that he gave > should be accompanied by a more detailed implementation of > the plugin. Maybe it would suffice to have a way of looking up > all the keywords that are defined after loading a plugin. The implementation of BEGINFILE and ENDFILE requires a lot of special support in the interpreter. It seems to me that implementing this sort of thing requires diving into the guts of how a program is compiled in awkgram.y, which is quite frankly an area of the code that I've never messed with. In main.c, the program is parsed into code_block: INSTRUCTION *code_block = NULL; ... /* Read in the program */ if (parse_program(& code_block, false) != 0 || dash_v_errs > 0) exit(EXIT_FAILURE); And then it runs the program by calling interpret(code_block). So any info about the running program would have to be extracted by parsing the INSTRUCTION data pointed to by code_block, as far as I can tell. Regards, Andy |
From: <jue...@go...> - 2021-06-25 08:10:06
|
Hello Andrew, > Hi, > > On Thu, Jun 24, 2021 at 12:19:17PM -0300, Vinícius dos Santos Oliveira wrote: >> Would it be possible to add an API to gawk that allows the plugin author to >> check whether there's a certain rule in the AWK script? > ... > > I don't have a strong view on this, but I'm also not certain that I understand > precisely what you want. If you were to develop and share a gawk patch to > implement your desired behavior, then I think we would all understand better > and could offer an opinion on whether the patch might be accepted. You are of > course always free to build gawk with your own patches; that's the beauty of > free software. > > Regards, > Andy I think he wants to detect the presence (and not the value) of a certain AWK rule at run-time. The term "rule" here seems to mean a keyword that exists after being declared by the plugin. I agree with you (Andrew) that the script example that he gave should be accompanied by a more detailed implementation of the plugin. Maybe it would suffice to have a way of looking up all the keywords that are defined after loading a plugin. |
From: Andrew J. S. <as...@te...> - 2021-06-25 01:00:29
|
Hi, On Thu, Jun 24, 2021 at 12:19:17PM -0300, Vinícius dos Santos Oliveira wrote: > Would it be possible to add an API to gawk that allows the plugin author to > check whether there's a certain rule in the AWK script? ... I don't have a strong view on this, but I'm also not certain that I understand precisely what you want. If you were to develop and share a gawk patch to implement your desired behavior, then I think we would all understand better and could offer an opinion on whether the patch might be accepted. You are of course always free to build gawk with your own patches; that's the beauty of free software. Regards, Andy |
From: Vinícius d. S. O. <vin...@gm...> - 2021-06-24 15:20:10
|
Would it be possible to add an API to gawk that allows the plugin author to check whether there's a certain rule in the AWK script? The use-case is that I want to offer something similar to BEGINFILE. If the user doesn't write a rule to handle some specific use case, the program will abort when/if it happens. If the user does write a rule for it then the program doesn't abort and just executes the user's rule. I just need to check a very simple expression, "read var X". For instance (the first rule): JSONRECORDERROR { next } .. { ... } .. { ... } .. { ... } No other types of AWK expressions need to be checked against. I don't need to know if the user wrote the rule "JSONRECORDERROR && somethingelse". I only need to know if he wrote the rule "JSONRECORDERROR" exactly (no other variation). There's no need to "merge multiple rules" in the likes of BEGINFILE either. Just consume the AWK program as it normally would and nothing else. All I want is to check on plugin startup if the user wrote a certain rule of a very simple pattern. The abort behaviour is something I write on my own plugin and shouldn't be part of this API addition. How does that sound? Too fancy? Too weirdo? Too far-fetched? Or is it reasonable after all? -- Vinícius dos Santos Oliveira https://vinipsmaker.github.io/ |
From: Galen T. <gta...@ya...> - 2021-06-07 15:16:08
|
A little more flexible, perhaps, than using an environment variable, would be to use a gawk command line option to indicate how your extension should handle errors. If you're not yet familiar with how to do so, see the documentation for getopt() at Getopt Function (The GNU Awk User’s Guide). | | | | Getopt Function (The GNU Awk User’s Guide) Getopt Function (The GNU Awk User’s Guide) | | | But I don't know if you can do this easily from an extension. Galen On Monday, June 7, 2021, 9:34:38 AM EDT, Vinícius dos Santos Oliveira <vin...@gm...> wrote: Em seg., 7 de jun. de 2021 às 10:23, Andrew J. Schorr <as...@te...> escreveu: You are certainly free to use global variables to configure the behavior of your library. Many extensions do this; for example, the XML extension uses the XMLMODE variable. But if you expect gawk to implement this policy for you, then I don't know how that would work. It would be the extension's responsibility to implement this type of behavior. How could gawk enforce it for you? To do that, we'd have to change the API, which seems unlikely. Okay. Thanks. I was just uneasy given I don't have much actual experience with AWK, so I wanted to check whether a current or similar idiom/convention for this was already in place as to not create new friction between extensions. I'll introduce a new var on my plugin and roll with it. -- Vinícius dos Santos Oliveirahttps://vinipsmaker.github.io/ _______________________________________________ Gawkextlib-users mailing list Gaw...@li... https://lists.sourceforge.net/lists/listinfo/gawkextlib-users |
From: Vinícius d. S. O. <vin...@gm...> - 2021-06-07 13:34:32
|
Em seg., 7 de jun. de 2021 às 10:23, Andrew J. Schorr < as...@te...> escreveu: > You are certainly free to use global variables to configure the > behavior of your library. Many extensions do this; for example, > the XML extension uses the XMLMODE variable. But if you expect gawk > to implement this policy for you, then I don't know how that would > work. It would be the extension's responsibility to implement this type > of behavior. How could gawk enforce it for you? To do that, we'd have > to change the API, which seems unlikely. > Okay. Thanks. I was just uneasy given I don't have much actual experience with AWK, so I wanted to check whether a current or similar idiom/convention for this was already in place as to not create new friction between extensions. I'll introduce a new var on my plugin and roll with it. -- Vinícius dos Santos Oliveira https://vinipsmaker.github.io/ |
From: Andrew J. S. <as...@te...> - 2021-06-07 13:23:05
|
Hi, On Mon, Jun 07, 2021 at 10:05:46AM -0300, Vinícius dos Santos Oliveira wrote: > Just an idea. Maybe the user should fill some field in PROCINFO informing > whether he wants the input parser to silently discard broken records, to print > a warning before discarding the broken record, or to allow the user to consume > and handle the error explicitly. You are certainly free to use global variables to configure the behavior of your library. Many extensions do this; for example, the XML extension uses the XMLMODE variable. But if you expect gawk to implement this policy for you, then I don't know how that would work. It would be the extension's responsibility to implement this type of behavior. How could gawk enforce it for you? To do that, we'd have to change the API, which seems unlikely. Regards, Andy |
From: Vinícius d. S. O. <vin...@gm...> - 2021-06-07 13:13:13
|
Em seg., 7 de jun. de 2021 às 00:51, Manuel Collado <mco...@gm...> escreveu: > The relevant sections of the gawk manual are: > > 17.4.6 Printing Messages > 17.4.7 Updating ERRNO > Currently that's what I do. I just print a warning using the warning() function. -- Vinícius dos Santos Oliveira https://vinipsmaker.github.io/ |