gawkextlib-users Mailing List for gawk libraries for XML, PostgreSQL,... (Page 3)

Dynamically loaded extension libraries for GNU AWK

Brought to you by: ajschorr, jkahrs

gawkextlib-users — Discussing user level aspects of extended gawk

You can subscribe to this list here.

2006	Jan (4)	Feb (3)	Mar (4)	Apr (5)	May (10)	Jun (7)	Jul (2)	Aug (1)	Sep	Oct	Nov	Dec
2007	Jan (1)	Feb	Mar	Apr	May	Jun	Jul	Aug	Sep (18)	Oct	Nov	Dec
2008	Jan (5)	Feb (9)	Mar (24)	Apr (3)	May (2)	Jun (1)	Jul (8)	Aug (3)	Sep (10)	Oct (5)	Nov	Dec (9)
2009	Jan (1)	Feb	Mar (1)	Apr	May	Jun	Jul	Aug	Sep	Oct	Nov	Dec
2010	Jan	Feb (39)	Mar	Apr	May	Jun	Jul	Aug	Sep	Oct	Nov	Dec
2012	Jan	Feb	Mar (7)	Apr	May	Jun	Jul (1)	Aug	Sep	Oct	Nov (6)	Dec (3)
2013	Jan (13)	Feb	Mar	Apr	May	Jun	Jul	Aug	Sep	Oct	Nov (6)	Dec (2)
2014	Jan	Feb	Mar (2)	Apr (7)	May (8)	Jun	Jul (3)	Aug (3)	Sep (15)	Oct	Nov (15)	Dec
2015	Jan	Feb	Mar (1)	Apr	May	Jun (5)	Jul	Aug	Sep (2)	Oct	Nov	Dec
2016	Jan	Feb	Mar	Apr	May (1)	Jun	Jul	Aug	Sep	Oct	Nov	Dec
2018	Jan (2)	Feb	Mar (1)	Apr (10)	May	Jun	Jul	Aug	Sep	Oct	Nov (4)	Dec
2019	Jan (5)	Feb	Mar	Apr	May	Jun (10)	Jul	Aug	Sep (3)	Oct	Nov	Dec (1)
2020	Jan (5)	Feb	Mar	Apr	May (2)	Jun	Jul	Aug	Sep (27)	Oct (4)	Nov	Dec
2021	Jan	Feb	Mar	Apr	May	Jun (20)	Jul	Aug	Sep	Oct	Nov	Dec
2023	Jan	Feb (5)	Mar	Apr	May	Jun	Jul	Aug (2)	Sep	Oct	Nov	Dec
2025	Jan	Feb	Mar	Apr	May (1)	Jun (2)	Jul	Aug	Sep	Oct	Nov	Dec

Flat | Threaded

<< < 1 2 3 4 5 .. 15 > >> (Page 3 of 15)

Re: [Gawkextlib-users] Announcing a new JSON extension

From: Vinícius d. S. O. <vin...@gm...> - 2020-09-08 19:31:38

Em ter., 8 de set. de 2020 às 16:20, Andrew J. Schorr <
as...@te...> escreveu:

> So we're stuck at the same place.


Please run:

$ git submodule update --init

I think this project needs
> a README file that explains how to install and use it.
>

This is true. Meson is similar to cmake/autotools. After it runs, it
generates a build.ninja that can be used to run ninja and compile the
project. So typically these are the steps you have to run:

$ mkdir build
$ cd build
$ meson ..
$ ninja

This will generate a jsonstream.so file in the build dir that can be loaded
in gawk through -l jsonstream (if you also set the AWKLIBPATH env var).
I'll be renaming the plug-in to tabjson later on.

-- 
Vinícius dos Santos Oliveira
https://vinipsmaker.github.io/

Re: [Gawkextlib-users] Announcing a new JSON extension

From: Vinícius d. S. O. <vin...@gm...> - 2020-09-08 19:26:00

Em ter., 8 de set. de 2020 às 15:58, Jürgen Kahrs via Gawkextlib-users <
gaw...@li...> escreveu:

> Following your lead, I also tried a build (easier than I thought)
> and ran into the same problem. I see the followings reasons for the
> build failure:
> 1. he requires boost 1.69 and I have 1.66 only.
> 2. package missing: libboost_headers1_66_0-devel
>

Could you edit the meson.build file and change the line 1.69 to 1.66 and
test if it works? If the project happens to compile with this old Boost
version, I'll gladly change this definition on the main repository too.

3. package missing: re2c
>

I use re2c to parse the JSON Pointer expressions. re2c is actually a quite
popular (and old) project, so it shouldn't be hard to find a package in
your distro. re2c is a code generator, so it's actually a binary I depend
on and not header/development files. The binary will read a .ypp file and
spit a C++ source code out of that.

4. directory missing: "mkdir 3rd/trial.protocol/include"
>

Please run:

$ git submodule update --init

It'll download submodule dependencies (just one; the JSON library that I
use in the project).


-- 
Vinícius dos Santos Oliveira
https://vinipsmaker.github.io/

Re: [Gawkextlib-users] Announcing a new JSON extension

From: Andrew J. S. <as...@te...> - 2020-09-08 19:20:15

My bleeding-edge Fedora 32 system does have boost 1.69, but I had
failed to install boost-devel and re2c. With those, I now get:

   bash$ meson builddir
   The Meson build system
   Version: 0.55.1
   Source dir: /home/users/schorr/gawk-tabjson
   Build dir: /home/users/schorr/gawk-tabjson/builddir
   Build type: native build
   Project name: gawk-jsonstream
   Project version: undefined
   C++ compiler for the host machine: c++ (gcc 10.2.1 "c++ (GCC) 10.2.1 20200723 (Red Hat 10.2.1-1)")
   C++ linker for the host machine: c++ ld.bfd 2.34-4
   Host machine cpu family: x86_64
   Host machine cpu: x86_64
   Found pkg-config: /usr/bin/pkg-config (1.6.3)
   Run-time dependency Boost found: YES 1.69.0 (/usr)
   Program re2c found: YES

   meson.build:23:0: ERROR: Include dir 3rd/trial.protocol/include does not exist.

   A full log can be found at /home/users/schorr/gawk-tabjson/builddir/meson-logs/meson-log.txt

So we're stuck at the same place. I think this project needs
a README file that explains how to install and use it.

Regards,
Andy

On Tue, Sep 08, 2020 at 08:57:58PM +0200, Jürgen Kahrs via Gawkextlib-users wrote:
> Following your lead, I also tried a build (easier than I thought)
> and ran into the same problem. I see the followings reasons for the
> build failure:
> 1. he requires boost 1.69 and I have 1.66 only.
> 2. package missing: libboost_headers1_66_0-devel
> 3. package missing: re2c
> 4. directory missing: "mkdir 3rd/trial.protocol/include"
> 
> Now I run out of ideas:
> 
>     meson builddir
>     The Meson build system
>     Version: 0.46.0
>     Source dir: /home/kahrs/work/gawk/jsonstream/gawk-tabjson
>     Build dir: /home/kahrs/work/gawk/jsonstream/gawk-tabjson/builddir
>     Build type: native build
>     Project name: gawk-jsonstream
>     Native C++ compiler: c++ (gcc 7.5.0 "c++ (SUSE Linux) 7.5.0")
>     Build machine cpu family: x86_64
>     Build machine cpu: x86_64
>     Dependency Boost () found: YES 1.66
>     Program re2c found: YES (/usr/bin/re2c)
>     Build targets in project: 1
>     Found ninja-1.8.2 at /usr/bin/ninja
>     kahrs@linux-q7qy:~/work/gawk/jsonstream/gawk-tabjson> cd builddir/
>     kahrs@linux-q7qy:~/work/gawk/jsonstream/gawk-tabjson/builddir> meson
>     compile
>     Error during basic setup:
> 
>     Neither directory contains a build file meson.build.
>     kahrs@linux-q7qy:~/work/gawk/jsonstream/gawk-tabjson/builddir> ls -l
>     insgesamt 24
>     -rw-r--r-- 1 kahrs users 5062  8. Sep 20:55 build.ninja
>     drwxr-xr-x 2 kahrs users 4096  8. Sep 20:55 compile
>     -rw-r--r-- 1 kahrs users 1533  8. Sep 20:55 compile_commands.json
>     lrwxrwxrwx 1 kahrs users   15  8. Sep 20:55 jsonstream.so ->
>     jsonstream.so.0
>     lrwxrwxrwx 1 kahrs users   19  8. Sep 20:55 jsonstream.so.0 ->
>     jsonstream.so.0.1.0
>     drwxr-xr-x 2 kahrs users 4096  8. Sep 20:55 meson-logs
>     drwxr-xr-x 2 kahrs users 4096  8. Sep 20:55 meson-private
> 
> Perhaps some files were not put into the repo.
> 
> 
>     I installed meson and boost on my Fedora 32 system, but I get an error
>     when I run "meson builddir":
> 
>        sh-5.0$ meson builddir
>        The Meson build system
>        Version: 0.55.1
>        Source dir: /home/users/schorr/gawk-tabjson
>        Build dir: /home/users/schorr/gawk-tabjson/builddir
>        Build type: native build
>        Project name: gawk-jsonstream
>        Project version: undefined
>        C++ compiler for the host machine: c++ (gcc 10.2.1 "c++ (GCC) 10.2.1 20200723 (Red Hat 10.2.1-1)")
>        C++ linker for the host machine: c++ ld.bfd 2.34-4
>        Host machine cpu family: x86_64
>        Host machine cpu: x86_64
>        Found pkg-config: /usr/bin/pkg-config (1.6.3)
>        Run-time dependency Boost found: NO
> 
>        meson.build:3:0: ERROR: Dependency "boost" not found
> 
>        A full log can be found at /home/users/schorr/gawk-tabjson/builddir/meson-logs/meson-log.txt
> 
>     So I'm stuck.
> 
>     But I see that "jsonstream" is used in some places, but the
>     namespace was set to "json". Might it make sense to use
>     "jsonstream" for the namespace to avoid colliding with the
>     existing gawk "json" library?
> 
>     Regards,
>     Andy
> 
>     On Tue, Sep 08, 2020 at 01:11:30PM -0400, Andrew J. Schorr wrote:
> 
>         Hi,
> 
>         If you take a look at the source code, it looks to me as if it is
>         packaged as an extension by calling register_input_parser and
>         adding a json::output function. You can see the code like so:
>            git clone https://gitlab.com/vinipsmaker/gawk-tabjson.git
>         I confess that I have no experience with the Meson build
>         system, so I haven't tried building it.
> 
>         I agree that this looks like an interesting feature set.
>         Are you aware that there's an existing gawk json extension
>         for converting between JSON and gawk arrays? It's very simple
>         and straightforward:
>            http://gawkextlib.sourceforge.net/json/json.html#:~:text=Summary,gawk%20associative%20arrays%20and%20JSON.
> 
>         I wonder whether you might consider using a different namespace
>         to avoid colliding with the existing json extension.
> 
>         Regards,
>         Andy
> 
>         On Tue, Sep 08, 2020 at 06:53:00PM +0200, Jürgen Kahrs via Gawkextlib-users wrote:
> 
>             Hello Vinícius,
>             after reading your examples scripts, your project reminded me of
>             the XML extension that exists for GNU Awk. In both cases AWK
>             supplies the syntax for the scripting and the extension supplies
>             the access mechansim to the data files (JSON / XML).
>             In cases like these, AWK's syntax is really much more readable
>             than the syntax of other tools that were created especially for
>             one kind of new data format and nothing else.
> 
>             After looking at your web page, I got the impression that your
>             project is not an "extension" to GAWK in the established technical
>             sense of GAWK extensions.
>               https://www.gnu.org/software/gawk/manual/html_node/Extension-Intro.html
>             Your source code looks rather like a proposal for a change to the
>             source code of GAWK. This idea (building access to a special file
>             format into GAWK itself) has already been discussed several times.
>             For example, reading CSV, XML, PNG and ZIP files could have been built
>             into GAWK's released source code. Such ideas have been rejected
>             for good reasons. Most importantly, GAWK would grow too fat for
>             small embedded systems, would depend on too many external libraries
>             and the work load of source code management, testing and documentation
>             would be put all onto the shoulders of the GAWK maintainer.
>             Did you know that mawk is still so popular mostly because it is not as
>             "fat and slow as GAWK has become over the years" ?
>             (sorry Arnold, I am exaggerating a bit to make the point)
> 
>             Now you see the way my argument goes: Keep file format access
>             outside of the original GAWK source code and put it into an extension.
>             Andrew Schorr has done the same for the extension that reads XML files.
>             Did you consider packaging your source code as an extension and
>             supplying the doc with the packaged extension ?
> 
>             Thanks for sending a notice to us.
> 
>             Juergen Kahrs
> 
> 
>                 Hi,
> 
>                 I'd like to announce a new JSON plug-in that I've been working on the past
>                 days to consume JSON streams.
> 
>                 This new project is inspired by the FPAT gawk's variable. As you all know,
>                 FS allows you to specify how to find the field separator. And FPAT allows
>                 you to specify how to find the field contents. That's how I designed the
>                 plugin. There is a JPAT variable that allows you to specify how to find the
>                 contents. It works just like FPAT, but for JSON.
> 
>                 JPAT is an array and each index will tell how the plug-in should fill the
>                 same index in the J array. The values of JPAT are JSON Pointers[1]. For
>                 instance, given the following stream:
> 
>                 {"name": "Beth", "pay_rate": 4.00, "hours_worked": 0}
>                 {"name": "Dan", "pay_rate": 3.75, "hours_worked": 0}
>                 {"name": "Kathy", "pay_rate": 4.00, "hours_worked": 10}
>                 {"name": "Mark", "pay_rate": 5.00, "hours_worked": 20}
>                 {"name": "Mary", "pay_rate": 5.50, "hours_worked": 22}
>                 {"name": "Susie", "pay_rate": 4.25, "hours_worked": 18}
> 
>                 And the following AWK program:
> 
>                 BEGIN {
>                     JPAT[1] = "/name"
>                     JPAT[2] = "/pay_rate"
>                     JPAT[3] = "/hours_worked"
>                 }
>                 J[3] > 0 { print J[1], J[2] * J[3] }
> 
>                 The output will be:
> 
>                 Kathy 40
>                 Mark 100
>                 Mary 121
>                 Susie 76.5
> 
>                 So the idea is to give AWK a tabular view of the array. AWK is great to
>                 process tabular data, so I think this fits well with the rest of AWK's
>                 design.
> 
>                 The tool is robust and the fields inside the JSON can come in any order. It
>                 doesn't matter for the order you specified. Arbitrary nesting is supported.
>                 The JSON is parsed in a one-pass fashion and only decodes the fields you
>                 requested. No DOM tree is saved at any time.
> 
>                 You can update the JSON by calling the function json::output() which will
>                 give you a new serialized JSON string as its return. To create a new JSON
>                 is a cheap process as its only job is to concatenate strings from the
>                 offsets saved during the parsing stage. If a value was originally boolean
>                 and you set J[n] to 1 or 0, the field will stay a boolean (but any other
>                 numeric value will coerce the type to a double). Values deleted from J are
>                 converted to the null JSON value.
> 
>                 You can also use the unwind operation to flatten an array in the document.
>                 For instance, given the following data set:
> 
>                 { "_id" : "jane", "joined" : "2011-03-02", "likes" : ["golf",
>                 "racquetball"] }
>                 { "_id" : "joe", "joined" : "2012-07-02", "likes" : ["tennis", "golf",
>                 "swimming"] }
> 
>                 And the following AWK program:
> 
>                 BEGIN {
>                     JPAT[1] = "/likes"
>                     JUNWIND = 1
>                 }
>                 { print NR, json::output() }
> 
>                 The output will be:
> 
>                 1 { "_id" : "jane", "joined" : "2011-03-02", "likes" : "golf" }
>                 2 { "_id" : "jane", "joined" : "2011-03-02", "likes" : "racquetball" }
>                 3 { "_id" : "joe", "joined" : "2012-07-02", "likes" : "tennis" }
>                 4 { "_id" : "joe", "joined" : "2012-07-02", "likes" : "golf" }
>                 5 { "_id" : "joe", "joined" : "2012-07-02", "likes" : "swimming" }
> 
>                 And this other AWK program:
> 
>                 BEGIN {
>                     JPAT[1] = "/likes"
>                     JUNWIND = 1
>                 }
>                 { likes[J[1]] += 1 }
>                 END {
>                     PROCINFO["sorted_in"] = "@val_num_desc"
>                     for (like in likes) {
>                         print like, likes[like]
>                     }
>                 }
> 
>                 Will output:
> 
>                 golf 2
>                 tennis 1
>                 swimming 1
>                 racquetball 1
> 
>                 The unwind operation idea comes from the MongoDB aggregation framework for
>                 which an $unwind operator exists. As each document is synthesized for that
>                 array element, the variable JUIND will hold the index from that element in
>                 the array. If that field wasn't an array and the document is just being
>                 passed through, JUIND has the value 0. If JUIND value is different than 0,
>                 JULENGTH is also set with the array length. These two variables let you
>                 delimit array elements that have origin in the same document.
> 
>                 The homepage of the project is: https://gitlab.com/vinipsmaker/gawk-tabjson
> 
>                 I'd like to know how strange this idea is for AWK. Do you believe it is too
>                 far off? Or maybe do you think the AWK spirit for tabular data processing
>                 has been properly absorbed here? Which one (it can't be both)?
> 
>                 BUGS
> 
>                   □ The README is pretty outdated. This very email is more accurate than
>                     the README page. That's on me. I like to document how I'd like to use a
>                     tool before I implement it, so implementation came after the README.
>                   □ Right now, I don't like the name json::output(). If you know a better
>                     name, please let me know.
>                   □ The JSON pointer syntax is 0-indexed as opposed to AWK's 1-indexed
>                     mindset. I did it for compatibility with JSON pointer expressions, but
>                     I believe this decision was a mistake and I'll be changing it later.
>                   □ OFMT isn't respected yet.
>                   □ I don't know how to report invalid JSONs from the stream. I don't know
>                     what'd be the AWK way here.
>                   □ Right now the plug-in always takes over streams. I'm planning to check
>                     for the value of JPAT to decide if the plug-in's getline should be
>                     activated. And to also provide a split()-like function so you could use
>                     the same functionality on a per-record basis. This idea requires
>                     thought and design on how to best proceed.
>                   □ You have to pass the flag --characters-as-bytes (or the shorthand -b)
>                     when invoking gawk.
>                   □ Only newline delimited JSONs are supported right now, but it's possible
>                     to improve the plug-in to automatically recognize RS.
>                   □ The plug-in can't right now insert new fields in the documents. This
>                     operation is also possible with the chosen approach, but I haven't
>                     spent the time here.
>                   □ A handle for DOM-parsed subtrees could be created, but this feature
>                     would also require thought and design.
>                   □ Right now I don't use mmap() when available. mmap() should improve
>                     plug-in performance (although it has already beaten jq performance on
>                     my tests).
>                   □ There isn't MPFR support yet.
> 
>                 And I hope I didn't get you too excited as the time I have available to
>                 spend on this project for this year is nearing its end and further work
>                 will have to be postponed to later.
> 
>                 [1] https://tools.ietf.org/html/rfc6901
> 
>                 --
>                 Vinícius dos Santos Oliveira
>                 https://vinipsmaker.github.io/
> 
> 
> 
>                 _______________________________________________
>                 Gawkextlib-users mailing list
>                 Gaw...@li...
>                 https://lists.sourceforge.net/lists/listinfo/gawkextlib-users
> 
> 
> 
> 
> 
>             _______________________________________________
>             Gawkextlib-users mailing list
>             Gaw...@li...
>             https://lists.sourceforge.net/lists/listinfo/gawkextlib-users
> 
> 
>         --
>         Andrew Schorr                      e-mail: as...@te...
>         Telemetry Investments, L.L.C.      phone:  917-305-1748
>         152 W 36th St, #402                fax:    212-425-5550
>         New York, NY 10018-8765
> 
> 
>         _______________________________________________
>         Gawkextlib-users mailing list
>         Gaw...@li...
>         https://lists.sourceforge.net/lists/listinfo/gawkextlib-users
> 
> 


> _______________________________________________
> Gawkextlib-users mailing list
> Gaw...@li...
> https://lists.sourceforge.net/lists/listinfo/gawkextlib-users


-- 
Andrew Schorr                      e-mail: as...@te...
Telemetry Investments, L.L.C.      phone:  917-305-1748
152 W 36th St, #402                fax:    212-425-5550
New York, NY 10018-8765

Re: [Gawkextlib-users] Announcing a new JSON extension

From: Vinícius d. S. O. <vin...@gm...> - 2020-09-08 19:20:12

Em ter., 8 de set. de 2020 às 15:42, Andrew J. Schorr <
as...@te...> escreveu:

> But I see that "jsonstream" is used in some places, but the
> namespace was set to "json". Might it make sense to use
> "jsonstream" for the namespace to avoid colliding with the
> existing gawk "json" library?
>

I actually don't like the name jsonstream. I'll update the plugin name to
tabjson later. The repo name already reflects the current name, but I need
to update the source code too.


-- 
Vinícius dos Santos Oliveira
https://vinipsmaker.github.io/

Re: [Gawkextlib-users] Announcing a new JSON extension

From: Vinícius d. S. O. <vin...@gm...> - 2020-09-08 19:18:12

Em ter., 8 de set. de 2020 às 14:11, Andrew J. Schorr <
as...@te...> escreveu:

> Hi,
>

Hi Andrew,

Are you aware that there's an existing gawk json extension
> for converting between JSON and gawk arrays? It's very simple
> and straightforward:
>
> http://gawkextlib.sourceforge.net/json/json.html#:~:text=Summary,gawk%20associative%20arrays%20and%20JSON
> .
>

Yes, I'm aware of this plug-in.

I wonder whether you might consider using a different namespace
> to avoid colliding with the existing json extension.
>

I could change the namespace, but in truth I like the json namespace. If I
can think of another identifier that I like and is short I'll update it
right away.


-- 
Vinícius dos Santos Oliveira
https://vinipsmaker.github.io/

Re: [Gawkextlib-users] Announcing a new JSON extension

From: Vinícius d. S. O. <vin...@gm...> - 2020-09-08 19:13:30

Em ter., 8 de set. de 2020 às 13:53, Jürgen Kahrs via Gawkextlib-users <
gaw...@li...> escreveu:

> Hello Vinícius,
>

Hello Jürgen,

thanks for taking the time to look at the project

After looking at your web page, I got the impression that your
> project is not an "extension" to GAWK in the established technical
> sense of GAWK extensions.
>   https://www.gnu.org/software/gawk/manual/html_node/Extension-Intro.html
> Your source code looks rather like a proposal for a change to the
> source code of GAWK.
>

Sorry about the unfamiliar directory structure. The project is actually an
extension. I compile it and the process generates a jsonstream.so file in
the build directory that can be loaded with gawk via the -l command line
flag if AWKLIBPATH env var is properly set.

Did you consider packaging your source code as an extension and
> supplying the doc with the packaged extension ?
>

The code is already an extension. I'll write a manpage later.


-- 
Vinícius dos Santos Oliveira
https://vinipsmaker.github.io/

Re: [Gawkextlib-users] Announcing a new JSON extension

From: <jue...@go...> - 2020-09-08 18:58:14

Following your lead, I also tried a build (easier than I thought)
and ran into the same problem. I see the followings reasons for the
build failure:
1. he requires boost 1.69 and I have 1.66 only.
2. package missing: libboost_headers1_66_0-devel
3. package missing: re2c
4. directory missing: "mkdir 3rd/trial.protocol/include"

Now I run out of ideas:

    meson builddir
    The Meson build system
    Version: 0.46.0
    Source dir: /home/kahrs/work/gawk/jsonstream/gawk-tabjson
    Build dir: /home/kahrs/work/gawk/jsonstream/gawk-tabjson/builddir
    Build type: native build
    Project name: gawk-jsonstream
    Native C++ compiler: c++ (gcc 7.5.0 "c++ (SUSE Linux) 7.5.0")
    Build machine cpu family: x86_64
    Build machine cpu: x86_64
    Dependency Boost () found: YES 1.66
    Program re2c found: YES (/usr/bin/re2c)
    Build targets in project: 1
    Found ninja-1.8.2 at /usr/bin/ninja
    kahrs@linux-q7qy:~/work/gawk/jsonstream/gawk-tabjson> cd builddir/
    kahrs@linux-q7qy:~/work/gawk/jsonstream/gawk-tabjson/builddir> meson compile
    Error during basic setup:

    Neither directory contains a build file meson.build.
    kahrs@linux-q7qy:~/work/gawk/jsonstream/gawk-tabjson/builddir> ls -l
    insgesamt 24
    -rw-r--r-- 1 kahrs users 5062  8. Sep 20:55 build.ninja
    drwxr-xr-x 2 kahrs users 4096  8. Sep 20:55 compile
    -rw-r--r-- 1 kahrs users 1533  8. Sep 20:55 compile_commands.json
    lrwxrwxrwx 1 kahrs users   15  8. Sep 20:55 jsonstream.so -> jsonstream.so.0
    lrwxrwxrwx 1 kahrs users   19  8. Sep 20:55 jsonstream.so.0 -> jsonstream.so.0.1.0
    drwxr-xr-x 2 kahrs users 4096  8. Sep 20:55 meson-logs
    drwxr-xr-x 2 kahrs users 4096  8. Sep 20:55 meson-private

Perhaps some files were not put into the repo.

> I installed meson and boost on my Fedora 32 system, but I get an error
> when I run "meson builddir":
>
>    sh-5.0$ meson builddir
>    The Meson build system
>    Version: 0.55.1
>    Source dir: /home/users/schorr/gawk-tabjson
>    Build dir: /home/users/schorr/gawk-tabjson/builddir
>    Build type: native build
>    Project name: gawk-jsonstream
>    Project version: undefined
>    C++ compiler for the host machine: c++ (gcc 10.2.1 "c++ (GCC) 10.2.1 20200723 (Red Hat 10.2.1-1)")
>    C++ linker for the host machine: c++ ld.bfd 2.34-4
>    Host machine cpu family: x86_64
>    Host machine cpu: x86_64
>    Found pkg-config: /usr/bin/pkg-config (1.6.3)
>    Run-time dependency Boost found: NO 
>
>    meson.build:3:0: ERROR: Dependency "boost" not found
>
>    A full log can be found at /home/users/schorr/gawk-tabjson/builddir/meson-logs/meson-log.txt
>
> So I'm stuck.
>
> But I see that "jsonstream" is used in some places, but the
> namespace was set to "json". Might it make sense to use
> "jsonstream" for the namespace to avoid colliding with the
> existing gawk "json" library?
>
> Regards,
> Andy
>
> On Tue, Sep 08, 2020 at 01:11:30PM -0400, Andrew J. Schorr wrote:
>> Hi,
>>
>> If you take a look at the source code, it looks to me as if it is
>> packaged as an extension by calling register_input_parser and
>> adding a json::output function. You can see the code like so:
>>    git clone https://gitlab.com/vinipsmaker/gawk-tabjson.git
>> I confess that I have no experience with the Meson build
>> system, so I haven't tried building it.
>>
>> I agree that this looks like an interesting feature set.
>> Are you aware that there's an existing gawk json extension
>> for converting between JSON and gawk arrays? It's very simple
>> and straightforward:
>>    http://gawkextlib.sourceforge.net/json/json.html#:~:text=Summary,gawk%20associative%20arrays%20and%20JSON.
>>
>> I wonder whether you might consider using a different namespace
>> to avoid colliding with the existing json extension.
>>
>> Regards,
>> Andy
>>
>> On Tue, Sep 08, 2020 at 06:53:00PM +0200, Jürgen Kahrs via Gawkextlib-users wrote:
>>> Hello Vinícius,
>>> after reading your examples scripts, your project reminded me of
>>> the XML extension that exists for GNU Awk. In both cases AWK
>>> supplies the syntax for the scripting and the extension supplies
>>> the access mechansim to the data files (JSON / XML).
>>> In cases like these, AWK's syntax is really much more readable
>>> than the syntax of other tools that were created especially for
>>> one kind of new data format and nothing else.
>>>
>>> After looking at your web page, I got the impression that your
>>> project is not an "extension" to GAWK in the established technical
>>> sense of GAWK extensions.
>>>   https://www.gnu.org/software/gawk/manual/html_node/Extension-Intro.html
>>> Your source code looks rather like a proposal for a change to the
>>> source code of GAWK. This idea (building access to a special file
>>> format into GAWK itself) has already been discussed several times.
>>> For example, reading CSV, XML, PNG and ZIP files could have been built
>>> into GAWK's released source code. Such ideas have been rejected
>>> for good reasons. Most importantly, GAWK would grow too fat for
>>> small embedded systems, would depend on too many external libraries
>>> and the work load of source code management, testing and documentation
>>> would be put all onto the shoulders of the GAWK maintainer.
>>> Did you know that mawk is still so popular mostly because it is not as
>>> "fat and slow as GAWK has become over the years" ?
>>> (sorry Arnold, I am exaggerating a bit to make the point)
>>>
>>> Now you see the way my argument goes: Keep file format access
>>> outside of the original GAWK source code and put it into an extension.
>>> Andrew Schorr has done the same for the extension that reads XML files.
>>> Did you consider packaging your source code as an extension and
>>> supplying the doc with the packaged extension ?
>>>
>>> Thanks for sending a notice to us.
>>>
>>> Juergen Kahrs
>>>
>>>
>>>     Hi,
>>>
>>>     I'd like to announce a new JSON plug-in that I've been working on the past
>>>     days to consume JSON streams.
>>>
>>>     This new project is inspired by the FPAT gawk's variable. As you all know,
>>>     FS allows you to specify how to find the field separator. And FPAT allows
>>>     you to specify how to find the field contents. That's how I designed the
>>>     plugin. There is a JPAT variable that allows you to specify how to find the
>>>     contents. It works just like FPAT, but for JSON.
>>>
>>>     JPAT is an array and each index will tell how the plug-in should fill the
>>>     same index in the J array. The values of JPAT are JSON Pointers[1]. For
>>>     instance, given the following stream:
>>>
>>>     {"name": "Beth", "pay_rate": 4.00, "hours_worked": 0}
>>>     {"name": "Dan", "pay_rate": 3.75, "hours_worked": 0}
>>>     {"name": "Kathy", "pay_rate": 4.00, "hours_worked": 10}
>>>     {"name": "Mark", "pay_rate": 5.00, "hours_worked": 20}
>>>     {"name": "Mary", "pay_rate": 5.50, "hours_worked": 22}
>>>     {"name": "Susie", "pay_rate": 4.25, "hours_worked": 18}
>>>
>>>     And the following AWK program:
>>>
>>>     BEGIN {
>>>         JPAT[1] = "/name"
>>>         JPAT[2] = "/pay_rate"
>>>         JPAT[3] = "/hours_worked"
>>>     }
>>>     J[3] > 0 { print J[1], J[2] * J[3] }
>>>
>>>     The output will be:
>>>
>>>     Kathy 40
>>>     Mark 100
>>>     Mary 121
>>>     Susie 76.5
>>>
>>>     So the idea is to give AWK a tabular view of the array. AWK is great to
>>>     process tabular data, so I think this fits well with the rest of AWK's
>>>     design.
>>>
>>>     The tool is robust and the fields inside the JSON can come in any order. It
>>>     doesn't matter for the order you specified. Arbitrary nesting is supported.
>>>     The JSON is parsed in a one-pass fashion and only decodes the fields you
>>>     requested. No DOM tree is saved at any time.
>>>
>>>     You can update the JSON by calling the function json::output() which will
>>>     give you a new serialized JSON string as its return. To create a new JSON
>>>     is a cheap process as its only job is to concatenate strings from the
>>>     offsets saved during the parsing stage. If a value was originally boolean
>>>     and you set J[n] to 1 or 0, the field will stay a boolean (but any other
>>>     numeric value will coerce the type to a double). Values deleted from J are
>>>     converted to the null JSON value.
>>>
>>>     You can also use the unwind operation to flatten an array in the document.
>>>     For instance, given the following data set:
>>>
>>>     { "_id" : "jane", "joined" : "2011-03-02", "likes" : ["golf",
>>>     "racquetball"] }
>>>     { "_id" : "joe", "joined" : "2012-07-02", "likes" : ["tennis", "golf",
>>>     "swimming"] }
>>>
>>>     And the following AWK program:
>>>
>>>     BEGIN {
>>>         JPAT[1] = "/likes"
>>>         JUNWIND = 1
>>>     }
>>>     { print NR, json::output() }
>>>
>>>     The output will be:
>>>
>>>     1 { "_id" : "jane", "joined" : "2011-03-02", "likes" : "golf" }
>>>     2 { "_id" : "jane", "joined" : "2011-03-02", "likes" : "racquetball" }
>>>     3 { "_id" : "joe", "joined" : "2012-07-02", "likes" : "tennis" }
>>>     4 { "_id" : "joe", "joined" : "2012-07-02", "likes" : "golf" }
>>>     5 { "_id" : "joe", "joined" : "2012-07-02", "likes" : "swimming" }
>>>
>>>     And this other AWK program:
>>>
>>>     BEGIN {
>>>         JPAT[1] = "/likes"
>>>         JUNWIND = 1
>>>     }
>>>     { likes[J[1]] += 1 }
>>>     END {
>>>         PROCINFO["sorted_in"] = "@val_num_desc"
>>>         for (like in likes) {
>>>             print like, likes[like]
>>>         }
>>>     }
>>>
>>>     Will output:
>>>
>>>     golf 2
>>>     tennis 1
>>>     swimming 1
>>>     racquetball 1
>>>
>>>     The unwind operation idea comes from the MongoDB aggregation framework for
>>>     which an $unwind operator exists. As each document is synthesized for that
>>>     array element, the variable JUIND will hold the index from that element in
>>>     the array. If that field wasn't an array and the document is just being
>>>     passed through, JUIND has the value 0. If JUIND value is different than 0,
>>>     JULENGTH is also set with the array length. These two variables let you
>>>     delimit array elements that have origin in the same document.
>>>
>>>     The homepage of the project is: https://gitlab.com/vinipsmaker/gawk-tabjson
>>>
>>>     I'd like to know how strange this idea is for AWK. Do you believe it is too
>>>     far off? Or maybe do you think the AWK spirit for tabular data processing
>>>     has been properly absorbed here? Which one (it can't be both)?
>>>
>>>     BUGS
>>>
>>>       □ The README is pretty outdated. This very email is more accurate than
>>>         the README page. That's on me. I like to document how I'd like to use a
>>>         tool before I implement it, so implementation came after the README.
>>>       □ Right now, I don't like the name json::output(). If you know a better
>>>         name, please let me know.
>>>       □ The JSON pointer syntax is 0-indexed as opposed to AWK's 1-indexed
>>>         mindset. I did it for compatibility with JSON pointer expressions, but
>>>         I believe this decision was a mistake and I'll be changing it later.
>>>       □ OFMT isn't respected yet.
>>>       □ I don't know how to report invalid JSONs from the stream. I don't know
>>>         what'd be the AWK way here.
>>>       □ Right now the plug-in always takes over streams. I'm planning to check
>>>         for the value of JPAT to decide if the plug-in's getline should be
>>>         activated. And to also provide a split()-like function so you could use
>>>         the same functionality on a per-record basis. This idea requires
>>>         thought and design on how to best proceed.
>>>       □ You have to pass the flag --characters-as-bytes (or the shorthand -b)
>>>         when invoking gawk.
>>>       □ Only newline delimited JSONs are supported right now, but it's possible
>>>         to improve the plug-in to automatically recognize RS.
>>>       □ The plug-in can't right now insert new fields in the documents. This
>>>         operation is also possible with the chosen approach, but I haven't
>>>         spent the time here.
>>>       □ A handle for DOM-parsed subtrees could be created, but this feature
>>>         would also require thought and design.
>>>       □ Right now I don't use mmap() when available. mmap() should improve
>>>         plug-in performance (although it has already beaten jq performance on
>>>         my tests).
>>>       □ There isn't MPFR support yet.
>>>
>>>     And I hope I didn't get you too excited as the time I have available to
>>>     spend on this project for this year is nearing its end and further work
>>>     will have to be postponed to later.
>>>
>>>     [1] https://tools.ietf.org/html/rfc6901
>>>
>>>     --
>>>     Vinícius dos Santos Oliveira
>>>     https://vinipsmaker.github.io/
>>>
>>>    
>>>    
>>>     _______________________________________________
>>>     Gawkextlib-users mailing list
>>>     Gaw...@li...
>>>     https://lists.sourceforge.net/lists/listinfo/gawkextlib-users
>>>
>>>
>>
>>> _______________________________________________
>>> Gawkextlib-users mailing list
>>> Gaw...@li...
>>> https://lists.sourceforge.net/lists/listinfo/gawkextlib-users
>>
>> -- 
>> Andrew Schorr                      e-mail: as...@te...
>> Telemetry Investments, L.L.C.      phone:  917-305-1748
>> 152 W 36th St, #402                fax:    212-425-5550
>> New York, NY 10018-8765
>>
>>
>> _______________________________________________
>> Gawkextlib-users mailing list
>> Gaw...@li...
>> https://lists.sourceforge.net/lists/listinfo/gawkextlib-users

Re: [Gawkextlib-users] Announcing a new JSON extension

From: Andrew J. S. <as...@te...> - 2020-09-08 18:42:03

I installed meson and boost on my Fedora 32 system, but I get an error
when I run "meson builddir":

   sh-5.0$ meson builddir
   The Meson build system
   Version: 0.55.1
   Source dir: /home/users/schorr/gawk-tabjson
   Build dir: /home/users/schorr/gawk-tabjson/builddir
   Build type: native build
   Project name: gawk-jsonstream
   Project version: undefined
   C++ compiler for the host machine: c++ (gcc 10.2.1 "c++ (GCC) 10.2.1 20200723 (Red Hat 10.2.1-1)")
   C++ linker for the host machine: c++ ld.bfd 2.34-4
   Host machine cpu family: x86_64
   Host machine cpu: x86_64
   Found pkg-config: /usr/bin/pkg-config (1.6.3)
   Run-time dependency Boost found: NO 

   meson.build:3:0: ERROR: Dependency "boost" not found

   A full log can be found at /home/users/schorr/gawk-tabjson/builddir/meson-logs/meson-log.txt

So I'm stuck.

But I see that "jsonstream" is used in some places, but the
namespace was set to "json". Might it make sense to use
"jsonstream" for the namespace to avoid colliding with the
existing gawk "json" library?

Regards,
Andy

On Tue, Sep 08, 2020 at 01:11:30PM -0400, Andrew J. Schorr wrote:
> Hi,
> 
> If you take a look at the source code, it looks to me as if it is
> packaged as an extension by calling register_input_parser and
> adding a json::output function. You can see the code like so:
>    git clone https://gitlab.com/vinipsmaker/gawk-tabjson.git
> I confess that I have no experience with the Meson build
> system, so I haven't tried building it.
> 
> I agree that this looks like an interesting feature set.
> Are you aware that there's an existing gawk json extension
> for converting between JSON and gawk arrays? It's very simple
> and straightforward:
>    http://gawkextlib.sourceforge.net/json/json.html#:~:text=Summary,gawk%20associative%20arrays%20and%20JSON.
> 
> I wonder whether you might consider using a different namespace
> to avoid colliding with the existing json extension.
> 
> Regards,
> Andy
> 
> On Tue, Sep 08, 2020 at 06:53:00PM +0200, Jürgen Kahrs via Gawkextlib-users wrote:
> > Hello Vinícius,
> > after reading your examples scripts, your project reminded me of
> > the XML extension that exists for GNU Awk. In both cases AWK
> > supplies the syntax for the scripting and the extension supplies
> > the access mechansim to the data files (JSON / XML).
> > In cases like these, AWK's syntax is really much more readable
> > than the syntax of other tools that were created especially for
> > one kind of new data format and nothing else.
> > 
> > After looking at your web page, I got the impression that your
> > project is not an "extension" to GAWK in the established technical
> > sense of GAWK extensions.
> >   https://www.gnu.org/software/gawk/manual/html_node/Extension-Intro.html
> > Your source code looks rather like a proposal for a change to the
> > source code of GAWK. This idea (building access to a special file
> > format into GAWK itself) has already been discussed several times.
> > For example, reading CSV, XML, PNG and ZIP files could have been built
> > into GAWK's released source code. Such ideas have been rejected
> > for good reasons. Most importantly, GAWK would grow too fat for
> > small embedded systems, would depend on too many external libraries
> > and the work load of source code management, testing and documentation
> > would be put all onto the shoulders of the GAWK maintainer.
> > Did you know that mawk is still so popular mostly because it is not as
> > "fat and slow as GAWK has become over the years" ?
> > (sorry Arnold, I am exaggerating a bit to make the point)
> > 
> > Now you see the way my argument goes: Keep file format access
> > outside of the original GAWK source code and put it into an extension.
> > Andrew Schorr has done the same for the extension that reads XML files.
> > Did you consider packaging your source code as an extension and
> > supplying the doc with the packaged extension ?
> > 
> > Thanks for sending a notice to us.
> > 
> > Juergen Kahrs
> > 
> > 
> >     Hi,
> > 
> >     I'd like to announce a new JSON plug-in that I've been working on the past
> >     days to consume JSON streams.
> > 
> >     This new project is inspired by the FPAT gawk's variable. As you all know,
> >     FS allows you to specify how to find the field separator. And FPAT allows
> >     you to specify how to find the field contents. That's how I designed the
> >     plugin. There is a JPAT variable that allows you to specify how to find the
> >     contents. It works just like FPAT, but for JSON.
> > 
> >     JPAT is an array and each index will tell how the plug-in should fill the
> >     same index in the J array. The values of JPAT are JSON Pointers[1]. For
> >     instance, given the following stream:
> > 
> >     {"name": "Beth", "pay_rate": 4.00, "hours_worked": 0}
> >     {"name": "Dan", "pay_rate": 3.75, "hours_worked": 0}
> >     {"name": "Kathy", "pay_rate": 4.00, "hours_worked": 10}
> >     {"name": "Mark", "pay_rate": 5.00, "hours_worked": 20}
> >     {"name": "Mary", "pay_rate": 5.50, "hours_worked": 22}
> >     {"name": "Susie", "pay_rate": 4.25, "hours_worked": 18}
> > 
> >     And the following AWK program:
> > 
> >     BEGIN {
> >         JPAT[1] = "/name"
> >         JPAT[2] = "/pay_rate"
> >         JPAT[3] = "/hours_worked"
> >     }
> >     J[3] > 0 { print J[1], J[2] * J[3] }
> > 
> >     The output will be:
> > 
> >     Kathy 40
> >     Mark 100
> >     Mary 121
> >     Susie 76.5
> > 
> >     So the idea is to give AWK a tabular view of the array. AWK is great to
> >     process tabular data, so I think this fits well with the rest of AWK's
> >     design.
> > 
> >     The tool is robust and the fields inside the JSON can come in any order. It
> >     doesn't matter for the order you specified. Arbitrary nesting is supported.
> >     The JSON is parsed in a one-pass fashion and only decodes the fields you
> >     requested. No DOM tree is saved at any time.
> > 
> >     You can update the JSON by calling the function json::output() which will
> >     give you a new serialized JSON string as its return. To create a new JSON
> >     is a cheap process as its only job is to concatenate strings from the
> >     offsets saved during the parsing stage. If a value was originally boolean
> >     and you set J[n] to 1 or 0, the field will stay a boolean (but any other
> >     numeric value will coerce the type to a double). Values deleted from J are
> >     converted to the null JSON value.
> > 
> >     You can also use the unwind operation to flatten an array in the document.
> >     For instance, given the following data set:
> > 
> >     { "_id" : "jane", "joined" : "2011-03-02", "likes" : ["golf",
> >     "racquetball"] }
> >     { "_id" : "joe", "joined" : "2012-07-02", "likes" : ["tennis", "golf",
> >     "swimming"] }
> > 
> >     And the following AWK program:
> > 
> >     BEGIN {
> >         JPAT[1] = "/likes"
> >         JUNWIND = 1
> >     }
> >     { print NR, json::output() }
> > 
> >     The output will be:
> > 
> >     1 { "_id" : "jane", "joined" : "2011-03-02", "likes" : "golf" }
> >     2 { "_id" : "jane", "joined" : "2011-03-02", "likes" : "racquetball" }
> >     3 { "_id" : "joe", "joined" : "2012-07-02", "likes" : "tennis" }
> >     4 { "_id" : "joe", "joined" : "2012-07-02", "likes" : "golf" }
> >     5 { "_id" : "joe", "joined" : "2012-07-02", "likes" : "swimming" }
> > 
> >     And this other AWK program:
> > 
> >     BEGIN {
> >         JPAT[1] = "/likes"
> >         JUNWIND = 1
> >     }
> >     { likes[J[1]] += 1 }
> >     END {
> >         PROCINFO["sorted_in"] = "@val_num_desc"
> >         for (like in likes) {
> >             print like, likes[like]
> >         }
> >     }
> > 
> >     Will output:
> > 
> >     golf 2
> >     tennis 1
> >     swimming 1
> >     racquetball 1
> > 
> >     The unwind operation idea comes from the MongoDB aggregation framework for
> >     which an $unwind operator exists. As each document is synthesized for that
> >     array element, the variable JUIND will hold the index from that element in
> >     the array. If that field wasn't an array and the document is just being
> >     passed through, JUIND has the value 0. If JUIND value is different than 0,
> >     JULENGTH is also set with the array length. These two variables let you
> >     delimit array elements that have origin in the same document.
> > 
> >     The homepage of the project is: https://gitlab.com/vinipsmaker/gawk-tabjson
> > 
> >     I'd like to know how strange this idea is for AWK. Do you believe it is too
> >     far off? Or maybe do you think the AWK spirit for tabular data processing
> >     has been properly absorbed here? Which one (it can't be both)?
> > 
> >     BUGS
> > 
> >       □ The README is pretty outdated. This very email is more accurate than
> >         the README page. That's on me. I like to document how I'd like to use a
> >         tool before I implement it, so implementation came after the README.
> >       □ Right now, I don't like the name json::output(). If you know a better
> >         name, please let me know.
> >       □ The JSON pointer syntax is 0-indexed as opposed to AWK's 1-indexed
> >         mindset. I did it for compatibility with JSON pointer expressions, but
> >         I believe this decision was a mistake and I'll be changing it later.
> >       □ OFMT isn't respected yet.
> >       □ I don't know how to report invalid JSONs from the stream. I don't know
> >         what'd be the AWK way here.
> >       □ Right now the plug-in always takes over streams. I'm planning to check
> >         for the value of JPAT to decide if the plug-in's getline should be
> >         activated. And to also provide a split()-like function so you could use
> >         the same functionality on a per-record basis. This idea requires
> >         thought and design on how to best proceed.
> >       □ You have to pass the flag --characters-as-bytes (or the shorthand -b)
> >         when invoking gawk.
> >       □ Only newline delimited JSONs are supported right now, but it's possible
> >         to improve the plug-in to automatically recognize RS.
> >       □ The plug-in can't right now insert new fields in the documents. This
> >         operation is also possible with the chosen approach, but I haven't
> >         spent the time here.
> >       □ A handle for DOM-parsed subtrees could be created, but this feature
> >         would also require thought and design.
> >       □ Right now I don't use mmap() when available. mmap() should improve
> >         plug-in performance (although it has already beaten jq performance on
> >         my tests).
> >       □ There isn't MPFR support yet.
> > 
> >     And I hope I didn't get you too excited as the time I have available to
> >     spend on this project for this year is nearing its end and further work
> >     will have to be postponed to later.
> > 
> >     [1] https://tools.ietf.org/html/rfc6901
> > 
> >     --
> >     Vinícius dos Santos Oliveira
> >     https://vinipsmaker.github.io/
> > 
> >    
> >    
> >     _______________________________________________
> >     Gawkextlib-users mailing list
> >     Gaw...@li...
> >     https://lists.sourceforge.net/lists/listinfo/gawkextlib-users
> > 
> > 
> 
> 
> > _______________________________________________
> > Gawkextlib-users mailing list
> > Gaw...@li...
> > https://lists.sourceforge.net/lists/listinfo/gawkextlib-users
> 
> 
> -- 
> Andrew Schorr                      e-mail: as...@te...
> Telemetry Investments, L.L.C.      phone:  917-305-1748
> 152 W 36th St, #402                fax:    212-425-5550
> New York, NY 10018-8765
> 
> 
> _______________________________________________
> Gawkextlib-users mailing list
> Gaw...@li...
> https://lists.sourceforge.net/lists/listinfo/gawkextlib-users

-- 
Andrew Schorr                      e-mail: as...@te...
Telemetry Investments, L.L.C.      phone:  917-305-1748
152 W 36th St, #402                fax:    212-425-5550
New York, NY 10018-8765

Re: [Gawkextlib-users] Announcing a new JSON extension

From: Andrew J. S. <as...@te...> - 2020-09-08 17:11:50

Hi,

If you take a look at the source code, it looks to me as if it is
packaged as an extension by calling register_input_parser and
adding a json::output function. You can see the code like so:
   git clone https://gitlab.com/vinipsmaker/gawk-tabjson.git
I confess that I have no experience with the Meson build
system, so I haven't tried building it.

I agree that this looks like an interesting feature set.
Are you aware that there's an existing gawk json extension
for converting between JSON and gawk arrays? It's very simple
and straightforward:
   http://gawkextlib.sourceforge.net/json/json.html#:~:text=Summary,gawk%20associative%20arrays%20and%20JSON.

I wonder whether you might consider using a different namespace
to avoid colliding with the existing json extension.

Regards,
Andy

On Tue, Sep 08, 2020 at 06:53:00PM +0200, Jürgen Kahrs via Gawkextlib-users wrote:
> Hello Vinícius,
> after reading your examples scripts, your project reminded me of
> the XML extension that exists for GNU Awk. In both cases AWK
> supplies the syntax for the scripting and the extension supplies
> the access mechansim to the data files (JSON / XML).
> In cases like these, AWK's syntax is really much more readable
> than the syntax of other tools that were created especially for
> one kind of new data format and nothing else.
> 
> After looking at your web page, I got the impression that your
> project is not an "extension" to GAWK in the established technical
> sense of GAWK extensions.
>   https://www.gnu.org/software/gawk/manual/html_node/Extension-Intro.html
> Your source code looks rather like a proposal for a change to the
> source code of GAWK. This idea (building access to a special file
> format into GAWK itself) has already been discussed several times.
> For example, reading CSV, XML, PNG and ZIP files could have been built
> into GAWK's released source code. Such ideas have been rejected
> for good reasons. Most importantly, GAWK would grow too fat for
> small embedded systems, would depend on too many external libraries
> and the work load of source code management, testing and documentation
> would be put all onto the shoulders of the GAWK maintainer.
> Did you know that mawk is still so popular mostly because it is not as
> "fat and slow as GAWK has become over the years" ?
> (sorry Arnold, I am exaggerating a bit to make the point)
> 
> Now you see the way my argument goes: Keep file format access
> outside of the original GAWK source code and put it into an extension.
> Andrew Schorr has done the same for the extension that reads XML files.
> Did you consider packaging your source code as an extension and
> supplying the doc with the packaged extension ?
> 
> Thanks for sending a notice to us.
> 
> Juergen Kahrs
> 
> 
>     Hi,
> 
>     I'd like to announce a new JSON plug-in that I've been working on the past
>     days to consume JSON streams.
> 
>     This new project is inspired by the FPAT gawk's variable. As you all know,
>     FS allows you to specify how to find the field separator. And FPAT allows
>     you to specify how to find the field contents. That's how I designed the
>     plugin. There is a JPAT variable that allows you to specify how to find the
>     contents. It works just like FPAT, but for JSON.
> 
>     JPAT is an array and each index will tell how the plug-in should fill the
>     same index in the J array. The values of JPAT are JSON Pointers[1]. For
>     instance, given the following stream:
> 
>     {"name": "Beth", "pay_rate": 4.00, "hours_worked": 0}
>     {"name": "Dan", "pay_rate": 3.75, "hours_worked": 0}
>     {"name": "Kathy", "pay_rate": 4.00, "hours_worked": 10}
>     {"name": "Mark", "pay_rate": 5.00, "hours_worked": 20}
>     {"name": "Mary", "pay_rate": 5.50, "hours_worked": 22}
>     {"name": "Susie", "pay_rate": 4.25, "hours_worked": 18}
> 
>     And the following AWK program:
> 
>     BEGIN {
>         JPAT[1] = "/name"
>         JPAT[2] = "/pay_rate"
>         JPAT[3] = "/hours_worked"
>     }
>     J[3] > 0 { print J[1], J[2] * J[3] }
> 
>     The output will be:
> 
>     Kathy 40
>     Mark 100
>     Mary 121
>     Susie 76.5
> 
>     So the idea is to give AWK a tabular view of the array. AWK is great to
>     process tabular data, so I think this fits well with the rest of AWK's
>     design.
> 
>     The tool is robust and the fields inside the JSON can come in any order. It
>     doesn't matter for the order you specified. Arbitrary nesting is supported.
>     The JSON is parsed in a one-pass fashion and only decodes the fields you
>     requested. No DOM tree is saved at any time.
> 
>     You can update the JSON by calling the function json::output() which will
>     give you a new serialized JSON string as its return. To create a new JSON
>     is a cheap process as its only job is to concatenate strings from the
>     offsets saved during the parsing stage. If a value was originally boolean
>     and you set J[n] to 1 or 0, the field will stay a boolean (but any other
>     numeric value will coerce the type to a double). Values deleted from J are
>     converted to the null JSON value.
> 
>     You can also use the unwind operation to flatten an array in the document.
>     For instance, given the following data set:
> 
>     { "_id" : "jane", "joined" : "2011-03-02", "likes" : ["golf",
>     "racquetball"] }
>     { "_id" : "joe", "joined" : "2012-07-02", "likes" : ["tennis", "golf",
>     "swimming"] }
> 
>     And the following AWK program:
> 
>     BEGIN {
>         JPAT[1] = "/likes"
>         JUNWIND = 1
>     }
>     { print NR, json::output() }
> 
>     The output will be:
> 
>     1 { "_id" : "jane", "joined" : "2011-03-02", "likes" : "golf" }
>     2 { "_id" : "jane", "joined" : "2011-03-02", "likes" : "racquetball" }
>     3 { "_id" : "joe", "joined" : "2012-07-02", "likes" : "tennis" }
>     4 { "_id" : "joe", "joined" : "2012-07-02", "likes" : "golf" }
>     5 { "_id" : "joe", "joined" : "2012-07-02", "likes" : "swimming" }
> 
>     And this other AWK program:
> 
>     BEGIN {
>         JPAT[1] = "/likes"
>         JUNWIND = 1
>     }
>     { likes[J[1]] += 1 }
>     END {
>         PROCINFO["sorted_in"] = "@val_num_desc"
>         for (like in likes) {
>             print like, likes[like]
>         }
>     }
> 
>     Will output:
> 
>     golf 2
>     tennis 1
>     swimming 1
>     racquetball 1
> 
>     The unwind operation idea comes from the MongoDB aggregation framework for
>     which an $unwind operator exists. As each document is synthesized for that
>     array element, the variable JUIND will hold the index from that element in
>     the array. If that field wasn't an array and the document is just being
>     passed through, JUIND has the value 0. If JUIND value is different than 0,
>     JULENGTH is also set with the array length. These two variables let you
>     delimit array elements that have origin in the same document.
> 
>     The homepage of the project is: https://gitlab.com/vinipsmaker/gawk-tabjson
> 
>     I'd like to know how strange this idea is for AWK. Do you believe it is too
>     far off? Or maybe do you think the AWK spirit for tabular data processing
>     has been properly absorbed here? Which one (it can't be both)?
> 
>     BUGS
> 
>       □ The README is pretty outdated. This very email is more accurate than
>         the README page. That's on me. I like to document how I'd like to use a
>         tool before I implement it, so implementation came after the README.
>       □ Right now, I don't like the name json::output(). If you know a better
>         name, please let me know.
>       □ The JSON pointer syntax is 0-indexed as opposed to AWK's 1-indexed
>         mindset. I did it for compatibility with JSON pointer expressions, but
>         I believe this decision was a mistake and I'll be changing it later.
>       □ OFMT isn't respected yet.
>       □ I don't know how to report invalid JSONs from the stream. I don't know
>         what'd be the AWK way here.
>       □ Right now the plug-in always takes over streams. I'm planning to check
>         for the value of JPAT to decide if the plug-in's getline should be
>         activated. And to also provide a split()-like function so you could use
>         the same functionality on a per-record basis. This idea requires
>         thought and design on how to best proceed.
>       □ You have to pass the flag --characters-as-bytes (or the shorthand -b)
>         when invoking gawk.
>       □ Only newline delimited JSONs are supported right now, but it's possible
>         to improve the plug-in to automatically recognize RS.
>       □ The plug-in can't right now insert new fields in the documents. This
>         operation is also possible with the chosen approach, but I haven't
>         spent the time here.
>       □ A handle for DOM-parsed subtrees could be created, but this feature
>         would also require thought and design.
>       □ Right now I don't use mmap() when available. mmap() should improve
>         plug-in performance (although it has already beaten jq performance on
>         my tests).
>       □ There isn't MPFR support yet.
> 
>     And I hope I didn't get you too excited as the time I have available to
>     spend on this project for this year is nearing its end and further work
>     will have to be postponed to later.
> 
>     [1] https://tools.ietf.org/html/rfc6901
> 
>     --
>     Vinícius dos Santos Oliveira
>     https://vinipsmaker.github.io/
> 
>    
>    
>     _______________________________________________
>     Gawkextlib-users mailing list
>     Gaw...@li...
>     https://lists.sourceforge.net/lists/listinfo/gawkextlib-users
> 
> 


> _______________________________________________
> Gawkextlib-users mailing list
> Gaw...@li...
> https://lists.sourceforge.net/lists/listinfo/gawkextlib-users


-- 
Andrew Schorr                      e-mail: as...@te...
Telemetry Investments, L.L.C.      phone:  917-305-1748
152 W 36th St, #402                fax:    212-425-5550
New York, NY 10018-8765

Re: [Gawkextlib-users] Announcing a new JSON extension

From: <jue...@go...> - 2020-09-08 16:53:25

Hello Vinícius,
after reading your examples scripts, your project reminded me of
the XML extension that exists for GNU Awk. In both cases AWK
supplies the syntax for the scripting and the extension supplies
the access mechansim to the data files (JSON / XML).
In cases like these, AWK's syntax is really much more readable
than the syntax of other tools that were created especially for
one kind of new data format and nothing else.

After looking at your web page, I got the impression that your
project is not an "extension" to GAWK in the established technical
sense of GAWK extensions.
  https://www.gnu.org/software/gawk/manual/html_node/Extension-Intro.html
Your source code looks rather like a proposal for a change to the
source code of GAWK. This idea (building access to a special file
format into GAWK itself) has already been discussed several times.
For example, reading CSV, XML, PNG and ZIP files could have been built
into GAWK's released source code. Such ideas have been rejected
for good reasons. Most importantly, GAWK would grow too fat for
small embedded systems, would depend on too many external libraries
and the work load of source code management, testing and documentation
would be put all onto the shoulders of the GAWK maintainer.
Did you know that mawk is still so popular mostly because it is not as
"fat and slow as GAWK has become over the years" ?
(sorry Arnold, I am exaggerating a bit to make the point)

Now you see the way my argument goes: Keep file format access
outside of the original GAWK source code and put it into an extension.
Andrew Schorr has done the same for the extension that reads XML files.
Did you consider packaging your source code as an extension and
supplying the doc with the packaged extension ?

Thanks for sending a notice to us.

Juergen Kahrs

> Hi,
>
> I'd like to announce a new JSON plug-in that I've been working on the past days to consume JSON streams.
>
> This new project is inspired by the FPAT gawk's variable. As you all know, FS allows you to specify how to find the /field separator/. And FPAT allows you to specify how to find the /field contents/. That's how I designed the plugin. There is a JPAT variable that allows you to specify how to find the contents. It works just like FPAT, but for JSON.
>
> JPAT is an array and each index will tell how the plug-in should fill the same index in the J array. The values of JPAT are JSON Pointers[1]. For instance, given the following stream:
>
> {"name": "Beth", "pay_rate": 4.00, "hours_worked": 0}
> {"name": "Dan", "pay_rate": 3.75, "hours_worked": 0}
> {"name": "Kathy", "pay_rate": 4.00, "hours_worked": 10}
> {"name": "Mark", "pay_rate": 5.00, "hours_worked": 20}
> {"name": "Mary", "pay_rate": 5.50, "hours_worked": 22}
> {"name": "Susie", "pay_rate": 4.25, "hours_worked": 18}
>
> And the following AWK program:
>
> BEGIN {
>     JPAT[1] = "/name"
>     JPAT[2] = "/pay_rate"
>     JPAT[3] = "/hours_worked"
> }
> J[3] > 0 { print J[1], J[2] * J[3] }
>
> The output will be:
>
> Kathy 40
> Mark 100
> Mary 121
> Susie 76.5
>
> So the idea is to give AWK a /tabular view/ of the array. AWK is great to process tabular data, so I think this fits well with the rest of AWK's design.
>
> The tool is robust and the fields inside the JSON can come in any order. It doesn't matter for the order you specified. Arbitrary nesting is supported. The JSON is parsed in a one-pass fashion and only decodes the fields you requested. No DOM tree is saved at any time.
>
> You can update the JSON by calling the function json::output() which will give you a new serialized JSON string as its return. To create a new JSON is a cheap process as its only job is to concatenate strings from the offsets saved during the parsing stage. If a value was originally boolean and you set J[n] to 1 or 0, the field will stay a boolean (but any other numeric value will coerce the type to a double). Values deleted from J are converted to the null JSON value.
>
> You can also use the unwind operation to flatten an array in the document. For instance, given the following data set:
>
> { "_id" : "jane", "joined" : "2011-03-02", "likes" : ["golf", "racquetball"] }
> { "_id" : "joe", "joined" : "2012-07-02", "likes" : ["tennis", "golf", "swimming"] }
>
> And the following AWK program:
>
> BEGIN {
>     JPAT[1] = "/likes"
>     JUNWIND = 1
> }
> { print NR, json::output() }
>
> The output will be:
>
> 1 { "_id" : "jane", "joined" : "2011-03-02", "likes" : "golf" }
> 2 { "_id" : "jane", "joined" : "2011-03-02", "likes" : "racquetball" }
> 3 { "_id" : "joe", "joined" : "2012-07-02", "likes" : "tennis" }
> 4 { "_id" : "joe", "joined" : "2012-07-02", "likes" : "golf" }
> 5 { "_id" : "joe", "joined" : "2012-07-02", "likes" : "swimming" }
>
> And this other AWK program:
>
> BEGIN {
>     JPAT[1] = "/likes"
>     JUNWIND = 1
> }
> { likes[J[1]] += 1 }
> END {
>     PROCINFO["sorted_in"] = "@val_num_desc"
>     for (like in likes) {
>         print like, likes[like]
>     }
> }
>
> Will output:
>
> golf 2
> tennis 1
> swimming 1
> racquetball 1
>
> The unwind operation idea comes from the MongoDB aggregation framework for which an $unwind operator exists. As each document is synthesized for that array element, the variable JUIND will hold the index from that element in the array. If that field wasn't an array and the document is just being passed through, JUIND has the value 0. If JUIND value is different than 0, JULENGTH is also set with the array length. These two variables let you delimit array elements that have origin in the same document.
>
> The homepage of the project is: https://gitlab.com/vinipsmaker/gawk-tabjson
>
> I'd like to know how strange this idea is for AWK. Do you believe it is too far off? Or maybe do you think the AWK spirit for tabular data processing has been properly absorbed here? Which one (it can't be both)?
>
> BUGS
>
>   * The README is pretty outdated. This very email is more accurate than the README page. That's on me. I like to document how I'd like to use a tool before I implement it, so implementation came after the README.
>   * Right now, I don't like the name json::output(). If you know a better name, please let me know.
>   * The JSON pointer syntax is 0-indexed as opposed to AWK's 1-indexed mindset. I did it for compatibility with JSON pointer expressions, but I believe this decision was a mistake and I'll be changing it later.
>   * OFMT isn't respected yet.
>   * I don't know how to report invalid JSONs from the stream. I don't know what'd be the AWK way here.
>   * Right now the plug-in always takes over streams. I'm planning to check for the value of JPAT to decide if the plug-in's getline should be activated. And to also provide a split()-like function so you could use the same functionality on a per-record basis. This idea requires thought and design on how to best proceed.
>   * You have to pass the flag --characters-as-bytes (or the shorthand -b) when invoking gawk.
>   * Only newline delimited JSONs are supported right now, but it's possible to improve the plug-in to automatically recognize RS.
>   * The plug-in can't right now insert new fields in the documents. This operation is also possible with the chosen approach, but I haven't spent the time here.
>   * A handle for DOM-parsed subtrees could be created, but this feature would also require thought and design.
>   * Right now I don't use mmap() when available. mmap() should improve plug-in performance (although it has already beaten jq performance on my tests).
>   * There isn't MPFR support yet.
>
>
> And I hope I didn't get you too excited as the time I have available to spend on this project for this year is nearing its end and further work will have to be postponed to later.
>
> [1] https://tools.ietf.org/html/rfc6901
>
> -- 
> Vinícius dos Santos Oliveira
> https://vinipsmaker.github.io/
>
>
> _______________________________________________
> Gawkextlib-users mailing list
> Gaw...@li...
> https://lists.sourceforge.net/lists/listinfo/gawkextlib-users

[Gawkextlib-users] Announcing a new JSON extension

From: Vinícius d. S. O. <vin...@gm...> - 2020-09-08 10:17:36

Hi,

I'd like to announce a new JSON plug-in that I've been working on the past
days to consume JSON streams.

This new project is inspired by the FPAT gawk's variable. As you all know,
FS allows you to specify how to find the *field separator*. And FPAT allows
you to specify how to find the *field contents*. That's how I designed the
plugin. There is a JPAT variable that allows you to specify how to find the
contents. It works just like FPAT, but for JSON.

JPAT is an array and each index will tell how the plug-in should fill the
same index in the J array. The values of JPAT are JSON Pointers[1]. For
instance, given the following stream:

{"name": "Beth", "pay_rate": 4.00, "hours_worked": 0}
{"name": "Dan", "pay_rate": 3.75, "hours_worked": 0}
{"name": "Kathy", "pay_rate": 4.00, "hours_worked": 10}
{"name": "Mark", "pay_rate": 5.00, "hours_worked": 20}
{"name": "Mary", "pay_rate": 5.50, "hours_worked": 22}
{"name": "Susie", "pay_rate": 4.25, "hours_worked": 18}

And the following AWK program:

BEGIN {
    JPAT[1] = "/name"
    JPAT[2] = "/pay_rate"
    JPAT[3] = "/hours_worked"
}
J[3] > 0 { print J[1], J[2] * J[3] }

The output will be:

Kathy 40
Mark 100
Mary 121
Susie 76.5

So the idea is to give AWK a *tabular view* of the array. AWK is great to
process tabular data, so I think this fits well with the rest of AWK's
design.

The tool is robust and the fields inside the JSON can come in any order. It
doesn't matter for the order you specified. Arbitrary nesting is supported.
The JSON is parsed in a one-pass fashion and only decodes the fields you
requested. No DOM tree is saved at any time.

You can update the JSON by calling the function json::output() which will
give you a new serialized JSON string as its return. To create a new JSON
is a cheap process as its only job is to concatenate strings from the
offsets saved during the parsing stage. If a value was originally boolean
and you set J[n] to 1 or 0, the field will stay a boolean (but any other
numeric value will coerce the type to a double). Values deleted from J are
converted to the null JSON value.

You can also use the unwind operation to flatten an array in the document.
For instance, given the following data set:

{ "_id" : "jane", "joined" : "2011-03-02", "likes" : ["golf",
"racquetball"] }
{ "_id" : "joe", "joined" : "2012-07-02", "likes" : ["tennis", "golf",
"swimming"] }

And the following AWK program:

BEGIN {
    JPAT[1] = "/likes"
    JUNWIND = 1
}
{ print NR, json::output() }

The output will be:

1 { "_id" : "jane", "joined" : "2011-03-02", "likes" : "golf" }
2 { "_id" : "jane", "joined" : "2011-03-02", "likes" : "racquetball" }
3 { "_id" : "joe", "joined" : "2012-07-02", "likes" : "tennis" }
4 { "_id" : "joe", "joined" : "2012-07-02", "likes" : "golf" }
5 { "_id" : "joe", "joined" : "2012-07-02", "likes" : "swimming" }

And this other AWK program:

BEGIN {
    JPAT[1] = "/likes"
    JUNWIND = 1
}
{ likes[J[1]] += 1 }
END {
    PROCINFO["sorted_in"] = "@val_num_desc"
    for (like in likes) {
        print like, likes[like]
    }
}

Will output:

golf 2
tennis 1
swimming 1
racquetball 1

The unwind operation idea comes from the MongoDB aggregation framework for
which an $unwind operator exists. As each document is synthesized for that
array element, the variable JUIND will hold the index from that element in
the array. If that field wasn't an array and the document is just being
passed through, JUIND has the value 0. If JUIND value is different than 0,
JULENGTH is also set with the array length. These two variables let you
delimit array elements that have origin in the same document.

The homepage of the project is: https://gitlab.com/vinipsmaker/gawk-tabjson

I'd like to know how strange this idea is for AWK. Do you believe it is too
far off? Or maybe do you think the AWK spirit for tabular data processing
has been properly absorbed here? Which one (it can't be both)?

BUGS


   - The README is pretty outdated. This very email is more accurate than
   the README page. That's on me. I like to document how I'd like to use a
   tool before I implement it, so implementation came after the README.
   - Right now, I don't like the name json::output(). If you know a better
   name, please let me know.
   - The JSON pointer syntax is 0-indexed as opposed to AWK's 1-indexed
   mindset. I did it for compatibility with JSON pointer expressions, but I
   believe this decision was a mistake and I'll be changing it later.
   - OFMT isn't respected yet.
   - I don't know how to report invalid JSONs from the stream. I don't know
   what'd be the AWK way here.
   - Right now the plug-in always takes over streams. I'm planning to check
   for the value of JPAT to decide if the plug-in's getline should be
   activated. And to also provide a split()-like function so you could use the
   same functionality on a per-record basis. This idea requires thought and
   design on how to best proceed.
   - You have to pass the flag --characters-as-bytes (or the shorthand -b)
   when invoking gawk.
   - Only newline delimited JSONs are supported right now, but it's
   possible to improve the plug-in to automatically recognize RS.
   - The plug-in can't right now insert new fields in the documents. This
   operation is also possible with the chosen approach, but I haven't spent
   the time here.
   - A handle for DOM-parsed subtrees could be created, but this feature
   would also require thought and design.
   - Right now I don't use mmap() when available. mmap() should improve
   plug-in performance (although it has already beaten jq performance on my
   tests).
   - There isn't MPFR support yet.


And I hope I didn't get you too excited as the time I have available to
spend on this project for this year is nearing its end and further work
will have to be postponed to later.

[1] https://tools.ietf.org/html/rfc6901

-- 
Vinícius dos Santos Oliveira
https://vinipsmaker.github.io/

Re: [Gawkextlib-users] CSV - Altering column by label

From: Manuel C. <mco...@gm...> - 2020-05-28 16:33:42

El 28/05/2020 a las 13:12, Phil escribió:
> Hi,
>
> I've just come across gawkextlib having previously used a regex in FPAT
> to handle CSV properly.  It's very useful, but I have one question.
>
> I often modify a specific CSV column based on the value in another
> column.  Without the CSV extension this is just done by comparing of $n
> and assigning to $m, where n and m are the test and modified field
> indexes, respectively.
>
> Whilst there is a nice helper function csvfield() to return fields via
> the column name, there is no equivalent update fields.  However looking
> into the source code I can see internal array _csv_label exists and I seem to be
> able to assign back to it.
>
> My question is - is there a more idiomatic way to do this, and if not is
> it safe to use _csv_label like in the below example?

I'm afraid not. Al least with the current version. Maybe a next version 
will make _csv_label visible, probably with a better name like csvindex.

>
> If the field "target_type" starts with text "Slow_Target" then split the
> field "qualifier" on the first underscore and replace the qualifier with
> the first split-out element:
>
> awk -i csv -i inplace -vCSVMODE=1 'NR>1 && substr(csvfield("target_type"),1,11)=="Slow_Target" { split(csvfield("qualifier"),isin,"_");$_csv_label["qualifier"]=isin[1];csvprint($0);next }{ print CSVRECORD }' example.csv
>

You can store the field indexes at NR==1.

NR==1 {type=_csv_label["target_type"]; qual=_csv_label["qualifier"]; next}
NR>1 && substr($type,1,11)=="Slow_Target" ... split($qual,isin,"_") ...

>
> I note this is a little slower than using FPAT and indexes too, but I
> guess there must be a cost to using the CSV column names and translating them?

The intended advantage of CSVMODE=1 is to simplify computations with the 
effective field values. Fields are automatically unquoted on input, and 
requoted by csvprint() on output.

If the required process is just to handle CSV text chunks without 
modifying quotation, then the FPAT direct approach is probably enough.

>
> Thanks,
> Phil.

Thanks for the report.

-- 
Manuel Collado - http://mcollado.z15.es

[Gawkextlib-users] CSV - Altering column by label

From: Phil <ph...@be...> - 2020-05-28 12:18:38

Hi,

I've just come across gawkextlib having previously used a regex in FPAT
to handle CSV properly.  It's very useful, but I have one question.

I often modify a specific CSV column based on the value in another
column.  Without the CSV extension this is just done by comparing of $n
and assigning to $m, where n and m are the test and modified field
indexes, respectively.

Whilst there is a nice helper function csvfield() to return fields via
the column name, there is no equivalent update fields.  However looking
into the source code I can see internal array _csv_label exists and I seem to be
able to assign back to it.

My question is - is there a more idiomatic way to do this, and if not is
it safe to use _csv_label like in the below example?

If the field "target_type" starts with text "Slow_Target" then split the
field "qualifier" on the first underscore and replace the qualifier with
the first split-out element:

awk -i csv -i inplace -vCSVMODE=1 'NR>1 && substr(csvfield("target_type"),1,11)=="Slow_Target" { split(csvfield("qualifier"),isin,"_");$_csv_label["qualifier"]=isin[1];csvprint($0);next }{ print CSVRECORD }' example.csv


I note this is a little slower than using FPAT and indexes too, but I
guess there must be a cost to using the CSV column names and translating them?

Thanks,
Phil.

Re: [Gawkextlib-users] Dynamic extension to preprocess files

From: Andrew J. S. <as...@te...> - 2020-01-05 14:32:04

Hi,

On Fri, Jan 03, 2020 at 10:05:28PM +0000, lemur117 wrote:
> Are there any simple tutorials I could follow for building an input parser?  The documentation stated to check readdir.c for inspiration, but I am having a hard time understanding how to even compile it.  The only way that I've gotten to work is if I do verbatim what is done by "make" on the gawk source code: `gcc -DHAVE_CONFIG_H -I.  -I./..   -g -O2 -DNDEBUG -MT readdir.lo -MD -MP -MF .deps/readdir.Tpo -c -o readdir.lo readdir.c`.  The library object "readdir.lo" is created, but I don't know how that helps me create my own usable parser.
> 
> What I'm looking for, if I'm not asking too much, is something like:
> 
> 1. C/C++ code for a trivial input parser, e.g. one that changes each line of input to be the number of characters that appeared on that line ('hello\nworld!\n' becomes '5\n6\n').
> 2. Compilation instructions, including what to do with the output of compilation (object file?).
> 3. Awk code that invokes the new parser.
> 
> readdir.c is a candidate for the first item, but I think it has added complexity to fit it in with the rest of the gawk source code build process (e.g. config.h, support for Windows, etc.), which a standalone parser may not need.  I'd like something pretty barebones to be able to wrap my feeble mind around.
> 
> Your consideration is appreciated.

You can find more examples of input parsers in the gawkextlib project
in xml/xml_interface.c and in csv/csv.c. With those in addition to readdir.c,
I think you should be able to get a good idea of how this stuff works.
Also note that there are test cases for all 3 of these extensions that show
how to use them.

Regarding build instructions: I think you simply need to dig a bit more
deeply into the "make" output. For example:

bash-4.2$ touch extension/readdir.c 
bash-4.2$ make 2>&1 | grep readdir
/bin/sh ./libtool  --tag=CC   --mode=compile gcc -DHAVE_CONFIG_H -I.  -I./..   -g -O2 -DARRAYDEBUG -DYYDEBUG -DLOCALEDEBUG -DMEMDEBUG -Wall -fno-builtin -g3 -ggdb3 -MT readdir.lo -MD -MP -MF .deps/readdir.Tpo -c -o readdir.lo readdir.c
libtool: compile:  gcc -DHAVE_CONFIG_H -I. -I./.. -g -O2 -DARRAYDEBUG -DYYDEBUG -DLOCALEDEBUG -DMEMDEBUG -Wall -fno-builtin -g3 -ggdb3 -MT readdir.lo -MD -MP -MF .deps/readdir.Tpo -c readdir.c  -fPIC -DPIC -o .libs/readdir.o
mv -f .deps/readdir.Tpo .deps/readdir.Plo
/bin/sh ./libtool  --tag=CC   --mode=link gcc  -g -O2 -DARRAYDEBUG -DYYDEBUG -DLOCALEDEBUG -DMEMDEBUG -Wall -fno-builtin -g3 -ggdb3 -module -avoid-version -no-undefined  -o readdir.la -rpath /extra_disk/tmp/gawk-test/lib/gawk readdir.lo  -lm 
libtool: link: rm -fr  .libs/readdir.la .libs/readdir.lai .libs/readdir.so
libtool: link: gcc -shared  -fPIC -DPIC  .libs/readdir.o   -lm  -g -O2 -g3 -ggdb3   -Wl,-soname -Wl,readdir.so -o .libs/readdir.so
libtool: link: ( cd ".libs" && rm -f "readdir.la" && ln -s "../readdir.la" "readdir.la" )

So as you can see, after compiling, you need to run the linker to build the
shared library readdir.so ("gcc -shared -fPIC ..."). It's just 2 gcc commands:
one to compile, and another to link. You can ignore the libtool nonsense,
assuming you are on Linux. You can then load readdir.so using the -l readdir
command-line arg, or @load "readdir" in the awk code, as long as you have
placed readdir.so somewhere in the path designated by AWKLIBPATH.

Regards,
Andy

Re: [Gawkextlib-users] Dynamic extension to preprocess files

From: lemur117 <lem...@pr...> - 2020-01-03 22:05:47

Are there any simple tutorials I could follow for building an input parser?  The documentation stated to check readdir.c for inspiration, but I am having a hard time understanding how to even compile it.  The only way that I've gotten to work is if I do verbatim what is done by "make" on the gawk source code: `gcc -DHAVE_CONFIG_H -I.  -I./..   -g -O2 -DNDEBUG -MT readdir.lo -MD -MP -MF .deps/readdir.Tpo -c -o readdir.lo readdir.c`.  The library object "readdir.lo" is created, but I don't know how that helps me create my own usable parser.

What I'm looking for, if I'm not asking too much, is something like:

1. C/C++ code for a trivial input parser, e.g. one that changes each line of input to be the number of characters that appeared on that line ('hello\nworld!\n' becomes '5\n6\n').
2. Compilation instructions, including what to do with the output of compilation (object file?).
3. Awk code that invokes the new parser.

readdir.c is a candidate for the first item, but I think it has added complexity to fit it in with the rest of the gawk source code build process (e.g. config.h, support for Windows, etc.), which a standalone parser may not need.  I'd like something pretty barebones to be able to wrap my feeble mind around.

Your consideration is appreciated.

-Jon

Sent with ProtonMail Secure Email.

‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Friday, January 3, 2020 11:58 AM, lemur117 via Gawkextlib-users <gaw...@li...> wrote:

> Thanks! This looks exactly like what I need. I'll play around with it.
>
> -Jon
>
> Sent with ProtonMail Secure Email.
>
> ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
> On Friday, January 3, 2020 10:15 AM, Andrew J. Schorr as...@te... wrote:
>
> > Hi,
> > On Fri, Jan 03, 2020 at 03:50:00PM +0000, lemur117 via Gawkextlib-users wrote:
> >
> > > Is it possible for a dynamic extension to preprocess a file before feeding it
> > > into awk's pattern-action rules?
> >
> > Yes! See the docs here:
> > https://www.gnu.org/software/gawk/manual/html_node/Input-Parsers.html
> > Regards,
> > Andy
>
> Gawkextlib-users mailing list
> Gaw...@li...
> https://lists.sourceforge.net/lists/listinfo/gawkextlib-users

Re: [Gawkextlib-users] Dynamic extension to preprocess files

From: lemur117 <lem...@pr...> - 2020-01-03 17:59:24

Thanks!  This looks exactly like what I need.  I'll play around with it.

-Jon


Sent with ProtonMail Secure Email.

‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Friday, January 3, 2020 10:15 AM, Andrew J. Schorr <as...@te...> wrote:

> Hi,
>
> On Fri, Jan 03, 2020 at 03:50:00PM +0000, lemur117 via Gawkextlib-users wrote:
>
> > Is it possible for a dynamic extension to preprocess a file before feeding it
> > into awk's pattern-action rules?
>
> Yes! See the docs here:
>
> https://www.gnu.org/software/gawk/manual/html_node/Input-Parsers.html
>
> Regards,
> Andy

Re: [Gawkextlib-users] Dynamic extension to preprocess files

From: Andrew J. S. <as...@te...> - 2020-01-03 16:15:50

Hi,

On Fri, Jan 03, 2020 at 03:50:00PM +0000, lemur117 via Gawkextlib-users wrote:
> Is it possible for a dynamic extension to preprocess a file before feeding it
> into awk's pattern-action rules?

Yes! See the docs here:

https://www.gnu.org/software/gawk/manual/html_node/Input-Parsers.html

Regards,
Andy

[Gawkextlib-users] Dynamic extension to preprocess files

From: lemur117 <lem...@pr...> - 2020-01-03 15:50:29

Is it possible for a dynamic extension to preprocess a file before feeding it into awk's pattern-action rules?  For instance, if the input files are compressed, the preprocessor would decompress them; or if they are encoded, the preprocessor would decode them; or if they are encrypted, the preprocessor would decrypt them.

I have built many shell scripts to preprocess files externally before providing them to awk as input, but there have always been disadvantages.  So I am exploring the possibility of implementing a preprocessor as a dynamic extension to mitigate the disadvantages.

As a simple example, assume I have three gzipped files (file1.gz, file2.gz, file3.gz) that I want awk to operate on, and I have a compiled C program that can unzip a file to stdout, named "my_gunzip."  Can a dynamic extension be built that would allow awk to be invoked on the input as:  "awk -f my_script.awk file1.gz file2.gz file3.gz" where my_script.awk calls "my_gunzip" to preprocess each file before feeding to awk's pattern-action rules?

Thanks for your consideration.

Sent with [ProtonMail](https://protonmail.com) Secure Email.

[Gawkextlib-users] new subscriber: Galen Tackett

From: <jue...@go...> - 2019-12-27 14:14:59

Attachments: Nachrichtenteil als Anhang Nachrichtenteil als Anhang

Hello,

our mailing list for *users* is still working and we have a new

subscriber (Galen Tackett). Galen sent the following question

to the *developers* list, which does not exist any more since the

re-organization of mailing lists at SourceForge some years ago.

Please feel free to comment on Galen's question.


Juergen Kahrs



-------- Weitergeleitete Nachricht --------
Betreff: 	[Gawkextlib-developer] Some revisions and corrections for gawk-xml manual
Datum: 	Fri, 27 Dec 2019 13:57:11 +0000 (UTC)
Von: 	Galen Tackett via Gawkextlib-developer <gaw...@li...>
Antwort an: 	For internal communication among developers <gaw...@li...>
An: 	gaw...@li... <gaw...@li...>
Kopie (CC): 	Galen Tackett <glt...@ya...>



I have a few corrections and revisions to the gawk-xml manual that I'd like to offer  How should I go about this?

My goal has been to fix typographic errors, correct a few problems with and make some additions to the index, and to clarify the text in a few places by rendering it in more idiomatic English.

Thanks!

Galen

[Gawkextlib-users] execlp-like function for gawk

From: <jue...@go...> - 2019-09-22 18:53:22

Attachments: ForwardedMessage.eml

Hello,

the attachment contains a mail that was automatically discarded

by the SourceForge mailing system. The reason for the automatic

discard was the length that exceeded our limit of 40 KB. I have

increased the limit to 100 KB now. So this posting should be

accepted by the mailing system and reach all subscribers.

Re: [Gawkextlib-users] An extension that provides an execlp-like function for gawk

From: Andrew J. S. <as...@te...> - 2019-09-22 17:45:32

Hi,

First of all, please group reply. The gawkextlib project is a collective
effort. I will not respond to any further personal emails.

Second, you need to fork the repo, apply your changes, and then submit
a merge request. See the sourceforge git docs here:
   https://sourceforge.net/p/forge/documentation/Git/
There's a video here:
   https://www.youtube.com/watch?v=Rp8f0cLmMrI

Regards,
Andy

On Sun, Sep 22, 2019 at 08:29:25PM +0300, Oğuz wrote:
> Okay, what should I do right now? How do I make a proper diff so that you guys
> can get a working copy of the extension? 
> 
> On Sun, Sep 22, 2019 at 6:57 PM Andrew J. Schorr <
> as...@te...> wrote:
> 
>     Hi,
> 
>     It doesn't need a "manual", but it will need a man page explaining
>     what it does. But maybe somebody else can help with that.
> 
>     I'm cc'ing the gawkextlib-users mailing list. Let's please discuss
>     on list.
> 
>     Thanks,
>     Andy
> 
>     On Sun, Sep 22, 2019 at 06:50:51PM +0300, Oğuz wrote:
>     > Well, I managed to make it compile with my extension. But writing a
>     manual etc.
>     > looks like too much work since I suck at English
>     >
>     > On Sun, Sep 22, 2019 at 5:52 PM Andrew J. Schorr <
>     > as...@te...> wrote:
>     >
>     >     Hi, yes, sorry, I should not have said "donating". The gawkextlib
>     project
>     >     is simply a place to post GPL'ed gawk libraries.
>     >
>     >     There is more info here:
>     >        http://gawkextlib.sourceforge.net/
>     >     and more particularly:
>     >        http://gawkextlib.sourceforge.net/Development.html
>     >
>     >     Regards,
>     >     Andy
>     >
>     >     On Sun, Sep 22, 2019 at 07:53:49AM -0600, ar...@sk... wrote:
>     >     > Donating was a poor word.  Allowing the project to use your code
>     would
>     >     > be better, you retain ownership and copyright.
>     >     >
>     >     > I'm cc-ing Andy who can help out in more detail.
>     >     >
>     >     > For a start, clone the gawkextlib project and see if you can add a
>     >     > new library for your code using the scripts there.  Then replace
>     the
>     >     > stub code with your code.  Get it all to compile, and then send
>     Andy
>     >     > a diff. Voila! :-)
>     >     >
>     >     > Arnold
>     >     >
>     >     > Oğuz <ogu...@gm...> wrote:
>     >     >
>     >     > > It doesn't say anything about *donating your code*, what is the
>     >     protocol
>     >     > > for that?
>     >     > >
>     >     > > On Sun, Sep 22, 2019 at 3:40 PM Aharon Robbins <ar...@sk...
>     >
>     >     wrote:
>     >     > >
>     >     > > > Please consider donating your code to the gawkextlib project.
>     See the
>     >     gawk
>     >     > > > doc
>     >     > > > for details...
>     >     > > >
>     >     > > > Thanks,
>     >     > > >
>     >     > > > Arnold
>     >     > > >
>     >     > > > In article <qm4o7h$ed6$1...@do...> you write:
>     >     > > > >On Tue, 17 Sep 2019 18:01:15 +0000, Kenny McCormack wrote:
>     >     > > > >
>     >     > > > >> In article <qlnfmn$qqv$1...@do...>,
>     >     > > > >> Oðuz  <ogu...@gm...> wrote:
>     >     > > > >>>>     4) I did something similar - quite some time ago -
>     called
>     >     'spawn'
>     >     > > > >>>>     - that provides both 'spawn' and 'exec'
>     functionality. 
>     >     'spawn'
>     >     > > > >>>>     runs the program as a child process, while 'exec'
>     replaces
>     >     the
>     >     > > > >>>>     GAWK program with another program.  You might want to
>     >     consider
>     >     > > > >>>>     renaming your 'exec' to 'spawn', since 'exec' means,
>     well,
>     >     exec().
>     >     > > > >>>>
>     >     > > > >>>>     5) If you're interested in my approach to this
>     problem, let
>     >     me
>     >     > > > >>>>     know,
>     >     > > > >>>>     but I'm assuming that since you've gone ahead and done
>     it
>     >     > > > >>>>     yourself, you're more in tune with your own code.
>     >     > > > >>>
>     >     > > > >>>Well, "spawn" sounds way better, thanks! And yes, I would
>     like to
>     >     see
>     >     > > > >>>your approach.
>     >     > > > >>>
>     >     > > > >>>
>     >     > > > >> Try this command:
>     >     > > > >>
>     >     > > > >> $ wget http://shell.xmission.com:PORT/spawn.zip
>     >     > > > >>
>     >     > > > >> where PORT is 65401.
>     >     > > > >>
>     >     > > > >> I think it contains everything you need.  Note that it is
>     written
>     >     for
>     >     > > > >> the "old" GAWK API (pre-gawk-4.2).
>     >     > > > >>
>     >     > > > >> This version supports 3 types of calls:
>     >     > > > >>
>     >     > > > >>     1) split("ls -l foo",A)
>     >     > > > >>      spawn(x,0,A[1],A)       # Read the source for what to
>     put in
>     >     for
>     >     > > > >'x'.
>     >     > > > >>     2) exec(A[1],A)
>     >     > > > >>     3) exec("ls","ls","-l","foo")
>     >     > > > >>
>     >     > > > >> Note that for completeness, I should probably extend "spawn"
>     the
>     >     same
>     >     > > > >> way, so that you can supply the args directly instead of via
>     the
>     >     array,
>     >     > > > >> but I never got around to doing it.
>     >     > > > >>
>     >     > > > >> Also note that the purpose of spawn_setInterrupt() is kind
>     of the
>     >     > > > >> opposite of what system() does (and which you re-created in
>     your
>     >     code).
>     >     > > > >> That is, it allows the child to get killed without killing
>     the
>     >     parent.
>     >     > > > >> I always found the (usual and standard) behavior of system()
>     with
>     >     regard
>     >     > > > >> to signals weird and unhelpful.
>     >     > > > >
>     >     > > > >I added support for supplying args via an array, here is the
>     last
>     >     version
>     >     > > > >if you're interested:
>     >     > > > >
>     >     > > > >
>     >     > > > >/* provide spawn function for GAWK
>     >     > > > > *
>     >     > > > > * build (GCC/clang)
>     >     > > > > *   cc -shared -fPIC -o spawn.so spawn.c
>     >     > > > > */
>     >     > > > >
>     >     > > > >#include <errno.h>
>     >     > > > >#include <signal.h>
>     >     > > > >#include <stddef.h>
>     >     > > > >#include <stdio.h>
>     >     > > > >#include <stdlib.h>
>     >     > > > >#include <string.h>
>     >     > > > >#include <sys/stat.h>
>     >     > > > >#include <sys/types.h>
>     >     > > > >#include <sys/wait.h>
>     >     > > > >#include <unistd.h>
>     >     > > > >#include <gawkapi.h>
>     >     > > > >
>     >     > > > >#define error(how, func) \
>     >     > > > >       how ## fatal(ext_id, "%s: %s", func, strerror(errno))
>     >     > > > >
>     >     > > > >int plugin_is_GPL_compatible;
>     >     > > > >
>     >     > > > >static const gawk_api_t *api;
>     >     > > > >static awk_ext_id_t ext_id;
>     >     > > > >static const char *ext_version = NULL;
>     >     > > > >static awk_bool_t (*init_func)(void) = NULL;
>     >     > > > >
>     >     > > > >/* do_spawn - spawn(arg0[, ...])
>     >     > > > > *   spawn a child process, wait for its termination and
>     return its
>     >     exit
>     >     > > > >status.
>     >     > > > > *   arg0 can be an array, in such case; arg0[0], arg0[1], ...
>     will
>     >     > > > >constitute the
>     >     > > > > *   argument list to the new process
>     >     > > > > *
>     >     > > > > *   return value:
>     >     > > > > *     * exit status of spawned process  -  if everything goes
>     well
>     >     > > > > *     * status of spawned process as    -  if it gets
>     signaled
>     >     > > > > *       described in waitpid(3)
>     >     > > > > *     * 127                             -  if execvp() fails
>     >     > > > > *     * -1                              -  otherwise
>     >     > > > > *
>     >     > > > > *   causes GAWK to exit if:
>     >     > > > > *     * arg0 is not an array, and an argument after it is of
>     >     non-scalar
>     >     > > > >type,
>     >     > > > > *     * arg0 is an array, and an element in it is of
>     non-scalar
>     >     type,
>     >     > > > > *     * malloc() or waitpid() fails
>     >     > > > > */
>     >     > > > >
>     >     > > > >static awk_value_t *
>     >     > > > >do_spawn (int nargs, awk_value_t *result, awk_ext_func_t
>     *unused)
>     >     > > > >{
>     >     > > > >       char **args;
>     >     > > > >
>     >     > > > >       awk_value_t arg;
>     >     > > > >       if (get_argument(0, AWK_ARRAY, &arg)) {
>     >     > > > >               awk_array_t a_cookie = arg.array_cookie;
>     >     > > > >               get_element_count(a_cookie, (size_t *) &nargs);
>     >     > > > >               args = malloc((nargs + 1) * sizeof(char *));
>     >     > > > >               if (! args) {
>     >     > > > >                       error(,"malloc");
>     >     > > > >               }
>     >     > > > >
>     >     > > > >               awk_value_t index;
>     >     > > > >               for (int i = 0; i < nargs; i++) {
>     >     > > > >                       /* note that a missing index will also
>     cause
>     >     an
>     >     > > > >error, e.g:
>     >     > > > >                        *
>     >     > > > >                        *   a[0] = "echo"
>     >     > > > >                        *   a[2] = "hello world"
>     >     > > > >                        *   spawn(a)
>     >     > > > >                        */
>     >     > > > >                       make_number(i, &index);
>     >     > > > >                       if (! get_array_element(a_cookie, &
>     index,
>     >     > > > >AWK_STRING, &arg)) {
>     >     > > > >                               errno = EINVAL;
>     >     > > > >                               error(,"get_array_element");
>     >     > > > >                       }
>     >     > > > >                       args[i] = arg.str_value.str;
>     >     > > > >               }
>     >     > > > >               args[nargs] = NULL;
>     >     > > > >       }
>     >     > > > >       else {
>     >     > > > >               args = malloc((nargs + 1) * sizeof(char *));
>     >     > > > >               if (! args) {
>     >     > > > >                       error(,"malloc");
>     >     > > > >               }
>     >     > > > >
>     >     > > > >               for (int i = 0; i < nargs; i++) {
>     >     > > > >                       if (! get_argument(i, AWK_STRING, &
>     arg)) {
>     >     > > > >                               errno = EINVAL;
>     >     > > > >                               error(,"get_argument");
>     >     > > > >                       }
>     >     > > > >                       args[i] = arg.str_value.str;
>     >     > > > >               }
>     >     > > > >               args[nargs] = NULL;
>     >     > > > >       }
>     >     > > > >
>     >     > > > >       /* handle signals the way system() does. */
>     >     > > > >       struct sigaction sa, intr, quit;
>     >     > > > >       sigset_t omask;
>     >     > > > >       sa.sa_handler = SIG_IGN;
>     >     > > > >       sigemptyset(&sa.sa_mask);
>     >     > > > >       sa.sa_flags = 0;
>     >     > > > >       sigemptyset(&intr.sa_mask);
>     >     > > > >       sigemptyset(&quit.sa_mask);
>     >     > > > >       sigaction(SIGINT,  &sa, &intr);
>     >     > > > >       sigaction(SIGQUIT, &sa, &quit);
>     >     > > > >       sigaddset(&sa.sa_mask, SIGCHLD);
>     >     > > > >       sigprocmask(SIG_BLOCK, &sa.sa_mask, &omask);
>     >     > > > >
>     >     > > > >       pid_t pid = fork();
>     >     > > > >       if (! pid) {
>     >     > > > >               sigaction(SIGINT,  &intr, NULL);
>     >     > > > >               sigaction(SIGQUIT, &quit, NULL);
>     >     > > > >               sigprocmask(SIG_SETMASK, &omask, NULL);
>     >     > > > >
>     >     > > > >               /* note that unlike system(),
>     >     > > > >                * this can be tricked with a tainted PATH.
>     >     > > > >                *
>     >     > > > >                * if it's a concern, programmer should either
>     use
>     >     > > > >absolute paths,
>     >     > > > >                * or sanitize PATH before calling spawn().
>     >     > > > >                *
>     >     > > > >                * e.g:
>     >     > > > >                *     spawn("/usr/bin/rm",$1)
>     >     > > > >                *     # or,
>     >     > > > >                *     ENVIRON["PATH"]="/bin:/usr/bin:/usr/
>     local/bin"
>     >     > > > >                *     spawn("rm",$1)
>     >     > > > >                */
>     >     > > > >               execvp(args[0], args);
>     >     > > > >               error(non, "execvp");
>     >     > > > >               exit(127);
>     >     > > > >       }
>     >     > > > >       else if (pid > 0) {
>     >     > > > >               int status;
>     >     > > > >
>     >     > > > >               do {
>     >     > > > >                       if (waitpid(pid, &status, 0) < 0) {
>     >     > > > >                               error(,"waitpid");
>     >     > > > >                       }
>     >     > > > >               }
>     >     > > > >               while (! WIFEXITED(status) && ! WIFSIGNALED
>     (status));
>     >     > > > >
>     >     > > > >               sigaction(SIGINT,  &intr, NULL);
>     >     > > > >               sigaction(SIGQUIT, &quit, NULL);
>     >     > > > >               sigprocmask(SIG_SETMASK, &omask, NULL);
>     >     > > > >
>     >     > > > >               if (WIFEXITED(status)) {
>     >     > > > >                       return make_number(WEXITSTATUS(status),
>     >     result);
>     >     > > > >               }
>     >     > > > >               else {
>     >     > > > >                       return make_number(status, result);
>     >     > > > >               }
>     >     > > > >       }
>     >     > > > >       else {
>     >     > > > >               error(non, "fork");
>     >     > > > >       }
>     >     > > > >
>     >     > > > >       free(args);
>     >     > > > >
>     >     > > > >       return make_number(-1, result);
>     >     > > > >}
>     >     > > > >
>     >     > > > >static awk_ext_func_t func_table[] = {
>     >     > > > >       { "spawn", do_spawn, 0, 1, awk_true, NULL }
>     >     > > > >};
>     >     > > > >
>     >     > > > >dl_load_func(func_table, spawn, "")
>     >     > > > >
>     >     > > >
>     >     > > >
>     >     > > > --
>     >     > > > Aharon (Arnold) Robbins                 arnold AT skeeve DOT
>     com
>     >     > > >
>     >     >
>     >
>     >     --
>     >     Andrew Schorr                      e-mail:
>     >     as...@te...
>     >     Telemetry Investments, L.L.C.      phone:  917-305-1748
>     >     545 Fifth Ave, Suite 1108          fax:    212-425-5550
>     >     New York, NY 10017-3630
>     >
> 
>     --
>     Andrew Schorr                      e-mail:
>     as...@te...
>     Telemetry Investments, L.L.C.      phone:  917-305-1748
>     545 Fifth Ave, Suite 1108          fax:    212-425-5550
>     New York, NY 10017-3630
> 

-- 
Andrew Schorr                      e-mail: as...@te...
Telemetry Investments, L.L.C.      phone:  917-305-1748
545 Fifth Ave, Suite 1108          fax:    212-425-5550
New York, NY 10017-3630

Re: [Gawkextlib-users] An extension that provides an execlp-like function for gawk

From: Andrew J. S. <as...@te...> - 2019-09-22 16:13:20

Hi,

It doesn't need a "manual", but it will need a man page explaining
what it does. But maybe somebody else can help with that.

I'm cc'ing the gawkextlib-users mailing list. Let's please discuss
on list.

Thanks,
Andy

On Sun, Sep 22, 2019 at 06:50:51PM +0300, Oğuz wrote:
> Well, I managed to make it compile with my extension. But writing a manual etc.
> looks like too much work since I suck at English
> 
> On Sun, Sep 22, 2019 at 5:52 PM Andrew J. Schorr <
> as...@te...> wrote:
> 
>     Hi, yes, sorry, I should not have said "donating". The gawkextlib project
>     is simply a place to post GPL'ed gawk libraries.
> 
>     There is more info here:
>        http://gawkextlib.sourceforge.net/
>     and more particularly:
>        http://gawkextlib.sourceforge.net/Development.html
> 
>     Regards,
>     Andy
> 
>     On Sun, Sep 22, 2019 at 07:53:49AM -0600, ar...@sk... wrote:
>     > Donating was a poor word.  Allowing the project to use your code would
>     > be better, you retain ownership and copyright.
>     >
>     > I'm cc-ing Andy who can help out in more detail.
>     >
>     > For a start, clone the gawkextlib project and see if you can add a
>     > new library for your code using the scripts there.  Then replace the
>     > stub code with your code.  Get it all to compile, and then send Andy
>     > a diff. Voila! :-)
>     >
>     > Arnold
>     >
>     > Oğuz <ogu...@gm...> wrote:
>     >
>     > > It doesn't say anything about *donating your code*, what is the
>     protocol
>     > > for that?
>     > >
>     > > On Sun, Sep 22, 2019 at 3:40 PM Aharon Robbins <ar...@sk...>
>     wrote:
>     > >
>     > > > Please consider donating your code to the gawkextlib project. See the
>     gawk
>     > > > doc
>     > > > for details...
>     > > >
>     > > > Thanks,
>     > > >
>     > > > Arnold
>     > > >
>     > > > In article <qm4o7h$ed6$1...@do...> you write:
>     > > > >On Tue, 17 Sep 2019 18:01:15 +0000, Kenny McCormack wrote:
>     > > > >
>     > > > >> In article <qlnfmn$qqv$1...@do...>,
>     > > > >> Oðuz  <ogu...@gm...> wrote:
>     > > > >>>>     4) I did something similar - quite some time ago - called
>     'spawn'
>     > > > >>>>     - that provides both 'spawn' and 'exec' functionality. 
>     'spawn'
>     > > > >>>>     runs the program as a child process, while 'exec' replaces
>     the
>     > > > >>>>     GAWK program with another program.  You might want to
>     consider
>     > > > >>>>     renaming your 'exec' to 'spawn', since 'exec' means, well,
>     exec().
>     > > > >>>>
>     > > > >>>>     5) If you're interested in my approach to this problem, let
>     me
>     > > > >>>>     know,
>     > > > >>>>     but I'm assuming that since you've gone ahead and done it
>     > > > >>>>     yourself, you're more in tune with your own code.
>     > > > >>>
>     > > > >>>Well, "spawn" sounds way better, thanks! And yes, I would like to
>     see
>     > > > >>>your approach.
>     > > > >>>
>     > > > >>>
>     > > > >> Try this command:
>     > > > >>
>     > > > >> $ wget http://shell.xmission.com:PORT/spawn.zip
>     > > > >>
>     > > > >> where PORT is 65401.
>     > > > >>
>     > > > >> I think it contains everything you need.  Note that it is written
>     for
>     > > > >> the "old" GAWK API (pre-gawk-4.2).
>     > > > >>
>     > > > >> This version supports 3 types of calls:
>     > > > >>
>     > > > >>     1) split("ls -l foo",A)
>     > > > >>      spawn(x,0,A[1],A)       # Read the source for what to put in
>     for
>     > > > >'x'.
>     > > > >>     2) exec(A[1],A)
>     > > > >>     3) exec("ls","ls","-l","foo")
>     > > > >>
>     > > > >> Note that for completeness, I should probably extend "spawn" the
>     same
>     > > > >> way, so that you can supply the args directly instead of via the
>     array,
>     > > > >> but I never got around to doing it.
>     > > > >>
>     > > > >> Also note that the purpose of spawn_setInterrupt() is kind of the
>     > > > >> opposite of what system() does (and which you re-created in your
>     code).
>     > > > >> That is, it allows the child to get killed without killing the
>     parent.
>     > > > >> I always found the (usual and standard) behavior of system() with
>     regard
>     > > > >> to signals weird and unhelpful.
>     > > > >
>     > > > >I added support for supplying args via an array, here is the last
>     version
>     > > > >if you're interested:
>     > > > >
>     > > > >
>     > > > >/* provide spawn function for GAWK
>     > > > > *
>     > > > > * build (GCC/clang)
>     > > > > *   cc -shared -fPIC -o spawn.so spawn.c
>     > > > > */
>     > > > >
>     > > > >#include <errno.h>
>     > > > >#include <signal.h>
>     > > > >#include <stddef.h>
>     > > > >#include <stdio.h>
>     > > > >#include <stdlib.h>
>     > > > >#include <string.h>
>     > > > >#include <sys/stat.h>
>     > > > >#include <sys/types.h>
>     > > > >#include <sys/wait.h>
>     > > > >#include <unistd.h>
>     > > > >#include <gawkapi.h>
>     > > > >
>     > > > >#define error(how, func) \
>     > > > >       how ## fatal(ext_id, "%s: %s", func, strerror(errno))
>     > > > >
>     > > > >int plugin_is_GPL_compatible;
>     > > > >
>     > > > >static const gawk_api_t *api;
>     > > > >static awk_ext_id_t ext_id;
>     > > > >static const char *ext_version = NULL;
>     > > > >static awk_bool_t (*init_func)(void) = NULL;
>     > > > >
>     > > > >/* do_spawn - spawn(arg0[, ...])
>     > > > > *   spawn a child process, wait for its termination and return its
>     exit
>     > > > >status.
>     > > > > *   arg0 can be an array, in such case; arg0[0], arg0[1], ... will
>     > > > >constitute the
>     > > > > *   argument list to the new process
>     > > > > *
>     > > > > *   return value:
>     > > > > *     * exit status of spawned process  -  if everything goes well
>     > > > > *     * status of spawned process as    -  if it gets signaled
>     > > > > *       described in waitpid(3)
>     > > > > *     * 127                             -  if execvp() fails
>     > > > > *     * -1                              -  otherwise
>     > > > > *
>     > > > > *   causes GAWK to exit if:
>     > > > > *     * arg0 is not an array, and an argument after it is of
>     non-scalar
>     > > > >type,
>     > > > > *     * arg0 is an array, and an element in it is of non-scalar
>     type,
>     > > > > *     * malloc() or waitpid() fails
>     > > > > */
>     > > > >
>     > > > >static awk_value_t *
>     > > > >do_spawn (int nargs, awk_value_t *result, awk_ext_func_t *unused)
>     > > > >{
>     > > > >       char **args;
>     > > > >
>     > > > >       awk_value_t arg;
>     > > > >       if (get_argument(0, AWK_ARRAY, &arg)) {
>     > > > >               awk_array_t a_cookie = arg.array_cookie;
>     > > > >               get_element_count(a_cookie, (size_t *) &nargs);
>     > > > >               args = malloc((nargs + 1) * sizeof(char *));
>     > > > >               if (! args) {
>     > > > >                       error(,"malloc");
>     > > > >               }
>     > > > >
>     > > > >               awk_value_t index;
>     > > > >               for (int i = 0; i < nargs; i++) {
>     > > > >                       /* note that a missing index will also cause
>     an
>     > > > >error, e.g:
>     > > > >                        *
>     > > > >                        *   a[0] = "echo"
>     > > > >                        *   a[2] = "hello world"
>     > > > >                        *   spawn(a)
>     > > > >                        */
>     > > > >                       make_number(i, &index);
>     > > > >                       if (! get_array_element(a_cookie, &index,
>     > > > >AWK_STRING, &arg)) {
>     > > > >                               errno = EINVAL;
>     > > > >                               error(,"get_array_element");
>     > > > >                       }
>     > > > >                       args[i] = arg.str_value.str;
>     > > > >               }
>     > > > >               args[nargs] = NULL;
>     > > > >       }
>     > > > >       else {
>     > > > >               args = malloc((nargs + 1) * sizeof(char *));
>     > > > >               if (! args) {
>     > > > >                       error(,"malloc");
>     > > > >               }
>     > > > >
>     > > > >               for (int i = 0; i < nargs; i++) {
>     > > > >                       if (! get_argument(i, AWK_STRING, &arg)) {
>     > > > >                               errno = EINVAL;
>     > > > >                               error(,"get_argument");
>     > > > >                       }
>     > > > >                       args[i] = arg.str_value.str;
>     > > > >               }
>     > > > >               args[nargs] = NULL;
>     > > > >       }
>     > > > >
>     > > > >       /* handle signals the way system() does. */
>     > > > >       struct sigaction sa, intr, quit;
>     > > > >       sigset_t omask;
>     > > > >       sa.sa_handler = SIG_IGN;
>     > > > >       sigemptyset(&sa.sa_mask);
>     > > > >       sa.sa_flags = 0;
>     > > > >       sigemptyset(&intr.sa_mask);
>     > > > >       sigemptyset(&quit.sa_mask);
>     > > > >       sigaction(SIGINT,  &sa, &intr);
>     > > > >       sigaction(SIGQUIT, &sa, &quit);
>     > > > >       sigaddset(&sa.sa_mask, SIGCHLD);
>     > > > >       sigprocmask(SIG_BLOCK, &sa.sa_mask, &omask);
>     > > > >
>     > > > >       pid_t pid = fork();
>     > > > >       if (! pid) {
>     > > > >               sigaction(SIGINT,  &intr, NULL);
>     > > > >               sigaction(SIGQUIT, &quit, NULL);
>     > > > >               sigprocmask(SIG_SETMASK, &omask, NULL);
>     > > > >
>     > > > >               /* note that unlike system(),
>     > > > >                * this can be tricked with a tainted PATH.
>     > > > >                *
>     > > > >                * if it's a concern, programmer should either use
>     > > > >absolute paths,
>     > > > >                * or sanitize PATH before calling spawn().
>     > > > >                *
>     > > > >                * e.g:
>     > > > >                *     spawn("/usr/bin/rm",$1)
>     > > > >                *     # or,
>     > > > >                *     ENVIRON["PATH"]="/bin:/usr/bin:/usr/local/bin"
>     > > > >                *     spawn("rm",$1)
>     > > > >                */
>     > > > >               execvp(args[0], args);
>     > > > >               error(non, "execvp");
>     > > > >               exit(127);
>     > > > >       }
>     > > > >       else if (pid > 0) {
>     > > > >               int status;
>     > > > >
>     > > > >               do {
>     > > > >                       if (waitpid(pid, &status, 0) < 0) {
>     > > > >                               error(,"waitpid");
>     > > > >                       }
>     > > > >               }
>     > > > >               while (! WIFEXITED(status) && ! WIFSIGNALED(status));
>     > > > >
>     > > > >               sigaction(SIGINT,  &intr, NULL);
>     > > > >               sigaction(SIGQUIT, &quit, NULL);
>     > > > >               sigprocmask(SIG_SETMASK, &omask, NULL);
>     > > > >
>     > > > >               if (WIFEXITED(status)) {
>     > > > >                       return make_number(WEXITSTATUS(status),
>     result);
>     > > > >               }
>     > > > >               else {
>     > > > >                       return make_number(status, result);
>     > > > >               }
>     > > > >       }
>     > > > >       else {
>     > > > >               error(non, "fork");
>     > > > >       }
>     > > > >
>     > > > >       free(args);
>     > > > >
>     > > > >       return make_number(-1, result);
>     > > > >}
>     > > > >
>     > > > >static awk_ext_func_t func_table[] = {
>     > > > >       { "spawn", do_spawn, 0, 1, awk_true, NULL }
>     > > > >};
>     > > > >
>     > > > >dl_load_func(func_table, spawn, "")
>     > > > >
>     > > >
>     > > >
>     > > > --
>     > > > Aharon (Arnold) Robbins                 arnold AT skeeve DOT com
>     > > >
>     >
> 
>     --
>     Andrew Schorr                      e-mail:
>     as...@te...
>     Telemetry Investments, L.L.C.      phone:  917-305-1748
>     545 Fifth Ave, Suite 1108          fax:    212-425-5550
>     New York, NY 10017-3630
> 

-- 
Andrew Schorr                      e-mail: as...@te...
Telemetry Investments, L.L.C.      phone:  917-305-1748
545 Fifth Ave, Suite 1108          fax:    212-425-5550
New York, NY 10017-3630

Re: [Gawkextlib-users] gawk extension installation location issues

From: Andrew J. S. <as...@te...> - 2019-06-17 12:44:56

Hi Luis,

You do not need to compile your own gawk, if your system is running
gawk 5.0, which is pretty new. In general, your problem seems to be
that things are getting installed in the wrong place. When gawk is
compiled, the library search path is fixed for that binary. In your
case, it clearly does not include /usr/local/lib/gawk. You can
override the builtin setting by configuring a value for AWKLIBPATH
in your environment. For more info:
   https://www.gnu.org/software/gawk/manual/html_node/AWKLIBPATH-Variable.html

You can see the default value like so:
   bash-4.2$ /bin/gawk 'BEGIN {print ENVIRON["AWKLIBPATH"]}'
   /usr/lib64/gawk

In conclusion, you must do one of 3 things:

   1. Use --prefix to install gawk-xml (and gawkextlib) in your system's
   standard location compiled into the gawk binary.

   2. Compile your own version of gawk that will use a compatible
   value for --prefix so that it will be able to find the gawk-xml
   extension that you installed.

   3. Configure AWKLIBPATH in your environment so that gawk knows
   where to search for the gawk-xml extension.

Regards,
Andy

On Mon, Jun 17, 2019 at 11:25:08AM +0100, Luis P. Mendes wrote:
> Hi Andy,
> 
> I cannot reproduce the problem, even after untarring the gawkextlib 
> and gawk-xml tarballs again, without setting any --prefix=/usr/local 
> for the configuration script.
> Even the make check for gawk-xml works fine now.
> 
> 
> But now, still cannot use the xml tool:
> $ awk -l xml workingdocuments.awk awk_demos/a.xml awk: fatal: can't 
> open shared library `xml' for reading (No such file or directory)
> 
> The compilation arguments for my distro's gawk are:
> configure_args="--with-readline" 
> 
> $ whereis awk
> awk: /usr/bin/awk /usr/libexec/awk /usr/share/awk 
> /usr/share/man/man1p/awk.1p /usr/share/man/man1/awk.1
> 
> The installation destination for the xml library:
> /usr/local/lib/gawk/xml.so
> 
> 
> Do I need to compile a different gawk myself and place it under 
> /usr/local?
> 
> 
> Thanks,
> 
> 
> Luis
> 
> 
> On 20190616 09:50:04 -0400, Andrew J. Schorr wrote: 
> > Hi Luis, 
> >
> > The default autoconf prefix is /usr/local. If that's not where you intend 
> > to install, then you should specify a prefix explicitly. I recommend that 
> > you set --prefix explicitly instead of relying upon defaults. If you do 
> > that, does the build succeed? 
> >
> > If you email the entire sequence of commands that you are running, then it 
> > will enable us to duplicate the problem. 
> >
> > Also, that "Configured with" message seems to be incorrect. I tried 
> > the same thing, and it does say "Configured with: ../configure --prefix=/usr ...", 
> > but that's simply not true. That message seems to be a bug in the 
> > autoconf tools. If you run "grep prefix= config.log", you should see at the end: 
> >     prefix='/usr/local' 
> >
> > Regards, 
> > Andy 
> >
> > On Sat, Jun 15, 2019 at 11:35:33PM +0100, Luis P. Mendes wrote: 
> > > Hi Andy, 
> > >  
> > >  
> > > # gawk -V 
> > > GNU Awk 5.0.0, API: 2.0 
> > >  
> > >  
> > > I've downloaded and installed the latest gawkextlib I saw from  
> > > sourceforge: gawkextlib-1.0.4.tar.gz 
> > >  
> > >  
> > > I think I found an error. 
> > > This is in the config.log of gawkextlib: 
> > > Configured with: /builddir/gcc-8.3.0/configure  
> > > --build=x86_64-unknown-linux-gnu --enable-fast-character  
> > > --enable-vtable-verify --prefix=/usr --mandir=/usr/share/man  
> > > --infodir=/usr/share/info --libexecdir=/usr/    lib --libdir=/usr/lib  
> > > --enable-threads=posix --enable-__cxa_atexit --disable-multilib  
> > > --with-system-zlib --enable-shared --enable-lto --enable-plugins  
> > > --enable-linker-build-id --disable-werror --disable-nls --     
> > > enable-default-pie --enable-default-ssp --enable-checking=release  
> > > --disable-libstdcxx-pch --with-isl --with-linker-hash-style=gnu  
> > > --disable-libunwind-exceptions --disable-target-libiberty  
> > > --enable-serial-confi    gure  
> > > --enable-languages=c,c++,objc,obj-c++,fortran,lto,go,ada  
> > >  
> > > Please note that prefix is /usr, as I didn't specified any in the  
> > > configure command. 
> > >  
> > > But, 
> > >  
> > > # find /usr/ -name libgawkextlib.so.0 
> > > /usr/local/lib/libgawkextlib.so.0 
> > >  
> > > # ll /usr/local/lib/libgawkextlib.so 
> > > lrwxrwxrwx 1 root root 22 jun 15 18:26 /usr/local/lib/libgawkextlib.so  
> > > -> libgawkextlib.so.0.0.0 
> > >  
> > > It seems that /usr/local/lib was used instead of /usr/lib. 
> > >  
> > >  
> > >  
> > > In the config.log of gawk-xml: 
> > > Configured with: /builddir/gcc-8.3.0/configure  
> > > --build=x86_64-unknown-linux-gnu --enable-fast-character  
> > > --enable-vtable-verify --prefix=/usr --mandir=/usr/share/man  
> > > --infodir=/usr/share/info --libexecdir=/usr/    lib --libdir=/usr/lib  
> > > --enable-threads=posix --enable-__cxa_atexit --disable-multilib  
> > > --with-system-zlib --enable-shared --enable-lto --enable-plugins  
> > > --enable-linker-build-id --disable-werror --disable-nls --     
> > > enable-default-pie --enable-default-ssp --enable-checking=release  
> > > --disable-libstdcxx-pch --with-isl --with-linker-hash-style=gnu  
> > > --disable-libunwind-exceptions --disable-target-libiberty  
> > > --enable-serial-confi    gure  
> > > --enable-languages=c,c++,objc,obj-c++,fortran,lto,go,ada 
> > >  
> > >  
> > > After a make clean in gawk-xml, the make command outputs an error: 
> > > gawk: xmlbase:14: fatal: load_ext: cannot open library  
> > > `../.libs/xml.so' (libgawkextlib.so.0: cannot open shared object file:  
> > > No such file or directory) 
> > >  
> > >  
> > > Thanks, 
> > >  
> > >  
> > > Luis 
> > >  
> > >  
> > >  
> > > On 20190615 15:41:07 -0400, Andrew J. Schorr wrote:  
> > > > Hi Luis,  
> > > > 
> > > > Which versions of gawk and gawkextlib are you using?   
> > > > Can you please share the entire sequence of commands you are running  
> > > > up through the "make check" that fails? It is clearly not configuring  
> > > > the environment variables properly, so it can't find the extension,  
> > > > as Jürgen pointed out.  
> > > > 
> > > > Regards,  
> > > > Andy  
> > > > 
> > > > On Sat, Jun 15, 2019 at 07:38:34PM +0100, Luis P. Mendes wrote:  
> > > > > Jürgen,  
> > > > >   
> > > > > I'm on Linux, not MacOSX: 4.19.50_1 #1 SMP PREEMPT  x86_64 GNU/Linux  
> > > > >   
> > > > >   
> > > > > On Sat, Jun 15, 2019 at 7:29 PM Jürgen Kahrs via Gawkextlib-users <  
> > > > > gaw...@li...> wrote:  
> > > > >   
> > > > >     Hello Luis,  
> > > > >     I guess you tried this on MacOSX.  
> > > > >     This seems to be the reason for the failed test cases:  
> > > > >         .. LD_LIBRARY_PATH= DYLD_LIBRARY_PATH=  gawk  
> > > > >   
> > > > >     The environment variable DYLD_LIBRARY_PATH should have been  
> > > > >     expanded as $DYLD_LIBRARY_PATH during configuration.  
> > > > >     I am not quite sure where this information got lost, in the source  
> > > > >     code or during configuration on your machine. Any idea ?  
> > > > >   
> > > > >   
> > > > >     Jürgen Kahrs  
> > > > >   
> > > > >   
> > > > >   
> > > > >         Hi,  
> > > > >   
> > > > >         After some years since last used this wonderful tool, I'd like to  
> > > > >         install gawk-xml.  
> > > > >         As per the README, I've installed gawkextlib without problems.  
> > > > >         Next, `configure` and ``make` for gawk-xml went fine, but not `make  
> > > > >         check`.  
> > > > >         Output of the command below.  
> > > > >         Do I need to install extra packages?  
> > > > >         Thanks  
> > > > >   
> > > > >         $ make check  
> > > > >         Making check in awklib  
> > > > >         make[1]: Entering directory '/home/lupe/recolhidos/gawkextlib/  
> > > > >         gawk-xml-1.1.1/awklib'  
> > > > >         make[1]: Nothing to be done for 'check'.  
> > > > >         make[1]: Leaving directory '/home/lupe/recolhidos/gawkextlib/  
> > > > >         gawk-xml-1.1.1/awklib'  
> > > > >         Making check in po  
> > > > >         make[1]: Entering directory '/home/lupe/recolhidos/gawkextlib/  
> > > > >         gawk-xml-1.1.1/po'  
> > > > >         make[1]: Leaving directory '/home/lupe/recolhidos/gawkextlib/  
> > > > >         gawk-xml-1.1.1/po'  
> > > > >         Making check in packaging  
> > > > >         make[1]: Entering directory '/home/lupe/recolhidos/gawkextlib/  
> > > > >         gawk-xml-1.1.1/packaging'  
> > > > >         make[1]: Nothing to be done for 'check'.  
> > > > >         make[1]: Leaving directory '/home/lupe/recolhidos/gawkextlib/  
> > > > >         gawk-xml-1.1.1/packaging'  
> > > > >         Making check in test  
> > > > >         make[1]: Entering directory '/home/lupe/recolhidos/gawkextlib/  
> > > > >         gawk-xml-1.1.1/test'  
> > > > >   
> > > > >         AWK = LC_ALL=C LANG=C AWKLIBPATH=../.libs:/usr/lib/gawk PATH=/home/lupe  
> > > > >         /bin:/home/lupe/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/  
> > > > >         sbin:/sbin:/opt/texlive/2019/bin/x86_64-linux:/usr/local/bin/fim:/opt/  
> > > > >         texlive/2017/bin/x86_64-linux:/home/lupe/go/bin:/home/lupe/.fzf/bin:/  
> > > > >         usr/local/bin/fim:/opt/texlive/2017/bin/x86_64-linux:/home/lupe/prog/go  
> > > > >         /bin LD_LIBRARY_PATH= DYLD_LIBRARY_PATH=  gawk  
> > > > >   
> > > > >         Locale environment:  
> > > > >         LC_ALL="C" LANG="C"  
> > > > >   
> > > > >         ======== Starting XML extension tests ========  
> > > > >         xdocbook  
> > > > >         ./xdocbook.ok _xdocbook diferem: byte 1, linha 1  
> > > > >         make[1]: [Makefile:574: xdocbook] Error 1 (ignored)  
> > > > >         xdeep2  
> > > > >         ./xdeep2.ok _xdeep2 diferem: byte 1, linha 1  
> > > > >         make[1]: [Makefile:602: xdeep2] Error 1 (ignored)  
> > > > >         xattr  
> > > > >         ./xattr.ok _xattr diferem: byte 1, linha 1  
> > > > >         make[1]: [Makefile:608: xattr] Error 1 (ignored)  
> > > > >         xfujutf8  
> > > > >         ./xfujutf8.ok _xfujutf8 diferem: byte 1, linha 1  
> > > > >         make[1]: [Makefile:614: xfujutf8] Error 1 (ignored)  
> > > > >         xotlsjis  
> > > > >         ./xotlsjis.ok _xotlsjis diferem: byte 1, linha 1  
> > > > >         make[1]: [Makefile:626: xotlsjis] Error 1 (ignored)  
> > > > >         xfujeucj  
> > > > >         ./xfujeucj.ok _xfujeucj diferem: byte 1, linha 1  
> > > > >         make[1]: [Makefile:620: xfujeucj] Error 1 (ignored)  
> > > > >         ./xload.ok _xload diferem: byte 52, linha 2  
> > > > >         make[1]: [Makefile:631: xload] Error 1 (ignored)  
> > > > >         xmlinterleave  
> > > > >         ./xmlinterleave.ok _xmlinterleave diferem: byte 1, linha 1  
> > > > >         make[1]: [Makefile:582: xmlinterleave] Error 1 (ignored)  
> > > > >         beginfile  
> > > > >         ./beginfile.ok _beginfile diferem: byte 1, linha 1  
> > > > >         make[1]: [Makefile:588: beginfile] Error 1 (ignored)  
> > > > >         ======== Done with XML extension tests ========  
> > > > >         make[2]: Entering directory '/home/lupe/recolhidos/gawkextlib/  
> > > > >         gawk-xml-1.1.1/test'  
> > > > >         9 TESTS FAILED  
> > > > >         make[2]: *** [Makefile:516: pass-fail] Error 1  
> > > > >         make[2]: Leaving directory '/home/lupe/recolhidos/gawkextlib/  
> > > > >         gawk-xml-1.1.1/test'  
> > > > >         make[1]: *** [Makefile:555: check] Error 2  
> > > > >         make[1]: Leaving directory '/home/lupe/recolhidos/gawkextlib/  
> > > > >         gawk-xml-1.1.1/test'  
> > > > >         make: *** [Makefile:591: check-recursive] Error 1  
> > > > >   
> > > > >          
> > > > >          
> > > > >         _______________________________________________  
> > > > >         Gawkextlib-users mailing list  
> > > > >         Gaw...@li...  
> > > > >         https://lists.sourceforge.net/lists/listinfo/gawkextlib-users  
> > > > >   
> > > > >   
> > > > >     _______________________________________________  
> > > > >     Gawkextlib-users mailing list  
> > > > >     Gaw...@li...  
> > > > >     https://lists.sourceforge.net/lists/listinfo/gawkextlib-users  
> > > > >   
> > > > 
> > > > 
> > > > > _______________________________________________  
> > > > > Gawkextlib-users mailing list  
> > > > > Gaw...@li...  
> > > > > https://lists.sourceforge.net/lists/listinfo/gawkextlib-users  
> > > > 
> > > > 
> > >  
> > >  
> > > _______________________________________________ 
> > > Gawkextlib-users mailing list 
> > > Gaw...@li... 
> > > https://lists.sourceforge.net/lists/listinfo/gawkextlib-users 
> >
> 
> 
> _______________________________________________
> Gawkextlib-users mailing list
> Gaw...@li...
> https://lists.sourceforge.net/lists/listinfo/gawkextlib-users

-- 
Andrew Schorr                      e-mail: as...@te...
Telemetry Investments, L.L.C.      phone:  917-305-1748
545 Fifth Ave, Suite 1108          fax:    212-425-5550
New York, NY 10017-3630

Re: [Gawkextlib-users] make check fails for gawk-xml-1.1.1

From: Luis P. M. <lui...@gm...> - 2019-06-17 10:25:20

Hi Andy,

I cannot reproduce the problem, even after untarring the gawkextlib 
and gawk-xml tarballs again, without setting any --prefix=/usr/local 
for the configuration script.
Even the make check for gawk-xml works fine now.


But now, still cannot use the xml tool:
$ awk -l xml workingdocuments.awk awk_demos/a.xml awk: fatal: can't 
open shared library `xml' for reading (No such file or directory)

The compilation arguments for my distro's gawk are:
configure_args="--with-readline" 

$ whereis awk
awk: /usr/bin/awk /usr/libexec/awk /usr/share/awk 
/usr/share/man/man1p/awk.1p /usr/share/man/man1/awk.1

The installation destination for the xml library:
/usr/local/lib/gawk/xml.so


Do I need to compile a different gawk myself and place it under 
/usr/local?


Thanks,


Luis


On 20190616 09:50:04 -0400, Andrew J. Schorr wrote: 
> Hi Luis, 
>
> The default autoconf prefix is /usr/local. If that's not where you intend 
> to install, then you should specify a prefix explicitly. I recommend that 
> you set --prefix explicitly instead of relying upon defaults. If you do 
> that, does the build succeed? 
>
> If you email the entire sequence of commands that you are running, then it 
> will enable us to duplicate the problem. 
>
> Also, that "Configured with" message seems to be incorrect. I tried 
> the same thing, and it does say "Configured with: ../configure --prefix=/usr ...", 
> but that's simply not true. That message seems to be a bug in the 
> autoconf tools. If you run "grep prefix= config.log", you should see at the end: 
>     prefix='/usr/local' 
>
> Regards, 
> Andy 
>
> On Sat, Jun 15, 2019 at 11:35:33PM +0100, Luis P. Mendes wrote: 
> > Hi Andy, 
> >  
> >  
> > # gawk -V 
> > GNU Awk 5.0.0, API: 2.0 
> >  
> >  
> > I've downloaded and installed the latest gawkextlib I saw from  
> > sourceforge: gawkextlib-1.0.4.tar.gz 
> >  
> >  
> > I think I found an error. 
> > This is in the config.log of gawkextlib: 
> > Configured with: /builddir/gcc-8.3.0/configure  
> > --build=x86_64-unknown-linux-gnu --enable-fast-character  
> > --enable-vtable-verify --prefix=/usr --mandir=/usr/share/man  
> > --infodir=/usr/share/info --libexecdir=/usr/    lib --libdir=/usr/lib  
> > --enable-threads=posix --enable-__cxa_atexit --disable-multilib  
> > --with-system-zlib --enable-shared --enable-lto --enable-plugins  
> > --enable-linker-build-id --disable-werror --disable-nls --     
> > enable-default-pie --enable-default-ssp --enable-checking=release  
> > --disable-libstdcxx-pch --with-isl --with-linker-hash-style=gnu  
> > --disable-libunwind-exceptions --disable-target-libiberty  
> > --enable-serial-confi    gure  
> > --enable-languages=c,c++,objc,obj-c++,fortran,lto,go,ada  
> >  
> > Please note that prefix is /usr, as I didn't specified any in the  
> > configure command. 
> >  
> > But, 
> >  
> > # find /usr/ -name libgawkextlib.so.0 
> > /usr/local/lib/libgawkextlib.so.0 
> >  
> > # ll /usr/local/lib/libgawkextlib.so 
> > lrwxrwxrwx 1 root root 22 jun 15 18:26 /usr/local/lib/libgawkextlib.so  
> > -> libgawkextlib.so.0.0.0 
> >  
> > It seems that /usr/local/lib was used instead of /usr/lib. 
> >  
> >  
> >  
> > In the config.log of gawk-xml: 
> > Configured with: /builddir/gcc-8.3.0/configure  
> > --build=x86_64-unknown-linux-gnu --enable-fast-character  
> > --enable-vtable-verify --prefix=/usr --mandir=/usr/share/man  
> > --infodir=/usr/share/info --libexecdir=/usr/    lib --libdir=/usr/lib  
> > --enable-threads=posix --enable-__cxa_atexit --disable-multilib  
> > --with-system-zlib --enable-shared --enable-lto --enable-plugins  
> > --enable-linker-build-id --disable-werror --disable-nls --     
> > enable-default-pie --enable-default-ssp --enable-checking=release  
> > --disable-libstdcxx-pch --with-isl --with-linker-hash-style=gnu  
> > --disable-libunwind-exceptions --disable-target-libiberty  
> > --enable-serial-confi    gure  
> > --enable-languages=c,c++,objc,obj-c++,fortran,lto,go,ada 
> >  
> >  
> > After a make clean in gawk-xml, the make command outputs an error: 
> > gawk: xmlbase:14: fatal: load_ext: cannot open library  
> > `../.libs/xml.so' (libgawkextlib.so.0: cannot open shared object file:  
> > No such file or directory) 
> >  
> >  
> > Thanks, 
> >  
> >  
> > Luis 
> >  
> >  
> >  
> > On 20190615 15:41:07 -0400, Andrew J. Schorr wrote:  
> > > Hi Luis,  
> > > 
> > > Which versions of gawk and gawkextlib are you using?   
> > > Can you please share the entire sequence of commands you are running  
> > > up through the "make check" that fails? It is clearly not configuring  
> > > the environment variables properly, so it can't find the extension,  
> > > as Jürgen pointed out.  
> > > 
> > > Regards,  
> > > Andy  
> > > 
> > > On Sat, Jun 15, 2019 at 07:38:34PM +0100, Luis P. Mendes wrote:  
> > > > Jürgen,  
> > > >   
> > > > I'm on Linux, not MacOSX: 4.19.50_1 #1 SMP PREEMPT  x86_64 GNU/Linux  
> > > >   
> > > >   
> > > > On Sat, Jun 15, 2019 at 7:29 PM Jürgen Kahrs via Gawkextlib-users <  
> > > > gaw...@li...> wrote:  
> > > >   
> > > >     Hello Luis,  
> > > >     I guess you tried this on MacOSX.  
> > > >     This seems to be the reason for the failed test cases:  
> > > >         .. LD_LIBRARY_PATH= DYLD_LIBRARY_PATH=  gawk  
> > > >   
> > > >     The environment variable DYLD_LIBRARY_PATH should have been  
> > > >     expanded as $DYLD_LIBRARY_PATH during configuration.  
> > > >     I am not quite sure where this information got lost, in the source  
> > > >     code or during configuration on your machine. Any idea ?  
> > > >   
> > > >   
> > > >     Jürgen Kahrs  
> > > >   
> > > >   
> > > >   
> > > >         Hi,  
> > > >   
> > > >         After some years since last used this wonderful tool, I'd like to  
> > > >         install gawk-xml.  
> > > >         As per the README, I've installed gawkextlib without problems.  
> > > >         Next, `configure` and ``make` for gawk-xml went fine, but not `make  
> > > >         check`.  
> > > >         Output of the command below.  
> > > >         Do I need to install extra packages?  
> > > >         Thanks  
> > > >   
> > > >         $ make check  
> > > >         Making check in awklib  
> > > >         make[1]: Entering directory '/home/lupe/recolhidos/gawkextlib/  
> > > >         gawk-xml-1.1.1/awklib'  
> > > >         make[1]: Nothing to be done for 'check'.  
> > > >         make[1]: Leaving directory '/home/lupe/recolhidos/gawkextlib/  
> > > >         gawk-xml-1.1.1/awklib'  
> > > >         Making check in po  
> > > >         make[1]: Entering directory '/home/lupe/recolhidos/gawkextlib/  
> > > >         gawk-xml-1.1.1/po'  
> > > >         make[1]: Leaving directory '/home/lupe/recolhidos/gawkextlib/  
> > > >         gawk-xml-1.1.1/po'  
> > > >         Making check in packaging  
> > > >         make[1]: Entering directory '/home/lupe/recolhidos/gawkextlib/  
> > > >         gawk-xml-1.1.1/packaging'  
> > > >         make[1]: Nothing to be done for 'check'.  
> > > >         make[1]: Leaving directory '/home/lupe/recolhidos/gawkextlib/  
> > > >         gawk-xml-1.1.1/packaging'  
> > > >         Making check in test  
> > > >         make[1]: Entering directory '/home/lupe/recolhidos/gawkextlib/  
> > > >         gawk-xml-1.1.1/test'  
> > > >   
> > > >         AWK = LC_ALL=C LANG=C AWKLIBPATH=../.libs:/usr/lib/gawk PATH=/home/lupe  
> > > >         /bin:/home/lupe/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/  
> > > >         sbin:/sbin:/opt/texlive/2019/bin/x86_64-linux:/usr/local/bin/fim:/opt/  
> > > >         texlive/2017/bin/x86_64-linux:/home/lupe/go/bin:/home/lupe/.fzf/bin:/  
> > > >         usr/local/bin/fim:/opt/texlive/2017/bin/x86_64-linux:/home/lupe/prog/go  
> > > >         /bin LD_LIBRARY_PATH= DYLD_LIBRARY_PATH=  gawk  
> > > >   
> > > >         Locale environment:  
> > > >         LC_ALL="C" LANG="C"  
> > > >   
> > > >         ======== Starting XML extension tests ========  
> > > >         xdocbook  
> > > >         ./xdocbook.ok _xdocbook diferem: byte 1, linha 1  
> > > >         make[1]: [Makefile:574: xdocbook] Error 1 (ignored)  
> > > >         xdeep2  
> > > >         ./xdeep2.ok _xdeep2 diferem: byte 1, linha 1  
> > > >         make[1]: [Makefile:602: xdeep2] Error 1 (ignored)  
> > > >         xattr  
> > > >         ./xattr.ok _xattr diferem: byte 1, linha 1  
> > > >         make[1]: [Makefile:608: xattr] Error 1 (ignored)  
> > > >         xfujutf8  
> > > >         ./xfujutf8.ok _xfujutf8 diferem: byte 1, linha 1  
> > > >         make[1]: [Makefile:614: xfujutf8] Error 1 (ignored)  
> > > >         xotlsjis  
> > > >         ./xotlsjis.ok _xotlsjis diferem: byte 1, linha 1  
> > > >         make[1]: [Makefile:626: xotlsjis] Error 1 (ignored)  
> > > >         xfujeucj  
> > > >         ./xfujeucj.ok _xfujeucj diferem: byte 1, linha 1  
> > > >         make[1]: [Makefile:620: xfujeucj] Error 1 (ignored)  
> > > >         ./xload.ok _xload diferem: byte 52, linha 2  
> > > >         make[1]: [Makefile:631: xload] Error 1 (ignored)  
> > > >         xmlinterleave  
> > > >         ./xmlinterleave.ok _xmlinterleave diferem: byte 1, linha 1  
> > > >         make[1]: [Makefile:582: xmlinterleave] Error 1 (ignored)  
> > > >         beginfile  
> > > >         ./beginfile.ok _beginfile diferem: byte 1, linha 1  
> > > >         make[1]: [Makefile:588: beginfile] Error 1 (ignored)  
> > > >         ======== Done with XML extension tests ========  
> > > >         make[2]: Entering directory '/home/lupe/recolhidos/gawkextlib/  
> > > >         gawk-xml-1.1.1/test'  
> > > >         9 TESTS FAILED  
> > > >         make[2]: *** [Makefile:516: pass-fail] Error 1  
> > > >         make[2]: Leaving directory '/home/lupe/recolhidos/gawkextlib/  
> > > >         gawk-xml-1.1.1/test'  
> > > >         make[1]: *** [Makefile:555: check] Error 2  
> > > >         make[1]: Leaving directory '/home/lupe/recolhidos/gawkextlib/  
> > > >         gawk-xml-1.1.1/test'  
> > > >         make: *** [Makefile:591: check-recursive] Error 1  
> > > >   
> > > >          
> > > >          
> > > >         _______________________________________________  
> > > >         Gawkextlib-users mailing list  
> > > >         Gaw...@li...  
> > > >         https://lists.sourceforge.net/lists/listinfo/gawkextlib-users  
> > > >   
> > > >   
> > > >     _______________________________________________  
> > > >     Gawkextlib-users mailing list  
> > > >     Gaw...@li...  
> > > >     https://lists.sourceforge.net/lists/listinfo/gawkextlib-users  
> > > >   
> > > 
> > > 
> > > > _______________________________________________  
> > > > Gawkextlib-users mailing list  
> > > > Gaw...@li...  
> > > > https://lists.sourceforge.net/lists/listinfo/gawkextlib-users  
> > > 
> > > 
> >  
> >  
> > _______________________________________________ 
> > Gawkextlib-users mailing list 
> > Gaw...@li... 
> > https://lists.sourceforge.net/lists/listinfo/gawkextlib-users 
>

Re: [Gawkextlib-users] make check fails for gawk-xml-1.1.1

From: Andrew J. S. <as...@te...> - 2019-06-16 13:50:13

Hi Luis,

The default autoconf prefix is /usr/local. If that's not where you intend
to install, then you should specify a prefix explicitly. I recommend that
you set --prefix explicitly instead of relying upon defaults. If you do
that, does the build succeed?

If you email the entire sequence of commands that you are running, then it
will enable us to duplicate the problem.

Also, that "Configured with" message seems to be incorrect. I tried
the same thing, and it does say "Configured with: ../configure --prefix=/usr ...",
but that's simply not true. That message seems to be a bug in the
autoconf tools. If you run "grep prefix= config.log", you should see at the end:
    prefix='/usr/local'

Regards,
Andy

On Sat, Jun 15, 2019 at 11:35:33PM +0100, Luis P. Mendes wrote:
> Hi Andy,
> 
> 
> # gawk -V
> GNU Awk 5.0.0, API: 2.0
> 
> 
> I've downloaded and installed the latest gawkextlib I saw from 
> sourceforge: gawkextlib-1.0.4.tar.gz
> 
> 
> I think I found an error.
> This is in the config.log of gawkextlib:
> Configured with: /builddir/gcc-8.3.0/configure 
> --build=x86_64-unknown-linux-gnu --enable-fast-character 
> --enable-vtable-verify --prefix=/usr --mandir=/usr/share/man 
> --infodir=/usr/share/info --libexecdir=/usr/    lib --libdir=/usr/lib 
> --enable-threads=posix --enable-__cxa_atexit --disable-multilib 
> --with-system-zlib --enable-shared --enable-lto --enable-plugins 
> --enable-linker-build-id --disable-werror --disable-nls --    
> enable-default-pie --enable-default-ssp --enable-checking=release 
> --disable-libstdcxx-pch --with-isl --with-linker-hash-style=gnu 
> --disable-libunwind-exceptions --disable-target-libiberty 
> --enable-serial-confi    gure 
> --enable-languages=c,c++,objc,obj-c++,fortran,lto,go,ada 
> 
> Please note that prefix is /usr, as I didn't specified any in the 
> configure command.
> 
> But,
> 
> # find /usr/ -name libgawkextlib.so.0
> /usr/local/lib/libgawkextlib.so.0
> 
> # ll /usr/local/lib/libgawkextlib.so
> lrwxrwxrwx 1 root root 22 jun 15 18:26 /usr/local/lib/libgawkextlib.so 
> -> libgawkextlib.so.0.0.0
> 
> It seems that /usr/local/lib was used instead of /usr/lib.
> 
> 
> 
> In the config.log of gawk-xml:
> Configured with: /builddir/gcc-8.3.0/configure 
> --build=x86_64-unknown-linux-gnu --enable-fast-character 
> --enable-vtable-verify --prefix=/usr --mandir=/usr/share/man 
> --infodir=/usr/share/info --libexecdir=/usr/    lib --libdir=/usr/lib 
> --enable-threads=posix --enable-__cxa_atexit --disable-multilib 
> --with-system-zlib --enable-shared --enable-lto --enable-plugins 
> --enable-linker-build-id --disable-werror --disable-nls --    
> enable-default-pie --enable-default-ssp --enable-checking=release 
> --disable-libstdcxx-pch --with-isl --with-linker-hash-style=gnu 
> --disable-libunwind-exceptions --disable-target-libiberty 
> --enable-serial-confi    gure 
> --enable-languages=c,c++,objc,obj-c++,fortran,lto,go,ada
> 
> 
> After a make clean in gawk-xml, the make command outputs an error:
> gawk: xmlbase:14: fatal: load_ext: cannot open library 
> `../.libs/xml.so' (libgawkextlib.so.0: cannot open shared object file: 
> No such file or directory)
> 
> 
> Thanks,
> 
> 
> Luis
> 
> 
> 
> On 20190615 15:41:07 -0400, Andrew J. Schorr wrote: 
> > Hi Luis, 
> >
> > Which versions of gawk and gawkextlib are you using?  
> > Can you please share the entire sequence of commands you are running 
> > up through the "make check" that fails? It is clearly not configuring 
> > the environment variables properly, so it can't find the extension, 
> > as Jürgen pointed out. 
> >
> > Regards, 
> > Andy 
> >
> > On Sat, Jun 15, 2019 at 07:38:34PM +0100, Luis P. Mendes wrote: 
> > > Jürgen, 
> > >  
> > > I'm on Linux, not MacOSX: 4.19.50_1 #1 SMP PREEMPT  x86_64 GNU/Linux 
> > >  
> > >  
> > > On Sat, Jun 15, 2019 at 7:29 PM Jürgen Kahrs via Gawkextlib-users < 
> > > gaw...@li...> wrote: 
> > >  
> > >     Hello Luis, 
> > >     I guess you tried this on MacOSX. 
> > >     This seems to be the reason for the failed test cases: 
> > >         .. LD_LIBRARY_PATH= DYLD_LIBRARY_PATH=  gawk 
> > >  
> > >     The environment variable DYLD_LIBRARY_PATH should have been 
> > >     expanded as $DYLD_LIBRARY_PATH during configuration. 
> > >     I am not quite sure where this information got lost, in the source 
> > >     code or during configuration on your machine. Any idea ? 
> > >  
> > >  
> > >     Jürgen Kahrs 
> > >  
> > >  
> > >  
> > >         Hi, 
> > >  
> > >         After some years since last used this wonderful tool, I'd like to 
> > >         install gawk-xml. 
> > >         As per the README, I've installed gawkextlib without problems. 
> > >         Next, `configure` and ``make` for gawk-xml went fine, but not `make 
> > >         check`. 
> > >         Output of the command below. 
> > >         Do I need to install extra packages? 
> > >         Thanks 
> > >  
> > >         $ make check 
> > >         Making check in awklib 
> > >         make[1]: Entering directory '/home/lupe/recolhidos/gawkextlib/ 
> > >         gawk-xml-1.1.1/awklib' 
> > >         make[1]: Nothing to be done for 'check'. 
> > >         make[1]: Leaving directory '/home/lupe/recolhidos/gawkextlib/ 
> > >         gawk-xml-1.1.1/awklib' 
> > >         Making check in po 
> > >         make[1]: Entering directory '/home/lupe/recolhidos/gawkextlib/ 
> > >         gawk-xml-1.1.1/po' 
> > >         make[1]: Leaving directory '/home/lupe/recolhidos/gawkextlib/ 
> > >         gawk-xml-1.1.1/po' 
> > >         Making check in packaging 
> > >         make[1]: Entering directory '/home/lupe/recolhidos/gawkextlib/ 
> > >         gawk-xml-1.1.1/packaging' 
> > >         make[1]: Nothing to be done for 'check'. 
> > >         make[1]: Leaving directory '/home/lupe/recolhidos/gawkextlib/ 
> > >         gawk-xml-1.1.1/packaging' 
> > >         Making check in test 
> > >         make[1]: Entering directory '/home/lupe/recolhidos/gawkextlib/ 
> > >         gawk-xml-1.1.1/test' 
> > >  
> > >         AWK = LC_ALL=C LANG=C AWKLIBPATH=../.libs:/usr/lib/gawk PATH=/home/lupe 
> > >         /bin:/home/lupe/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/ 
> > >         sbin:/sbin:/opt/texlive/2019/bin/x86_64-linux:/usr/local/bin/fim:/opt/ 
> > >         texlive/2017/bin/x86_64-linux:/home/lupe/go/bin:/home/lupe/.fzf/bin:/ 
> > >         usr/local/bin/fim:/opt/texlive/2017/bin/x86_64-linux:/home/lupe/prog/go 
> > >         /bin LD_LIBRARY_PATH= DYLD_LIBRARY_PATH=  gawk 
> > >  
> > >         Locale environment: 
> > >         LC_ALL="C" LANG="C" 
> > >  
> > >         ======== Starting XML extension tests ======== 
> > >         xdocbook 
> > >         ./xdocbook.ok _xdocbook diferem: byte 1, linha 1 
> > >         make[1]: [Makefile:574: xdocbook] Error 1 (ignored) 
> > >         xdeep2 
> > >         ./xdeep2.ok _xdeep2 diferem: byte 1, linha 1 
> > >         make[1]: [Makefile:602: xdeep2] Error 1 (ignored) 
> > >         xattr 
> > >         ./xattr.ok _xattr diferem: byte 1, linha 1 
> > >         make[1]: [Makefile:608: xattr] Error 1 (ignored) 
> > >         xfujutf8 
> > >         ./xfujutf8.ok _xfujutf8 diferem: byte 1, linha 1 
> > >         make[1]: [Makefile:614: xfujutf8] Error 1 (ignored) 
> > >         xotlsjis 
> > >         ./xotlsjis.ok _xotlsjis diferem: byte 1, linha 1 
> > >         make[1]: [Makefile:626: xotlsjis] Error 1 (ignored) 
> > >         xfujeucj 
> > >         ./xfujeucj.ok _xfujeucj diferem: byte 1, linha 1 
> > >         make[1]: [Makefile:620: xfujeucj] Error 1 (ignored) 
> > >         ./xload.ok _xload diferem: byte 52, linha 2 
> > >         make[1]: [Makefile:631: xload] Error 1 (ignored) 
> > >         xmlinterleave 
> > >         ./xmlinterleave.ok _xmlinterleave diferem: byte 1, linha 1 
> > >         make[1]: [Makefile:582: xmlinterleave] Error 1 (ignored) 
> > >         beginfile 
> > >         ./beginfile.ok _beginfile diferem: byte 1, linha 1 
> > >         make[1]: [Makefile:588: beginfile] Error 1 (ignored) 
> > >         ======== Done with XML extension tests ======== 
> > >         make[2]: Entering directory '/home/lupe/recolhidos/gawkextlib/ 
> > >         gawk-xml-1.1.1/test' 
> > >         9 TESTS FAILED 
> > >         make[2]: *** [Makefile:516: pass-fail] Error 1 
> > >         make[2]: Leaving directory '/home/lupe/recolhidos/gawkextlib/ 
> > >         gawk-xml-1.1.1/test' 
> > >         make[1]: *** [Makefile:555: check] Error 2 
> > >         make[1]: Leaving directory '/home/lupe/recolhidos/gawkextlib/ 
> > >         gawk-xml-1.1.1/test' 
> > >         make: *** [Makefile:591: check-recursive] Error 1 
> > >  
> > >         
> > >         
> > >         _______________________________________________ 
> > >         Gawkextlib-users mailing list 
> > >         Gaw...@li... 
> > >         https://lists.sourceforge.net/lists/listinfo/gawkextlib-users 
> > >  
> > >  
> > >     _______________________________________________ 
> > >     Gawkextlib-users mailing list 
> > >     Gaw...@li... 
> > >     https://lists.sourceforge.net/lists/listinfo/gawkextlib-users 
> > >  
> >
> >
> > > _______________________________________________ 
> > > Gawkextlib-users mailing list 
> > > Gaw...@li... 
> > > https://lists.sourceforge.net/lists/listinfo/gawkextlib-users 
> >
> >
> 
> 
> _______________________________________________
> Gawkextlib-users mailing list
> Gaw...@li...
> https://lists.sourceforge.net/lists/listinfo/gawkextlib-users

-- 
Andrew Schorr                      e-mail: as...@te...
Telemetry Investments, L.L.C.      phone:  917-305-1748
545 Fifth Ave, Suite 1108          fax:    212-425-5550
New York, NY 10017-3630

17 messages has been excluded from this view by a project administrator.

Flat | Threaded

<< < 1 2 3 4 5 .. 15 > >> (Page 3 of 15)