Thread: [Videlibri-xidel] documentation for syntax to extract two or more fields from same "record" ?
Client for public libraries
Brought to you by:
benibela
From: Marco F. <mfi...@ne...> - 2023-08-05 07:28:52
|
Greetings, (I also posted this on github because I am not sure what the best channel is...) I am learning Xidel 0.9.8 on Ubuntu. My issue is that I have looked into the documentation, but as far as I can tell it gives no clue (none I recognize, at least) to do this. I have a JSON file with records that have, among others, id and title fields, eg: #> jq '.' test.json | cut -c1-100 | more [ { "id": 42, "title": "Software is eating the world", I want to extract with Xidel those two values, producing output lines like this : ARTICLE: 42 ==> Software is eating the world or at least like this: 42 ==> Software is eating the world and I can't find, or recognize, the right syntax to use for what seems a general, very common need to me. The closest I have come to what I want is this: xidel test.json -e 'for $t in $json/title return string-join(("$t/../id", $t), " ==> ")' which produces lines like these: $t/../id ==> Software is eating the world what is the right way to refer to the id value of the current record??? Thanks! Marco |
From: Reino W. <rwi...@xs...> - 2023-08-05 15:32:23
|
Hello Marco, On 2023-08-05T09:09:17+0200, Marco Fioretti <mfi...@ne...> wrote: > (I also posted this on github because I am not sure what the best channel is...) It's Benito's project of course, but I'd say that while the Github issue-tracker is a good place for bug-reports, for questions about usage the mailinglist here, the SourceForge discussion forum, or StackOverflow is a better place. > I am learning Xidel 0.9.8 on Ubuntu. It's really recommended to use a more up-to-date version! See https://videlibri.sourceforge.net/xidel.html#downloads. > I have a JSON file with records that have, among others, id and title fields,eg: > > #> jq '.' test.json | cut -c1-100 | more > [ > { > "id": 42, > "title": "Software is eating the world", > > I want to extract with Xidel those two values, producing output lines like this: > > ARTICLE: 42 ==> Software is eating the world > [...] Can we assume the rest of the JSON looks a bit like this? [ { "id": 42, "title": "Software is eating the world" }, { "id": 43, "title": "..." } ... ] If not, then please specify. -- Reino |
From: M. F. <mfi...@ne...> - 2023-08-05 16:05:41
|
On Sat, Aug 05, 2023 17:10:39 PM +0200, Reino Wijnsma wrote: > Can we assume the rest of the JSON looks a bit like this? > > [ > { > "id": 42, > "title": "Software is eating the world" > }, > { > "id": 43, > "title": "..." > } > ... > ] > > If not, then please specify. Hi Reino, yes, the JSON does all look like that. There are OTHER fields e.g. url, creation date and so on, but all the records have the same structure. Looking forward to suggestions. Meanwhile, I will also try to download the newer version. Thanks, Marco |
From: Reino W. <rwi...@xs...> - 2023-08-05 17:01:42
|
On 2023-08-05T18:05:28+0200, M. Fioretti <mfi...@ne...> wrote: > yes, the JSON does all look like that. Then a simple string-concatenation would suffice: xidel -s test.json -e '$json()/concat("ARTICLE: ",id," ==> ",title)' Or with the latest XPath 4 extended-string-syntax: xidel -s test.json -e '$json()/`ARTICLE: {id} ==> {title}`' Your JSON is an array, so be sure to use $json(), or $json?* (XPath/XQuery 3 syntax), to itterate over its members. On 2023-08-05T09:09:17+0200, Marco Fioretti <mfi...@ne...> wrote: > My issue is that I have looked into the documentation, but as far as I can tell it gives no clue (none I recognize, at least) to do this. What documentation specifically? When I first encountered Xidel long ago I thought lots of thing were not documented, until I realized it had full support for (at that time) XPath/Xquery 2.0, which has its own documentation. Last week I made this post <https://sourceforge.net/p/xidel/discussion/help/thread/9bebdbf105/#35a8>, which in my opinion links to a lot of interesting Xidel specific and general XPath/XQuery information. -- Reino |
From: M. F. <mfi...@ne...> - 2023-08-05 18:24:50
|
On Sat, Aug 05, 2023 19:01:34 PM +0200, Reino Wijnsma wrote: > On 2023-08-05T18:05:28+0200, M. Fioretti <mfi...@ne...> wrote: > > yes, the JSON does all look like that. > > > Then a simple string-concatenation would suffice: > > xidel -s test.json -e '$json()/concat("ARTICLE: ",id," ==> ",title)' > > Or with the latest XPath 4 extended-string-syntax: > > xidel -s test.json -e '$json()/`ARTICLE: {id} ==> {title}`' thanks a LOT. Just tried both commands, and confirm that they both work as I need. > Your JSON is an array, so be sure to use $json(), or $json?*... And THIS is the one (in hindsight, obvious) thing that I was missing > What documentation specifically? > When I first encountered Xidel long ago I thought lots of thing were > not documented, until I realized it had full support for (at that > time) XPath/ Xquery 2.0, which has its own documentation. Indeed, probably the problem I have may very well be that even THAT documentation is hard to recognize/navigate (for me, at least). For example, now that I do have the output I asked for, all lines like: ARTICLE: 42 ==> Software is eating the world the next step would be to make each "column" fixed length, for readability, i.e. this: ARTICLE: 42 ==> Software is eating the world ARTICLE: 3942 ==> Software licenses in the age of AI instead of: ARTICLE: 42 ==> Software is eating the world ARTICLE: 3942 ==> Software licenses in the age of AI in Perl, C, bash... I'd use sprintf, but Is there an XQuery / Xpath version of it? Doesn't seem so. Thanks, Marco |
From: Reino W. <rwi...@xs...> - 2023-08-05 21:29:10
|
On 2023-08-05T20:24:40+0200, M. Fioretti <mfi...@ne...> wrote: > the next step would be to make each "column" fixed length, for > readability, i.e. this: > > ARTICLE: 42 ==> Software is eating the world > ARTICLE: 3942 ==> Software licenses in the age of AI If it's only about the id, if it doesn't exceed 9999, then the following trick would do: echo '[ { "id": 1, "title": "Software A" }, { "id": 10, "title": "Software B" }, { "id": 100, "title": "Software C" }, { "id": 1000, "title": "Software D" } ]' | xidel -se ' $json()/concat( "ARTICLE: ", *substring(" ",1,4 - string-length(id))||id,* " ==> ", title ) ' ARTICLE: 1 ==> Software A ARTICLE: 10 ==> Software B ARTICLE: 100 ==> Software C ARTICLE: 1000 ==> Software D If you have a situation where you need every "column" to have a fixed width, then it IS possible, but your query will become very difficult. My own hobby-project for example: https://github.com/Reino17/xivid/blob/master/xivid_notes.txt#L1465-L1730. Another example that comes to mind: https://sourceforge.net/p/xidel/discussion/help/thread/031d881982/#a3e1. -- Reino |
From: Reino W. <rwi...@xs...> - 2024-09-28 13:59:02
|
Hello videlibri-xidel and Marco, On 2023-08-05T09:09:17+0200, Marco Fioretti <mfi...@ne...> wrote: > I am learning Xidel 0.9.8 on Ubuntu. [...] I just found out that you're a freelance author, because by pure coincidence I stumbled upon your article https://www.linux-magazine.com/Issues/2023/276/Xidel. Great job! A good article for anyone new to Xidel. I have no idea if Benito has already seen it, but I'm sure he'll agree. -- Reino |