Menu

Parsing JSON

When you query a wiki page with the Allura REST API, you receive a JSON representation of the requested page. To parse JSON from a shell I could find two good alternatives: jq and the Python json library.

I did some initial tests and I found a problem with jq getting the markdown text. The default output looked OK, but the raw output, that I actually needed, looked like gibberish. With Python I immediately got the right output, so I decided to stick with Python.

For both jq and the Python json library you can get it done with a single line command, but obviously the syntax is more complex for Python. Still to get a string or a list of strings with Python is quite easy.

$ curl -s -k -X GET \
> https://sourceforge.net/rest/p/demo-project/wiki/Project%20Web%20Services-Draft/ \
> | jq -r '.text'
$ curl -s -k -X GET \
> https://sourceforge.net/rest/p/demo-project/wiki/Project%20Web%20Services-Draft/ \
> | python -c "import sys, json; print(json.load(sys.stdin)['text'])"

The list of attachments in the JSON representation of a SourcForge wiki page is a list objects with key-value pairs for URL and size. I wanted to retrieve a list of URLs. This is really very simple with jq and I had a hard time finding the solution for python. But that was mainly because I was expecting a short Python notation for this, while I really had to add an extra step.

$ curl -s -k -X GET \
> https://sourceforge.net/rest/p/demo-project/wiki/Project%20Web%20Services-Draft/ \
> | jq -r '.attachments[].url'
$ curl -s -k -X GET \
> https://sourceforge.net/rest/p/demo-project/wiki/Project%20Web%20Services-Draft/ \
> | python -c "import sys, json; \
> attachments = json.load(sys.stdin)['attachments']; \
> print('\n'.join([a['url'] for a in attachments]))"

While writing this blog I was not able to reproduce the problematic jq behaviour with regard to getting the markdown text. I probably made some error in the notation, because now it seems to work fine. So I am definitely switching to jq in the script, because it is simpler to use, cleaner and makes the script more readable.

Posted by Henk van den Akker 2022-07-01 Labels: Development

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.