Since version 22.01, WCM supports to extract information from XML and JSON documents. These can be numbers (for example) that be further processed using the interpreter function for a comparison against an expected / changed value to create an web change alert. Furthermore, if the server features a REST API it can be used to access dedicated information using dedicated server-requests.
For XML, WCM supports the XML/XPath notation to do so as implemented in the pugixml library
...then you can access the price of the second book (the index is zero-based!) by this XML/XPath notation: /books/book[1]/price
...which will return: 23.58
For JSON it is similar. Consider this JSON document:
{
"books":
[
{
"title" : "A Wild Sheep Chase",
"price" : 22.72
},
{
"title" : "The Night Watch",
"price" : 23.58
},
{
"title" : "The Comedians",
"price" : 21.99
},
]
}
...then you can access the price of the second book by this JSON/Pointer notation: /books/1/price
...alternatively you can use JSONPath like this: $.books.1.price
Again, if you let an interpreter operation follow then you track prices of items, for example.
For XML, there is something special to consider:
XML/XPath does not work on HTML because HTML is not a valid XML document you can apply XPath on. For these cases, WCM offers the option to try to convert HTML to a valid XML document (in fact, it becomes something like XHTML). You have to enable this conversion in the options of the XML/XPath filter. On the other hand, if the webpage is already XML or XHTML you don't need and should not apply the conversion as it may lead to errors otherwise.
👍
1
Last edit: Morten MacFly 2023-09-23
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Yes, indeed - it could probably solve your problem! For me it was more to support REST APIs on servers where you send a query and get often JSON in return. Glad to hear it may be of further use...
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Since version 22.01, WCM supports to extract information from XML and JSON documents. These can be numbers (for example) that be further processed using the interpreter function for a comparison against an expected / changed value to create an web change alert. Furthermore, if the server features a REST API it can be used to access dedicated information using dedicated server-requests.
How does it work? Consider this document for XML:
...then you can access the price of the second book (the index is zero-based!) by this XML/XPath notation:
/books/book[1]/price
...which will return: 23.58
For JSON it is similar. Consider this JSON document:
...then you can access the price of the second book by this JSON/Pointer notation:
/books/1/price
...alternatively you can use JSONPath like this:
$.books.1.price
Again, if you let an interpreter operation follow then you track prices of items, for example.
For XML, there is something special to consider:
XML/XPath does not work on HTML because HTML is not a valid XML document you can apply XPath on. For these cases, WCM offers the option to try to convert HTML to a valid XML document (in fact, it becomes something like XHTML). You have to enable this conversion in the options of the XML/XPath filter. On the other hand, if the webpage is already XML or XHTML you don't need and should not apply the conversion as it may lead to errors otherwise.
Last edit: Morten MacFly 2023-09-23
Is there a way to select the price of "The Night Watch" if its position in the list may change?
For example, something like:
/books/title:"The Night Watch"/price
If not, this functionality will be exceptionally useful to add!
Last edit: Gitoffthelawn 2024-03-24
Good question. Honest answer: I don't now. I am using the syntax of the JSONCons library (https://danielaparker.github.io/jsoncons/) which ships with many examples (on that homepage and also here: https://github.com/danielaparker/jsoncons/tree/master/examples/src and here: https://github.com/danielaparker/jsoncons?tab=readme-ov-file#E1). You might want to check if you find something useful there for your purpose and then I can check if there is a need to change / enhance the implementation.
Thank you for this update! I think it has a little bit relation with my question, right? Anyway, thank you!
Yes, indeed - it could probably solve your problem! For me it was more to support REST APIs on servers where you send a query and get often JSON in return. Glad to hear it may be of further use...