This is because sites of this sort misuse XML.

I dont blame them because they want to use XML but there is no standardized way

to consume a non ending "Stream" of XML.

Why?

You can either produce 1 document like they do ( and several other protocols such as IM protocols that use XML)

<root>

message

message

.... forever

.... you will NEVER see the </rooot>

 


OR you can produce a stream of messages

<message> text </message>

<message> text </message>

<message> text </message>

 

 

The problem with the first approach is that (well depending what level of specs you read want to

stick to) pretty much requires you read to the end tag in order to determine well-formedness.

Secondly most XML processes are simply not built for this case, although Streaming XSLT might just be able to do it,

most processes are designed to read the full document until the end (either into memory to construct an in-memory representation,

or into a database). 

 

The second case has no established "standard" representation ... i.ee a sequence of XML documents

or even a sequence of XDM values does not have a standard format.  There are serialization specs for

what to do when writing a sequence of XDM but they are not designed or appropriate for this case,

This case needs a pre-processor (or something very deep in the XML processor) to detect the boundaries

and return individual messages.  That can be difficult when buffering data, when reading from sockets etc.

And XML processors are all supposed to have ONE document as input so they simply dont do this

( See: for an ongoing discussion and semi proposal:

http://xml.calldei.com/XDMSerialize

)

 

While on the Efficient XML W3C group we had the same issue attempting to address similar use cases,

that time it was a IM protocol that (ab)used XML in the same way ...

 

to accomplish what you want you need to either

1)      Have a preprocess step that reads raw text and splits it into the child XML elements and sends those to saxon
Depending on the source that could be easy or hard .. maybe they use a well-defined delimiter
between child nodes ... otherwise you have to be as smart as an XML parser to know when you reach node boundaries.
OR

2)      MAYBE Streaming XSLT can address this.   Not sure.

 

 

 

From: Christian Schwaderer [mailto:c_schwaderer@hotmail.com]
Sent: Tuesday, June 10, 2014 5:17 AM
To: Mailing list for the SAXON XSLT and XQuery processor
Subject: Re: [saxon] Installing Saxon/C on openSUSE 13.1 64-bit

 

Thanks again, Oneil!
But I don't think that this is the problem. I double checked that entry - and it is there.
Without the line "extension=saxon.so" PHP wouldn't try to "load dynamic library '/usr/lib64/php5/extensions/saxon.so' ". And without loading "saxon.so" - how should PHP know that it has to open "libsaxon.so"?
Maybe PHP doesn't know where to look for "libsaxon.so" because it says "Unknown on line 0". That seems strange to me. Obviously PHP is able to obtain some information from saxon.so. Otherwise, it wouldn't know the filename "libsaxon.so". But why can't PHP give the right line number? (Or does it start line counting from zero and the entry pointing to "libsaxon.so" is in the very first line? Than still, the word "Unknown" remains a problem.)

I hope you find the error via building an openSUSE 13.1 64 bit virtual machine yourself. Unfortunately, I don't think there is anything I could try or do now.

Thanks again and kind regards!
Christian


From: oneil@saxonica.com
Date: Tue, 10 Jun 2014 09:54:25 +0100
To: saxon-help@lists.sourceforge.net
Subject: Re: [saxon] Installing Saxon/C on openSUSE 13.1 64-bit

This looks like the php extension has not been added to the php.ini file. i.e. extension=saxon.so

 

I am building myself a virtual box with openSUSE 13.1