[Simple-support] Simple XML framework can't handle foreign markup in Strings - forced use of CDATA
Brought to you by:
niallg
|
From: Stumpf, J. T. <Tho...@st...> - 2011-06-23 18:16:53
|
Hi!
I'm quite disappointed:
Simple XML framework forces the usage of superfluous CDATA-blocks.
On unserialising XML-Files with elements containing markup, e.g. HTML,
Simple XML framework transforms the HTML to pure text discarding all
mark up:
<description id="eid-191" composed="0">
<div>
<p>The quick brown <a href="www.example.com/fox">fox</a>.</p>
</div>
</description>
and
@Element
String description;
yields description to contain "The quick brown fox" but not
<div>
<p>The quick brown <a href="www.example.com/fox">fox</a>.</p>
</div>
This is crippling the provided data, undocumented and most probably
unnecessary.
The only way to get around it is to use CDATA blocks which is error
prone and evil:
Every additional CDATA-block bloats the XML files. Every CDATA block
renders substructures invalidateable. Every CDATA block makes the
generated *and* read data less suitable for true XML chains.
CDATA is *only* recommendable if the substructure is not valid itself.
The classes exactly define the parts which should be read into
attributes so there is no need to cripple the contained data.
It is a fairly common use case of XML based markup languages to contain
foreign other markup languages and it is not sensible to have to
implement the whole set of elements of these.
It is highly undesirable to be forced to try to sidestep existing
standards fixed in published RFCs.
Additionally obviously Simple XML framework shall not alter the content
of an element if there isn't a provided method (without a warning). To
alter the content somehow breaches the essential rules of object
orientation.
The mixed content of an element annotated as element of type string e.g.
can and should always simply be stored as String containing the markup
elements.
If the library is decent it might though integrate an optional switch to
do so or code to get rid of the putatively superfluous child nodes.
To preserve a reasonable XML chain and gain valid XML only such texts
should have to be contained in a CDATA block which contain not well
formed data. This is a sensible usage for CDATA.
The decision to insert the data as CDATA might even be deferred till
runtime.
Usage should be transparent and painless for the developer and an
alternative provider of XML data.
There has to be an option (annotation) to enabling an developer to keep
the markup without bloating the code at least.
As said before it's much more preferable to conserve per default
contents of String elements as provided.
For example I need the contained HTML mark up (see above) but I can't
alter the provider.
I'm looking forward for an answer - but according to my experiences
before I unfortunately think nobody will reply at all so this might just
end up as being a warning for other developers...
Jens
|