From: David L. <mdl...@gm...> - 2008-11-23 16:47:36
|
Hello, I'm new to all this Xquery and eXist world, so this may be a very newbie question, but i was trying to read an RSS feed with xquery. I do the for $x in doc("sample.rss") syntax and all is fine. In the decription section of the feed is embedded html. when this gets serialized it converts all the entities it finds to %gt;%lt;. I wanted to turn around and display this with the markup but its >< etc. Is there a way to keep the html markup when it serializes? Thanks, David |
From: Wolfgang <wol...@ex...> - 2008-11-23 21:57:08
|
Hi, > I'm new to all this Xquery and eXist world, so this may be a very > newbie question, but i was trying to read an RSS feed with xquery. I > do the for $x in doc("sample.rss") syntax and all is fine. In the > decription section of the feed is embedded html. when this gets > serialized it converts all the entities it finds to %gt;%lt;. This is not really an eXist issue, but a limitation of RSS. RSS embeds the HTML as text, not XML. Just look at the source code of an arbitrary RSS feed in your web browser and you will find that the HTML is either "escaped" or wrapped into a CDATA section. You would need to pre-process the content and parse the HTML into XML - otherwise you cannot query it. If you have a choice, I would prefer Atom over RSS. It has proper support for embedding XHTML. Wolfgang |