Thread: [aKregator-devel] [Bug 112491] New: CDATA in feed is not handled correctly
Brought to you by:
lippel
From: Eckhart Wör. <kd...@ew...> - 2005-09-12 17:19:32
|
------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. http://bugs.kde.org/show_bug.cgi?id=112491 Summary: CDATA in feed is not handled correctly Product: akregator Version: unspecified Platform: unspecified OS/Version: Linux Status: NEW Severity: normal Priority: NOR Component: general AssignedTo: akregator-devel lists sourceforge net ReportedBy: kde ewsoftware de Version: 1.2 (using KDE 3.4.2, Kubuntu Package 4:3.4.2-0ubuntu0hoary2 ) Compiler: gcc version 3.3.5 (Debian 1:3.3.5-8ubuntu2) OS: Linux (i686) release 2.6.10-5-386 In http://www.blogistan.co.uk/qt/atom.xml , <![CDATA[ ... ]]> is used to mask the articles. These CDATA tags belong to the XML file and should therefore not get passed to KHTML. At the moment, they do get passed to KHTML, resulting in strange rendering results. |
From: Frank O. <fra...@kd...> - 2005-09-30 06:36:14
|
------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. http://bugs.kde.org/show_bug.cgi?id=112491 ------- Additional Comments From frank.osterfeld kdemail net 2005-09-30 08:36 ------- This example is not Atom-1.0 compliant. In Atom, CDATA seems not valid in <content type="html">, according to http://www.atomenabled.org/developers/syndication/#text "If type="html", then this element contains entity escaped html. <title type="html"> AT&amp;T bought <b>by SBC</b>! </title>" So the feed should use escaped HTML instead of CDATA. |
From: Eckhart Wör. <kd...@ew...> - 2005-10-23 09:29:21
|
------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. http://bugs.kde.org/show_bug.cgi?id=112491 ------- Additional Comments From kde ewsoftware de 2005-10-23 11:29 ------- http://www.w3.org/TR/2004/REC-xml-20040204/#sec-cdata-sect says: "[Definition: CDATA sections may occur anywhere character data may occur; they are used to escape blocks of text containing characters which would otherwise be recognized as markup. CDATA sections begin with the string "<![CDATA[" and end with the string "]]>":]" |
From: <ow...@bu...> - 2005-11-10 16:13:31
|
------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. http://bugs.kde.org/show_bug.cgi?id=112491 kde ewsoftware de changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |bderidder novell com ------- Additional Comments From kde ewsoftware de 2005-11-10 17:13 ------- *** Bug 116051 has been marked as a duplicate of this bug. *** |
From: Frank O. <fra...@kd...> - 2006-01-15 23:16:48
|
------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. http://bugs.kde.org/show_bug.cgi?id=112491 frank.osterfeld kdemail net changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution| |FIXED ------- Additional Comments From frank.osterfeld kdemail net 2006-01-16 00:16 ------- SVN commit 498704 by osterfeld: fix atom:content parsing: Don't show tags when for Atom 1.0 feeds with escaped HTML in it BUG: 112491, 117938 M +36 -15 tools_p.cpp --- branches/KDE/3.5/kdepim/akregator/src/librss/tools_p.cpp #498703:498704 @ -47,21 +47,42 @ QDomElement e = node.toElement(); QString result; - if (elemName == "content" && ((e.hasAttribute("mode") && e.attribute("mode") == "xml") || !e.hasAttribute("mode"))) - result = childNodesAsXML(node); - else - result = e.text(); - - bool hasPre = result.contains("<pre>",false); - bool hasHtml = hasPre || result.contains("<"); // FIXME: test if we have html, should be more clever -> regexp - if(!isInlined && !hasHtml) // perform nl2br if not a inline elt and it has no html elts - result = result = result.replace(QChar('\n'), "<br />"); - if(!hasPre) // strip white spaces if no <pre> - result = result.simplifyWhiteSpace(); - - if (result.isEmpty()) - return QString::null; - + bool doHTMLCheck = true; + + if (elemName == "content") // we have Atom here + { + doHTMLCheck = false; + // the first line is always the Atom 0.3, the second Atom 1.0 + if (( e.hasAttribute("mode") && e.attribute("mode") == "escaped" && e.attribute("type") == "text/html" ) + || (!e.hasAttribute("mode") && e.attribute("type") == "html")) + { + result = KCharsets::resolveEntities(e.text().simplifyWhiteSpace()); // escaped html + } + else if (( e.hasAttribute("mode") && e.attribute("mode") == "escaped" && e.attribute("type") == "text/plain" ) + || (!e.hasAttribute("mode") && e.attribute("type") == "text")) + { + result = e.text().stripWhiteSpace(); // plain text + } + else if (( e.hasAttribute("mode") && e.attribute("mode") == "xml" ) + || (!e.hasAttribute("mode") && e.attribute("type") == "xhtml")) + { + result = childNodesAsXML(e); // embedded XHMTL + } + + } + + if (doHTMLCheck) // check for HTML; not necessary for Atom:content + { + bool hasPre = result.contains("<pre>",false); + bool hasHtml = hasPre || result.contains("<"); // FIXME: test if we have html, should be more clever -> regexp + if(!isInlined && !hasHtml) // perform nl2br if not a inline elt and it has no html elts + result = result = result.replace(QChar('\n'), "<br />"); + if(!hasPre) // strip white spaces if no <pre> + result = result.simplifyWhiteSpace(); + + if (result.isEmpty()) + return QString::null; + } return result; } |
From: Eckhart Wör. <kd...@ew...> - 2006-02-28 20:52:43
|
------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. http://bugs.kde.org/show_bug.cgi?id=112491 kde ewsoftware de changed: What |Removed |Added ---------------------------------------------------------------------------- Status|RESOLVED |REOPENED Component|general |feed parser Resolution|FIXED | ------- Additional Comments From kde ewsoftware de 2006-02-28 21:52 ------- This bug has only been fixed for Atom, not for RSS. Reopened it therefore. |
From: <ow...@bu...> - 2006-02-28 20:54:02
|
------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. http://bugs.kde.org/show_bug.cgi?id=112491 kde ewsoftware de changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |roman.cheplyaka gmail com ------- Additional Comments From kde ewsoftware de 2006-02-28 21:53 ------- *** Bug 122857 has been marked as a duplicate of this bug. *** |
From: Jan Kundrát <jk...@ge...> - 2006-10-08 20:29:58
|
------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. http://bugs.kde.org/show_bug.cgi?id=112491 jkt gentoo org changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |jkt gentoo org |
From: Peter A. <mu...@fr...> - 2007-05-19 13:54:08
|
------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. http://bugs.kde.org/show_bug.cgi?id=112491 ------- Additional Comments From muczy freestart hu 2007-05-19 15:54 ------- Same here. Gentoo ~amd64 kde 3.5.6 Please fix this annoying bug! |
From: Peter A. <mu...@fr...> - 2007-05-19 13:57:20
|
------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. http://bugs.kde.org/show_bug.cgi?id=112491 muczy freestart hu changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |muczy freestart hu |