Hi guys,
We've run into the issue that single quote character gets serialized as ' by the html serializers. Attached you'll find a patch and unit test for this. The unit test also documents the behavior for the XML serializer. The behavior of the XML serializer is unchanged by this patch.
Cheers, Oscar
I think the way your posting is rendered reverses your argument, but the title remains correct (at least in my browser).
This will conflict with my patch on bug #118. My patch will also fix this but I single out the apostrophe specifically, just in case something changes in the special entities list later on. I do like that you made a test case for it.
Here's the source code comment where the apostrophe was added to the special entities list:
haha
Lol@the rendering of my post.
It actually is a nice illustration of what I want to avoid. I'm 100% sure I entered & a m p ; in the first sentence but "some" process converted that into a ' (at least that is how it now shows up in the raw html of this page). I'd like to be able to use HtmlCleaner in such a way that it fixes any incorrect HTML, but leaves all characters "as-is".
I've applied your patch, double checked my test case and all other tests we have that use HtmlCleaner and all is green :-) I'm ok with closing this issue as a duplicate of 118.
About the JavaScript link mentioned in the code comment you mentioned: as all attributes are always serialized/normalized using double quotes, using single quotes inside of attribute values should be ok both in HTML as well as XML. To illustrate the following test case:
s / & a m p ; / & a p o s ;
sigh friday afternoon here ...
Hi Oscar,
I've applied Seanster's patch for issue 118.
The test case for XmlSerialiser is a good one - I've added it to the serialisation test cases but commented out for now until I've figured out if there any side effects of applying the rules to Xml serialisers as well as Html.
S