Menu

#17 [PATCH] Allow lenient parsing for original entities

2.13
open
nobody
None
5
2015-05-15
2014-09-08
No

A lot of older HTML documents include entities that are not semi-colon terminated which should nonetheless be recognized as such and which are recognized as such by most modern browsers (I've tested on Chrome, Firefox, and Safari). This would seem to apply only to the set of original entities representing ISO Latin 1 characters (code < 256).

See also:
http://stackoverflow.com/questions/18689230/why-do-html-entity-names-with-dec-255-not-require-semicolon
http://stackoverflow.com/questions/15532252/why-is-reg-being-rendered-as-%C2%AE-without-the-bounding-semicolon

1 Attachments

Discussion

  • Scott Wilson

    Scott Wilson - 2014-09-09

    Great idea Shaun - this allows us to "upgrade" older HTML docs to use valid XML/XHTML entities rather than just treat them as text. I'll review the patch for inclusion in the next release.

     
  • Scott Wilson

    Scott Wilson - 2014-10-31

    Hi Shaun,

    I can't apply your patch as its a collection of "partial" patches in Gnu format rather than a unified patch. Can you create a universal patch using "svn diff" or "diff -u"?

     
  • Scott Wilson

    Scott Wilson - 2015-01-19
    • Group: 2.9 --> 2.11
     
  • Scott Wilson

    Scott Wilson - 2015-05-12
    • Group: 2.11 --> 2.12
     
  • Scott Wilson

    Scott Wilson - 2015-05-15
    • Group: 2.12 --> 2.13
     

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.