Menu

#162 PrettyHtmlSerialiser drops trailing space in tags

v2.30
open
nobody
None
5
2023-06-19
2015-12-16
C Martin
No
String html = "<b>One </b>Two";
TagNode node = cleaner.clean(html);
String cleanedHtml = (new PrettyHtmlSerialiser(props, " ")).getAsString(node, "UTF-8");

will produce cleaned HTML of

<b>One</b>Two

dropping the space inside the tag.

SimpleHtmlSerialiser preserves the space.

Discussion

  • Scott Wilson

    Scott Wilson - 2015-12-16

    Yes, I can confirm it does this. The offending code is in PrettyHtmlSerializer.getSingleLineOfChildren where it trims the child content. I think a more appropriate algorithm in this instance is a whitespace collapse function, which turns any amount of whitespace into a single space character.

     
    • C Martin

      C Martin - 2015-12-16

      Would the white-space collapsing would fit more logically in the cleaner phase, and not in the renderer? The single/mutiple whitespace equivalence is a property of HTML itself, and would apply however it's rendered.

       
  • Scott Wilson

    Scott Wilson - 2015-12-16

    Logically you'd think so, however in the parsing rules that doesn't seem to be the case. If you put two spaces between words in a tag, then it renders in a browser (e.g. Chrome) as a single space - but if you inspect the DOM via xPath, both spaces are still there. So its not actually part of the cleaning phase.

     
  • Scott Wilson

    Scott Wilson - 2015-12-16

    ... but in any case it would suggest for this that the best option is to not trim any spaces in the output from PrettyHtmlSerializer

     
  • Scott Wilson

    Scott Wilson - 2017-02-06
    • Group: v 2.7 --> v2.19
     
  • Scott Wilson

    Scott Wilson - 2017-02-07
    • Group: v2.19 --> v2.20
     
  • Scott Wilson

    Scott Wilson - 2017-05-02
    • Group: v2.20 --> v2.21
     
  • Scott Wilson

    Scott Wilson - 2017-05-11
    • Group: v2.21 --> v2.22
     
  • Scott Wilson

    Scott Wilson - 2018-04-24
    • Group: v2.22 --> v2.23
     
  • Scott Wilson

    Scott Wilson - 2019-09-04
    • Group: v2.23 --> v2.24
     
  • Scott Wilson

    Scott Wilson - 2020-04-29
    • Group: v2.24 --> v2.25
     
  • Scott Wilson

    Scott Wilson - 2021-09-24
    • Group: v2.25 --> v2.26
     
  • Scott Wilson

    Scott Wilson - 2023-04-29
    • Group: v2.26 --> v2.29
     
  • Scott Wilson

    Scott Wilson - 2023-06-19
    • Group: v2.29 --> v2.30
     

Log in to post a comment.