Menu

#118 HTML always translating special entities

v2.16
closed-fixed
nobody
None
5
2021-04-20
2014-05-17
Seanster
No

Hi. Now that the HTML Serializers are back I've tried them out but I can't get them to just leave special entities alone.

Input:   or £

Output if (translate special entities) true:   £
Output if (translate special entities) false: (space) or £

There's no scenario where I can get it to output   or £
None of the other command line options make any difference.

Related to ticket #107 maybe? escapeXml() needs adjustment? I don't have any experience with Version 2.7 yet (first version with the HTML Serializers back on the command line) but for this example it works as expected.

Also related,
transSpecialEntitiesToNCR is not exposed on the command line but should have been brought back when the html serializers were reinstated. The command line help text should list the html serializers too.

Discussion

<< < 1 2 (Page 2 of 2)
  • Scott Wilson

    Scott Wilson - 2015-10-23
    • status: open-accepted --> closed-fixed
     
  • Scott Wilson

    Scott Wilson - 2016-08-17

    Hi Seanster,

    In the current trunk (which will become v2.17 when released) I've got this unit test passing:

        @Test
        public void nbsp() throws IOException{
            String html = "<b>One&nbsp;</b>Two";
    
            ByteArrayOutputStream htmlOutputStream = new ByteArrayOutputStream();
    
            HtmlCleaner cleaner = new HtmlCleaner();
            CleanerProperties props = cleaner.getProperties();
            props.setTranslateSpecialEntities(false);
            TagNode node = cleaner.clean(html);
            new SimpleHtmlSerializer(props).writeToStream(node, htmlOutputStream);
            String htmlcontent = htmlOutputStream.toString();
            assertTrue(htmlcontent.contains("<b>One&nbsp;</b>Two"));
        }
    

    Does that cover the case?

     
<< < 1 2 (Page 2 of 2)

Log in to post a comment.

MongoDB Logo MongoDB