Learn how easy it is to sync an existing GitHub or Google Code repo to a SourceForge project! See Demo

Close

#224 encoding of last "-" character is not working correctly

open
aditsu
None
5
2012-10-08
2010-08-02
No

in the test's case the last '-' character at the end of the paragraph is not encoded correctly.

Discussion

  • the '-' is a unicode character, not a regular - sign.

     
  • aditsu
    aditsu
    2010-08-02

    I opened both the input and output files in a browser and they look identical. Even the html source looks identical except for different line wrapping. What exactly is the problem?

     
  • I have attached jtidies' output. Have a look at the "incluindo – mas" section. I have tried to shorten the problematic sequence as short as I can. The strange thing is that - when you remove some things from the beginning, everything works nicely.

     
  • aditsu
    aditsu
    2010-08-02

    What version of jtidy did you use and what program/command did you run?

     
  • I'm using trunk. I have seen the problem for at least two years but never filed a bug.

     
  • aditsu
    aditsu
    2010-08-17

    suggested patch

     
    Attachments
  • aditsu
    aditsu
    2010-08-17

    Hi, could you try the attached patch please?
    It seems to fix the problem but I wonder if it breaks anything.
    Btw, it doesn't seem to break any more tests on the java5 branch.

     
  • I'll check in two weeks...

     
  •  
    Attachments
  •  
    Attachments
  •  
    Attachments
  •  
    Attachments
  • I couldn't wait. The problem has gone - thank you! I have re-uploaded all files to create a test case (for the old system and this failure).

     
  • aditsu
    aditsu
    2010-08-17

    Haha, thanks, I'll go ahead and commit then (a bit later).

     
  • aditsu
    aditsu
    2010-08-18

    Well, I committed the fix, but the test case you added doesn't work on trunk - it tries to parse "SUMMARY" (message level) as a number.