Menu

#6 Keeps whitespace in a document.

open
nobody
None
5
2004-01-02
2004-01-02
No

- Changes the way CondenseWhiteSpace works.

Set to true:
there are only two modes of operation - HTML-like
parsing (leading, trailing and inner whitespace is
chopped),

Set to false:
Keeps everything except spaces between attributes,
spaces after processing instructions etc.. but basically
all content stays the same.

This patch should also fix handling of UTF-8. 2.2.1
release escapes all instances of high-ascii chars in the
output stream.

If you have a single Japanese character encoded into
UTF-8 it might consisting of 3 chars, eg: 234, 245, 201
these became êõÉ. Subsequent
parsing by another parser will interpret them as three
distinct unicode characters. This was a one-line fix.

Discussion

  • Simon Harrison

    Simon Harrison - 2004-01-02

    Logged In: YES
    user_id=775521

    Oh bugger - I've been caught out again - Let's try the last
    paragraph again:

    If you have a single Japanese character encoded into
    UTF-8 it might consisting of 3 chars, eg: 234, 245, 201
    these became êõÉ.
    Subsequent
    parsing by another parser will interpret them as three
    distinct unicode characters. This was a one-line fix.

     
  • Simon Harrison

    Simon Harrison - 2004-01-02

    The patch

     

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.