Share

StatSVN

Tracker: Bugs

5 Missing Encoding Declaration - ID: 1829625
Last Update: Comment added ( archilles_sw )

Hello,

there seems to be a problem with unicode characters in the generated html
files. Subversion stores its log output in utf-8 (at least in my case) and
browsers usually use iso-8859-1. Therefore any multibyte chars (i.e. german
"umlaute", generally all non-asc-ii) are shown incorrectly just
byte-per-byte.

The following line in html header should help:

<meta name="content-type" content="text/html; charset=utf-8" />

I use version 0.3.1.


Archilles ( archilles_sw ) - 2007-11-10 20:02

5

Open

None

Nobody/Anonymous

Output Layer

None

Public


Comments ( 2 )

Date: 2008-06-18 20:19
Sender: archilles_sw


I changed "iso-8859-1" to "utf-8", reload the page and have correct chars.
Maybe some (java) xml parsers do have broken unicode implementations, but
I'm not a cross-platform expert. Younger prasers should handle it - at
least on Linux my last xml trouble was years ago. MacOS should be fine too.
Don't know about Windows as I use it only for gaming :)

Maybe you could just let the raw bytestream pass on (known) broken
parsers. Unicode is just some kind of interpretation and the browser does
it finally. Well, okay this is dirty and may cause security implications.
Perhaps someone experienced in unicode knows a tip...


Date: 2008-06-15 15:23
Sender: jkealeyProject Admin


Sorry for the late reply.

Looking at the current output in v0.4, I see:
<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1"/>

This code is in StatCVS. Given my lack of experience with charsets, I'm
not the best person to decide the best course of action for StatCVS/StatSVN
with regards to a general charset.

All I can say is that, for SVN, the log output appears to always be in
UTF-8 but we've gotten errors from parsers that don't recognize some
characters. (See comments here
http://blog.lavablast.com/post/2008/03/Upcoming-StatCVSStatSVN-release.aspx).


Do you think this is a limitation of the xml parsers on some platforms
(which don't support multi-byte UTF-8) or something else?




Attached File

No Files Currently Attached

Change

No changes have been made to this artifact.