|
From: Brad T. <br...@ar...> - 2008-04-01 02:24:38
|
Hi Lukas, Thanks for the detailed report of the problem. I'm not sure I understand all the issues you've mentioned: #2 and #3, are straightforward and should have been obvious before now, thanks! To make sure I've understood and corrected the issue, I changed the META tag's declared encoding to "utf-8" and added the <%@ page ... %> directive as you suggested to .../templates/UI-header.jsp, which is referenced by all the other html generating jsp files. I'm not sure I understand what you mean with your first point, though.. I just checked and didn't see any non-ASCII characters in any of the .jsp files included with Wayback, so I'm not sure how they would be converted to UTF-8. A colleague just suggested that Windows may include a special 2 byte header to all text files indicating the encoding. Is this the change you're talking about? Am I missing something else obvious? I'm also not sure what you're saying with #4. I understand that there can sometimes be complications with user submitted data arriving at the server using the wrong encoding, and thus losing information, but in this case, wouldn't the best option be to assume the encoding declared in the users HTTP request was correct? Said another way, wouldn't it be too late to alter the encoding after the request has been received by the server? Can you illustrate this problem further, or point me at some online docs describing the problem and solution? Thanks again! Brad Lukáš Matějka wrote: > Hello, > > we've been using and testing Wayback for several years in WebArchiv.cz > and we're familiar with the fact, that so far IA's done a lot of effort > in i18n especially in last releases. In particular, we appreciate > support for language properties and configuration of individual jsp > pages, nevertheless we're still facing issues with utf-8 encoding. I'd > like to ask for experiences from others (non-ascii countries) how they > solved this issue. > > In general, with a new release, we have to always make following changes > (with assumption that we usually store our language properties in utf-8): > > 1. Convert all jsp into utf-8 > 2. Add meta tag "<meta equiv="Content-Type" content="text/html; > charset=UTF-8">" to JSP in order to browser can recognize right encoding > 3. Add directive <%@ page language="java" pageEncoding="utf-8" > contentType="text/html;charset=utf-8"%>' to each JSP to say that server > should send response in UTF-8 > 4. if we also want to send a unicode text from form to server we have to > implement a filter that sets encoding to request > (req.setCharacterEncoding(encoding);) > > With respect to this changes, we're able to customize each release, > however it might help to other non-english speaking countries to > incorporate this into wayback. > Or is there any other intent how to treat this issue? > > Thanks in advance for reply. > > Best Regards > -- > Lukas Matejka > WebArchiv.cz > CZ National Library > > ------------------------------------------------------------------------- > This SF.net email is sponsored by: Microsoft > Defy all challenges. Microsoft(R) Visual Studio 2008. > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > _______________________________________________ > Archive-access-discuss mailing list > Arc...@li... > https://lists.sourceforge.net/lists/listinfo/archive-access-discuss > |