Re: [FreeMarker-user] URL-encoding strings
Generates text that depends on changing data (like dynamic HTML).
Brought to you by:
revusky
From: Daniel D. <dd...@fr...> - 2004-01-06 23:16:37
|
Tuesday, January 6, 2004, 10:00:50 PM, Per Cederberg wrote: > On Tue, 2004-01-06 at 20:46, Daniel Dekany wrote: >> But which Servlet container does that (use UTF-8 for the encoding)? > > You mean for decoding URL:s? Sometimes the fragments of the URL must be decoded to be interpreted. First, the HTTP server must decode them to know what resource should it use to generate the response. Consider, URL "foo" and "f%6Fo" must be interpreted as the same, so the "%6F" must be resolved to "o". Then, the Servlet container faces the same problem again, when it has to choose the servlet based in the the path. Then, the your Servlet code faces this again, if it has to interpret path info... and finally, the problem again arises with parameters sent with QUERY_STRING or with POST-ed form... > I actually had a similar problem with POST:s in Tomcat. > By default I think it did ISO-8859-1, but it does > respect the standard way of setting the char encoding: > > request.setCharacterEncoding("UTF-8"); > > Then, of course, one has to hope that the browsers do > send stuff on the wire in UTF-8. It helps to set the > page encoding in the HTTP headers (or in META-tags) but > it provides no guarantee. I don't remember I have found it written anywhere, but browsers use the encoding of the HTML page that contains the <form> to encode the form content. (I don't know what to think about the W3C recommendation that says that UTF-8 should be used always...) So as far as you know what encoding do the sender page uses, there is no problem. And usually you use the same encoding for all pages inside the same Web application, so you just say something like "everything is ISO-8859-2". And then ?url should use ISO-8859-2 as well. So ?url should use the output encoding, but since FreeMarker uses Writer-s instead of OutputStreams, it doesn't know anything about the current output encoding. This is where we stuck, and the Web app framework should enter. But, I wonder if we should introduce a url_character_encoding setting or something like that, that the framework can set, and if that setting is not set, ?url will cause error, unless you explicitly specify the charset: ?url("ISO-8859-2"). > Or did I get sidetracked now? Maybe a bit... but actually, the root of all problem is the same here: The HTTP protocol forgot to say anything about the encoding of URL-s... -- Best regards, Daniel Dekany |