From: Zoran V. <zv...@ar...> - 2006-08-14 11:52:00
|
On 14.08.2006, at 12:53, Michael Lex wrote: > I think you get Bernd wrong: The problem was, that Bernd wanted > naviserver to return the content in iso-8859-1 encoding. So the number > of bytes and the number of characters should be equal. > The Content-Length has to be the number of bytes returned, but =20 > naviserver > computed the value with string bytelength of an utf-8 string, which > was, in Bernds, case greater than the bytelength of the iso8859-1 > string. I believe the best way is to peek at the standard (RFC 2616): 14.13 Content-Length The Content-Length entity-header field indicates the size of the entity-body, in decimal number of OCTETs, sent to the recipient or, in the case of the HEAD method, the size of the entity-body that would have been sent had the request been a GET. Content-Length =3D "Content-Length" ":" 1*DIGIT An example is Content-Length: 3495 Applications SHOULD use this field to indicate the transfer-=20 length of the message-body, unless this is prohibited by the rules in section 4.4. This all means that content-length gives total number of *bytes* in the response, regardless of any encoding applied. This also means that in the case of UTF8 encoded string "m=FC" it will be 3 and not 2. If the "m=FC" is sent with ISO8859-1 then the content length wold be 2. Allright. I think I get it now. If this is so, then this means that we cannot possibly give the correct content-length UNLESS we apply the encoding BEFORE sending any headers and body, as we would have to either give the correct value in content-length header OR would need to OMIT the content-length and turn off the keepalive for that response. > > So it seems that chunked encoding is the best possible solution. But > as Gustav said, chunked transfer-encoding is only part of HTTP/1.0 and > some clients don't understand it. Yes, chunked encoding seems feasible there. For clients not supporting the chunked responses, we could convert the entire message beforehand burning some memory and cycles. As there are quite a few of them out there, this may not be of much importance anyways. OK, this makes sense. > > Btw: Aolserver doesn't encode "on-the-fly", but in memory. So they > know the content-length before the content is sent to the recipient. > On the fly I mean that the message is not encoded in its *entirety* beforehand, rather it is converted piece-by-piece (hence on-the-fly) in Ns_ConnWriteVChars(). So, what do we have now? A. For HTTP 1.0 clients only, we could/should/must either: a. omit content-length and turn keepalive off leaving the browser to drain the connection until EOF. b. calculate the content-length in advance by performing the conversion of the message in its entirety in the memory using the given output encoding B. For HTTP 1.1 clients we can turn on chunked encoding if the output encoding is specified, and is not UTF8 (basically, this is what Bernd's workaround does). Is this right? Are there any other options we may have? Zoran > Michael > > ----------------------------------------------------------------------=20= > --- > Using Tomcat but need to do more? Need to support web services, =20 > security? > Get stuff done quickly with pre-integrated technology to make your =20 > job easier > Download IBM WebSphere Application Server v.1.0.1 based on Apache =20 > Geronimo > http://sel.as-us.falkag.net/sel?=20 > cmd=3Dlnk&kid=3D120709&bid=3D263057&dat=3D121642 > _______________________________________________ > naviserver-devel mailing list > nav...@li... > https://lists.sourceforge.net/lists/listinfo/naviserver-devel |