|
From: Gilles D. <gr...@sc...> - 2002-09-05 20:54:46
|
According to Gabriele Bartolini: > Ciao Romain, > > as far as I know, now htdig doesn't support it yet, but you could > easily hack the code to make it work. I have something to complain about > this way of negotiating a request by the CMS, because HTTP says the when > no Accept is given, every media type is accepted by the client, but ... > it's ok. > > However, I think this is a good point to analyse for the 3.2 code. We > should somehow let the Web server know what kind of media types htdig is > able to understand, by listing all of them (default ones plus those > managed through external parses' help). > > What d'u think guys? Well, I certainly don't have a problem with htdig 3.2 having support for the Accept header in its requests. In fact, it does sound like a good idea. However, Romain's web site is broken! htdig 3.1.5 is an HTTP/1.0 client, and in RFC 1945, which defines the HTTP 1.0 protocol, the Accept request header is only mentioned in an appendix, where it states that this "... header field can be used to indicate a list..." (Note: can be used, not MUST be used!) I.e. this is not to be treated as a required header, and many HTTP/1.0 clients will not put out this header. Any server that requires this of an HTTP/1.0 client is broken. Even RFC 2068, which defines HTTP/1.1, says "... can be used ...", and also "If no Accept header field is present, then it is assumed that the client accepts all media types." If a web site cannot render content properly without the Accept header, it is not compliant with this standard. Fixing htdig to work around this bug may allow htdig to index the site, but it won't prevent problems with other standards- compliant web clients navigating this site, if they happen not to put out this header either. Workarounds for bugs like this should be a last resort, when it's impossible to fix the real problem, and not a first resort to avoid even attempting to get at the problem. > Il mer, 2002-08-28 alle 15:14, rl...@bn... ha scritto: > > I want to index my web site using htdig. > > > > However, my web site, using a CMS , needs the "Accept " HTTP Header, in > > order to render the dynamic content properly. > > > > htdig does not send this Header. > > > > How can I define custom HTTP Headers for the robot : > > using htdig.conf ? > > modifying the source code ? > > > > PS: > > I am using a compiled htdig v3.1.5 on an AIX v4.3 box -- Gilles R. Detillieux E-mail: <gr...@sc...> Spinal Cord Research Centre WWW: http://www.scrc.umanitoba.ca/ Dept. Physiology, U. of Manitoba Winnipeg, MB R3E 3J7 (Canada) |