From: Geoff H. <ghu...@ws...> - 2002-02-09 03:57:42
|
At 2:53 PM -0500 2/8/02, William R. Knox wrote: >In parsedcdate, assume that unqualified dates are at noon instead of >midnight. If no one is ever more than 12 hours plus or minus UTC, this is >actually a very easy hack, er, solution. I seemed to think that this was probably true but figured I'd check an official timezone map to be sure. <http://aa.usno.navy.mil/faq/docs/world_tzones.html> Interestingly at least the US Navy has two zones at +13 and +14 hours respectively. Admittedly these are some small islands out in the Pacific and I don't know how many servers are running out there. >This seems like the best course of action, though it has the >disadvantage of increasing the size of the database for all files, >even though only a limited number use the additional date >information. I suppose the meta date info could only be populated if >the file has it, though. Certainly a META "date" field would only be populated if the document has it. OTOH, there's a limited amount of overhead for merely adding a record to each DocumentRef as it needs to distinguish between each bit of information stored for a given document. >The additional advantage of this is that, currently, if the meta tag >on a file stays the same, I don't think it ever gets reindexed - however, >the meta tag could stay the same and the contents could change. No, I really doubt this. It sends the date it has in the database in an If-Modified-Since request to the server as well as comparing it to the date returned by the server. If the server doesn't return a date, it uses the current time to the indexer. The If-Modified-Since test is only conceivably a problem if someone adds a META date tag with a time in the *future*. The test by htdig is even more restrictive: if (doc->ModTime() == ref->DocTime()) // retrieved but not changed So if the server ignored the If-Modified-Since and sends the document anyway, it'll only be ignored by ht://Dig if the time is *exactly* the same, which is pretty unlikely if the document itself was changed. Setting the document date to the META date only happens after parsing occurs and no additional checking is performed. -Geoff |