Menu

#3 Rescrape headers now and again to look for changed files

open
nobody
None
5
2004-12-03
2004-12-03
Anonymous
No

If a day moves URL, we currently do rescrape it and
update (this happens when daily editions are bundled
into volumes). However, we do not speculative recrawl
all pages looking for changes.

We should do this every now and again for the entire
backlog of stuff. And only update things if the date
in the HTTP header has changed - if cunning we can do
the kind of HTTP request which is for just the header.

Discussion


Log in to post a comment.