Menu

#1 Store HTML diffs, backup XML diffs

open
nobody
None
5
2004-11-02
2004-11-02
No

At the moment, when pages change (currenly only
detected when URLs change) we store backwards diffs of
the xml files in scrapedxml/deabtes/*.diff*. These
need backing up.

We should probably also store and backup backwards
diffs of the HTML.

Discussion

  • Matthew Somerville

    Logged In: YES
    user_id=202102

    All HTML page changes can now be detected, and diffs stored
    in cmpages/

    Are the diffs backed up yet?

     
  • Francis Irving

    Francis Irving - 2005-01-23

    Logged In: YES
    user_id=91098

    At the moment we don't back up diffs at all. We should
    backup the xml diffs, but the HTML ones are less important.
    Need the XML files themselves for old gids, so URLs don't
    break. We should back up diffs and actual XML files.

    Valuable historic data like chgpages (all the old lists of
    ministerships etc.) is in CVS, hence backed up across the
    world via SF.

     

Log in to post a comment.