[Rest2web-develop] Page Caching
Brought to you by:
mjfoord
From: Michael F. <fuz...@vo...> - 2006-04-09 22:02:16
|
Hello all, Thanks to an email from a new user, I've been inspired to implement 'page-caching' a bit earlier than anticipated - it's jumped the queue. :-) It's not trivial to implement, but not really that hard. I've 'hashed out' a spec, see below. This pickles all page data as it builds and saves an MD5 hash for each page, as well as a hash of the template in use and the breadcrumbs. When building a page, if all these three are identical, then it *doesn't* build the page (it is unchanged), and it can just copy data from the pickle into it's own data structures (which other pages may access). If the page hasn't been built before, or has changed, then it saves the data into a pickle. This is saved out at the end. I don't know how quickly I'll be able to implement this. It's not really that difficult. Testing will be a bugger of course. The Hash Format Pickle each page along with its breadcrumbs Also pickle a hash of the contents and a hash for the template in use If any page in a directory changes then the index file must be regenerated [#]_. For each directory we must pickle the list of pages and the list of sub-directories. If this has changed then that directory is marked as changed (attribute on the Processor - reset when a new directory is entered). So enter a new page. First check if it has a pickle (this maybe the first run through). Hash the contents - and compare. If changed - generate the page and save the hash. If not changed need to compare the hash with the template in use - to see if that has changed. Do the same with the breadcrumbs. If they all check out then instead of generating the page - the relevant stuff can just be copied from the pickle. If one of these *doesn't* check out, then the page must be generated - and the correct data saved into the pickle. If a page changes - then the directory marker needs to be set. Index pages need to check this, and force regeneration if anything has changed. Otherwise they follow the same procedure. When processing a directory - if there are deleted entries, these need to be removed from the pickle and the changed marker set. --- The hash is kept as self.hash_data. You can test if it is in use with if ``self.hash:`` (will be None) if hashing is not in use. The structure is a dictionary with an entry per directory. (This means that it is not a tree.) The keys are the values used for ``self.dir_as_url``. Each directory has the following values : index - an entry representing the index page pages - a dictionary (keyed by page name) representing each page dir_list - a list of the files in the directory [#]_ FIXME: How do we handle directories with ``__prune__`` ? Just skip them and leave their pickle intact I guess. .. [#] A later addition will allow you to specify whether a whole directory must be rebuilt if a file in a directory changes. .. [#] This means that any extra files in the directory will cause the index page to be rebuilt. |