Hello all,
Thanks to an email from a new user, I've been inspired to implement
'page-caching' a bit earlier than anticipated - it's jumped the queue. :-)
It's not trivial to implement, but not really that hard.
I've 'hashed out' a spec, see below. This pickles all page data as it
builds and saves an MD5 hash for each page, as well as a hash of the
template in use and the breadcrumbs.
When building a page, if all these three are identical, then it
*doesn't* build the page (it is unchanged), and it can just copy data
from the pickle into it's own data structures (which other pages may
access).
If the page hasn't been built before, or has changed, then it saves the
data into a pickle. This is saved out at the end.
I don't know how quickly I'll be able to implement this. It's not really
that difficult. Testing will be a bugger of course.
The Hash Format
Pickle each page along with its breadcrumbs
Also pickle a hash of the contents and a hash for the template in use
If any page in a directory changes then the index file must be
regenerated [#]_.
For each directory we must pickle the list of pages and the list of
sub-directories.
If this has changed then that directory is marked as changed (attribute
on the
Processor - reset when a new directory is entered).
So enter a new page.
First check if it has a pickle (this maybe the first run through).
Hash the contents - and compare. If changed - generate the page and save the
hash.
If not changed need to compare the hash with the template in use - to see if
that has changed.
Do the same with the breadcrumbs.
If they all check out then instead of generating the page - the relevant
stuff can just be copied from the pickle.
If one of these *doesn't* check out, then the page must be generated -
and the
correct data saved into the pickle.
If a page changes - then the directory marker needs to be set.
Index pages need to check this, and force regeneration if anything has
changed.
Otherwise they follow the same procedure.
When processing a directory - if there are deleted entries, these need to be
removed from the pickle and the changed marker set.
---
The hash is kept as self.hash_data.
You can test if it is in use with if ``self.hash:`` (will be None) if
hashing
is not in use.
The structure is a dictionary with an entry per directory. (This means
that it
is not a tree.)
The keys are the values used for ``self.dir_as_url``.
Each directory has the following values :
index - an entry representing the index page
pages - a dictionary (keyed by page name) representing each page
dir_list - a list of the files in the directory [#]_
FIXME: How do we handle directories with ``__prune__`` ? Just skip them and
leave their pickle intact I guess.
.. [#] A later addition will allow you to specify whether a whole directory
must be rebuilt if a file in a directory changes.
.. [#] This means that any extra files in the directory will cause the
index page to be rebuilt.
|