From: Tavis R. <ta...@re...> - 2002-03-31 08:31:40
|
Hi, I just started doing some work on Cheetah's output caching framework and decided it made more sense to approach this from a WebKit-wide perspective. Here's some rough design notes I've come up with so far. I'd appreciate any feedback. PROBLEM: There is no standardized way to cache output in WebKit. As a result, Webware developers have to start from scratch when designing and implementing caching framework for their application, or they just ignore the benefits of caching. PROPOSED SOLUTION: Create a standarized output caching framework for WebKit (+Cheetah) that is easy-to-use, efficient, flexible, and extendable. BACKGROUND NOTES: * Cheetah currently allows a programmer to do fragment output caching of regions of a template or the string return values of individual $placeholders. The #cache directive and the $*placeholder syntax are used to invoke caching. * The cached output fragments for each Cheetah template are stored in the Template._cacheData instance variable, which is a simple dictionary. Thus each instance of a template has it's own cache data. There is no sharing of this cache data between the multiple intances of a Template servlet that are created when Cheetah templates are loaded into WebKit. * the cache items are stored in memory and are not persistent between processes. * Cheetah's current cache implementation doesn't provide for 'specializing caches', which means you can't store a different value of a cached region for a set of special cases: 1 cache value per user, per browser type, per query string value, etc. REQUIREMENTS: * should not significantly slow down requests that don't use output caching * no major changes should be required to cheetah's syntax * should work with full-page or page-fragment output caching * the cache interface should be visible to servlets and to the application class * should be possible to share cache items between servlets and between threads * should provide a simple, standardized way to interact with browser and proxy ('gateway' and 'reverse') caches * would be nice to provide some hooks to allow cached items to be shared between WebKit processes * would be nice to allow caching full-page output to external files that are accesible directly from the webserver. IMPLEMENTATION PROPOSAL: * create a singleton CacheStore class that handles the storage and retrieval of output cache items (which may be any Python object): class CacheStore: def store(self, cacheItem): ... return cacheKey def retrieve(self, cacheKey): ... return cacheItem def hasKey(self, cacheKey): return True or False def invalidate(self, cacheKey) ... * every cached item is assigned a string key that is guaranteed to be unique. This key is assigned by the CacheStore. * the CacheStore will handle access synchronization * the CacheStore is accesible via a simple import statement. * the application code is responsible for invalidating cacheItems. * the CacheStore stores the cacheItems in a dictionary attribute. * specialized subclasses of CacheStore can be used to store the cacheItems elsewhere (DB, shared memory, filesystem, ZODB, etc.) and allow cacheItems to be shared between processes. Or you could follow the pattern of SessionDynamicStore and keep the cache item in memory or push it to disk if it hasn't been accessed in a while. * to allow for full-page output caching, servlet intances could have a special shouldCache() method that tells the Application class if, and for how-long, the servlet's output should be cached. The Application class would maintain a mapping of URIs to cacheKeys. For each request it would see if there the URI was cached. comments?? Tavis |
From: Chuck E. <ChuckEsterbrook@StockAlerts.com> - 2002-03-31 13:56:45
|
On Sunday 31 March 2002 01:46 am, Tavis Rudd wrote: > PROBLEM > There is no standardized way to cache output in WebKit. As a result, > Webware developers have to start from scratch when designing and > implementing caching framework for their application, or they just > ignore the benefits of caching. I use this kind of caching all the time, which I find very easy and a good "bang for the buck": def foo(self): if self._foo is None: self._foo = whatever() return self._foo I therefore am paying attention to caching, but not necessarily designing and implementing a caching framework from scratch. I think your PROBLEM statement misrepresents WebKit development when it says that developers have to start from scratch or go without. I do neither. I'm not saying that your caching framework ideas don't offer some bonuses over "caching in object attributes", just that the PROBLEM statement is too strong. In any case, good luck. It will be interesting to see if you and Terrel collaborate on this. If you haven't read this yet, you should: http://jaguar.sourceforge.net/cgi-bin/wiki/DependencyManagementService I'm certainly open to tweaking and refactoring WebKit to support a good CacheKit. -Chuck |
From: Tavis R. <ta...@re...> - 2002-03-31 18:16:51
|
On Sunday 31 March 2002 05:56, Chuck Esterbrook wrote: > On Sunday 31 March 2002 01:46 am, Tavis Rudd wrote: > > PROBLEM > > There is no standardized way to cache output in WebKit. As a > > result, Webware developers have to start from scratch when > > designing and implementing caching framework for their > > application, or they just ignore the benefits of caching. > > I use this kind of caching all the time, which I find very easy and > a good "bang for the buck": > > def foo(self): > if self._foo is None: > self._foo = whatever() > return self._foo That's great for partial output caching, but not necessarily for full-page output caching. You can do full-pages this way, but it's not very efficient. WebKit still has to do a fair bit of processing just to get to the servlet and this approach stores 10 or more copies of the page output in memory because of the number of live servlet instances. What I'm proposing would shortcut the URI-to-servlet mapping process and store only a single copy. Yeah, that PROBLEM statement is a bit strong. How about this instead? There is no standardized way to cache **full-page** output in WebKit. As a result, Webware developers have to start from scratch when designing and implementing caching framework for their application, or they just ignore the benefits of caching. |
From: Terrel S. <tsh...@uc...> - 2002-04-09 01:31:08
|
On Sun, 2002-03-31 at 05:56, Chuck Esterbrook wrote: > I'm not saying that your caching framework ideas don't offer some > bonuses over "caching in object attributes", just that the PROBLEM > statement is too strong. > > In any case, good luck. It will be interesting to see if you and Terrel > collaborate on this. If you haven't read this yet, you should: > > http://jaguar.sourceforge.net/cgi-bin/wiki/DependencyManagementService > Thanks for the plug, Chuck. Tavis, I created the page mentioned by Chuck when we first started discussing the caching strategy in Cheetah about a year ago. It has been sitting fairly dormant since then, until about a month ago when I started kicking it around a bit. Anyone interested in discussing this can hang out on http://jaguar.sourceforge.net/cgi-bin/wiki/ and/or join the jaguar-devel mailing list. > > I'm certainly open to tweaking and refactoring WebKit to support a good > CacheKit. > Cool. I think my original plan would be a fairly invasive refactoring, but a solid regression test suite and a lot of benchmarks should make it palatable. Now for some code to back it up .... 8-) -- Terrel p.s. sorry about the slow response. |