Thread: Re: [SMW-devel] Query result caching and invalidation (Jeroen De Dauw)

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

Hi,

Well let me just hawk into this discussion since we are running a
query cache solution for the last two month based on:

## Storage engine
We do not use any database table but instead rely on an available
caching engine either APC, memcached or MW's own object cache.

## Query uniqueness
Before a getQueryResult() is executed a hash key is generated from
$query->getQueryString() . '#'. $query->getLimit() . '#' .
$query->getOffset() . '#'.  serialize( $printouts ) . '#' . serialize(
$query->sortkeys ) which gives enough depth to ensure comparability
among queries.

## Associated objects
While having $res->getResults stored as single cache object (which is
the simplest of all operations) the more important role goes to
associated objects. Associated objects are individual entities (page,
property etc.)  that part of the result set and the condition and each
stored as separated cache object (each associated object uses its own
md5 hash key, easy to track any object with the same key during any
update process). This allows to build a chain between objects and the
query that requested its involvement.

During each update process (onChangeTitle, onUpdateDataBefore,
onDelete etc) an object and its hash key (it is a cheap operation due
to its 1:1 relation) is checked against the cache pool and if an cache
object exists the stored array of hash keys will point to all involved
queries of that object. When one of these objects are detected during
the update process they will be purged from cache ($cache->delete(
$row ).

If a change happened, a result is build up from scratch because of any
potential alteration (we do not compare any change we assume a change
happened and therefore the risk of an invalid result is higher than
just having a new result set) that could have happened with one of the
involved objects that caused the invalidation.

Since we only use in-memory cache objects, we don't have to care about
synchronization of table objects and we simply use the hash key as
comparator, invalidator and chaining object.

## Result
The greatest benefit will emerge from query results that only change
occasionally and for queries with have a high turnover due to a high
velocity on associated objects there has been no measurable downside
(we use APC or memcached rather than a database).

Cheers,

mwjames

Thread: Re: [SMW-devel] Query result caching and invalidation (Jeroen De Dauw)

Lets you store and query data within the wiki's pages.

semediawiki-devel