|
From: Lionel B. <lio...@bo...> - 2007-01-19 19:21:01
|
Dave Strickler wrote the following on 19.01.2007 19:41 :
> We are experimenting here with it, with good results on other
> SQL-based PHP software.
>
> I would like to try it with Perl, and have the API, but am having
> trouble setting variables into memcached, and could use some help.
>
> I am using code that takes a SQL command string, turns it into an MD5
> hash, then sees it if can get the data from memcached. If not, it
> looks it up in SQL, and then stores it in memcached.
>
> Anyone try this? I have *NO* idea what I am doing with Perl since I am
> just a PHP guy.
I've used memcached, but with Ruby, not Perl. I've 2 code-related comments :
- you shouldn't use MD5 to cache statements it's inefficient and
theoritically you could have a collision (sometime in this century...).
The good practice is to wrap the code that fetches and sets data in the
DB to access memcache with a unique key (you can reuse the primary key
used by SQLgrey)
For example in the "is_in_from_awl" method you could check for the
presence of the key "from_awl|<sender_name>|sender_domain|src" which
would be expected to store the "last_seen" value stored in DB, if not
found, ask the DB.
For the key I use '|' as a separator and the table and the primary key
column names as element to make sure the key is unique in memcache.
Then in "put_in_from_awl" you'd build the key the same way and put the
last_seen value with an expiration of
"$self->past_tstamp($self->{sqlgrey}{awl_age}" days.
Don't forget to put and check the timestamp values or changing your
delays in configuration would have no effect on memcache-stored entries.
- you *must* handle the cache expiration (by either making entries
expire, explicitely delete them or checking their value for validity),
by only wrapping the statements like you do in your code you don't
handle DELETEs properly, the method described above should be ok. Note:
when you'll want to handle the connect table, you'll definitely have to
delete entries in memcache when they are moved to the auto-whitelist.
On the principle, I'm not sure you would earn much from using memcache.
When you don't find an information in memcache, you suppose it isn't
there and check in the DB instead. So a large subset of the queries made
by SQLgrey will still hit the DB. You could alleviate the problem by
storing negative hits in memcache, but then you'll have to expire them
properly too (and there should be so many of them, nearly never reused
that you could end up ejecting more useful memcache content when adding
them).
If it works for you could you please bench the results with and without
your modification (the average load on a mail server with a local DB
with and without your patch). I don't think we have any performance
problem even with large mail systems but if we get a good performance
boost, I'll definitely consider adding your patch.
Lionel.
|