Re: [Lurker-users] lurker db tunables
Brought to you by:
terpstra
From: Wesley W. T. <we...@te...> - 2011-01-09 21:54:54
|
On Sun, Jan 9, 2011 at 1:26 AM, Shane W <shane-from-lists.sourceforge.net@ csy.ca> wrote: > First, the database is often creating new files and > deleting which isn't that efficient for rsync to the > hosting provider. How does lurker decide when to roll a new > file vs append to an existing when adding a message to an > archive. > The database files in lurker are immutable. Every time lurker-index is run, a new database file is 'rolled up' with other existing files. Given the nature of the index lurker uses, this is quite efficient for magnet disks. As for rsync bandwidth, I wouldn't worry too much. The new database files created correspond to new email. On average, a given key in the database will move through log(N) otherwise unchanging database files, which means it will be rsync'd quite a bit less often than it would be in say, a BTree or hash table where nearby changes to the file cause rsync to recopy it. If I were you, my main concern about rsync would be that it doesn't take a consistent snapshot. If lurker-index runs while rsync runs, you can get a database summary file that lists files which rsync didn't copy. When I take backups, I use LVM to snapshot the disk and backup from that. Another possibility would be to lock the file 'db.writer' while rsync runs, which will block lurker-index processes which must acquire this lock. > The other issue is database size. I currently have a number > of lists being archived and the database is around 8% > larger than the compressed mbox size. Is that typical? > Sounds about right. The database includes most of the content of the original email messages in addition to various meta-data. > I ask about tunables as $LIBDIR/db looks like: > 1 8192 255 > First # is the database format version number (1). Second # is the block size (8192). Third number is the maximum key length (255). I wouldn't recommend changing the key length. I suppose you could change the block size, but be advised that values != 8192 haven't really been tested. I would be surprised if lurker requires much resources on a modern system, such that you need to tune it. It was built to run debian-scale mailing lists on computers which are now 10 years old... Do you have a workload where there is a performance problem? I am aware of several changes that would greatly increase the database performance, but never implemented them as they have never been necessary. |