Re: [Lurker-users] Mail list limits
Brought to you by:
terpstra
From: Wesley W. T. <we...@te...> - 2010-12-22 09:23:36
|
On Wed, Dec 22, 2010 at 9:06 AM, Wesley W. Terpstra <we...@te...> wrote: > On Wed, Dec 22, 2010 at 2:30 AM, Robert Woodworth > <ro...@a1...> wrote: >> Ive been experimenting and somewhere close to 3000 messages is where it >> starts to hang. > How are you feeding messages to lurker-index, exactly? Nevermind, I found your earlier post. You are running fetchmail to get mail for multiple mailing lists, splitting it into mailboxes based on the 'To:' field and then running lurker over the entire resulting mailbox. I can see now why lurker is getting slower and slower. I think from the context of your setup that you still misunderstand how 'lurker-index' is intended to be run. You are supposed to run it on messages as they arrive, >ONE TIME ONLY<. Every time you feed a message to lurker it puts it into the database again. That means that if you are indexing the same mailbox over and over, your mailbox is getting very very large as each message gets inserted repeatedly. lurker assumes that those old message were delivered to the list again and faithfully rearchives them for you. So, consider your example where you run fetchmail followed by lurker every 10 minutes. If there are 3000 messages in that mailbox, after one day lurker-index will have been run 24*6=144 times. You now have a database that is 144* the size it should be. Here's how you should be using lurker-index: In a typical setup with one email address per list, the MTA delivers the incoming messages one-at-a-time as they arrive to lurker-index with the '-m' option. Each time a mail arrives, it (and only it) gets routed to lurker-index. In a setup with fetchmail with one email per list, fetchmail gets the email and feeds the (new) messages to lurker-index. In a setup with fetchmail with one email shared between all lists (this is your setup), fetchmail gets the email and feeds it to procmail. procmail splits the email stream back into the different lists and feeds each email to lurker-index. The problem with your setup is that your procmail isn't feeding messages to lurker-index; it is feeding them to a mailbox. Then, you are running lurker-index on the entire mailbox instead of only the new messages. In my earlier response to your postings, I asked you to use this: A perhaps better solution is to feed the mail directly to lurker-index > from your procmail rule. > /usr/share/doc/lurker/README.procmail describes this setup. > :0 w > * ^X-Mailing-List: <exa...@li...>.* > | lurker-index -l example-list -m > There are two key differences here between my proposal and what you are doing. 1) I split up the email based on a reliable header (X-Mailing-List) instead of the 'To' field. Please, find the header the your mailing lists add and filter on that instead. 2) Most importantly, the rule feeds messages directly to lurker-index using the procmail pipe syntax (|). This means that each new email gets routed to lurker-index instead of a mailbox. Due to point #2, lurker-index only gets invoked on the new messages. There is no need to run lurker-index after running fetchmail; lurker-index already got run by procmail where necessary. I hope this clears things up. As for the database corruption, even using lurker-index incorrectly should not be causing this. I'm guessing it has something to do with having multiple lurker-indexes blocking at the same time and I'm seeing if I can reproduce the problem. |