Also, subject logging is used so that files aren't randomly deleted so
block reporting works (still, I don't think that's the problem here,
just wanted to more thouroughly answer your question)
On Tue, Apr 20, 2010 at 2:51 PM, K Post <nntp.post@...> wrote:
> I had maxbytes set to 8000, trying 4000 now, though I'm afraid of
> going more spammy in the rebuild since there's a TON of spam, but
> proportionally few messages being sent each day (or recieved as
> Subject logging is on for manual review and retrieval.
> The server seems to be unresponsive not as it goes through all of the
> messages, but between "Resulting file" and "Bayesian Pairs"
> calculation. See below.
> Apr-18-10 20:49:55 Resulting file 'c:/assp/spamdb.rb.tmp' is 6,078,796 bytes
> Apr-18-10 20:53:35 Bayesian Pairs: 253,820 in new mail, 1,750,470 now in list
> What's going on between there?
> Thanks for the thoughts.
> On Tue, Apr 20, 2010 at 7:58 AM, Hill, Brett <hillrb@...> wrote:
>> K Post wrote:
>> Ok, ok, I'm an idiot. There, I said it, but I still have questions.
>> As it turns out, the system rebuilds the database at 00:15, 5:15am,
>> AND 8:15pm. So now we know where the load is coming from.
>> There were outages at midnight and 5am consistently too, I just wasn't
>> getting text alerts overnight for outages < 15 minutes.
>> So, new question: Any idea what could be causing the rebuild to kill
>> the server while it processes, and why does my rebuilt seem to take
>> about 40 minutes each time.
>> I keep 15,000 files, subject logging on. No databases in use.
>> Is there a reason why you keep (UseSubjectsAsMaillogNames) enabled? Now
>> that you've got 15k messages, there really is no need to have that
>> Your exceeding NotSpam count is probably what is causing your rebuild to
>> take so long. ASSP still looks through all those extra messages during
>> the rebuild before it deletes the overage. If you disable
>> (UseSubjectsAsMaillogNames), ASSP will maintain a consistent 15K (or
>> thereabouts) thereby reducing the amount of time it takes for the
>> rebuild. I can't say why your ASSP is non-responsive though. Mine's
>> always responsive during a rebuild.
>> My limit is set to 14,500 and the rebuild takes about 13-14 minutes.
>> For whatever reason, I have a hard time keeping max files in my spam
>> dir. I assume it's because they're deleted because of false positives.
>> My server runs Win32 with a 3.4GHz Xeon Proc with 3.5GB of RAM.
>> RebuildSpamDB 220.127.116.11 (1.0.01) started - Tue Apr 20 07:30:01 2010
>> Running in basedirectory 'C:/ASSP'
>> ---ASSP Settings---
>> Use Subject as Maillog Names: Disabled
>> Maxbytes: 4000
>> Maxfiles: 14500
>> ---Cleaning whitelist (c:/assp/whitelist)--- whitelist entries older
>> than 1095 days (MaxWhitelistDays) will be removed whitelist before:
>> 20,008 whitelist after: 20,008
>> --- Cleaning NoBayesian folders ---
>> entries older than 30 days will be removed starting cleanup old files
>> for folder c:/assp/okmail folder c:/assp/okmail before: 0 folder
>> c:/assp/okmail after: 0
>> starting cleanup old files for folder c:/assp/discarded folder
>> c:/assp/discarded before: 376 folder c:/assp/discarded deleted: 7 folder
>> c:/assp/discarded after: 369
>> starting cleanup old files for folder c:/assp/quarantine folder
>> c:/assp/quarantine before: 405 folder c:/assp/quarantine deleted: 7
>> folder c:/assp/quarantine after: 398
>> --- Cleaning corrected spam/notspam folders --- entries older than 1000
>> days will be removed starting cleanup old files for folder
>> c:/assp/errors/spam folder c:/assp/errors/spam before: 545 folder
>> c:/assp/errors/spam after: 545
>> starting cleanup old files for folder c:/assp/errors/notspam folder
>> c:/assp/errors/notspam before: 540 folder c:/assp/errors/notspam after:
>> --- Cleaning Bayesian folders ---
>> File Count: 545
>> Imported Files: 545
>> Finished in 4 second(s)
>> File Count: 540
>> Imported Files: 540
>> Finished in 5 second(s)
>> File Count: 14,035
>> removing c:/assp/spam/9760.eml -- 'email@...' is in Whitelist
>> Removed White: 1
>> Imported Files: 14,034
>> Finished in 362 second(s)
>> File Count: 14,499
>> Imported Files: 14,499
>> Finished in 433 second(s)
>> Generating weighted Bayesian tuplets...done
>> Saving rebuilt SPAM database...done
>> Resulting file 'spamdb' is 3,847,902 bytes
>> HELO Blacklist: 275 HELOs
>> Spam Weight: 4,277,804
>> Not-Spam Weight: 4,551,464
>> Corpus norm: 0.9399 (ok - balanced)
>> Corpus correction settings - low:0.9 high:1.2 minimum files:10000
>> minimum days:14
>> Total processing time: 820 second(s)
>> Griplist download disabled
>> Downloading C:/ASSP/files/droplist.txt via direct HTTP connection
>> Tue Apr 20 07:43:43 2010: RebuildSpamDB 18.104.22.168 (1.0.01) ended
>> Kind Regards,
>> Download Intel® Parallel Studio Eval
>> Try the new software tools for yourself. Speed compiling, find bugs
>> proactively, and fine-tune applications for parallel performance.
>> See why Intel Parallel Studio got high marks during beta.
>> Assp-test mailing list