Re: [Dspam-user] Global Group Not Training
Brought to you by:
paulcockings,
sbajic
From: Ed S. <sz...@lo...> - 2010-04-29 21:26:32
|
>> Trying to combine the 2 ideas above I tried this in the group file: >> corpususer:classification:* >> >> But unfortunately this causes a double free or corruption error in glibc >> when trying to classify any message. (I saw a ticket on that >> http://sourceforge.net/tracker/?func=detail&atid=1126467&aid=2990455&group_id=250683 >> <http://sourceforge.net/tracker/?func=detail&atid=1126467&aid=2990455&group_id=250683> >> and will be posting to that ticket right after this email) >> >> > This issue is fixed in GIT repository. Check out and try again. > > I've checked out the latest copy and the above group file line does not cause the double free error but it does not seem to be working properly either. Before the return of every email I get this error message repeated serveral times: > WARNING: nonstandard use of escape in a string literal > LINE 1: ...plit_part(split_part(version(),' ',2),'.',1) FROM '\d+')::in... > ^ > HINT: Use the escape string syntax for escapes, e.g., E'\r\n'. With the above group file line (corpususer:classification:*) I get no X-DSPAM headers added unless the result is Whitelisted. With "corpususer:classification:*corpususer" it classifies and inserts headers but debug does not have any entry showing corpususer. With "corpususer:merged:*" I do see references to dspam using corpususer. > >> My questions are: >> How should a Global Group be setup to get the results describe in the >> README? >> Is there any way to tell if a Global Group is being used? >> >> > A global group is many things in DSPAM. You mean a "classification" group. Right? So the question should be: How to check if a classification group is working. > Yes, I believe I'm looking for a classification group. A group that will be used for new users who have no training data and where trained users data is not confident in the result. So What format should I use the group file to get allusers to be in a classification group with corpususer? And how do I check to see if its working? > Can you post the output of: > dspam_stats -H tes...@te... > dspam_admin ag pref tes...@te... > dspam_admin ag pref default > sed "/^[\t ]*#\|^[\t ]*$/d" /path/to/your/dspam.conf > > $ /usr/local/dspam/bin/dspam_stats -H tes...@te... tes...@te...: TP True Positives: 39 TN True Negatives: 60 FP False Positives: 9 FN False Negatives: 29 SC Spam Corpusfed: 0 NC Nonspam Corpusfed: 0 TL Training Left: 2431 SHR Spam Hit Rate 57.35% HSR Ham Strike Rate: 13.04% PPV Positive predictive value: 81.25% OCA Overall Accuracy: 72.26% $ /usr/local/dspam/bin/dspam_admin ag pref tes...@te... trainingMode=TOE spamAction=quarantine spamSubject=[SPAM] statisticalSedation=5 enableBNR=on enableWhitelist=on signatureLocation=headers tagSpam=off tagNonspam=off showFactors=off optIn=off optOut=off whitelistThreshold=10 makeCorpus=off storeFragments=off localStore= processorBias=on fallbackDomain=off trainPristine=off optOutClamAV=off ignoreRBLLookups=off RBLInoculate=off $ /usr/local/dspam/bin/dspam_admin ag pref default trainingMode=TOE spamAction=quarantine spamSubject=[SPAM] statisticalSedation=5 enableBNR=on enableWhitelist=on signatureLocation=headers tagSpam=off tagNonspam=off showFactors=off optIn=off optOut=off whitelistThreshold=10 makeCorpus=off storeFragments=off localStore= processorBias=on fallbackDomain=off trainPristine=off optOutClamAV=off ignoreRBLLookups=off RBLInoculate=off $ sed "/^[\t ]*#\|^[\t ]*$/d" /usr/local/dspam/etc/dspam.conf Home /usr/local/dspam/var/dspam StorageDriver /usr/local/dspam/lib/dspam/libpgsql_drv.so OnFail error Trust root Trust dspam Trust www-data Trust mail Trust mailnull Trust smmsp Trust daemon TrainingMode toe TestConditionalTraining off Feature whitelist Feature tb=5 Algorithm graham burton Tokenizer chain PValue bcr WebStats off Preference "trainingMode=TOE" # { TOE | TUM | TEFT | NOTRAIN } -> default:teft Preference "spamAction=quarantine" # { quarantine | tag | deliver } -> default:quarantine Preference "spamSubject=[SPAM]" # { string } -> default:[SPAM] Preference "statisticalSedation=5" # { 0 - 10 } -> default:0 Preference "enableBNR=on" # { on | off } -> default:off Preference "enableWhitelist=on" # { on | off } -> default:on Preference "signatureLocation=headers" # { message | headers } -> default:message Preference "tagSpam=off" # { on | off } Preference "tagNonspam=off" # { on | off } Preference "showFactors=off" # { on | off } -> default:off Preference "optIn=off" # { on | off } Preference "optOut=off" # { on | off } Preference "whitelistThreshold=10" # { Integer } -> default:10 Preference "makeCorpus=off" # { on | off } -> default:off Preference "storeFragments=off" # { on | off } -> default:off Preference "localStore=" # { on | off } -> default:username Preference "processorBias=on" # { on | off } -> default:on Preference "fallbackDomain=off" # { on | off } -> default:off Preference "trainPristine=off" # { on | off } -> default:off Preference "optOutClamAV=off" # { on | off } -> default:off Preference "ignoreRBLLookups=off" # { on | off } -> default:off Preference "RBLInoculate=off" # { on | off } -> default:off AllowOverride enableBNR AllowOverride enableWhitelist AllowOverride fallbackDomain AllowOverride ignoreGroups AllowOverride ignoreRBLLookups AllowOverride localStore AllowOverride makeCorpus AllowOverride optIn AllowOverride optOut AllowOverride optOutClamAV AllowOverride processorBias AllowOverride RBLInoculate AllowOverride showFactors AllowOverride signatureLocation AllowOverride spamAction AllowOverride spamSubject AllowOverride statisticalSedation AllowOverride storeFragments AllowOverride tagNonspam AllowOverride tagSpam AllowOverride trainPristine AllowOverride trainingMode AllowOverride whitelistThreshold AllowOverride dailyQuarantineSummary PgSQLServer 127.0.0.1 PgSQLUser dspam PgSQLPass <removed> PgSQLDb dspam PgSQLConnectionCache 20 PgSQLUIDInSignature on PgSQLVirtualTable dspam_virtual_uids PgSQLVirtualUIDField uid PgSQLVirtualUsernameField username IgnoreHeader X-DSPAM-Result IgnoreHeader X-DSPAM-Processed IgnoreHeader X-DSPAM-Confidence IgnoreHeader X-DSPAM-Probability IgnoreHeader X-DSPAM-Signature Notifications off PurgeSignature off # Specified in purge.sql PurgeNeutral 90 PurgeUnused off # Specified in purge.sql PurgeHapaxes off # Specified in purge.sql PurgeHits1S off # Specified in purge.sql PurgeHits1I off # Specified in purge.sql SystemLog on UserLog on Opt out ServerPort 24 ServerQueueSize 32 ServerPID /var/run/dspam.pid ServerMode dspam ServerPass.Relay1 <removed> ClientHost 127.0.0.1 ClientPort 24 ClientIdent <removed> ProcessorURLContext on ProcessorBias on StripRcptDomain off > >> I'm using dspam 3.9.0 with a postgresql backend, compiled from source >> with the following options: >> ../configure --prefix=/usr/local/dspam --sysconfdir=/usr/local/dspam/etc >> --with-storage-driver=mysql_drv,pgsql_drv >> --with-mysql-includes=/usr/include/mysql >> --with-pgsql-includes=/usr/include/postgresql --enable-daemon >> --enable-debug - >> -enable-virtual-users --enable-preferences-extension --enable-clamav >> >> Thanks, >> Ed >> >> >> corpususer:classification:*corpususer debug output: >> >>> 6565: [04/29/2010 14:47:37] No QuarantineAgent option found. Using >>> standard quarantine. >>> 6565: [04/29/2010 14:47:37] DSPAM Instance Startup >>> 6565: [04/29/2010 14:47:37] input args: /usr/local/dspam/bin/dspam >>> --stdout --deliver=innocent,spam --user tes...@te... --debug >>> 6565: [04/29/2010 14:47:37] pass-thru args: >>> 6565: [04/29/2010 14:47:37] processing user tes...@te... >>> 6565: [04/29/2010 14:47:37] uid = 0, euid = 0, gid = 0, egid = 8 >>> 6565: [04/29/2010 14:47:37] loading preferences for user >>> tes...@te... >>> 6565: [04/29/2010 14:47:37] _pgsql_drv_getpwnam: successful returning >>> struct for name: tes...@te... >>> 6565: [04/29/2010 14:47:37] Loading preferences for uid 3856 >>> 6565: [04/29/2010 14:47:37] Loading preferences for uid 0 >>> 6565: [04/29/2010 14:47:37] Loading preferences for uid 0 >>> 6565: [04/29/2010 14:47:37] default preferences empty. reverting to >>> dspam.conf preferences. >>> 6565: [04/29/2010 14:47:37] Loading preferences from dspam.conf >>> 6565: [04/29/2010 14:47:37] using >>> /usr/local/dspam/var/dspam/opt-in/tes...@te....dspam as path >>> 6565: [04/29/2010 14:47:37] using >>> /usr/local/dspam/var/dspam/opt-out/tes...@te...dspam as path >>> 6565: [04/29/2010 14:47:37] sedation level set to: 5 >>> 6565: [04/29/2010 14:47:37] _pgsql_drv_getpwnam: successful returning >>> struct for name: tes...@te... >>> 6565: [04/29/2010 14:47:37] _pgsql_drv_getpwnam returning cached name >>> tes...@te.... >>> 6565: [04/29/2010 14:47:39] Loading 7 BNR patterns >>> 6565: [04/29/2010 14:47:39] _pgsql_drv_getpwnam returning cached name >>> tes...@te.... >>> 6565: [04/29/2010 14:47:39] Whitelist threshold: 10 >>> <snip tokens> >>> 6565: [04/29/2010 14:47:39] Graham-Bayesian Probability: 0.002278 >>> Samples: 15 >>> 6565: [04/29/2010 14:47:39] Burton-Bayesian Probability: 0.000018 >>> Samples: 27 >>> 6565: [04/29/2010 14:47:39] no factors specified; using default >>> 6565: [04/29/2010 14:47:39] Result Confidence: 1.00 >>> 6565: [04/29/2010 14:47:39] _pgsql_drv_getpwnam returning cached name >>> tes...@te.... >>> 6565: [04/29/2010 14:47:39] Control: [10 10] [10 11] Delta: [0 1] >>> 6565: [04/29/2010 14:47:40] total processing time: 3.01203s >>> 6565: [04/29/2010 14:47:40] _pgsql_drv_getpwnam returning cached name >>> tes...@te.... >>> 6565: [04/29/2010 14:47:40] _pgsql_drv_getpwnam returning cached name >>> tes...@te.... >>> 6565: [04/29/2010 14:47:40] saving signature as >>> 3856,4bd9d44c65657166613715 >>> 6565: [04/29/2010 14:47:40] _pgsql_drv_getpwnam returning cached name >>> tes...@te.... >>> 6565: [04/29/2010 14:47:40] libdspam returned probability of 0.002278 >>> 6565: [04/29/2010 14:47:40] message result: NOT SPAM >>> 6565: [04/29/2010 14:47:40] _pgsql_drv_getpwnam returning cached name >>> tes...@te.... >>> 6565: [04/29/2010 14:47:40] delivering message >>> 6565: [04/29/2010 14:47:40] DSPAM Instance Shutdown. Exit Code: >>> 0 >>> |