|
From: Thomas E. <Tho...@th...> - 2018-01-08 16:34:56
|
Colin,
I'm happy with 8000 files - but my corpus is still less.
>I’m not sure why I have MaxFiles at 21,000 when the default is 14,000.
If the time required by the rebuild is OK for you and the detection rate
and correctness is OK, use the 21.000. The 14.000 is IMHO a wild guess.
From time to time I change this value, also the file-age values - this
depends on the spammers. If I have less than 6000 files in the corpus, I
increase these values until I feel confident.
I'm happy with 1.000.000 records in spamdb and 2.000.000 in hmmdb.
>2018-01-08 12:40:31 start populating Spamdb with 2,844,910 records -
Bayesian check is now disabled!
>2018-01-08 12:43:05 Finished populating Spamdb with 2,844,910 records -
Bayesian check is now enabled!
>2018-01-08 12:44:29 start populating Hidden Markov Model with 7,118,798
records!
>2018-01-08 12:52:22 Finished populating Hidden Markov Model with
7,118,798 records!
These two time ranges are the only important for the rebuild, because the
updated database table is not available while records are populated.
I think, the time required by your system to updated the database is
acceptable in both cases.
My experience is, it is more important to use the 'Max Number of Duplicate
File Names - MaxAllowedDups' than anything else. If you don't control this
and you get 10.000 spams a day with 'your image....' ,'your fax....' and
all are stored in the corpus, assp will detect these mails, but no other
spam - after two days.
Thomas
Von: "Colin Waring" <co...@do...>
An: "ASSP development mailing list" <ass...@li...>
Datum: 08.01.2018 16:32
Betreff: Re: [Assp-test] Meltdown/Spectre
Hi Thomas,
Thanks, I understand why things are the way they are so I checked. I’m not
sure why I have MaxFiles at 21,000 when the default is 14,000. I’ll reduce
it back down if you think that 14,000 is plenty.
All the best,
Colin Waring.
From: Thomas Eckardt [mailto:Tho...@th...]
Sent: 08 January 2018 14:53
To: ASSP development mailing list <ass...@li...>
Subject: Re: [Assp-test] Meltdown/Spectre
>I’d be happy to limit
If 'UseSubjectsAsMaillogNames' is used, the corpus Maintenance ignores
MaxFiles - this is also the case, if fileage controll is used - but the
rebuild will use MaxFiles.
So it should be possible to speedup the rebuild, by decreasing the value
of MaxFiles.
Thomas
Von: "Colin Waring" <co...@do...>
An: "ASSP development mailing list"
<ass...@li...>
Datum: 08.01.2018 15:36
Betreff: Re: [Assp-test] Meltdown/Spectre
Prior to changing anything, the count was 7,500,000 records.
After increasing maxbytes from 6,000 to 20,000 it dropped to 5,400,000 but
has now gone back up.
root@mail2:/usr/local/assp/store# ls -1 spam/|wc -l
25865
root@mail2:/usr/local/assp/store# ls -1 notspam/|wc -l
245138
root@mail2:/usr/local/assp/store# ls -1 errors/spam/|wc -l
1313
root@mail2:/usr/local/assp/store# ls -1 errors/notspam/|wc -l
2005
I have MaxBayesFileAge set to 31 days.
MaxCorrectedDays set to 365
StoreCompleteMail is set to no limit for resend purposes.
A quick find on the directory (find ./ -type f -mtime +32) doesn’t find
anything so it looks right. The spam/notspam ratio matches the % Non-local
mail blocked which is what I would expect with a corpus limited by time
rather than the number of files.
2018-01-08 10:42:46 info: require approx. 20,999 files (11,046,000 words)
from folder store/spam to get the wanted corpusnorm (1.000)
2018-01-08 11:15:58 Imported Files for HeloBlackList: 21,000
2018-01-08 11:15:58 Imported Files for Bayes/HMM: 21,396
2018-01-08 11:15:58 info: require approx. 17,277 files (8,776,977 words)
from folder store/notspam to get the wanted corpusnorm (1.000)
2018-01-08 12:40:09 Imported Files for HeloBlackList: 21,000
2018-01-08 12:40:09 Imported Files for Bayes/HMM: 22,256
I’m guessing the “require approx.” is just informative and doesn’t
actually limited the number of files that the rebuild process uses
although it does say it will limit them later:
2018-01-08 12:44:24 Corpus norm: 0.9999 - (very good - balanced)
2018-01-08 12:44:24 Corpus confidence: 1.00000000
2018-01-08 12:44:24 Recommendation: RebuildSpamDB will limit the number of
used messages in your corpus. Excess files will be ingored.
I’d be happy to limit the notspam folder by number of files, the reason I
use number of days is to make files available for resend requests and
people can be away for two weeks at a time and that affects only the spam
folder. I don’t think they can be set differently though.
All the best,
Colin Waring.
From: Thomas Eckardt [mailto:Tho...@th...]
Sent: 08 January 2018 14:03
To: ASSP development mailing list <ass...@li...>
Subject: Re: [Assp-test] Meltdown/Spectre
Nice to hear that's fine now.
2018-01-06 23:52:22 Finished populating Hidden Markov Model with 5,418,395
records!
2018-01-08 12:52:22 Finished populating Hidden Markov Model with 7,118,798
records!
This is an increase of 30% in 37 hours. Did you change anything in the
config?
For me, these record counts are very high. Only for me to know, how many
files are in your corpus?
Thomas
Von: "Colin Waring" <co...@do...>
An: "ASSP development mailing list" <
ass...@li...>
Datum: 08.01.2018 14:45
Betreff: Re: [Assp-test] Meltdown/Spectre
Thank you,
I’ve got it all configured and set up logwatch to do a summary of anything
else.
Rebuild is populating the db now, though the number of records has gone
back up.:
2018-01-08 12:40:31 start populating Spamdb with 2,844,910 records -
Bayesian check is now disabled!
2018-01-08 12:43:05 Finished populating Spamdb with 2,844,910 records -
Bayesian check is now enabled!
2018-01-08 12:44:29 start populating Hidden Markov Model with 7,118,798
records!
2018-01-08 12:52:22 Finished populating Hidden Markov Model with 7,118,798
records!
Thank you for all the help.
All the best,
Colin Waring.
From: Thomas Eckardt [mailto:Tho...@th...]
Sent: 08 January 2018 12:41
To: ASSP development mailing list <ass...@li...>
Subject: Re: [Assp-test] Meltdown/Spectre
>Just to double check, can notifyre and nonotifyre use file: and then
multi line files to define multiple entries?
Yes
Do Notify, if log entry matches*
Do NOT Notify, if log entry matches*
notice the * at the end
>I’ll use notifyRe for errors and logwatch for both errors and warnings.
I don't use it for warnings - errors only. Some error are ignored.
nonotifyre: file contains
user root
\[root.*?\]
name server
droplist
error: authentication failed
upgrade to TLS
Thomas
Von: "Colin Waring" <co...@do...>
An: "ASSP development mailing list" <
ass...@li...>
Datum: 08.01.2018 12:46
Betreff: Re: [Assp-test] Meltdown/Spectre
Thank you again,
I’ll use notifyRe for errors and logwatch for both errors and warnings. A
daily summary for warnings should be enough
Just to double check, can notifyre and nonotifyre use file: and then multi
line files to define multiple entries?
All the best,
Colin Waring.
From: Thomas Eckardt [mailto:Tho...@th...]
Sent: 08 January 2018 11:07
To: ASSP development mailing list <ass...@li...>
Subject: Re: [Assp-test] Meltdown/Spectre
ERROR: or Error:
in the maillog.txt indicates a major exception - an issue fixup is
required.
Warning:
is a minor exception - an issue fixup may be required, if the warning is
not caused by a temp error.
Info:
are information lines
'NotifyRe' may help to catch events.
Thomas
Von: "Colin Waring" <co...@do...>
An: "ASSP development mailing list" <
ass...@li...>
Datum: 08.01.2018 11:42
Betreff: Re: [Assp-test] Meltdown/Spectre
Thank you Thomas
2018-01-08 00:17:38 [Worker_10001] ERROR: unable to find file
/usr/local/assp/assp_db_import.cfg - cancel import
I’m not sure whether this file was introduced and I missed it or if it has
gone missing somewhere along the way. I’ve downloaded a fresh copy and set
the rebuild running again.
If things like this aren’t in the rebuild log then I need to set up better
monitoring. Are all important error logs in maillog.txt prefixed with
ERROR in the same way or would there be other important keywords to check
for?
All the best,
Colin Waring.
From: Thomas Eckardt [mailto:Tho...@th...]
Sent: 08 January 2018 10:30
To: ASSP development mailing list <ass...@li...>
Subject: Re: [Assp-test] Meltdown/Spectre
Colin,
please have a look in to maillog.txt - search for both sequences
Jan-08-18 04:13:55 [Worker_10001] Start populating Spamdb with 1,109,512
records - Bayesian check is now disabled!
Jan-08-18 04:13:55 [Worker_10001] Try to lock Spamdb database in 5
second(s)
Jan-08-18 04:14:00 [Worker_10001] Database import started for table spamdb
Jan-08-18 04:14:00 [Worker_10001] Trying Bulkimport for table spamdb
Jan-08-18 04:14:00 [Worker_10001] Database: MySQL 5.1.72-community-log
Jan-08-18 04:14:00 [Worker_10001] Info: version 2.4.3(15119) of file
C:/assp/assp_db_import.cfg is used for the import
Jan-08-18 04:14:01 [Worker_10001] Added 3000 of 1109512 records (BULK) for
table spamdb - finished in 68 sec
.....
Jan-08-18 04:14:49 [Worker_10001] Bulkimport for table spamdb finished
Jan-08-18 04:14:49 [Worker_10001] Successfully added 1109512 records in to
table spamdb
Jan-08-18 04:14:49 [Worker_10001] Finished populating Spamdb with
1,109,512 records - Bayesian check is now enabled!
Jan-08-18 04:15:11 [Worker_10001] Try to lock HMM databases in 5 second(s)
Jan-08-18 04:15:16 [Worker_10001] Start populating Hidden Markov Model.
HMM-check is disabled for this time!
Jan-08-18 04:15:16 [Worker_10001] Start populating Hidden Markov Model
with 1,858,890 records!
Jan-08-18 04:15:16 [Worker_10001] Database import started for table hmmdb
Jan-08-18 04:15:17 [Worker_10001] Trying Bulkimport for table hmmdb
Jan-08-18 04:15:17 [Worker_10001] Database: MySQL 5.1.72-community-log
Jan-08-18 04:15:18 [Worker_10001] Added 20000 of 1858890 records (BULK)
for table hmmdb - finished in 91 sec
.....
Jan-08-18 04:16:50 [Worker_10001] Bulkimport for table hmmdb finished
Jan-08-18 04:16:50 [Worker_10001] Successfully added 1858890 records in to
table hmmdb
Jan-08-18 04:16:50 [Worker_10001] Finished populating Hidden Markov Model
with 1,858,890 records!
The highlighted lines are not written to the rebuildrun.txt - only to
maillog.txt!
> it is being populated whenever someone reports a message through the
email interface.
There must be something wrong with database import function. Error lines
should be shown in these sequences!
>The only files I have in tmpDB currently are:
>-rw-r--r-- 1 root root 118557988 Jan 8 06:46 rbtmp.hamHMM.chains
>-rw-r--r-- 1 root root 94224002 Jan 8 06:46 rbtmp.hamHMM.totals
>-rw-r--r-- 1 root root 240631755 Jan 8 06:47 rbtmp.spamHMM.chains
>-rw-r--r-- 1 root root 185125930 Jan 8 06:47 rbtmp.spamHMM.totals
That's OK. This is the full Hidden Markov Model. The files are required to
recalculate the model, if mails are reported.
>So I’m missing rbtmp.hamHMM and rbtmp.spamHMM
No such files there, if rebuild uses 'in memory' only.
>Incidentally I have noticed that spamdb.helo.rb.tmp gets created in the
assp working directory not tmpDB – I’m not sure whether it is supposed to
be there?
This will get fixed.
Von: "Colin Waring" <co...@do...>
An: "ASSP development mailing list" <
ass...@li...>
Datum: 08.01.2018 10:34
Betreff: Re: [Assp-test] Meltdown/Spectre
So,
As suspected the rebuild debug shows nothing useful at this stage.
I can however now tell where the content of hmmdb is coming from – it is
being populated whenever someone reports a message through the email
interface.
The only files I have in tmpDB currently are:
-rw-r--r-- 1 root root 118557988 Jan 8 06:46 rbtmp.hamHMM.chains
-rw-r--r-- 1 root root 94224002 Jan 8 06:46 rbtmp.hamHMM.totals
-rw-r--r-- 1 root root 240631755 Jan 8 06:47 rbtmp.spamHMM.chains
-rw-r--r-- 1 root root 185125930 Jan 8 06:47 rbtmp.spamHMM.totals
So I’m missing rbtmp.hamHMM and rbtmp.spamHMM
I had a look at the code and saw that the populate part runs the database
import routine against the hash HMMresObj yet the only place the hash is
populated is:
$HMMresObj=tie %HMMres,'BerkeleyDB::Hash',
(-Filename => "$DBDir/rb_HMMres.bdb"
,
-Flags => DB_CREATE,
-Env => $BDBEnv);
So, how does the database get populated if BDB is off?
That’s about as far as I can get at the moment I think..
Incidentally I have noticed that spamdb.helo.rb.tmp gets created in the
assp working directory not tmpDB – I’m not sure whether it is supposed to
be there?
All the best,
Colin.
From: Colin Waring [mailto:co...@do...]
Sent: 07 January 2018 22:43
To: ASSP development mailing list <ass...@li...>
Subject: Re: [Assp-test] Meltdown/Spectre
Rebuild has completed:
mysql> select * from hmmdb;
+------------------+------------------------------------+---------+
| pkey | pvalue | pfrozen |
+------------------+------------------------------------+---------+
| ***COUNT*** | 3 | 0 |
| ***DB-VERSION*** | 2_14315_UAX#29_UAX#15_WordStem2.02 | 0 |
| ***bayesnorm*** | 0.999954300466416 | 0 |
+------------------+------------------------------------+---------+
3 rows in set (0.00 sec)
So nothing in mysql. ASSP status is all green and I can see the above data
by using the edit list button next to hmmdb.
Could DBCacheMaxAge have anything to do with this? It was set to 10.
I’m re-running rebuild with the debug file created and will have to check
in the morning.
From: Colin Waring [mailto:co...@do...]
Sent: 07 January 2018 21:08
To: ASSP development mailing list <ass...@li...>
Subject: Re: [Assp-test] Meltdown/Spectre
Hi Thomas,
I’ve checked and RebuildTestMode is not set.
mysql> select count(*) from hmmdb;
+----------+
| count(*) |
+----------+
| 5194934 |
+----------+
1 row in set (3.35 sec)
The count hasn’t changed overnight so it is definitely not updating.
So I’ve dropped hmmdb, spamdb and spamdbhelo. Run a full update on all the
servers including perl modules and then restarted everything. Tables
recreated and now a rebuild is running to hopefully set them up afresh.
Fingers crossed that solves it and hopefully no other tables are affected.
All the best,
Colin.
From: Thomas Eckardt [mailto:Tho...@th...]
Sent: 07 January 2018 19:06
To: ASSP development mailing list <ass...@li...>
Subject: Re: [Assp-test] Meltdown/Spectre
Colin, did you set RebuildTestMode ???? For me, it looks like.
mysql> mysql> select count(*) from hmmham;
| 1248444 |
mysql> select count(*) from hmmhamtot;
| 1123064 |
mysql> select count(*) from hmmspam;
| 1654660 |
mysql> select count(*) from hmmspamtot;
| 1495532 |
Remove these tables - they were possibly created many many years ago. I
can't remember.
Thomas
Von: "Colin Waring" <co...@do...>
An: "ASSP development mailing list" <
ass...@li...>
Datum: 07.01.2018 19:29
Betreff: Re: [Assp-test] Meltdown/Spectre
Hi Thomas,
Maybe I’m misunderstanding what populating is? Is populating when the
temporary db generated by the rebuild are loaded into the mysql server?
I was therefore looking at the mysql server to confirm if any new data was
being put in it.
Is there any debugging I can turn up to get more information on what is
happening at that point? I’m not sure if rebuilddebug.txt would give more
information, I imagine it’d certainly slow down other parts of the
rebuild.
All the best,
Colin.
From: Thomas Eckardt [mailto:Tho...@th...]
Sent: 07 January 2018 17:34
To: ASSP development mailing list <ass...@li...>
Subject: Re: [Assp-test] Meltdown/Spectre
>2018-01-06 22:00:00 Maxbytes: 20,000
ok nearly two hours - that's long - takes on my system ~ 30 min
>2018-01-06 23:51:13 start populating Spamdb with 2,514,865 records -
Bayesian check is now disabled!
>2018-01-06 23:51:18 Finished populating Spamdb with 2,514,865 records -
Bayesian check is now enabled!
there is something wrong - 5 seconds duration with a hardcoded delay of 5
seconds for 2.5 million records
>2018-01-06 23:52:22 start populating Hidden Markov Model with 5,418,395
records!
>2018-01-06 23:52:22 Finished populating Hidden Markov Model with
5,418,395 records!
same here, 5.4 million records in less than a second - this is impossible
mysql> mysql> select count(*) from hmmham;
| 1248444 |
mysql> select count(*) from hmmhamtot;
| 1123064 |
mysql> select count(*) from hmmspam;
| 1654660 |
mysql> select count(*) from hmmspamtot;
| 1495532 |
Where do you get these MySQL tables/records from ? There is no option (and
also NO CODE) in assp to tie the temporary HMM tables to mysql. And even
if this would be possible - mysql is too slow to build the HMM. There are
only two options in assp to hold the temp HMM tables, BerkeleyDB and
memory.
Thomas
Von: "Colin Waring" <co...@do...>
An: "ASSP development mailing list" <
ass...@li...>
Datum: 07.01.2018 17:51
Betreff: Re: [Assp-test] Meltdown/Spectre
So a report in from last nights’ rebuild.
Logs are:
2018-01-06 22:00:00 Maxbytes: 20,000
2018-01-06 23:51:13 start populating Spamdb with 2,514,865 records -
Bayesian check is now disabled!
2018-01-06 23:51:18 Finished populating Spamdb with 2,514,865 records -
Bayesian check is now enabled!
2018-01-06 23:52:22 start populating Hidden Markov Model with 5,418,395
records!
2018-01-06 23:52:22 Finished populating Hidden Markov Model with 5,418,395
records!
2018-01-06 23:52:22 Total processing time: 6,742 second(s)
2018-01-06 23:52:22 Total processing data: 975.63 Mbyte
So that’s about 20 minutes quicker with nearly double the data processed.
Marginally more Spamdb records and a reduction of HMM records by 2
million.
Still about half the speed of yours though.
All the best,
Colin.
From: Colin Waring [mailto:co...@do...]
Sent: 06 January 2018 20:48
To: ASSP development mailing list <ass...@li...>
Subject: Re: [Assp-test] Meltdown/Spectre
I’ll try upping Maxbytes to 20000 and see what happens. I’ve also turned
off usedb4rebuild to see what happens in relation to your other message.
As far as hmmdb goes, I checked everything over and can’t see anything
wrong although the numbers don’t add up to the ones in the log. The db
entries don’t have dates against them so I’m not sure how I would check to
see if they are recent.
-rw-r--r-- 1 root root 0 Jan 5 22:00 BDB-error.txt
-rw-r--r-- 1 root root 434175 Jan 5 22:00 __db.001
-rw-r--r-- 1 root root 3325951 Jan 5 22:00 __db.002
-rw-r--r-- 1 root root 65544191 Jan 5 22:13 __db.003
-rw-r--r-- 1 root root 663552 Jan 6 00:12 rb_Helo.bdb
-rw-r--r-- 1 root root 334389248 Jan 6 00:08 rb_spam.bdb
-rw-r--r-- 1 root root 332099584 Jan 6 00:13 rbtmp.hamHMM.bdb
-rw-r--r-- 1 root root 168296448 Jan 6 00:13 rbtmp.hamHMM.totals.bdb
-rw-r--r-- 1 root root 339763200 Jan 6 00:13 rbtmp.spamHMM.bdb
-rw-r--r-- 1 root root 335945728 Jan 6 00:13 rbtmp.spamHMM.totals.bdb
-rw-r--r-- 1 root root 12288 Jan 5 23:21 trashlist.bdb
mysql> select count(*) from hmmdb;
| 5194934 |
mysql> mysql> select count(*) from hmmham;
| 1248444 |
mysql> select count(*) from hmmhamtot;
| 1123064 |
mysql> select count(*) from hmmspam;
| 1654660 |
mysql> select count(*) from hmmspamtot;
| 1495532 |
From: Thomas Eckardt [mailto:Tho...@th...]
Sent: 06 January 2018 06:54
To: ASSP development mailing list <ass...@li...>
Subject: Re: [Assp-test] Meltdown/Spectre
> I’m wondering why I have so many more records when Maxbytes is less and
the total data is less.
This is caused by HTML mails - mostly SPAM mails.
You may have a look in to some spam mails with a size of 20.000 and more
bytes. You'll find some, which are starting with alot of HTML header stuff
(CSS and script and so on). Most times this content is longer than 6000
byte (your MaxByte setting).
I saw mails with a size of 25.000 bytes and 10 words of human readable
content.
ASSP tries to get the human readable content of HTML mails for analyzing,
but if this is not possible, it uses the available data.
The CSS and header content is very different in every mail. Even assp
normalizes this content anyway, this leads in to much more different HMMdb
and spamDB records - most of them are useless for spam detection.
Have a look in to the GUI for - Use this HTML Parser (HTMLParser).
I use HTML::Strip.
My current setting for MaxBytes (20.000) is only a long time running try
out. I want to see, how the detection works from 20.000 to 50.000 bytes
setting in 10.000 bytes steps. Each setting is used for ~1 month. MaxBytes
50.000 has passed the test and was perfect - like expected - because 100%
of spam mails (without an attachment) are perfectly analyzed and detected.
How ever, this setting leads in to a ~25% performance penalty for the
rebuild task (in relation to 20.000) using my corpus.
>CPU Model: Intel(R) Xeon(R) CPU E5-2640 v2 @ 2.00GHz
An nice CPU - but with ASSP's single threaded rebuild task it is slower
than my older Intel(R) Xeon(R) CPU X5680 @ 3.33GHz.
http://cpuboss.com/cpus/Intel-Xeon-X5680-vs-Intel-Xeon-E5-2640-v2
Collin, don't care about the overall rebuild speed. It runns at night and
it does'nt hurt, if it takes an hour more or less. Two steps are time
critical: populating spamDB and populating HMMdb. As you said "The db part
looks to be fine". But wait ....
It looks like, there is something wrong with the temporary rebuild
databases used for HMM. This can be also the cause for a very very slow
rebuild. >>> The rebuild was actually quicker a while back, maybe 40m
>2018-01-05 00:07:42 Start populating Hidden Markov Model. HMM-check is
disabled for this time!
>2018-01-05 00:07:43 Total processing time: 7,663 second(s)
This is ONE second time difference - totaly impossible - even if HMMdb is
hold in RAM !!!!
Is it right, that you use BerkeleyDB for the rebuild? If so -
check the 'tmpDB/rebuildDB/BDB-error.txt' file. It should be zero byte
long!
In doubt: shutdown assp, clean the folder 'tmpDB/rebuildDB/', start assp,
run a rebuild.
Thomas
Von: "Colin Waring" <co...@do...>
An: "ASSP development mailing list" <
ass...@li...>
Datum: 05.01.2018 21:14
Betreff: Re: [Assp-test] Meltdown/Spectre
From: Thomas Eckardt [mailto:Tho...@th...]
Sent: 05 January 2018 17:16
To: ASSP development mailing list <ass...@li...>
Subject: Re: [Assp-test] Meltdown/Spectre
>>time 7,663 seconds, data 486.61 Mbyte
>This is very slow. To be honest - I'm lost for words!
>My rebuild results are:
Mine are very different
2018-01-04 22:00:00 Maxbytes: 6,000
2018-01-05 00:03:00 start populating Spamdb with 2,466,760 records -
Bayesian check is now disabled!
2018-01-05 00:07:42 Start populating Hidden Markov Model. HMM-check is
disabled for this time!
2018-01-05 00:07:43 Total processing time: 7,663 second(s)
2018-01-05 00:07:43 Total processing data: 486.61 Mbyte
2018-01-05 00:08:37 Uploading Griplist via Direct Connection
The db part looks to be fine considering the times and the extra records
that mine added. I’m wondering why I have so many more records when
Maxbytes is less and the total data is less.
My two MX have directly mounted Gluster replicas running off a Fibre
channel SAN and the rebuild only runs on one.
I have a 4GB tmpDB mounted as tmpfs:
tmpfs 4.0G 1.3G 2.8G 32%
/usr/local/assp/tmpDB
Hardware for each is Citrix XenServer 7.2 running on HP DL servers
CPU Model: Intel(R) Xeon(R) CPU E5-2640 v2 @ 2.00GHz
112GB RAM in each with 12GB allocated to each VM
Hard drives aren’t SSD but are on a 1+0 array – I forget how many drives
are in it but there’s a few. SAN is a Dell Powervault, I’d need to check
on the spec.
The VMs are Ubuntu 16.04.3 LTS
16 cores allocated in 4 socket with 4 cores per socket
Primary
top - 20:02:52 up 82 days, 3:40, 1 user, load average: 0.41, 0.18, 0.11
Tasks: 241 total, 1 running, 240 sleeping, 0 stopped, 0 zombie
%Cpu(s): 0.2 us, 0.0 sy, 0.0 ni, 99.7 id, 0.0 wa, 0.0 hi, 0.0 si,
0.0 st
KiB Mem : 12318500 total, 180648 free, 6131216 used, 6006636
buff/cache
KiB Swap: 8253436 total, 7765076 free, 488360 used. 5702644 avail Mem
Secondary/rebuild
top - 20:02:30 up 66 days, 6:59, 2 users, load average: 0.05, 0.05,
0.07
Tasks: 250 total, 1 running, 249 sleeping, 0 stopped, 0 zombie
%Cpu(s): 0.2 us, 0.1 sy, 0.0 ni, 99.7 id, 0.0 wa, 0.0 hi, 0.0 si,
0.0 st
KiB Mem : 12318500 total, 448412 free, 7276144 used, 4593944
buff/cache
KiB Swap: 8253436 total, 6071240 free, 2182196 used. 3396112 avail Mem
ASSP uses 2.3g memory
Clamd about 1G
Gluster 2.2G
Perl is v5.22.1. I believe 5.26 is coming in 18.04 LTS at the end of April
according to the release schedule. I’ll plan an upgrade sometime after
that.
The rebuild was actually quicker a while back, maybe 40m but one of the
version changes must have had an impact. I couldn’t say which though as I
only really keep an eye on the amount of data processed and the
norm/confidence.
>From my point of view the real bottleneg for the rebuild task is, that
only one core (thread) is used by this >task, even there are 12 or more
available.
>Because of this (my bad) software design, the speed of a single core
matters too much. I think about for >a while to change this. I hope, I'll
get this fixed/improved in 2018.
Improvements are always welcome to make a great product even better ?
I hope 2018 is good to you.
All the best,
Colin.
Von: "Colin Waring" <co...@do...>
An: "ASSP development mailing list" <
ass...@li...>
Datum: 05.01.2018 16:01
Betreff: Re: [Assp-test] Meltdown/Spectre
Hi Thomas,
Thank you for the input – I do recall previously discussing ISP mode and
realising that it was for bigger deployments than ours.
We have three servers. Two handling inbound and one specifically for
Office 365 relaying. The two inbound probably do about 50,000 messages per
day between them according to infostats.
CPU Usage on both frontends is 1.62% avg and 1.49% avg respectively. I
only have a single MySQL db (general load average is around 0.1 ) and I’ve
been watching the hypervisor reports on its performance. I did set up a
Gluster sync between the two frontends so they have access to the same
corpus without having to do it over the network – that helped with
performance however I’ve never been able to get the rebuild run to be
particularly quick (Last night’s was total processing time 7,663 seconds,
data 486.61 Mbyte). I haven’t brought it up here because it doesn’t really
have much of an effect and it is likely in my setup rather than an ASSP
issue.
So I think I’ll get away with it on my setup, hopefully this information
will be helpful to other people who are trying to figure out if they’ll be
impacted.
All the best,
Colin Waring.
From: Thomas Eckardt [mailto:Tho...@th...]
Sent: 05 January 2018 13:49
To: ASSP development mailing list <ass...@li...>
Subject: Re: [Assp-test] Meltdown/Spectre
I remember an ISP issue, who used 10 assp instances with one enterprise
MySQL backend cluster, sharing all tables for all instances.
In havy workload times (100.000 or even more mails per hour), the MySQL
server was brought to its end - no matter how many physical resouces were
made available. Even holding the complete assp DB in the DB-server RAM has
not solved the problem.
With 100.000 mails per hour and ~50 DB queries per mail (HMMdb and
spamDB), the DB server has to process at least 5 million queries in one
hour.
If we exclude HMMdb and spamDB, depending on the configuration, there can
be additionaly 10 to 20 DB queries per mail (for all the other DB-tables).
Even this can lead in to a very high DB workload!
The URIBL-check can also be very resource expensive (read and write !!!).
Assume a mail with 100 different URIs is seen the first time - 100
unsuccessfull cache DB-queries, followed by 100 DNS queries, followed by
100 cache DB-writes.
To prevent this issue, assp V2 has a buildin ISP mode for HMMdb and
spamDB.
In short:
- the corpus of all instances is synchronized to a master instance (rsync
for example)
- HMMdb and spamDB are hold in memory in each instance and each worker
- HMMdb and spamDB are build on the master system and are distributed as
files to all other instances using an external script (methode of your
choice)
- all other tables are shared traditionaly - but each instance uses a
configurable DB cache to prevent repeated DB-queries for the same results
(for example IP checks, helo ....)
This ISP mode requires at least 16GB RAM per instance, if a maximum of 15
SMTP workers is used. Using more than 15 workers in an instance, produces
a large overhead without any performance improvement.
Collin, I don't know the workload and configuration of your systems - but
the math is simple.
An possible solution between the standard mode and the ISP mode can be:
- each assp instance has its own DB backend
- all DB-backends are bidirectional synchronized (asynchron) to a
DB-master-server-cluster
Depending on the overall workload, the DB-master-server-cluster must be an
enterprise cluster or something like that.
If we assume 10 assp instances, each record change in one instance will
lead in to one store and nine write sync ops at the master cluster!
If we assume five DB-write ops per mail -> 100 000 mail/h in all instances
-> 500 000 store ops/h + 4.5M sync ops/h at the master cluster.
Yes - the workload at the cluster will be very high, but it is no longer
time critical and will balance over all the time.
The disadvantage is, that the tables in all instances are never 100%
sychron and the last instance "winns" in writing the same DB-record. The
async state of the tables in all DB-backends increases with the overall
workload.
You may also think about a ring synchronization between the 10 assp
DB-backends. The cluster will not be required and the DB-backends will
have a manageable workload - but the delay of syncing a single record and
the data inconsitency over all instances will be increased.
Thomas
Von: "Colin Waring" <co...@do...>
An: "ASSP development mailing list" <
ass...@li...>
Datum: 05.01.2018 10:45
Betreff: [Assp-test] Meltdown/Spectre
Hi All,
I’m wondering if anyone has updated their ASSP/db backends and monitored
the performance impact yet.
I’m currently working on assessing just how bad this is going to be with
how many systems I’ve got to coordinate hypervisor/OS/microcode updates on
so I’m checking around with everyone to see who’s already got some
answers.
All the best,
Colin Waring.
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Assp-test mailing list
Ass...@li...
https://lists.sourceforge.net/lists/listinfo/assp-test
DISCLAIMER:
*******************************************************
This email and any files transmitted with it may be confidential, legally
privileged and protected in law and are intended solely for the use of the
individual to whom it is addressed.
This email was multiple times scanned for viruses. There should be no
known virus in this email!
*******************************************************
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Assp-test mailing list
Ass...@li...
https://lists.sourceforge.net/lists/listinfo/assp-test
DISCLAIMER:
*******************************************************
This email and any files transmitted with it may be confidential, legally
privileged and protected in law and are intended solely for the use of the
individual to whom it is addressed.
This email was multiple times scanned for viruses. There should be no
known virus in this email!
*******************************************************
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Assp-test mailing list
Ass...@li...
https://lists.sourceforge.net/lists/listinfo/assp-test
DISCLAIMER:
*******************************************************
This email and any files transmitted with it may be confidential, legally
privileged and protected in law and are intended solely for the use of the
individual to whom it is addressed.
This email was multiple times scanned for viruses. There should be no
known virus in this email!
*******************************************************
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Assp-test mailing list
Ass...@li...
https://lists.sourceforge.net/lists/listinfo/assp-test
DISCLAIMER:
*******************************************************
This email and any files transmitted with it may be confidential, legally
privileged and protected in law and are intended solely for the use of the
individual to whom it is addressed.
This email was multiple times scanned for viruses. There should be no
known virus in this email!
*******************************************************
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Assp-test mailing list
Ass...@li...
https://lists.sourceforge.net/lists/listinfo/assp-test
DISCLAIMER:
*******************************************************
This email and any files transmitted with it may be confidential, legally
privileged and protected in law and are intended solely for the use of the
individual to whom it is addressed.
This email was multiple times scanned for viruses. There should be no
known virus in this email!
*******************************************************
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Assp-test mailing list
Ass...@li...
https://lists.sourceforge.net/lists/listinfo/assp-test
DISCLAIMER:
*******************************************************
This email and any files transmitted with it may be confidential, legally
privileged and protected in law and are intended solely for the use of the
individual to whom it is addressed.
This email was multiple times scanned for viruses. There should be no
known virus in this email!
*******************************************************
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Assp-test mailing list
Ass...@li...
https://lists.sourceforge.net/lists/listinfo/assp-test
DISCLAIMER:
*******************************************************
This email and any files transmitted with it may be confidential, legally
privileged and protected in law and are intended solely for the use of the
individual to whom it is addressed.
This email was multiple times scanned for viruses. There should be no
known virus in this email!
*******************************************************
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Assp-test mailing list
Ass...@li...
https://lists.sourceforge.net/lists/listinfo/assp-test
DISCLAIMER:
*******************************************************
This email and any files transmitted with it may be confidential, legally
privileged and protected in law and are intended solely for the use of the
individual to whom it is addressed.
This email was multiple times scanned for viruses. There should be no
known virus in this email!
*******************************************************
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Assp-test mailing list
Ass...@li...
https://lists.sourceforge.net/lists/listinfo/assp-test
DISCLAIMER:
*******************************************************
This email and any files transmitted with it may be confidential, legally
privileged and protected in law and are intended solely for the use of the
individual to whom it is addressed.
This email was multiple times scanned for viruses. There should be no
known virus in this email!
*******************************************************
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Assp-test mailing list
Ass...@li...
https://lists.sourceforge.net/lists/listinfo/assp-test
DISCLAIMER:
*******************************************************
This email and any files transmitted with it may be confidential, legally
privileged and protected in law and are intended solely for the use of the
individual to whom it is addressed.
This email was multiple times scanned for viruses. There should be no
known virus in this email!
*******************************************************
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Assp-test mailing list
Ass...@li...
https://lists.sourceforge.net/lists/listinfo/assp-test
DISCLAIMER:
*******************************************************
This email and any files transmitted with it may be confidential, legally
privileged and protected in law and are intended solely for the use of the
individual to whom it is addressed.
This email was multiple times scanned for viruses. There should be no
known virus in this email!
*******************************************************
|