Thread: [Mixmaster-devel] binary id.log
Brought to you by:
weaselp
From: <dis...@sa...> - 2002-04-23 07:49:32
Attachments:
mix29b33idlog.dif
|
-----BEGIN PGP SIGNED MESSAGE----- Hash: RIPEMD160 I got one old 486/133 and set up linux and mixmaster, now I'm doing some tests, it can process about 2000msgs/hour, seems enough. But then I found that it spends more time searching message id in id.log than en/decripting message. As I couldn't see any readon why id.log is in ascii format I converted it to binary and now it is twice as fast: ascii id.log 86680 entries - 1.8 seconds binary id.log 115300 entries - 1.2 seconds rsa 1024 decryption - 0.22 seconds Also id.log takes less space - one entry uses only 20 bytes instead of 44 Patch for binary id.log in attachment __ Disastry http://disastry.dhs.org/ -----BEGIN PGP SIGNATURE----- Version: Netscape PGP half-Plugin 0.15 by Disastry / PGPsdk v1.7.1 iQA/AwUBPMT1gjBaTVEuJQxkEQNO8wCg4J6p6YEFiRn32dskSJCVGUM6d9AAnizp c25YzGno7tEqeK6wRLQWx+kf =6Gu0 -----END PGP SIGNATURE----- |
From: Bill S. <bil...@po...> - 2002-04-23 09:23:50
|
A long, long time ago, in a Usenet far, far away, Henry Spencer was not known for being one of the foreigners writing IPSEC, but for things like being the sysadmin *still* running that PDP-11/44 at UTZOO and for making C News go much faster than B News did. It's amazing how much you can speed things up by choosing appropriate data formats and data structures for your applications, and for not doing calculations you don't need. Also Jon Bentley's "Programming Pearls" was an excellent reference on the field. Depending on how the log file gets managed, it may be that binary data is a good choice, or it may not be hard to use a simple database like DB to keep things in b-trees or other fast searching/sorting structures. It's probably not convenient to store it sorted (sigh). Berkeley DB is supported commercially by Sleepycat Software www.sleepycat.com under a semi-Berkeley-flavored license for non-commercial users, and it has some free predecessors like -libdbm. On the other hand, ASCII formats can be much more convenient for some things, e.g. if you ever want to grep through them for contents. On one mail server I use, which is probably a P200ish box, it took me 0.036-0.039 to grep for things in /usr/share/dict/words, which had ~235000 entries and 2.5MB, so it may be that there's something about the way you're searching. ("time wc words" reported taking 0.175 sec to run...) Surprisingly "look" took 0.019 seconds to run the first time, though 0.005 in future runs. It's got some big advantages from sorting, but probably it also gained some from pulling the file from disk into cache. At 09:49 AM 04/23/2002 +0200, dis...@sa... wrote: >-----BEGIN PGP SIGNED MESSAGE----- >Hash: RIPEMD160 > >I got one old 486/133 and set up linux and mixmaster, >now I'm doing some tests, it can process about 2000msgs/hour, >seems enough. > >But then I found that it spends more time searching >message id in id.log than en/decripting message. >As I couldn't see any readon why id.log is in ascii format >I converted it to binary and now it is twice as fast: > >ascii id.log 86680 entries - 1.8 seconds >binary id.log 115300 entries - 1.2 seconds >rsa 1024 decryption - 0.22 seconds > >Also id.log takes less space - one entry uses only 20 bytes instead of 44 > >Patch for binary id.log in attachment > >__ >Disastry http://disastry.dhs.org/ >-----BEGIN PGP SIGNATURE----- >Version: Netscape PGP half-Plugin 0.15 by Disastry / PGPsdk v1.7.1 > >iQA/AwUBPMT1gjBaTVEuJQxkEQNO8wCg4J6p6YEFiRn32dskSJCVGUM6d9AAnizp >c25YzGno7tEqeK6wRLQWx+kf >=6Gu0 >-----END PGP SIGNATURE----- |
From: cmeclax po'u le cmevi'u ke'u. <cm...@gm...> - 2002-04-23 13:28:22
|
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 de'i Tuesday 23 April 2002 04:50 la Bill Stewart cusku di'e > Depending on how the log file gets managed, > it may be that binary data is a good choice, > or it may not be hard to use a simple database like DB > to keep things in b-trees or other fast searching/sorting structures. > It's probably not convenient to store it sorted (sigh). > Berkeley DB is supported commercially by Sleepycat Software > www.sleepycat.com under a semi-Berkeley-flavored license for non-commercial > users, > and it has some free predecessors like -libdbm. The operations in this case are: 1. Given an ID, find whether it is in the log. 2. Given a time, remove all entries older than that. 2 need be done only occasionally, whereas 1 is done whenever a message is processed. So hashing the log by the ID is the thing to do. What I'm going to do for pingstats (once some work that has kept me very busy slows down) is compute from the random ping ID a sequence of positions in the file and look only at those. For this, you could take the ID and extract two numbers, then know that that ID is logged in position 197, 457, 723, 988, autc, and check those locations, and if you find an expired record overwrite it with the new one. cmeclax -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.0.6 (GNU/Linux) Comment: For info see http://www.gnupg.org iD8DBQE8xWFk3/k1hdmG9jMRAuXkAKCQab5xA6I0jWA+29t/4PAsJaEdRACdG0Ro 1ynzpjbbV/2vy0Rx8ZAra+Y= =2/ok -----END PGP SIGNATURE----- |
From: Len S. <ra...@qu...> - 2002-04-30 01:59:07
|
On Tue, 23 Apr 2002, Bill Stewart wrote: > Depending on how the log file gets managed, > it may be that binary data is a good choice, > or it may not be hard to use a simple database like DB > to keep things in b-trees or other fast searching/sorting structures. > It's probably not convenient to store it sorted (sigh). > Berkeley DB is supported commercially by Sleepycat Software www.sleepycat.com > under a semi-Berkeley-flavored license for non-commercial users, > and it has some free predecessors like -libdbm. > > On the other hand, ASCII formats can be much more convenient for some things, > e.g. if you ever want to grep through them for contents. id.log exists for replay attack protection, and isn't really useful for the administrator to see. Storing it in binary seems perfectly fine, and I am slightly surprised it wasn't already. Disastry, when you get that patch cleaned up to work on 64 bit systems, drop me a note. This is definitely a change we want to make. (I agree with using a db hash, also.) |
From: <dis...@sa...> - 2002-05-20 18:03:21
Attachments:
mix29b33idlog2.dif
|
-----BEGIN PGP SIGNED MESSAGE----- Hash: RIPEMD160 Len Sassaman wrote: > Disastry, when you get that patch cleaned up to work on 64 bit systems, > drop me a note. This is definitely a change we want to make. (I agree with > using a db hash, also.) here is the new patch, it uses sizeof()s, and idlog entry now is typedef struct { char id[16]; long time; } idlog_t; instead of long idbuf[5]; in previous patch so it should work on 64 bit systems too __ Disastry http://disastry.dhs.org/ -----BEGIN PGP SIGNATURE----- Version: Netscape PGP half-Plugin 0.15 by Disastry / PGPsdk v1.7.1 iQA/AwUBPOkeMzBaTVEuJQxkEQNVBQCgxl7Oq6fBcKCCuta8oZsCs7KstSkAoNGQ Yy+wbZ3Iyg0A39cj3gStg0YD =qub8 -----END PGP SIGNATURE----- |
From: Len S. <ra...@qu...> - 2002-04-23 09:48:39
|
On Tue, 23 Apr 2002 dis...@sa... wrote: > As I couldn't see any readon why id.log is in ascii format > I converted it to binary and now it is twice as fast: Excellent. BTW, while you're digging around in that part of the code, would you mind looking at the feasibility of having a config file option for the length of time an entry stays in id.log? --Len. |
From: <dis...@sa...> - 2002-04-23 09:13:01
|
-----BEGIN PGP SIGNED MESSAGE----- Hash: RIPEMD160 Len Sassaman wrote: > On Tue, 23 Apr 2002 dis...@sa... wrote: > > > As I couldn't see any readon why id.log is in ascii format > > I converted it to binary and now it is twice as fast: > > Excellent. > > BTW, while you're digging around in that part of the code, would you mind > looking at the feasibility of having a config file option for the length > of time an entry stays in id.log? > --Len. isn't there such option already? I think IDEXP does (or is supposed to do...) this: IDEXP Mixmaster keeps a log of packet IDs to prevent replay attacks. IDEXP specifies after which period of time old IDs are expired. Default: 7d, minimum: 5d. If set to 0, no log is kept. __ Disastry http://disastry.dhs.org/ -----BEGIN PGP SIGNATURE----- Version: Netscape PGP half-Plugin 0.15 by Disastry / PGPsdk v1.7.1 iQA/AwUBPMUJVDBaTVEuJQxkEQNcvwCfRpkc0PmAUzwPFi/u3j3culM9MRsAn09k 5qaSnuDZHyim9Y0LE1HM+hGe =ffJ5 -----END PGP SIGNATURE----- |
From: Len S. <ra...@qu...> - 2002-04-23 09:20:19
|
On Tue, 23 Apr 2002 dis...@sa... wrote: > Len Sassaman wrote: > > On Tue, 23 Apr 2002 dis...@sa... wrote: > > > > > As I couldn't see any readon why id.log is in ascii format > > > I converted it to binary and now it is twice as fast: > > > > Excellent. > > > > BTW, while you're digging around in that part of the code, would you mind > > looking at the feasibility of having a config file option for the length > > of time an entry stays in id.log? > > --Len. > > isn't there such option already? > I think IDEXP does (or is supposed to do...) this: > > IDEXP Mixmaster keeps a log of packet IDs to prevent > replay attacks. IDEXP specifies after which period > of time old IDs are expired. Default: 7d, minimum: > 5d. If set to 0, no log is kept. > Erm, yes. It's way too late for me to be still awake. What I meant to say was... I'd like to be able to specify a minimum size that the log file must reach (or a minimum number of entries it must contain) before the limits on the length of time an id log entry expires. (Given such a speed improvement from storing it as binary, one would expect the log file to okay being far larger than it would be previously.) --Len. |
From: Peter P. <pe...@pa...> - 2002-04-24 11:25:22
|
On Tue, 23 Apr 2002, Len Sassaman wrote: > > isn't there such option already? > > I think IDEXP does (or is supposed to do...) this: > > > > IDEXP Mixmaster keeps a log of packet IDs to prevent > > replay attacks. IDEXP specifies after which period > > of time old IDs are expired. Default: 7d, minimum: > > 5d. If set to 0, no log is kept. > > >=20 > Erm, yes. It's way too late for me to be still awake. >=20 > What I meant to say was... >=20 > I'd like to be able to specify a minimum size that the log file must reach > (or a minimum number of entries it must contain) before the limits on the > length of time an id log entry expires. Why? Messages older than IDEXP are dropped anyway (If the cliend used the optional sent date in the mix message - all clients that I know do it). yours, peter --=20 PGP signed and encrypted | .''`. ** Debian GNU/Linux ** messages preferred. | : :' : The universal | `. `' Operating System http://www.palfrader.org/ | `- http://www.debian.org/ |
From: cmeclax po'u le cmevi'u ke'u. <cm...@gm...> - 2002-04-23 13:08:11
|
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 de'i Tuesday 23 April 2002 03:49 la dis...@sa... cusku di'e > Also id.log takes less space - one entry uses only 20 bytes instead of 44 Are you using four bytes, eight bytes, or whatever the OS uses for the time? Four bytes will lead to 2038-problems. cmeclax -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.0.6 (GNU/Linux) Comment: For info see http://www.gnupg.org iD8DBQE8xVyS3/k1hdmG9jMRAq+oAJ9Z3n0zf5xHaQthYpC2GG8QnTGFMwCeI0fI 97rXxbSMEZc+cEWdAzRDyqY= =Wt64 -----END PGP SIGNATURE----- |
From: <dis...@sa...> - 2002-04-23 13:31:01
|
-----BEGIN PGP SIGNED MESSAGE----- Hash: RIPEMD160 cmeclax po'u le cmevi'u ke'umri wrote: > de'i Tuesday 23 April 2002 03:49 la dis...@sa... cusku di'e > > Also id.log takes less space - one entry uses only 20 bytes instead of 44 > > Are you using four bytes, eight bytes, or whatever the OS uses for the time? 4 bytes, actually long. hmm.. so my patch will not work on 64-bit cpu properly :-/ > Four bytes will lead to 2038-problems. I don't care :->> __ Disastry http://disastry.dhs.org/ -----BEGIN PGP SIGNATURE----- Version: Netscape PGP half-Plugin 0.15 by Disastry / PGPsdk v1.7.1 iQA/AwUBPMVFxDBaTVEuJQxkEQPkwgCffpzIwbv9/m4/lwHbvmWSJu5128UAoO0z yW3Ku7yYf5HwwpROQd69uHCy =idc7 -----END PGP SIGNATURE----- |
From: cmeclax po'u le cmevi'u ke'u. <cm...@gm...> - 2002-04-23 13:53:25
|
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 de'i Tuesday 23 April 2002 09:30 la dis...@sa... cusku di'e > > Four bytes will lead to 2038-problems. > > I don't care :->> Actually, if you use circular comparison (subtract the times and check the sign) it won't cause a 2038-problem, except in the highly unlikely case that a message is delivered to the next remailer more than 68 years after it was sent and the remailer hasn't changed address or key in that time. cmeclax -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.0.6 (GNU/Linux) Comment: For info see http://www.gnupg.org iD8DBQE8xWdH3/k1hdmG9jMRAg/lAJ9ox1QAnUNwOP3a9dxOoV65opEeNACdEUhd rp0vLuh9JMcnUA4RvEkvAak= =CD1z -----END PGP SIGNATURE----- |