bogofilter-cvs Mailing List for bogofilter -- Fast Bayesian Spam Filter
Fast Bayesian spam filter along lines suggested by Paul Graham
Brought to you by:
m-a
You can subscribe to this list here.
2002 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
(92) |
Oct
(277) |
Nov
(500) |
Dec
(346) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2003 |
Jan
(437) |
Feb
(277) |
Mar
(215) |
Apr
(220) |
May
(225) |
Jun
(119) |
Jul
(184) |
Aug
(306) |
Sep
(340) |
Oct
(231) |
Nov
(176) |
Dec
(232) |
2004 |
Jan
(202) |
Feb
(189) |
Mar
(111) |
Apr
(43) |
May
(99) |
Jun
(164) |
Jul
(71) |
Aug
(82) |
Sep
(61) |
Oct
(135) |
Nov
(208) |
Dec
(174) |
2005 |
Jan
(241) |
Feb
(99) |
Mar
(236) |
Apr
(112) |
May
(175) |
Jun
(178) |
Jul
(18) |
Aug
(31) |
Sep
(37) |
Oct
(60) |
Nov
(41) |
Dec
(69) |
2006 |
Jan
(34) |
Feb
(14) |
Mar
(16) |
Apr
(6) |
May
(20) |
Jun
(15) |
Jul
(64) |
Aug
(25) |
Sep
|
Oct
(2) |
Nov
(25) |
Dec
(13) |
2007 |
Jan
(36) |
Feb
(15) |
Mar
(3) |
Apr
(2) |
May
|
Jun
|
Jul
(25) |
Aug
(2) |
Sep
|
Oct
|
Nov
(12) |
Dec
(4) |
2008 |
Jan
(2) |
Feb
(12) |
Mar
(4) |
Apr
(14) |
May
(8) |
Jun
(4) |
Jul
(4) |
Aug
(5) |
Sep
|
Oct
(11) |
Nov
(1) |
Dec
|
2009 |
Jan
(15) |
Feb
(50) |
Mar
|
Apr
(2) |
May
(16) |
Jun
|
Jul
(13) |
Aug
(16) |
Sep
(1) |
Oct
(1) |
Nov
|
Dec
|
2010 |
Jan
(1) |
Feb
(7) |
Mar
(20) |
Apr
(5) |
May
|
Jun
|
Jul
(31) |
Aug
|
Sep
(1) |
Oct
(2) |
Nov
(1) |
Dec
(4) |
2011 |
Jan
(1) |
Feb
(3) |
Mar
(1) |
Apr
(1) |
May
(8) |
Jun
(2) |
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2012 |
Jan
(6) |
Feb
|
Mar
|
Apr
(9) |
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
(2) |
Nov
(1) |
Dec
(12) |
2013 |
Jan
(2) |
Feb
|
Mar
|
Apr
|
May
|
Jun
(5) |
Jul
(4) |
Aug
|
Sep
|
Oct
|
Nov
(14) |
Dec
(1) |
2014 |
Jan
|
Feb
(3) |
Mar
|
Apr
|
May
|
Jun
|
Jul
(3) |
Aug
|
Sep
|
Oct
|
Nov
(2) |
Dec
|
2015 |
Jan
|
Feb
(10) |
Mar
(1) |
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
(11) |
Nov
|
Dec
|
2016 |
Jan
(4) |
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
(2) |
Aug
|
Sep
(2) |
Oct
(3) |
Nov
|
Dec
|
2017 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
(4) |
Oct
|
Nov
|
Dec
|
2018 |
Jan
|
Feb
|
Mar
|
Apr
(1) |
May
(1) |
Jun
|
Jul
(18) |
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2019 |
Jan
(1) |
Feb
|
Mar
|
Apr
|
May
(1) |
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
From: <m-...@us...> - 2019-05-19 13:55:20
|
Revision: 7079 http://sourceforge.net/p/bogofilter/code/7079 Author: m-a Date: 2019-05-19 13:55:14 +0000 (Sun, 19 May 2019) Log Message: ----------- Lock out SVN repo builds after Git conversion. Update README. Modified Paths: -------------- trunk/bogofilter/README trunk/bogofilter/README.svn trunk/bogofilter/configure.ac Modified: trunk/bogofilter/README =================================================================== --- trunk/bogofilter/README 2019-01-26 08:56:47 UTC (rev 7078) +++ trunk/bogofilter/README 2019-05-19 13:55:14 UTC (rev 7079) @@ -132,9 +132,11 @@ The latest stable version can be downloaded. The development source is in a Subversion repository on SourceForge. To download the latest SVN source, cd to the directory to which you wish to download and type - the following command: + EITHER ONE of the following commands: - svn checkout http://svn.code.sf.net/p/bogofilter/code/trunk bogofilter + git clone https://gitlab.com/mandree/bogofilter.git bogofilter-git + OR + git clone git://git.code.sf.net/p/bogofilter/git bogofilter-git b. Building & Installing ------------------------ @@ -142,20 +144,14 @@ To compile and install the standard configuration from a tarball, use the following commands: - cd bogofilter-1.0.1 [change to project directory] + cd bogofilter-1.2.5 [change to project directory] ./configure [add configure options as required] make all check make install [as root] - To compile and install the standard configuration from SVN, use - the following commands: + To compile and install the standard configuration from Git, see + README.git. - cd bogofilter [change to project directory] - ./autogen.sh [add configure options as required] - make install [as root] - - Be sure to read README.svn for additional preparations required. - You will need a full set of development tools installed to be able to run autogen.sh, including recent automake (1.8) and autoconf (2.59). configure does not have these requirements. Modified: trunk/bogofilter/README.svn =================================================================== --- trunk/bogofilter/README.svn 2019-01-26 08:56:47 UTC (rev 7078) +++ trunk/bogofilter/README.svn 2019-05-19 13:55:14 UTC (rev 7079) @@ -1,26 +1,17 @@ README.svn -- How to build bogofilter from Subversion (SVN) -$Id$ -(C) 2002,2007,2009 by Matthias Andree. Freely distributable according to -the terms of the GNU Free Documentation License 1.0. No front- or -back-matter parts, no invariant parts. -------------------------------------------------------------------------- +Bogofilter's repository has been converted from SVN to Git on 2019-05-19. +Please check out from Git instead: -After you have checked out bogofilter from SVN, some files are missing, -for example, configure, ylwrap and others. +== New repo on GitLab: https://gitlab.com/mandree/bogofilter == +git clone https://gitlab.com/mandree/bogofilter.git bogofilter-git +OR +git clone gi...@gi...:mandree/bogofilter.git bogofilter-git -These files can be created automatically with recent autoconf and -automake versions. You will need autoconf 2.60 and automake 1.9 or -newer. +== Dev access on SourceForge: == +git clone ssh://m-...@gi.../p/bogofilter/git bogofilter-git +OR +git clone https://m-...@gi.../p/bogofilter/git bogofilter-git -To recreate these files, run: autoreconf -i -s -f -You can optionally add -v to see what autoreconf is doing. - -Then proceed as usual: ./configure && make && make check -and so on. - -For less verbose output: ./configure --quiet && make -s && make -s check - -Have fun! -------------------------------------------------------------------------- -end of README.svn +== Read-only access on SourceForge: == +git clone git://git.code.sf.net/p/bogofilter/git bogofilter-git Modified: trunk/bogofilter/configure.ac =================================================================== --- trunk/bogofilter/configure.ac 2019-01-26 08:56:47 UTC (rev 7078) +++ trunk/bogofilter/configure.ac 2019-05-19 13:55:14 UTC (rev 7079) @@ -20,6 +20,11 @@ dnl AC_INIT([bogofilter],[1.2.5]) dnl +AC_MSG_ERROR([ +=========================================================================== +The repository has been converted to Git, this SVN is no longer maintained. +=========================================================================== +]) AC_PREREQ([2.68]) AC_USE_SYSTEM_EXTENSIONS AC_CONFIG_SRCDIR([src/bogofilter.c]) This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site. |
From: <m-...@us...> - 2019-01-26 08:56:57
|
Revision: 7078 http://sourceforge.net/p/bogofilter/code/7078 Author: m-a Date: 2019-01-26 08:56:47 +0000 (Sat, 26 Jan 2019) Log Message: ----------- LMDB driver: stricter overflow checking. Provided by STeffen Nurpmeso. Modified Paths: -------------- trunk/bogofilter/src/datastore_lmdb.c Modified: trunk/bogofilter/src/datastore_lmdb.c =================================================================== --- trunk/bogofilter/src/datastore_lmdb.c 2018-07-20 23:29:42 UTC (rev 7077) +++ trunk/bogofilter/src/datastore_lmdb.c 2019-01-26 08:56:47 UTC (rev 7078) @@ -5,7 +5,7 @@ * datastore_lmdb.c -- implements the datastore, using LMDB. * * AUTHORS: - * Steffen Nurpmeso <st...@sd...> 2018 + * Steffen Nurpmeso <st...@sd...> 2018, 2019 * (copied from datastore_kc.c: * Gyepi Sam <gy...@pr...> 2003 * Matthias Andree <mat...@gm...> 2003, 2018 @@ -24,11 +24,11 @@ * reaches the size limit, the transaction must be aborted, then the * environment must be resized, then a new transaction has to be created. * Resizing will not shrink, effectively. - * 3. We assume xmalloc() aborts if out of memory. - * 4. We assume no token->leng actually exceeds int32_t. - * 5. mdb_env_get_maxkeysize(): + * 3. mdb_env_get_maxkeysize(): * Depends on the compile-time constant #MDB_MAXKEYSIZE. Default 511. * We reject any keys which excess this. + * 4. We assume xmalloc() aborts if out of memory. + * 5. We assume no token->leng actually exceeds int32_t. * * In order to be able to deal with 2. we need to track all changes that are * performed in a txn, so that in case we are running against the wall we are @@ -114,8 +114,8 @@ struct a_bflm_txn_cache{ struct a_bflm_txn_cache *bflmtc_last; /* Up-to-date (stack usage) */ struct a_bflm_txn_cache *bflmtc_next; /* Needs to be build before use! */ - char *bflmtc_caster; /* Current caster */ - char *bflmtc_max; /* Maximum usable byte, exclusive */ + char *bflmtc_caster; /* Current caster */ + char *bflmtc_max; /* ..imum usable byte, exclusive */ /* Actually points to &self[1] TODO [0] or [8], dep. __STDC_VERSION__! */ char *bflmtc_data; }; @@ -615,19 +615,25 @@ char const *emsg; size_t kl, vl, i; - kl = key->mv_size; + if((kl = key->mv_size) >= 0x7FFFFFFFu) + goto jeoverflow; if(val_or_null != NULL){ - vl = val_or_null->mv_size; - i = (2 * sizeof(uint32_t)) + kl + vl; + if((vl = val_or_null->mv_size) >= 0x7FFFFFFFu) + goto jeoverflow; + if((i = kl + vl) >= 0x7FFFFFFFu - 2 * sizeof(uint32_t)) + goto jeoverflow; + i += 2 * sizeof(uint32_t); }else{ vl = 0; - i = sizeof(uint32_t) + kl; + if((i = kl) >= 0x7FFFFFFFu - sizeof(uint32_t)) + goto jeoverflow; + i += sizeof(uint32_t); } i = a_BFLM_TXN_CACHE_ALIGN(i); /* XXX We actually should abort() the program instead: cannot be handled */ - if(kl >= 0x7FFFFFFFu || vl >= 0x7FFFFFFFu || - i >= 0x7FFFFFFFu - sizeof(*bflmtcp)){ + if(i >= 0x7FFFFFFFu - sizeof(*bflmtcp)){ +jeoverflow: emsg = "LMDB: entry too large to be stored"; goto jleave; } This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site. |
From: <m-...@us...> - 2018-07-20 23:29:44
|
Revision: 7077 http://sourceforge.net/p/bogofilter/code/7077 Author: m-a Date: 2018-07-20 23:29:42 +0000 (Fri, 20 Jul 2018) Log Message: ----------- Further datastore_lmdb update by Steffen Nurpmeso. Modified Paths: -------------- trunk/bogofilter/src/datastore_lmdb.c Modified: trunk/bogofilter/src/datastore_lmdb.c =================================================================== --- trunk/bogofilter/src/datastore_lmdb.c 2018-07-19 20:47:11 UTC (rev 7076) +++ trunk/bogofilter/src/datastore_lmdb.c 2018-07-20 23:29:42 UTC (rev 7077) @@ -8,7 +8,7 @@ * Steffen Nurpmeso <st...@sd...> 2018 * (copied from datastore_kc.c: * Gyepi Sam <gy...@pr...> 2003 - * Matthias Andree <mat...@gm...> 2003 + * Matthias Andree <mat...@gm...> 2003, 2018 * Stefan Bellon <sb...@sb...> 2003-2004 * Pierre Habouzit <mad...@de...> 2007 * Denny Lin <den...@hs...> 2015) @@ -41,18 +41,16 @@ */ /* Alternative implementation: fixed DB size */ -/*#define a_BFLM_FIXED_SIZE (1u << 31)*/ +/*#define a_BFLM_FIXED_SIZE (ULONG_MAX >> (ULONG_MAX != UINT_MAX ? 22 : 1))*/ /* mdb_env_set_maxreaders() */ -#define a_BFLM_MAXREADERS 15 +#define a_BFLM_MAXREADERS 7 #ifndef a_BFLM_FIXED_SIZE - /* DB size grow. Must be a power of two! + /* DB size grow. Must be a power of two (we perform alignment)! * Space it so that a DB load does not run against walls too many times. - * We try _TRIES times to resize for a single new entry before giving up. - * Note that the minimum size must be capable to store the two used DB - * structured without resize etc. */ -# define a_BFLM_GROW (1u << 23) + * We try _TRIES times to resize for a single new entry before giving up */ +# define a_BFLM_GROW (1u << 24) # define a_BFLM_GROW_TRIES 3 /* Size of one chunk of the intermediate txn cache, as above. @@ -60,7 +58,7 @@ * Of course, if a token requires more space, we allocate a larger chunk */ # define a_BFLM_TXN_CACHE_SIZE (1u << 20) - /* An entry consists of an uint32_t describing the length of the key. + /* A cache entry consists of an uint32_t describing the length of the key. * If the high bit is set an uint32_t describing the length of the value * follows. After the data buffers there possibly is alignment pad */ # define a_BFLM_TXN_CACHE_ALIGN(X) \ @@ -74,6 +72,7 @@ #include "common.h" #include <errno.h> +#include <limits.h> #include <lmdb.h> @@ -82,7 +81,6 @@ #include "error.h" #include "paths.h" #include "xmalloc.h" -#include "xstrdup.h" #if MDB_VERSION_FULL < MDB_VERINT(0, 9, 22) # error "Required LMDB version: 0.9.22 or later (0.9.11 may do, but untested)" @@ -114,7 +112,7 @@ #ifndef a_BFLM_FIXED_SIZE struct a_bflm_txn_cache{ - struct a_bflm_txn_cache *bflmtc_last; + struct a_bflm_txn_cache *bflmtc_last; /* Up-to-date (stack usage) */ struct a_bflm_txn_cache *bflmtc_next; /* Needs to be build before use! */ char *bflmtc_caster; /* Current caster */ char *bflmtc_max; /* Maximum usable byte, exclusive */ @@ -123,6 +121,9 @@ }; #endif +static char const a_bflm_db_name_man[] = a_BFLM_DB_NAME_MAN; +static char const a_bflm_db_name_dat[] = a_BFLM_DB_NAME_DAT; + /**/ static struct a_bflm *a_bflm_init(bfpath *bfp, bool rdonly); static int a_bflm__check_create(struct a_bflm *bflmp); @@ -134,18 +135,18 @@ static int a_bflm_txn_commit(void *vhandle); #ifndef a_BFLM_FIXED_SIZE -/**/ +/* A transaction needs to be resized and all modifications in the cache need to + * be replayed, because we have seen MDB_MAP_FULL (or MDB_MAP_RESIZED) */ static bool a_bflm_txn_mapfull(struct a_bflm *bflmp, bool close_cursor); +/* (NULL on success or an error message otherwise) */ +static char const *a_bflm_txn__replay(struct a_bflm *bflmp); + /* Put an entry; it is a deletion if val_or_null is NULL. * Return NULL on success or an error message otherwise */ static char const *a_bflm_txn_cache_put(struct a_bflm *bflmp, MDB_val *key, MDB_val *val_or_null); -/* Replay all the cache operations in order to redo the transaction. - * Return NULL on success or an error message otherwise */ -static char const *a_bflm_txn_cache_replay(struct a_bflm *bflmp); - /* Free the recovery stack and possible heap data */ static void a_bflm_txn_cache_free(struct a_bflm *bflmp); #endif /* a_BFLM_FIXED_SIZE */ @@ -192,7 +193,7 @@ memcpy(rv->bflm_filepath = (char*)&rv[1], bfp->filepath, i); rv->bflm_flags = (((DEBUG_DATABASE(1) || getenv("BF_DEBUG_DB") != NULL) - ? a_BFLM_DEBUG : a_BFLM_NONE) | + ? a_BFLM_DEBUG : a_BFLM_NONE) | (rdonly ? a_BFLM_RDONLY : a_BFLM_NONE)); e = mdb_env_create(&rv->bflm_env); @@ -204,7 +205,7 @@ rv->bflm_maxkeysize = mdb_env_get_maxkeysize(rv->bflm_env); /* To acommodate with bogofilter's db_created() mechanism we cannot use the - * unnamed DB which "always exists", but must place data in a named one */ + * unnamed DB which "always exists", but must place data in named ones */ e = mdb_env_set_maxdbs(rv->bflm_env, 2); if(e != MDB_SUCCESS){ emsg = "mdb_env_set_maxdbs()"; @@ -259,6 +260,7 @@ char const *db_name; unsigned int f; int e; + /* TODO compile-time-assert that MDB_CREATE is not 0 */ jredo_txn: e = mdb_txn_begin(bflmp->bflm_env, NULL, 0, &bflmp->bflm_txn); @@ -270,7 +272,7 @@ goto jleave; } - db_name = a_BFLM_DB_NAME_MAN; + db_name = a_bflm_db_name_man; f = 0; jredo_dbi: e = mdb_dbi_open(bflmp->bflm_txn, db_name, f, &bflmp->bflm_dbi); @@ -283,8 +285,8 @@ goto jerr; } - if(f == MDB_CREATE && 0 == strcmp(db_name, a_BFLM_DB_NAME_MAN)){ - db_name = a_BFLM_DB_NAME_DAT; + if(f == MDB_CREATE && db_name == a_bflm_db_name_man){ + db_name = a_bflm_db_name_dat; goto jredo_dbi; } @@ -346,9 +348,9 @@ goto jerr1; } - e = mdb_dbi_open(bflmp->bflm_txn, a_BFLM_DB_NAME_DAT, 0, &bflmp->bflm_dbi); + e = mdb_dbi_open(bflmp->bflm_txn, a_bflm_db_name_dat, 0, &bflmp->bflm_dbi); if(e != MDB_SUCCESS){ - if(e == MDB_NOTFOUND){ + if(e == MDB_NOTFOUND && (bflmp->bflm_flags & a_BFLM_RDONLY)){ bflmp->bflm_flags |= a_BFLM_DB_UNAVAIL; goto junavail; } @@ -476,12 +478,20 @@ mdb_txn_abort(bflmp->bflm_txn); - /* Resize map */ + /* Resize map. To be super-safe, synchronize current map size first */ jredo_txn: + mdb_env_set_mapsize(bflmp->bflm_env, 0); /* no error defined */mdb_env_info(bflmp->bflm_env, &envinfo); i = envinfo.me_mapsize; - i += a_BFLM_GROW / 10; - i = (i + (a_BFLM_GROW - 1)) & ~(a_BFLM_GROW - 1); + if((size_t)-1 - i >= a_BFLM_GROW * 2){ + i += a_BFLM_GROW / 10; + i = (i + (a_BFLM_GROW - 1)) & ~(a_BFLM_GROW - 1); + }else if((size_t)-1 - i >= 1024u * 1024u * 2) + i = (size_t)-1 - (1024u * 1024u - 1); + else{ + emsg = "DB size too large"; + goto jerr1; + } e = mdb_env_set_mapsize(bflmp->bflm_env, i); if(e != MDB_SUCCESS){ emsg = "mdb_env_set_mapsize()"; @@ -491,15 +501,13 @@ /* Recreate transaction */ e = mdb_txn_begin(bflmp->bflm_env, NULL, 0, &bflmp->bflm_txn); if(e != MDB_SUCCESS){ - if(e == MDB_MAP_RESIZED){ - mdb_env_set_mapsize(bflmp->bflm_env, 0); + if(e == MDB_MAP_RESIZED) goto jredo_txn; - } emsg = "mdb_txn_begin()"; goto jerr1; } - e = mdb_dbi_open(bflmp->bflm_txn, a_BFLM_DB_NAME_DAT, 0, &bflmp->bflm_dbi); + e = mdb_dbi_open(bflmp->bflm_txn, a_bflm_db_name_dat, 0, &bflmp->bflm_dbi); if(e != MDB_SUCCESS){ emsg = "mdb_dbi_open()"; goto jerr2; @@ -511,7 +519,7 @@ goto jerr2; } - if((emsg = a_bflm_txn_cache_replay(bflmp)) != NULL) + if((emsg = a_bflm_txn__replay(bflmp)) != NULL) goto jerr3; if(bflmp->bflm_flags & a_BFLM_DEBUG) @@ -534,6 +542,72 @@ } static char const * +a_bflm_txn__replay(struct a_bflm *bflmp){ + MDB_val key, val; + char const *emsg; + int e; + uint32_t kl, vl; + char *dp; + struct a_bflm_txn_cache *head, *bflmtcp; + + /* First of all create a list in the right order */ + for(head = NULL, bflmtcp = bflmp->bflm_txn_cache; bflmtcp != NULL; + bflmtcp = bflmtcp->bflmtc_last){ + bflmtcp->bflmtc_next = head; + head = bflmtcp; + } + + /* Then replay, using it */ + for(; head != NULL; head = head->bflmtc_next){ + for(dp = head->bflmtc_data; dp < head->bflmtc_caster;){ + bool isins; + + /* For actual loading always use memcpy() for simplicity. + * (That is: C standard and undefined behaviour, who knows?) */ + memcpy(&kl, dp, sizeof kl); + dp += sizeof kl; + if((isins = ((kl & 0x80000000u) != 0))){ + kl ^= 0x80000000u; + memcpy(&vl, dp, sizeof vl); + dp += sizeof vl; + } + + key.mv_size = kl; + key.mv_data = dp; + dp += kl; + if(isins){ + val.mv_size = vl; + val.mv_data = dp; + dp += vl; + + e = mdb_cursor_put(bflmp->bflm_cursor, &key, &val, 0); + if(e != MDB_SUCCESS){ + emsg = "mdb_cursor_put()"; + goto jleave; + } + }else{ + e = mdb_cursor_get(bflmp->bflm_cursor, &key, NULL, + MDB_SET_KEY); + if(e != MDB_SUCCESS){ + emsg = "mdb_cursor_get() for delete"; + goto jleave; + } + e = mdb_cursor_del(bflmp->bflm_cursor, 0); + if(e != MDB_SUCCESS){ + emsg = "mdb_cursor_del()"; + goto jleave; + } + } + dp = (char*)a_BFLM_TXN_CACHE_ALIGN((uintptr_t)dp); + } + } + + emsg = NULL; +jleave: + return emsg; +} + +static char const * a_bflm_txn_cache_put(struct a_bflm *bflmp, MDB_val *key, MDB_val *val_or_null){ uint32_t ui; char *dp; @@ -600,73 +674,6 @@ return emsg; } -static char const * -a_bflm_txn_cache_replay(struct a_bflm *bflmp){ - /* And replay all the changes we have yet seen */ - MDB_val key, val; - char const *emsg; - int e; - uint32_t kl, vl; - char *dp; - struct a_bflm_txn_cache *head, *bflmtcp; - - /* First of all create a list in the right order */ - for(head = NULL, bflmtcp = bflmp->bflm_txn_cache; bflmtcp != NULL; - bflmtcp = bflmtcp->bflmtc_last){ - bflmtcp->bflmtc_next = head; - head = bflmtcp; - } - - /* Then replay, using it */ - for(; head != NULL; head = head->bflmtc_next){ - for(dp = head->bflmtc_data; dp < head->bflmtc_caster;){ - bool isins; - - /* For actual loading always use memcpy() for simplicity. - * (That is: C standard and undefined behaviour, who knows?) */ - memcpy(&kl, dp, sizeof kl); - dp += sizeof kl; - if((isins = ((kl & 0x80000000u) != 0))){ - kl ^= 0x80000000u; - memcpy(&vl, dp, sizeof vl); - dp += sizeof vl; - } - - key.mv_size = kl; - key.mv_data = dp; - dp += kl; - if(isins){ - val.mv_size = vl; - val.mv_data = dp; - dp += vl; - - e = mdb_cursor_put(bflmp->bflm_cursor, &key, &val, 0); - if(e != MDB_SUCCESS){ - emsg = "mdb_cursor_put()"; - goto jleave; - } - }else{ - e = mdb_cursor_get(bflmp->bflm_cursor, &key, NULL, - MDB_SET_KEY); - if(e != MDB_SUCCESS){ - emsg = "mdb_cursor_get() for delete"; - goto jleave; - } - e = mdb_cursor_del(bflmp->bflm_cursor, 0); - if(e != MDB_SUCCESS){ - emsg = "mdb_cursor_del()"; - goto jleave; - } - } - dp = (char*)a_BFLM_TXN_CACHE_ALIGN((uintptr_t)dp); - } - } - - emsg = NULL; -jleave: - return emsg; -} - static void a_bflm_txn_cache_free(struct a_bflm *bflmp){ struct a_bflm_txn_cache *bflmtcp; @@ -774,8 +781,8 @@ jleave: if(DEBUG_DATABASE(3)) fprintf(dbgout, "LMDB db_get_dbvalue(): %lu <%.*s> -> %d\n", - (unsigned long)token->leng, (int)token->leng, (const char *)token->data, - (e == 0)); + (unsigned long)token->leng, (int)token->leng, + (char const*)token->data, (e == 0)); return e; jerr: if(e != MDB_NOTFOUND){ @@ -803,7 +810,7 @@ if((bflmp = (struct a_bflm *)vhandle) == NULL) goto jleave; - if(bflmp->bflm_flags & a_BFLM_DB_UNAVAIL) + if(bflmp->bflm_flags & a_BFLM_DB_UNAVAIL) /* XXX assert instead */ goto jleave; if((size_t)token->leng > bflmp->bflm_maxkeysize){ @@ -843,8 +850,8 @@ jleave: if(DEBUG_DATABASE(3)) fprintf(dbgout, "LMDB db_set_dbvalue(): %lu <%.*s> -> %d\n", - (unsigned long)token->leng, (int)token->leng, (const char *)token->data, - (e == 0)); + (unsigned long)token->leng, (int)token->leng, + (char const*)token->data, (e == 0)); return e; jerr: print_error(__FILE__, __LINE__, "LMDB[%ld]: db_set_dbvalue(), %s: %d, %s", @@ -912,8 +919,8 @@ jleave: if(DEBUG_DATABASE(3)) fprintf(dbgout, "LMDB db_delete(): %lu <%.*s> -> %d\n", - (unsigned long)token->leng, (int)token->leng, (const char *)token->data, - (e == 0)); + (unsigned long)token->leng, (int)token->leng, + (char const*)token->data, (e == 0)); return e; jerr: print_error(__FILE__, __LINE__, "LMDB[%ld]: db_delete(), %s: %d, %s", This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site. |
From: <m-...@us...> - 2018-07-19 20:47:14
|
Revision: 7076 http://sourceforge.net/p/bogofilter/code/7076 Author: m-a Date: 2018-07-19 20:47:11 +0000 (Thu, 19 Jul 2018) Log Message: ----------- Merge lmdb-support branch developments. Modified Paths: -------------- trunk/bogofilter/configure.ac trunk/bogofilter/src/datastore_lmdb.c trunk/bogofilter/src/lexer.c trunk/bogofilter/src/tests/t.bogoutil Property Changed: ---------------- trunk/ Index: trunk =================================================================== --- trunk 2018-07-19 20:45:49 UTC (rev 7075) +++ trunk 2018-07-19 20:47:11 UTC (rev 7076) Property changes on: trunk ___________________________________________________________________ Modified: svn:mergeinfo ## -1 +1 ## -/branches/lmdb-support:7060-7068 \ No newline at end of property +/branches/lmdb-support:7060-7075 \ No newline at end of property Modified: trunk/bogofilter/configure.ac =================================================================== --- trunk/bogofilter/configure.ac 2018-07-19 20:45:49 UTC (rev 7075) +++ trunk/bogofilter/configure.ac 2018-07-19 20:47:11 UTC (rev 7076) @@ -549,8 +549,8 @@ mdb_env_set_maxreaders(env, 1); mdb_env_set_mapsize(env, 4096*42); mdb_env_open(env, "/tmp", 0, 0660); - mdb_txn_begin(env, (void*)0, 0, &txn); - mdb_dbi_open(txn, (void*)0, 0, &dbi); + mdb_txn_begin(env, 0, 0, &txn); + mdb_dbi_open(txn, "", 0, &dbi); ])],,AC_MSG_ERROR(Cannot link to lmdb library.)) LIBS="$saveLIBS" ;; Modified: trunk/bogofilter/src/datastore_lmdb.c =================================================================== --- trunk/bogofilter/src/datastore_lmdb.c 2018-07-19 20:45:49 UTC (rev 7075) +++ trunk/bogofilter/src/datastore_lmdb.c 2018-07-19 20:47:11 UTC (rev 7076) @@ -187,7 +187,7 @@ size_t i; i = strlen(bfp->filepath) +1; - rv = xmalloc(sizeof(*rv) + i); + rv = (struct a_bflm *)xmalloc(sizeof(*rv) + i); memset(rv, 0, sizeof *rv); memcpy(rv->bflm_filepath = (char*)&rv[1], bfp->filepath, i); @@ -283,7 +283,7 @@ goto jerr; } - if(f == MDB_CREATE && db_name == a_BFLM_DB_NAME_MAN){ + if(f == MDB_CREATE && 0 == strcmp(db_name, a_BFLM_DB_NAME_MAN)){ db_name = a_BFLM_DB_NAME_DAT; goto jredo_dbi; } @@ -325,7 +325,7 @@ e = DST_OK; - if((bflmp = vhandle) == NULL) + if((bflmp = (struct a_bflm *)vhandle) == NULL) goto jleave; if(bflmp->bflm_flags & a_BFLM_DEBUG) @@ -332,6 +332,7 @@ fprintf(dbgout, "LMDB[%ld]: txn_begin(%p [%s])\n", (long)getpid(), bflmp, bflmp->bflm_filepath); + bflmp->bflm_flags &= ~(a_BFLM_HAS_TXN | a_BFLM_DB_UNAVAIL); jredo_txn: e = mdb_txn_begin(bflmp->bflm_env, NULL, (bflmp->bflm_flags & a_BFLM_RDONLY ? MDB_RDONLY : 0), @@ -383,7 +384,7 @@ e = DST_OK; - if((bflmp = vhandle) == NULL) + if((bflmp = (struct a_bflm *)vhandle) == NULL) goto jleave; if(bflmp->bflm_flags & a_BFLM_DEBUG) @@ -414,7 +415,7 @@ e = DST_OK; - if((bflmp = vhandle) == NULL) + if((bflmp = (struct a_bflm *)vhandle) == NULL) goto jleave; if(bflmp->bflm_flags & a_BFLM_DEBUG) @@ -565,7 +566,7 @@ jcache_new: i += sizeof(*bflmtcp); i = max(i, a_BFLM_TXN_CACHE_SIZE); - dp = (char*)(bflmtcp = xmalloc(i)); + dp = (char*)(bflmtcp = (struct a_bflm_txn_cache *)xmalloc(i)); bflmtcp->bflmtc_last = bflmp->bflm_txn_cache; bflmp->bflm_txn_cache = bflmtcp; bflmtcp->bflmtc_caster = bflmtcp->bflmtc_data = (char*)&bflmtcp[1]; @@ -703,7 +704,7 @@ db_close(void *vhandle){ struct a_bflm *bflmp; - if((bflmp = vhandle) == NULL) + if((bflmp = (struct a_bflm *)vhandle) == NULL) goto jleave; if(bflmp->bflm_flags & a_BFLM_DEBUG) @@ -725,7 +726,7 @@ bool created; struct a_bflm *bflmp; - created = ((bflmp = vhandle) != NULL && + created = ((bflmp = (struct a_bflm *)vhandle) != NULL && (bflmp->bflm_flags & a_BFLM_DB_CREATED) != 0); return created; } @@ -739,7 +740,7 @@ e = DS_NOTFOUND; - if((bflmp = vhandle) == NULL) + if((bflmp = (struct a_bflm *)vhandle) == NULL) goto jleave; if(bflmp->bflm_flags & a_BFLM_DB_UNAVAIL) @@ -750,7 +751,7 @@ fprintf(dbgout, "LMDB[%ld]: get_dbvalue: key too big " "(> %lu bytes), ignoring %.*s\n", (long)getpid(),(unsigned long)bflmp->bflm_maxkeysize, - (int)token->leng, token->data); + (int)token->leng, (const char *)token->data); goto jleave; } @@ -773,7 +774,7 @@ jleave: if(DEBUG_DATABASE(3)) fprintf(dbgout, "LMDB db_get_dbvalue(): %lu <%.*s> -> %d\n", - (unsigned long)token->leng, (int)token->leng, token->data, + (unsigned long)token->leng, (int)token->leng, (const char *)token->data, (e == 0)); return e; jerr: @@ -799,7 +800,7 @@ e = 0; - if((bflmp = vhandle) == NULL) + if((bflmp = (struct a_bflm *)vhandle) == NULL) goto jleave; if(bflmp->bflm_flags & a_BFLM_DB_UNAVAIL) @@ -810,7 +811,7 @@ fprintf(dbgout, "LMDB[%ld]: set_dbvalue: key too big " "(> %lu bytes), ignoring %.*s\n", (long)getpid(),(unsigned long)bflmp->bflm_maxkeysize, - (int)token->leng, token->data); + (int)token->leng, (const char *)token->data); goto jleave; } @@ -842,7 +843,7 @@ jleave: if(DEBUG_DATABASE(3)) fprintf(dbgout, "LMDB db_set_dbvalue(): %lu <%.*s> -> %d\n", - (unsigned long)token->leng, (int)token->leng, token->data, + (unsigned long)token->leng, (int)token->leng, (const char *)token->data, (e == 0)); return e; jerr: @@ -863,7 +864,7 @@ e = 0; - if((bflmp = vhandle) == NULL) + if((bflmp = (struct a_bflm *)vhandle) == NULL) goto jleave; if(bflmp->bflm_flags & a_BFLM_DB_UNAVAIL) @@ -874,7 +875,7 @@ fprintf(dbgout, "LMDB[%ld]: delete: key too big " "(> %lu bytes), ignoring %.*s\n", (long)getpid(),(unsigned long)bflmp->bflm_maxkeysize, - (int)token->leng, token->data); + (int)token->leng, (const char *)token->data); goto jleave; } @@ -911,7 +912,7 @@ jleave: if(DEBUG_DATABASE(3)) fprintf(dbgout, "LMDB db_delete(): %lu <%.*s> -> %d\n", - (unsigned long)token->leng, (int)token->leng, token->data, + (unsigned long)token->leng, (int)token->leng, (const char *)token->data, (e == 0)); return e; jerr: @@ -927,7 +928,7 @@ db_flush(void *vhandle){ struct a_bflm *bflmp; - if((bflmp = vhandle) != NULL){ + if((bflmp = (struct a_bflm *)vhandle) != NULL){ int e; if(bflmp->bflm_flags & a_BFLM_DEBUG) @@ -953,7 +954,7 @@ rv = EX_OK; - if((bflmp = vhandle) == NULL) + if((bflmp = (struct a_bflm *)vhandle) == NULL) goto jleave; if(bflmp->bflm_flags & a_BFLM_DB_UNAVAIL) @@ -988,7 +989,7 @@ /* Copy to dbv_key and dbv_val in order to avoid loss upon possible * action on the DB; should not matter, but NUL terminate them */ dbv_key.leng = (uint32_t)(i = key.mv_size); - dbv_key.data = buf = xrealloc(buf, i +1 + val.mv_size +1); + dbv_key.data = buf = (char *)xrealloc(buf, i +1 + val.mv_size +1); memcpy(buf, key.mv_data, i); buf[i++] = '\0'; dbv_val.leng = (uint32_t)val.mv_size; Modified: trunk/bogofilter/src/lexer.c =================================================================== --- trunk/bogofilter/src/lexer.c 2018-07-19 20:45:49 UTC (rev 7075) +++ trunk/bogofilter/src/lexer.c 2018-07-19 20:47:11 UTC (rev 7076) @@ -140,6 +140,7 @@ && count != EOF /* don't skip if inside message/rfc822 */ && msg_state->parent == NULL + && buff->t.leng >= hdrlen && memcmp(buff->t.u.text,spam_header_name,hdrlen) == 0) { count = skip_folded_line(buff); } Modified: trunk/bogofilter/src/tests/t.bogoutil =================================================================== --- trunk/bogofilter/src/tests/t.bogoutil 2018-07-19 20:45:49 UTC (rev 7075) +++ trunk/bogofilter/src/tests/t.bogoutil 2018-07-19 20:47:11 UTC (rev 7076) @@ -89,7 +89,7 @@ unset BOGOFILTER_DIR cd "$TMPDIR" -"$BOGOUTIL" -C -w "$WORDLIST" .MSG_COUNT > /dev/null -"$BOGOUTIL" -C -p "$WORDLIST" .MSG_COUNT > /dev/null +$BOGOUTIL -C -w "$WORDLIST" .MSG_COUNT > /dev/null +$BOGOUTIL -C -p "$WORDLIST" .MSG_COUNT > /dev/null PRINTCORE=$OPC cd - > /dev/null This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site. |
From: <m-...@us...> - 2018-07-19 20:45:53
|
Revision: 7075 http://sourceforge.net/p/bogofilter/code/7075 Author: m-a Date: 2018-07-19 20:45:49 +0000 (Thu, 19 Jul 2018) Log Message: ----------- Bugfix sent in by Steffen Nurpmeso. Modified Paths: -------------- branches/lmdb-support/bogofilter/src/datastore_lmdb.c Modified: branches/lmdb-support/bogofilter/src/datastore_lmdb.c =================================================================== --- branches/lmdb-support/bogofilter/src/datastore_lmdb.c 2018-07-19 20:43:32 UTC (rev 7074) +++ branches/lmdb-support/bogofilter/src/datastore_lmdb.c 2018-07-19 20:45:49 UTC (rev 7075) @@ -332,6 +332,7 @@ fprintf(dbgout, "LMDB[%ld]: txn_begin(%p [%s])\n", (long)getpid(), bflmp, bflmp->bflm_filepath); + bflmp->bflm_flags &= ~(a_BFLM_HAS_TXN | a_BFLM_DB_UNAVAIL); jredo_txn: e = mdb_txn_begin(bflmp->bflm_env, NULL, (bflmp->bflm_flags & a_BFLM_RDONLY ? MDB_RDONLY : 0), This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site. |
From: <m-...@us...> - 2018-07-19 20:43:36
|
Revision: 7074 http://sourceforge.net/p/bogofilter/code/7074 Author: m-a Date: 2018-07-19 20:43:32 +0000 (Thu, 19 Jul 2018) Log Message: ----------- Resynch with trunk. Property Changed: ---------------- branches/lmdb-support/ Index: branches/lmdb-support =================================================================== --- branches/lmdb-support 2018-07-19 20:32:03 UTC (rev 7073) +++ branches/lmdb-support 2018-07-19 20:43:32 UTC (rev 7074) Property changes on: branches/lmdb-support ___________________________________________________________________ Modified: svn:mergeinfo ## -1 +1 ## -/trunk:7060-7067 \ No newline at end of property +/trunk:7060-7073 \ No newline at end of property This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site. |
From: <m-...@us...> - 2018-07-19 20:32:05
|
Revision: 7073 http://sourceforge.net/p/bogofilter/code/7073 Author: m-a Date: 2018-07-19 20:32:03 +0000 (Thu, 19 Jul 2018) Log Message: ----------- Fix several warnings. Modified Paths: -------------- branches/lmdb-support/bogofilter/src/datastore_lmdb.c Modified: branches/lmdb-support/bogofilter/src/datastore_lmdb.c =================================================================== --- branches/lmdb-support/bogofilter/src/datastore_lmdb.c 2018-07-19 20:25:19 UTC (rev 7072) +++ branches/lmdb-support/bogofilter/src/datastore_lmdb.c 2018-07-19 20:32:03 UTC (rev 7073) @@ -283,7 +283,7 @@ goto jerr; } - if(f == MDB_CREATE && db_name == a_BFLM_DB_NAME_MAN){ + if(f == MDB_CREATE && 0 == strcmp(db_name, a_BFLM_DB_NAME_MAN)){ db_name = a_BFLM_DB_NAME_DAT; goto jredo_dbi; } @@ -750,7 +750,7 @@ fprintf(dbgout, "LMDB[%ld]: get_dbvalue: key too big " "(> %lu bytes), ignoring %.*s\n", (long)getpid(),(unsigned long)bflmp->bflm_maxkeysize, - (int)token->leng, token->data); + (int)token->leng, (const char *)token->data); goto jleave; } @@ -773,7 +773,7 @@ jleave: if(DEBUG_DATABASE(3)) fprintf(dbgout, "LMDB db_get_dbvalue(): %lu <%.*s> -> %d\n", - (unsigned long)token->leng, (int)token->leng, token->data, + (unsigned long)token->leng, (int)token->leng, (const char *)token->data, (e == 0)); return e; jerr: @@ -810,7 +810,7 @@ fprintf(dbgout, "LMDB[%ld]: set_dbvalue: key too big " "(> %lu bytes), ignoring %.*s\n", (long)getpid(),(unsigned long)bflmp->bflm_maxkeysize, - (int)token->leng, token->data); + (int)token->leng, (const char *)token->data); goto jleave; } @@ -842,7 +842,7 @@ jleave: if(DEBUG_DATABASE(3)) fprintf(dbgout, "LMDB db_set_dbvalue(): %lu <%.*s> -> %d\n", - (unsigned long)token->leng, (int)token->leng, token->data, + (unsigned long)token->leng, (int)token->leng, (const char *)token->data, (e == 0)); return e; jerr: @@ -874,7 +874,7 @@ fprintf(dbgout, "LMDB[%ld]: delete: key too big " "(> %lu bytes), ignoring %.*s\n", (long)getpid(),(unsigned long)bflmp->bflm_maxkeysize, - (int)token->leng, token->data); + (int)token->leng, (const char *)token->data); goto jleave; } @@ -911,7 +911,7 @@ jleave: if(DEBUG_DATABASE(3)) fprintf(dbgout, "LMDB db_delete(): %lu <%.*s> -> %d\n", - (unsigned long)token->leng, (int)token->leng, token->data, + (unsigned long)token->leng, (int)token->leng, (const char *)token->data, (e == 0)); return e; jerr: This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site. |
From: <m-...@us...> - 2018-07-19 20:25:23
|
Revision: 7072 http://sourceforge.net/p/bogofilter/code/7072 Author: m-a Date: 2018-07-19 20:25:19 +0000 (Thu, 19 Jul 2018) Log Message: ----------- Make LMDB code valid C++, too. Modified Paths: -------------- branches/lmdb-support/bogofilter/configure.ac branches/lmdb-support/bogofilter/src/datastore_lmdb.c Modified: branches/lmdb-support/bogofilter/configure.ac =================================================================== --- branches/lmdb-support/bogofilter/configure.ac 2018-07-19 20:23:10 UTC (rev 7071) +++ branches/lmdb-support/bogofilter/configure.ac 2018-07-19 20:25:19 UTC (rev 7072) @@ -549,8 +549,8 @@ mdb_env_set_maxreaders(env, 1); mdb_env_set_mapsize(env, 4096*42); mdb_env_open(env, "/tmp", 0, 0660); - mdb_txn_begin(env, (void*)0, 0, &txn); - mdb_dbi_open(txn, (void*)0, 0, &dbi); + mdb_txn_begin(env, 0, 0, &txn); + mdb_dbi_open(txn, "", 0, &dbi); ])],,AC_MSG_ERROR(Cannot link to lmdb library.)) LIBS="$saveLIBS" ;; Modified: branches/lmdb-support/bogofilter/src/datastore_lmdb.c =================================================================== --- branches/lmdb-support/bogofilter/src/datastore_lmdb.c 2018-07-19 20:23:10 UTC (rev 7071) +++ branches/lmdb-support/bogofilter/src/datastore_lmdb.c 2018-07-19 20:25:19 UTC (rev 7072) @@ -187,7 +187,7 @@ size_t i; i = strlen(bfp->filepath) +1; - rv = xmalloc(sizeof(*rv) + i); + rv = (struct a_bflm *)xmalloc(sizeof(*rv) + i); memset(rv, 0, sizeof *rv); memcpy(rv->bflm_filepath = (char*)&rv[1], bfp->filepath, i); @@ -325,7 +325,7 @@ e = DST_OK; - if((bflmp = vhandle) == NULL) + if((bflmp = (struct a_bflm *)vhandle) == NULL) goto jleave; if(bflmp->bflm_flags & a_BFLM_DEBUG) @@ -383,7 +383,7 @@ e = DST_OK; - if((bflmp = vhandle) == NULL) + if((bflmp = (struct a_bflm *)vhandle) == NULL) goto jleave; if(bflmp->bflm_flags & a_BFLM_DEBUG) @@ -414,7 +414,7 @@ e = DST_OK; - if((bflmp = vhandle) == NULL) + if((bflmp = (struct a_bflm *)vhandle) == NULL) goto jleave; if(bflmp->bflm_flags & a_BFLM_DEBUG) @@ -565,7 +565,7 @@ jcache_new: i += sizeof(*bflmtcp); i = max(i, a_BFLM_TXN_CACHE_SIZE); - dp = (char*)(bflmtcp = xmalloc(i)); + dp = (char*)(bflmtcp = (struct a_bflm_txn_cache *)xmalloc(i)); bflmtcp->bflmtc_last = bflmp->bflm_txn_cache; bflmp->bflm_txn_cache = bflmtcp; bflmtcp->bflmtc_caster = bflmtcp->bflmtc_data = (char*)&bflmtcp[1]; @@ -703,7 +703,7 @@ db_close(void *vhandle){ struct a_bflm *bflmp; - if((bflmp = vhandle) == NULL) + if((bflmp = (struct a_bflm *)vhandle) == NULL) goto jleave; if(bflmp->bflm_flags & a_BFLM_DEBUG) @@ -725,7 +725,7 @@ bool created; struct a_bflm *bflmp; - created = ((bflmp = vhandle) != NULL && + created = ((bflmp = (struct a_bflm *)vhandle) != NULL && (bflmp->bflm_flags & a_BFLM_DB_CREATED) != 0); return created; } @@ -739,7 +739,7 @@ e = DS_NOTFOUND; - if((bflmp = vhandle) == NULL) + if((bflmp = (struct a_bflm *)vhandle) == NULL) goto jleave; if(bflmp->bflm_flags & a_BFLM_DB_UNAVAIL) @@ -799,7 +799,7 @@ e = 0; - if((bflmp = vhandle) == NULL) + if((bflmp = (struct a_bflm *)vhandle) == NULL) goto jleave; if(bflmp->bflm_flags & a_BFLM_DB_UNAVAIL) @@ -863,7 +863,7 @@ e = 0; - if((bflmp = vhandle) == NULL) + if((bflmp = (struct a_bflm *)vhandle) == NULL) goto jleave; if(bflmp->bflm_flags & a_BFLM_DB_UNAVAIL) @@ -927,7 +927,7 @@ db_flush(void *vhandle){ struct a_bflm *bflmp; - if((bflmp = vhandle) != NULL){ + if((bflmp = (struct a_bflm *)vhandle) != NULL){ int e; if(bflmp->bflm_flags & a_BFLM_DEBUG) @@ -953,7 +953,7 @@ rv = EX_OK; - if((bflmp = vhandle) == NULL) + if((bflmp = (struct a_bflm *)vhandle) == NULL) goto jleave; if(bflmp->bflm_flags & a_BFLM_DB_UNAVAIL) @@ -988,7 +988,7 @@ /* Copy to dbv_key and dbv_val in order to avoid loss upon possible * action on the DB; should not matter, but NUL terminate them */ dbv_key.leng = (uint32_t)(i = key.mv_size); - dbv_key.data = buf = xrealloc(buf, i +1 + val.mv_size +1); + dbv_key.data = buf = (char *)xrealloc(buf, i +1 + val.mv_size +1); memcpy(buf, key.mv_data, i); buf[i++] = '\0'; dbv_val.leng = (uint32_t)val.mv_size; This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site. |
From: <m-...@us...> - 2018-07-19 20:23:12
|
Revision: 7071 http://sourceforge.net/p/bogofilter/code/7071 Author: m-a Date: 2018-07-19 20:23:10 +0000 (Thu, 19 Jul 2018) Log Message: ----------- Fix out-of-bounds memory read found with t.valgrind. Modified Paths: -------------- branches/lmdb-support/bogofilter/src/lexer.c Modified: branches/lmdb-support/bogofilter/src/lexer.c =================================================================== --- branches/lmdb-support/bogofilter/src/lexer.c 2018-07-19 20:19:31 UTC (rev 7070) +++ branches/lmdb-support/bogofilter/src/lexer.c 2018-07-19 20:23:10 UTC (rev 7071) @@ -140,6 +140,7 @@ && count != EOF /* don't skip if inside message/rfc822 */ && msg_state->parent == NULL + && buff->t.leng >= hdrlen && memcmp(buff->t.u.text,spam_header_name,hdrlen) == 0) { count = skip_folded_line(buff); } This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site. |
From: <m-...@us...> - 2018-07-19 20:19:34
|
Revision: 7070 http://sourceforge.net/p/bogofilter/code/7070 Author: m-a Date: 2018-07-19 20:19:31 +0000 (Thu, 19 Jul 2018) Log Message: ----------- Fix t.bogoutil under BF_CHECKTOOL=valgrind. Modified Paths: -------------- branches/lmdb-support/bogofilter/src/tests/t.bogoutil Modified: branches/lmdb-support/bogofilter/src/tests/t.bogoutil =================================================================== --- branches/lmdb-support/bogofilter/src/tests/t.bogoutil 2018-07-19 20:07:01 UTC (rev 7069) +++ branches/lmdb-support/bogofilter/src/tests/t.bogoutil 2018-07-19 20:19:31 UTC (rev 7070) @@ -89,7 +89,7 @@ unset BOGOFILTER_DIR cd "$TMPDIR" -"$BOGOUTIL" -C -w "$WORDLIST" .MSG_COUNT > /dev/null -"$BOGOUTIL" -C -p "$WORDLIST" .MSG_COUNT > /dev/null +$BOGOUTIL -C -w "$WORDLIST" .MSG_COUNT > /dev/null +$BOGOUTIL -C -p "$WORDLIST" .MSG_COUNT > /dev/null PRINTCORE=$OPC cd - > /dev/null This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site. |
From: <m-...@us...> - 2018-07-19 20:07:05
|
Revision: 7069 http://sourceforge.net/p/bogofilter/code/7069 Author: m-a Date: 2018-07-19 20:07:01 +0000 (Thu, 19 Jul 2018) Log Message: ----------- Merge LMDB support from ^/branches/lmdb-support. Modified Paths: -------------- trunk/bogofilter/AUTHORS trunk/bogofilter/INSTALL trunk/bogofilter/NEWS trunk/bogofilter/configure.ac trunk/bogofilter/src/Makefile.am trunk/bogofilter/src/tests/t.frame Added Paths: ----------- trunk/bogofilter/src/datastore_lmdb.c Property Changed: ---------------- trunk/ Index: trunk =================================================================== --- trunk 2018-07-19 20:04:35 UTC (rev 7068) +++ trunk 2018-07-19 20:07:01 UTC (rev 7069) Property changes on: trunk ___________________________________________________________________ Added: svn:mergeinfo ## -0,0 +1 ## +/branches/lmdb-support:7060-7068 \ No newline at end of property Modified: trunk/bogofilter/AUTHORS =================================================================== --- trunk/bogofilter/AUTHORS 2018-07-19 20:04:35 UTC (rev 7068) +++ trunk/bogofilter/AUTHORS 2018-07-19 20:07:01 UTC (rev 7069) @@ -57,3 +57,4 @@ Roman Trunov Julius Plenz Denny Lin (KyotoCabinet support) +Steffen Nurpmeso (LMDB support) Modified: trunk/bogofilter/INSTALL =================================================================== --- trunk/bogofilter/INSTALL 2018-07-19 20:04:35 UTC (rev 7068) +++ trunk/bogofilter/INSTALL 2018-07-19 20:07:01 UTC (rev 7069) @@ -18,10 +18,12 @@ 3. SQLite (3.2.6 or newer) http://sqlite.org/ 4. TokyoCabinet http://fallabs.com/tokyocabinet/ 5. KyotoCabinet http://fallabs.com/kyotocabinet/ +6. LMDB Lightning Memory-Mapped Database Manager https://symas.com/lmdb/ You can use --with-database=ARG (choose from db (for Berkeley DB), qdbm, -sqlite) to pick the database backend (you must have installed the -database and the corresponding developer package). db is the default. +sqlite, tokyocabinet, kyotocabinet, lmdb) to pick the database backend (you +must have installed the database and the corresponding developer package). +db is the default. If you are using "db", you can use --disable-transactions or --enable-transactions to force the use of 1a or 1b. The default is to Modified: trunk/bogofilter/NEWS =================================================================== --- trunk/bogofilter/NEWS 2018-07-19 20:04:35 UTC (rev 7068) +++ trunk/bogofilter/NEWS 2018-07-19 20:07:01 UTC (rev 7069) @@ -15,6 +15,11 @@ ------------------------------------------------------------------------------- + 2018-07-19 + * Support for using LMDB (Lightning Memory-Mapped Database Manager) + as the database back-end. Suggested, courteously implemented and + contributed by Steffen Nurpmeso, steffen .at. sdaoden.eu. + 2018-07-17 * The Berkeley DB backend driver forgoes DB_NOSYNC in transactional mode, so as to synchronize changes from the logs back into the .db Modified: trunk/bogofilter/configure.ac =================================================================== --- trunk/bogofilter/configure.ac 2018-07-19 20:04:35 UTC (rev 7068) +++ trunk/bogofilter/configure.ac 2018-07-19 20:07:01 UTC (rev 7069) @@ -479,7 +479,7 @@ WITH_DB_ENGINE=db AC_ARG_WITH(database, AS_HELP_STRING([--with-database=ENGINE], - [choose database engine {db|qdbm|sqlite3|tokyocabinet|kyotocabinet} [[db]]]), + [choose database engine {db|qdbm|sqlite3|tokyocabinet|kyotocabinet|lmdb} [[db]]]), [ WITH_DB_ENGINE=$withval ] ) @@ -531,6 +531,29 @@ ])],,AC_MSG_ERROR(Cannot link to kyotocabinet library.)) LIBS="$saveLIBS" ;; + xlmdb) + AC_DEFINE(ENABLE_LMDB_DATASTORE,1, [Enable LMDB datastore]) + DB_TYPE=lmdb + DB_EXT=.lmdb + AC_LIB_LINKFLAGS([lmdb]) + LIBDB="$LIBLMDB" + saveLIBS="$LIBS" + LIBS="$LIBS $LIBDB" + AC_LINK_IFELSE([AC_LANG_PROGRAM([ +#include <lmdb.h> + ], [ + MDB_env *env; + MDB_txn *txn; + MDB_dbi dbi; + mdb_env_create(&env); + mdb_env_set_maxreaders(env, 1); + mdb_env_set_mapsize(env, 4096*42); + mdb_env_open(env, "/tmp", 0, 0660); + mdb_txn_begin(env, (void*)0, 0, &txn); + mdb_dbi_open(txn, (void*)0, 0, &dbi); + ])],,AC_MSG_ERROR(Cannot link to lmdb library.)) + LIBS="$saveLIBS" + ;; xqdbm) AC_DEFINE(ENABLE_QDBM_DATASTORE,1, [Enable qdbm datastore]) DB_TYPE=qdbm @@ -681,7 +704,7 @@ LIBS="$saveLIBS" ;; *) - AC_MSG_ERROR([Invalid --with-database argument. Supported engines are db, qdbm, sqlite3, tokyocabinet, kyotocabinet.]) + AC_MSG_ERROR([Invalid --with-database argument. Supported engines are db, qdbm, sqlite3, tokyocabinet, kyotocabinet, lmdb.]) ;; esac @@ -708,6 +731,7 @@ AM_CONDITIONAL(ENABLE_SQLITE_DATASTORE, test "x$WITH_DB_ENGINE" = "xsqlite3") AM_CONDITIONAL(ENABLE_TOKYOCABINET_DATASTORE, test "x$WITH_DB_ENGINE" = "xtokyocabinet") AM_CONDITIONAL(ENABLE_KYOTOCABINET_DATASTORE, test "x$WITH_DB_ENGINE" = "xkyotocabinet") +AM_CONDITIONAL(ENABLE_LMDB_DATASTORE, test "x$WITH_DB_ENGINE" = "xlmdb") dnl Use TRIO to replace missing snprintf/vsnprintf. needtrio=0 Modified: trunk/bogofilter/src/Makefile.am =================================================================== --- trunk/bogofilter/src/Makefile.am 2018-07-19 20:04:35 UTC (rev 7068) +++ trunk/bogofilter/src/Makefile.am 2018-07-19 20:07:01 UTC (rev 7069) @@ -195,6 +195,11 @@ datastore_opthelp_dummies.c \ datastore_dummies.c else +if ENABLE_LMDB_DATASTORE +datastore_SOURCE = datastore_lmdb.c \ + datastore_opthelp_dummies.c \ + datastore_dummies.c +else if ENABLE_TRANSACTIONS datastore_SOURCE = datastore_db.c datastore_db_trans.c else @@ -209,6 +214,7 @@ endif endif endif +endif datastore_OBJECT = $(datastore_SOURCE:.c=.o) Copied: trunk/bogofilter/src/datastore_lmdb.c (from rev 7068, branches/lmdb-support/bogofilter/src/datastore_lmdb.c) =================================================================== --- trunk/bogofilter/src/datastore_lmdb.c (rev 0) +++ trunk/bogofilter/src/datastore_lmdb.c 2018-07-19 20:07:01 UTC (rev 7069) @@ -0,0 +1,1022 @@ +/* $Id$ */ + +/* + * NAME: + * datastore_lmdb.c -- implements the datastore, using LMDB. + * + * AUTHORS: + * Steffen Nurpmeso <st...@sd...> 2018 + * (copied from datastore_kc.c: + * Gyepi Sam <gy...@pr...> 2003 + * Matthias Andree <mat...@gm...> 2003 + * Stefan Bellon <sb...@sb...> 2003-2004 + * Pierre Habouzit <mad...@de...> 2007 + * Denny Lin <den...@hs...> 2015) + */ + +/* + * Remarks. + * + * 1. LMDB places anything inside transactions (txn). + * You open an environment (which may contain multiple DBs), create + * a transaction and open a DB in that transaction. + * 2. LMDB is based on a finite-sized memory map. When a writable transaction + * reaches the size limit, the transaction must be aborted, then the + * environment must be resized, then a new transaction has to be created. + * Resizing will not shrink, effectively. + * 3. We assume xmalloc() aborts if out of memory. + * 4. We assume no token->leng actually exceeds int32_t. + * 5. mdb_env_get_maxkeysize(): + * Depends on the compile-time constant #MDB_MAXKEYSIZE. Default 511. + * We reject any keys which excess this. + * + * In order to be able to deal with 2. we need to track all changes that are + * performed in a txn, so that in case we are running against the wall we are + * capable to replay all changes after having resized the map. + * + * Alternatively, define a_BFLM_FIXED_SIZE, in which case all the replay code + * is not compiled, but instead the given size is fixed, and any DB overflow + * results in program abortion. Since the DB should only consume disc space + * for those pages which are used, this should not hurt in practice. + */ + +/* Alternative implementation: fixed DB size */ +/*#define a_BFLM_FIXED_SIZE (1u << 31)*/ + +/* mdb_env_set_maxreaders() */ +#define a_BFLM_MAXREADERS 15 + +#ifndef a_BFLM_FIXED_SIZE + /* DB size grow. Must be a power of two! + * Space it so that a DB load does not run against walls too many times. + * We try _TRIES times to resize for a single new entry before giving up. + * Note that the minimum size must be capable to store the two used DB + * structured without resize etc. */ +# define a_BFLM_GROW (1u << 23) +# define a_BFLM_GROW_TRIES 3 + + /* Size of one chunk of the intermediate txn cache, as above. + * Space it so that a DB load does not require all too many. + * Of course, if a token requires more space, we allocate a larger chunk */ +# define a_BFLM_TXN_CACHE_SIZE (1u << 20) + + /* An entry consists of an uint32_t describing the length of the key. + * If the high bit is set an uint32_t describing the length of the value + * follows. After the data buffers there possibly is alignment pad */ +# define a_BFLM_TXN_CACHE_ALIGN(X) \ + (((X) + (sizeof(uint32_t) - 1)) & ~(sizeof(uint32_t) - 1)) +#endif /* a_BFLM_FIXED_SIZE */ + +/* The DB names we use: one for our "is-created" event, the other for data */ +#define a_BFLM_DB_NAME_MAN "BF_MAN" +#define a_BFLM_DB_NAME_DAT "BF_DAT" + +#include "common.h" + +#include <errno.h> + +#include <lmdb.h> + +#include "datastore.h" +#include "datastore_db.h" +#include "error.h" +#include "paths.h" +#include "xmalloc.h" +#include "xstrdup.h" + +#if MDB_VERSION_FULL < MDB_VERINT(0, 9, 22) +# error "Required LMDB version: 0.9.22 or later (0.9.11 may do, but untested)" +#endif + +#define UNUSED(x) ((void)(x)) + +enum a_bflm_flags{ + a_BFLM_NONE, + a_BFLM_DEBUG = 1u<<0, + a_BFLM_RDONLY = 1u<<1, + a_BFLM_DB_CREATED = 1u<<2, /* DBs were newly created */ + a_BFLM_DB_UNAVAIL = 1u<<3, /* rdonly open, but no DB exists yet! */ + a_BFLM_HAS_TXN = 1u<<4 +}; + +struct a_bflm{ + char *bflm_filepath; /* bfpath.filepath (points to &self[1]) */ + MDB_env *bflm_env; + MDB_txn *bflm_txn; + MDB_cursor *bflm_cursor; + MDB_dbi bflm_dbi; + uint32_t bflm_flags; + size_t bflm_maxkeysize; /* mdb_env_get_maxkeysize() */ +#ifndef a_BFLM_FIXED_SIZE + struct a_bflm_txn_cache *bflm_txn_cache; /* Stack thereof */ +#endif +}; + +#ifndef a_BFLM_FIXED_SIZE +struct a_bflm_txn_cache{ + struct a_bflm_txn_cache *bflmtc_last; + struct a_bflm_txn_cache *bflmtc_next; /* Needs to be build before use! */ + char *bflmtc_caster; /* Current caster */ + char *bflmtc_max; /* Maximum usable byte, exclusive */ + /* Actually points to &self[1] TODO [0] or [8], dep. __STDC_VERSION__! */ + char *bflmtc_data; +}; +#endif + +/**/ +static struct a_bflm *a_bflm_init(bfpath *bfp, bool rdonly); +static int a_bflm__check_create(struct a_bflm *bflmp); +static void a_bflm_free(struct a_bflm *bflmp); + +/**/ +static int a_bflm_txn_begin(void *vhandle); +static int a_bflm_txn_abort(void *vhandle); +static int a_bflm_txn_commit(void *vhandle); + +#ifndef a_BFLM_FIXED_SIZE +/**/ +static bool a_bflm_txn_mapfull(struct a_bflm *bflmp, bool close_cursor); + +/* Put an entry; it is a deletion if val_or_null is NULL. + * Return NULL on success or an error message otherwise */ +static char const *a_bflm_txn_cache_put(struct a_bflm *bflmp, MDB_val *key, + MDB_val *val_or_null); + +/* Replay all the cache operations in order to redo the transaction. + * Return NULL on success or an error message otherwise */ +static char const *a_bflm_txn_cache_replay(struct a_bflm *bflmp); + +/* Free the recovery stack and possible heap data */ +static void a_bflm_txn_cache_free(struct a_bflm *bflmp); +#endif /* a_BFLM_FIXED_SIZE */ + +static dsm_t /* TODO const*/ a_bflm_dsm = { + /* public -- used in datastore.c */ + &a_bflm_txn_begin, + &a_bflm_txn_abort, + &a_bflm_txn_commit, + /* private -- used in datastore_db_*.c */ + NULL, /* dsm_env_init */ + NULL, /* dsm_cleanup */ + NULL, /* dsm_cleanup_lite */ + NULL, /* dsm_get_env_dbe */ + NULL, /* dsm_database_name */ + NULL, /* dsm_recover_open */ + NULL, /* dsm_auto_commit_flags */ + NULL, /* dsm_get_rmw_flag */ + NULL, /* dsm_lock */ + NULL, /* dsm_common_close */ + NULL, /* dsm_sync */ + NULL, /* dsm_log_flush */ + NULL, /* dsm_pagesize */ + NULL, /* dsm_purgelogs */ + NULL, /* dsm_checkpoint */ + NULL, /* dsm_recover */ + NULL, /* dsm_remove */ + NULL, /* dsm_verify */ + NULL, /* dsm_list_logfiles */ + NULL /* dsm_leafpages */ +}; + +static struct a_bflm * +a_bflm_init(bfpath *bfp, bool rdonly){ + /* No variable array for .bflm_filepath, use same method as in word.h */ + int e; + char const *emsg; + struct a_bflm *rv; + size_t i; + + i = strlen(bfp->filepath) +1; + rv = xmalloc(sizeof(*rv) + i); + memset(rv, 0, sizeof *rv); + memcpy(rv->bflm_filepath = (char*)&rv[1], bfp->filepath, i); + + rv->bflm_flags = (((DEBUG_DATABASE(1) || getenv("BF_DEBUG_DB") != NULL) + ? a_BFLM_DEBUG : a_BFLM_NONE) | + (rdonly ? a_BFLM_RDONLY : a_BFLM_NONE)); + + e = mdb_env_create(&rv->bflm_env); + if(e != MDB_SUCCESS){ + emsg = "mdb_env_open()"; + goto jerr1; + } + + rv->bflm_maxkeysize = mdb_env_get_maxkeysize(rv->bflm_env); + + /* To acommodate with bogofilter's db_created() mechanism we cannot use the + * unnamed DB which "always exists", but must place data in a named one */ + e = mdb_env_set_maxdbs(rv->bflm_env, 2); + if(e != MDB_SUCCESS){ + emsg = "mdb_env_set_maxdbs()"; + goto jerr1; + } + + mdb_env_set_maxreaders(rv->bflm_env, a_BFLM_MAXREADERS); + + /* TODO We may not do this unless going for a huge fixed size, because with + * TODO v0.9.22 a further DB open will then crash in mdb_*_put() after + * TODO a growing _mapsize call! ... */ +#ifdef a_BFLM_FIXED_SIZE + e = mdb_env_set_mapsize(rv->bflm_env, a_BFLM_FIXED_SIZE); + if(e != MDB_SUCCESS){ + emsg = "mdb_env_set_mapsize()"; + goto jerr2; + } +#endif + + e = mdb_env_open(rv->bflm_env, rv->bflm_filepath, MDB_NOSUBDIR, 0660); + if(e != MDB_SUCCESS){ + emsg = "mdb_env_open()"; + goto jerr2; + } + + /* Let us fake a "has been created" event :( */ + if(!(rv->bflm_flags & a_BFLM_RDONLY) && + (e = a_bflm__check_create(rv)) != MDB_SUCCESS){ + emsg = "cannot handle management DB"; + goto jerr2; + } + + if(rv->bflm_flags & a_BFLM_DEBUG) + fprintf(dbgout, "LMDB[%ld]: init: %p [%s]\n", + (long)getpid(), rv, rv->bflm_filepath); +jleave: + return rv; + +jerr2: + mdb_env_close(rv->bflm_env); +jerr1: + if(emsg != NULL) + print_error(__FILE__, __LINE__, "LMDB[%ld]: init, %s: %d, %s", + (long)getpid(), emsg, e, mdb_strerror(e)); + xfree(rv); + rv = NULL; + goto jleave; +} + +static int +a_bflm__check_create(struct a_bflm *bflmp){ + char const *db_name; + unsigned int f; + int e; + +jredo_txn: + e = mdb_txn_begin(bflmp->bflm_env, NULL, 0, &bflmp->bflm_txn); + if(e != MDB_SUCCESS){ + if(e == MDB_MAP_RESIZED){ + mdb_env_set_mapsize(bflmp->bflm_env, 0); + goto jredo_txn; + } + goto jleave; + } + + db_name = a_BFLM_DB_NAME_MAN; + f = 0; +jredo_dbi: + e = mdb_dbi_open(bflmp->bflm_txn, db_name, f, &bflmp->bflm_dbi); + if(e != MDB_SUCCESS){ + if(e == MDB_NOTFOUND && f == 0){ + bflmp->bflm_flags |= a_BFLM_DB_CREATED; + f = MDB_CREATE; + goto jredo_dbi; + } + goto jerr; + } + + if(f == MDB_CREATE && db_name == a_BFLM_DB_NAME_MAN){ + db_name = a_BFLM_DB_NAME_DAT; + goto jredo_dbi; + } + + e = mdb_txn_commit(bflmp->bflm_txn); + if(e != MDB_SUCCESS) +jerr: + mdb_txn_abort(bflmp->bflm_txn); +jleave: + return e; +} + +static void +a_bflm_free(struct a_bflm *bflmp){ + if(bflmp != NULL){ +#ifndef a_BFLM_FIXED_SIZE + if(bflmp->bflm_txn_cache != NULL){ + if(DEBUG_DATABASE(1)) + fprintf(dbgout, "LMDB _free(): error: there is txn_cache!\n"); + a_bflm_txn_cache_free(bflmp); + } +#endif + + mdb_env_close(bflmp->bflm_env); + + if(bflmp->bflm_flags & a_BFLM_DEBUG) + fprintf(dbgout, "LMDB[%ld]: a_bflm_free(%p [%s])\n", + (long)getpid(), bflmp, bflmp->bflm_filepath); + + xfree(bflmp); + } +} + +static int +a_bflm_txn_begin(void *vhandle){ + char const *emsg; + struct a_bflm *bflmp; + int e; + + e = DST_OK; + + if((bflmp = vhandle) == NULL) + goto jleave; + + if(bflmp->bflm_flags & a_BFLM_DEBUG) + fprintf(dbgout, "LMDB[%ld]: txn_begin(%p [%s])\n", + (long)getpid(), bflmp, bflmp->bflm_filepath); + +jredo_txn: + e = mdb_txn_begin(bflmp->bflm_env, NULL, + (bflmp->bflm_flags & a_BFLM_RDONLY ? MDB_RDONLY : 0), + &bflmp->bflm_txn); + if(e != MDB_SUCCESS){ + if(e == MDB_MAP_RESIZED){ + mdb_env_set_mapsize(bflmp->bflm_env, 0); + goto jredo_txn; + } + emsg = "mdb_txn_begin()"; + goto jerr1; + } + + e = mdb_dbi_open(bflmp->bflm_txn, a_BFLM_DB_NAME_DAT, 0, &bflmp->bflm_dbi); + if(e != MDB_SUCCESS){ + if(e == MDB_NOTFOUND){ + bflmp->bflm_flags |= a_BFLM_DB_UNAVAIL; + goto junavail; + } + emsg = "mdb_dbi_open()"; + goto jerr2; + } + + e = mdb_cursor_open(bflmp->bflm_txn, bflmp->bflm_dbi, &bflmp->bflm_cursor); + if(e != MDB_SUCCESS){ + emsg = "mdb_cursor_open()"; + goto jerr2; + } + +junavail: + bflmp->bflm_flags |= a_BFLM_HAS_TXN; + e = DST_OK; +jleave: + return e; + +jerr2: + mdb_txn_abort(bflmp->bflm_txn); +jerr1: + print_error(__FILE__, __LINE__, "LMDB[%ld]: txn_begin(), %s: %d, %s", + (long)getpid(), emsg, e, mdb_strerror(e)); + e = DST_FAILURE; + goto jleave; +} + +static int +a_bflm_txn_abort(void *vhandle){ + struct a_bflm *bflmp; + int e; + + e = DST_OK; + + if((bflmp = vhandle) == NULL) + goto jleave; + + if(bflmp->bflm_flags & a_BFLM_DEBUG) + fprintf(dbgout, "LMDB[%ld]: txn_abort(%p [%s])\n", + (long)getpid(), bflmp, bflmp->bflm_filepath); + + if(!(bflmp->bflm_flags & a_BFLM_DB_UNAVAIL)) + mdb_cursor_close(bflmp->bflm_cursor); + + mdb_txn_abort(bflmp->bflm_txn); + +#ifndef a_BFLM_FIXED_SIZE + a_bflm_txn_cache_free(bflmp); +#endif + + bflmp->bflm_flags &= ~a_BFLM_HAS_TXN; +jleave: + return e; +} + +static int +a_bflm_txn_commit(void *vhandle){ + struct a_bflm *bflmp; +#ifndef a_BFLM_FIXED_SIZE + int retries; +#endif + int e; + + e = DST_OK; + + if((bflmp = vhandle) == NULL) + goto jleave; + + if(bflmp->bflm_flags & a_BFLM_DEBUG) + fprintf(dbgout, "LMDB[%ld]: txn_commit(%p [%s])\n", + (long)getpid(), bflmp, bflmp->bflm_filepath); + + if(!(bflmp->bflm_flags & a_BFLM_DB_UNAVAIL)) + mdb_cursor_close(bflmp->bflm_cursor); + +#ifndef a_BFLM_FIXED_SIZE + retries = 0; +jredo: +#endif + e = mdb_txn_commit(bflmp->bflm_txn); + if(e != MDB_SUCCESS){ +#ifndef a_BFLM_FIXED_SIZE + if((e == MDB_MAP_FULL || e == MDB_MAP_RESIZED) && + ++retries <= a_BFLM_GROW_TRIES && + a_bflm_txn_mapfull(bflmp, false)){ + mdb_cursor_close(bflmp->bflm_cursor); + goto jredo; + } +#endif + mdb_txn_abort(bflmp->bflm_txn); + e = MDB_PANIC; + } + +#ifndef a_BFLM_FIXED_SIZE + a_bflm_txn_cache_free(bflmp); +#endif + + bflmp->bflm_flags &= ~a_BFLM_HAS_TXN; + if(e == MDB_SUCCESS) + e = DST_OK; + else{ + print_error(__FILE__, __LINE__, "LMDB[%ld]: txn_commit(): %d, %s", + (long)getpid(), e, mdb_strerror(e)); + e = DST_FAILURE; + } +jleave: + return e; +} + +#ifndef a_BFLM_FIXED_SIZE +static bool +a_bflm_txn_mapfull(struct a_bflm *bflmp, bool close_cursor){ + MDB_envinfo envinfo; + char const *emsg; + int e; + size_t i; + + /* Abort transaction */ + if(DEBUG_DATABASE(1) && (bflmp->bflm_flags & a_BFLM_DB_UNAVAIL)) + exit(EX_ERROR); + + if(close_cursor) + mdb_cursor_close(bflmp->bflm_cursor); + + mdb_txn_abort(bflmp->bflm_txn); + + /* Resize map */ +jredo_txn: + /* no error defined */mdb_env_info(bflmp->bflm_env, &envinfo); + i = envinfo.me_mapsize; + i += a_BFLM_GROW / 10; + i = (i + (a_BFLM_GROW - 1)) & ~(a_BFLM_GROW - 1); + e = mdb_env_set_mapsize(bflmp->bflm_env, i); + if(e != MDB_SUCCESS){ + emsg = "mdb_env_set_mapsize()"; + goto jerr1; + } + + /* Recreate transaction */ + e = mdb_txn_begin(bflmp->bflm_env, NULL, 0, &bflmp->bflm_txn); + if(e != MDB_SUCCESS){ + if(e == MDB_MAP_RESIZED){ + mdb_env_set_mapsize(bflmp->bflm_env, 0); + goto jredo_txn; + } + emsg = "mdb_txn_begin()"; + goto jerr1; + } + + e = mdb_dbi_open(bflmp->bflm_txn, a_BFLM_DB_NAME_DAT, 0, &bflmp->bflm_dbi); + if(e != MDB_SUCCESS){ + emsg = "mdb_dbi_open()"; + goto jerr2; + } + + e = mdb_cursor_open(bflmp->bflm_txn, bflmp->bflm_dbi, &bflmp->bflm_cursor); + if(e != MDB_SUCCESS){ + emsg = "mdb_cursor_open()"; + goto jerr2; + } + + if((emsg = a_bflm_txn_cache_replay(bflmp)) != NULL) + goto jerr3; + + if(bflmp->bflm_flags & a_BFLM_DEBUG) + fprintf(dbgout, "LMDB[%ld]: txn_mapfull(%p [%s]): " + "recreated, new size %lu\n", + (long)getpid(), bflmp, bflmp->bflm_filepath, + (unsigned long)envinfo.me_mapsize); + e = 0; +jleave: + return (e == 0); +jerr3: + /* Done by TXN abort mdb_cursor_close(bflmp->bflm_cursor); */ +jerr2: + /* Done by TXN abort mdb_txn_abort(bflmp->bflm_txn); */ +jerr1: + print_error(__FILE__, __LINE__, "LMDB[%ld]: txn_mapfull(): %s, %d, %s", + (long)getpid(), emsg, e, mdb_strerror(e)); + e = 1; + goto jleave; +} + +static char const * +a_bflm_txn_cache_put(struct a_bflm *bflmp, MDB_val *key, MDB_val *val_or_null){ + uint32_t ui; + char *dp; + struct a_bflm_txn_cache *bflmtcp; + char const *emsg; + size_t kl, vl, i; + + kl = key->mv_size; + if(val_or_null != NULL){ + vl = val_or_null->mv_size; + i = (2 * sizeof(uint32_t)) + kl + vl; + }else{ + vl = 0; + i = sizeof(uint32_t) + kl; + } + i = a_BFLM_TXN_CACHE_ALIGN(i); + + /* XXX We actually should abort() the program instead: cannot be handled */ + if(kl >= 0x7FFFFFFFu || vl >= 0x7FFFFFFFu || + i >= 0x7FFFFFFFu - sizeof(*bflmtcp)){ + emsg = "LMDB: entry too large to be stored"; + goto jleave; + } + + /* Do we need to create a new cache chunk entry? + * We are simple and only look into the top of the stack */ + if((bflmtcp = bflmp->bflm_txn_cache) == NULL) + goto jcache_new; + else if(i >= (size_t)(bflmtcp->bflmtc_max - bflmtcp->bflmtc_caster)){ +jcache_new: + i += sizeof(*bflmtcp); + i = max(i, a_BFLM_TXN_CACHE_SIZE); + dp = (char*)(bflmtcp = xmalloc(i)); + bflmtcp->bflmtc_last = bflmp->bflm_txn_cache; + bflmp->bflm_txn_cache = bflmtcp; + bflmtcp->bflmtc_caster = bflmtcp->bflmtc_data = (char*)&bflmtcp[1]; + i -= 2 * sizeof(uint32_t); + bflmtcp->bflmtc_max = &dp[i]; + } + + /* For actual storing always use memcpy() for simplicity. + * (That is: C standard and undefined behaviour, who knows?) */ + dp = bflmtcp->bflmtc_caster; + ui = (uint32_t)kl; + if(val_or_null != NULL) + ui |= 0x80000000u; + memcpy(dp, &ui, sizeof ui); + dp += sizeof ui; + if(val_or_null != NULL){ + ui = (uint32_t)vl; + memcpy(dp, &ui, sizeof ui); + dp += sizeof ui; + } + memcpy(dp, key->mv_data, kl); + dp += kl; + if(vl != 0){ + memcpy(dp, val_or_null->mv_data, vl); + dp += vl; + } + bflmtcp->bflmtc_caster = (char*)a_BFLM_TXN_CACHE_ALIGN((uintptr_t)dp); + + emsg = NULL; +jleave: + return emsg; +} + +static char const * +a_bflm_txn_cache_replay(struct a_bflm *bflmp){ + /* And replay all the changes we have yet seen */ + MDB_val key, val; + char const *emsg; + int e; + uint32_t kl, vl; + char *dp; + struct a_bflm_txn_cache *head, *bflmtcp; + + /* First of all create a list in the right order */ + for(head = NULL, bflmtcp = bflmp->bflm_txn_cache; bflmtcp != NULL; + bflmtcp = bflmtcp->bflmtc_last){ + bflmtcp->bflmtc_next = head; + head = bflmtcp; + } + + /* Then replay, using it */ + for(; head != NULL; head = head->bflmtc_next){ + for(dp = head->bflmtc_data; dp < head->bflmtc_caster;){ + bool isins; + + /* For actual loading always use memcpy() for simplicity. + * (That is: C standard and undefined behaviour, who knows?) */ + memcpy(&kl, dp, sizeof kl); + dp += sizeof kl; + if((isins = ((kl & 0x80000000u) != 0))){ + kl ^= 0x80000000u; + memcpy(&vl, dp, sizeof vl); + dp += sizeof vl; + } + + key.mv_size = kl; + key.mv_data = dp; + dp += kl; + if(isins){ + val.mv_size = vl; + val.mv_data = dp; + dp += vl; + + e = mdb_cursor_put(bflmp->bflm_cursor, &key, &val, 0); + if(e != MDB_SUCCESS){ + emsg = "mdb_cursor_put()"; + goto jleave; + } + }else{ + e = mdb_cursor_get(bflmp->bflm_cursor, &key, NULL, + MDB_SET_KEY); + if(e != MDB_SUCCESS){ + emsg = "mdb_cursor_get() for delete"; + goto jleave; + } + e = mdb_cursor_del(bflmp->bflm_cursor, 0); + if(e != MDB_SUCCESS){ + emsg = "mdb_cursor_del()"; + goto jleave; + } + } + dp = (char*)a_BFLM_TXN_CACHE_ALIGN((uintptr_t)dp); + } + } + + emsg = NULL; +jleave: + return emsg; +} + +static void +a_bflm_txn_cache_free(struct a_bflm *bflmp){ + struct a_bflm_txn_cache *bflmtcp; + + if(bflmp->bflm_flags & a_BFLM_DEBUG) + fprintf(dbgout, "LMDB[%ld]: cache_free(%p [%s])\n", + (long)getpid(), bflmp, bflmp->bflm_filepath); + + while((bflmtcp = bflmp->bflm_txn_cache) != NULL){ + bflmp->bflm_txn_cache = bflmtcp->bflmtc_last; + xfree(bflmtcp); + } +} +#endif /* a_BFLM_FIXED_SIZE */ + +dsm_t /* const TODO */ *dsm = &a_bflm_dsm; + +void * +db_open(void *env, bfpath *bfp, dbmode_t open_mode){ + struct a_bflm *bflmp; + UNUSED(env); + + if((bflmp = a_bflm_init(bfp, (open_mode == DS_READ))) == NULL) + goto jleave; + + if(bflmp->bflm_flags & a_BFLM_DEBUG) + fprintf(dbgout, "LMDB[%ld]: db_open(%p [%s; rdonly=%d])\n", + (long)getpid(), bflmp, bflmp->bflm_filepath, + !!(bflmp->bflm_flags & a_BFLM_RDONLY)); +jleave: + return bflmp; +} + +void +db_close(void *vhandle){ + struct a_bflm *bflmp; + + if((bflmp = vhandle) == NULL) + goto jleave; + + if(bflmp->bflm_flags & a_BFLM_DEBUG) + fprintf(dbgout, "LMDB[%ld]: db_close(%p [%s])\n", + (long)getpid(), bflmp, bflmp->bflm_filepath); + + a_bflm_free(bflmp); +jleave:; +} + +bool +db_is_swapped(void *vhandle){ + UNUSED(vhandle); + return false; +} + +bool +db_created(void *vhandle){ + bool created; + struct a_bflm *bflmp; + + created = ((bflmp = vhandle) != NULL && + (bflmp->bflm_flags & a_BFLM_DB_CREATED) != 0); + return created; +} + +int +db_get_dbvalue(void *vhandle, const dbv_t *token, dbv_t *value){ + MDB_val key, val; + char const *emsg; + struct a_bflm *bflmp; + int e; + + e = DS_NOTFOUND; + + if((bflmp = vhandle) == NULL) + goto jleave; + + if(bflmp->bflm_flags & a_BFLM_DB_UNAVAIL) + goto jleave; + + if((size_t)token->leng > bflmp->bflm_maxkeysize){ + if(bflmp->bflm_flags & a_BFLM_DEBUG) + fprintf(dbgout, "LMDB[%ld]: get_dbvalue: key too big " + "(> %lu bytes), ignoring %.*s\n", + (long)getpid(),(unsigned long)bflmp->bflm_maxkeysize, + (int)token->leng, token->data); + goto jleave; + } + + key.mv_data = token->data; + key.mv_size = token->leng; + e = mdb_cursor_get(bflmp->bflm_cursor, &key, &val, MDB_SET); + if(e != MDB_SUCCESS){ + emsg = "mdb_cursor_get()"; + goto jerr; + } + + if(val.mv_size > value->leng){ + emsg = "value storage too small"; + e = ENOSPC; + goto jerr; + } + memcpy(value->data, val.mv_data, value->leng = val.mv_size); + + e = 0; +jleave: + if(DEBUG_DATABASE(3)) + fprintf(dbgout, "LMDB db_get_dbvalue(): %lu <%.*s> -> %d\n", + (unsigned long)token->leng, (int)token->leng, token->data, + (e == 0)); + return e; +jerr: + if(e != MDB_NOTFOUND){ + print_error(__FILE__, __LINE__, "LMDB[%ld]: db_get_dbvalue(), " + "%s: %d, %s", + (long)getpid(), emsg, e, mdb_strerror(e)); + exit(EX_ERROR); + } + e = DS_NOTFOUND; + goto jleave; +} + +int +db_set_dbvalue(void *vhandle, const dbv_t *token, const dbv_t *value){ + MDB_val key, val; + char const *emsg; + struct a_bflm *bflmp; +#ifndef a_BFLM_FIXED_SIZE + int retries; +#endif + int e; + + e = 0; + + if((bflmp = vhandle) == NULL) + goto jleave; + + if(bflmp->bflm_flags & a_BFLM_DB_UNAVAIL) + goto jleave; + + if((size_t)token->leng > bflmp->bflm_maxkeysize){ + if(bflmp->bflm_flags & a_BFLM_DEBUG) + fprintf(dbgout, "LMDB[%ld]: set_dbvalue: key too big " + "(> %lu bytes), ignoring %.*s\n", + (long)getpid(),(unsigned long)bflmp->bflm_maxkeysize, + (int)token->leng, token->data); + goto jleave; + } + +#ifndef a_BFLM_FIXED_SIZE + retries = 0; +jredo: +#endif + key.mv_data = token->data; + key.mv_size = token->leng; + val.mv_data = value->data; + val.mv_size = value->leng; + e = mdb_cursor_put(bflmp->bflm_cursor, &key, &val, 0); + if(e != MDB_SUCCESS){ +#ifndef a_BFLM_FIXED_SIZE + if(e == MDB_MAP_FULL && ++retries <= a_BFLM_GROW_TRIES && + a_bflm_txn_mapfull(bflmp, true)) + goto jredo; +#endif + emsg = "mdb_cursor_put()"; + goto jerr; + } + +#ifndef a_BFLM_FIXED_SIZE + if((emsg = a_bflm_txn_cache_put(bflmp, &key, &val)) != NULL) + goto jerr; +#endif + + e = 0; +jleave: + if(DEBUG_DATABASE(3)) + fprintf(dbgout, "LMDB db_set_dbvalue(): %lu <%.*s> -> %d\n", + (unsigned long)token->leng, (int)token->leng, token->data, + (e == 0)); + return e; +jerr: + print_error(__FILE__, __LINE__, "LMDB[%ld]: db_set_dbvalue(), %s: %d, %s", + (long)getpid(), emsg, e, mdb_strerror(e)); + exit(EX_ERROR); +} + +int +db_delete(void *vhandle, const dbv_t *token){ + MDB_val key; + char const *emsg; + struct a_bflm *bflmp; +#ifndef a_BFLM_FIXED_SIZE + int retries; +#endif + int e; + + e = 0; + + if((bflmp = vhandle) == NULL) + goto jleave; + + if(bflmp->bflm_flags & a_BFLM_DB_UNAVAIL) + goto jleave; + + if((size_t)token->leng > bflmp->bflm_maxkeysize){ + if(bflmp->bflm_flags & a_BFLM_DEBUG) + fprintf(dbgout, "LMDB[%ld]: delete: key too big " + "(> %lu bytes), ignoring %.*s\n", + (long)getpid(),(unsigned long)bflmp->bflm_maxkeysize, + (int)token->leng, token->data); + goto jleave; + } + +#ifndef a_BFLM_FIXED_SIZE + retries = 0; +jredo: +#endif + key.mv_data = token->data; + key.mv_size = token->leng; + e = mdb_cursor_get(bflmp->bflm_cursor, &key, NULL, MDB_SET_KEY); + if(e != MDB_SUCCESS){ + emsg = "mdb_cursor_get()"; + goto jerr; + } + + e = mdb_cursor_del(bflmp->bflm_cursor, 0); + if(e != MDB_SUCCESS){ +#ifndef a_BFLM_FIXED_SIZE + /* Should not happen, though */ + if(e == MDB_MAP_FULL && ++retries <= a_BFLM_GROW_TRIES && + a_bflm_txn_mapfull(bflmp, true)) + goto jredo; +#endif + emsg = "mdb_cursor_del()"; + goto jerr; + } + +#ifndef a_BFLM_FIXED_SIZE + if((emsg = a_bflm_txn_cache_put(bflmp, &key, NULL)) != NULL) + goto jerr; +#endif + + e = 0; +jleave: + if(DEBUG_DATABASE(3)) + fprintf(dbgout, "LMDB db_delete(): %lu <%.*s> -> %d\n", + (unsigned long)token->leng, (int)token->leng, token->data, + (e == 0)); + return e; +jerr: + print_error(__FILE__, __LINE__, "LMDB[%ld]: db_delete(), %s: %d, %s", + (long)getpid(), emsg, e, mdb_strerror(e)); + if(e != MDB_NOTFOUND) + exit(EX_ERROR); + e = DS_NOTFOUND; + goto jleave; +} + +void +db_flush(void *vhandle){ + struct a_bflm *bflmp; + + if((bflmp = vhandle) != NULL){ + int e; + + if(bflmp->bflm_flags & a_BFLM_DEBUG) + fprintf(dbgout, "LMDB[%ld]: db_flush(%p [%s])\n", + (long)getpid(), bflmp, bflmp->bflm_filepath); + + e = mdb_env_sync(bflmp->bflm_env, true); + if(e != MDB_SUCCESS) + print_error(__FILE__, __LINE__, "LMDB[%ld]: db_flush(): %d, %s", + (long)getpid(), e, mdb_strerror(e)); + } +} + +ex_t +db_foreach(void *vhandle, db_foreach_t hook, void *userdata){ + dbv_t dbv_key, dbv_val; + MDB_val key, val; + char *buf; + MDB_cursor_op cursor_op; + MDB_cursor *fecp; + struct a_bflm *bflmp; + ex_t rv; + + rv = EX_OK; + + if((bflmp = vhandle) == NULL) + goto jleave; + + if(bflmp->bflm_flags & a_BFLM_DB_UNAVAIL) + goto jleave; + + if(bflmp->bflm_flags & a_BFLM_DEBUG) + fprintf(dbgout, "LMDB[%ld]: db_foreach(%p [%s])\n", + (long)getpid(), bflmp, bflmp->bflm_filepath); + + if(mdb_cursor_open(bflmp->bflm_txn, bflmp->bflm_dbi, &fecp + ) != MDB_SUCCESS){ + rv = EX_ERROR; + goto jleave; + } + + buf = NULL; + for(cursor_op = MDB_FIRST;; cursor_op = MDB_NEXT){ + size_t i; + int e; + + e = mdb_cursor_get(fecp, &key, &val, cursor_op); + if(e != MDB_SUCCESS){ + if(e != MDB_NOTFOUND){ + print_error(__FILE__, __LINE__, "LMDB[%ld]: db_foreach(): " + "%d, %s", + (long)getpid(), e, mdb_strerror(e)); + rv = EX_ERROR; + } + break; + } + + /* Copy to dbv_key and dbv_val in order to avoid loss upon possible + * action on the DB; should not matter, but NUL terminate them */ + dbv_key.leng = (uint32_t)(i = key.mv_size); + dbv_key.data = buf = xrealloc(buf, i +1 + val.mv_size +1); + memcpy(buf, key.mv_data, i); + buf[i++] = '\0'; + dbv_val.leng = (uint32_t)val.mv_size; + memcpy(dbv_val.data = &buf[i], val.mv_data, val.mv_size); + i += val.mv_size; + buf[i++] = '\0'; + + rv = hook(&dbv_key, &dbv_val, userdata); + + if(rv != EX_OK) + break; + } + if(buf != NULL) + xfree(buf); + + mdb_cursor_close(fecp); +jleave: + return rv; +} + +const char * +db_version_str(void){ + return MDB_VERSION_STRING; +} + +char const * +db_str_err(int e){ + return mdb_strerror(e); +} + +/* vim:set et sts=4 sw=4 sts=4 tw=79: */ Modified: trunk/bogofilter/src/tests/t.frame =================================================================== --- trunk/bogofilter/src/tests/t.frame 2018-07-19 20:04:35 UTC (rev 7068) +++ trunk/bogofilter/src/tests/t.frame 2018-07-19 20:07:01 UTC (rev 7069) @@ -53,6 +53,7 @@ *Kyoto*) DB_TXN=true ;; *SQLite*) DB_TXN=true ;; *TrivialDB*) DB_TXN=false ;; + *LMDB*) DB_TXN=true ;; *) echo >&2 "Unknown data base type in bogofilter -V: $DB_NAME" exit 1 ;; esac This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site. |
From: <m-...@us...> - 2018-07-19 20:04:39
|
Revision: 7068 http://sourceforge.net/p/bogofilter/code/7068 Author: m-a Date: 2018-07-19 20:04:35 +0000 (Thu, 19 Jul 2018) Log Message: ----------- Resynch trunk. Property Changed: ---------------- branches/lmdb-support/ Index: branches/lmdb-support =================================================================== --- branches/lmdb-support 2018-07-19 20:03:48 UTC (rev 7067) +++ branches/lmdb-support 2018-07-19 20:04:35 UTC (rev 7068) Property changes on: branches/lmdb-support ___________________________________________________________________ Modified: svn:mergeinfo ## -1 +1 ## -/trunk:7060-7063 \ No newline at end of property +/trunk:7060-7067 \ No newline at end of property This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site. |
From: <m-...@us...> - 2018-07-19 20:03:51
|
Revision: 7067 http://sourceforge.net/p/bogofilter/code/7067 Author: m-a Date: 2018-07-19 20:03:48 +0000 (Thu, 19 Jul 2018) Log Message: ----------- Add news and build docs about LMDB. Modified Paths: -------------- branches/lmdb-support/bogofilter/INSTALL branches/lmdb-support/bogofilter/NEWS Modified: branches/lmdb-support/bogofilter/INSTALL =================================================================== --- branches/lmdb-support/bogofilter/INSTALL 2018-07-19 19:56:17 UTC (rev 7066) +++ branches/lmdb-support/bogofilter/INSTALL 2018-07-19 20:03:48 UTC (rev 7067) @@ -18,10 +18,12 @@ 3. SQLite (3.2.6 or newer) http://sqlite.org/ 4. TokyoCabinet http://fallabs.com/tokyocabinet/ 5. KyotoCabinet http://fallabs.com/kyotocabinet/ +6. LMDB Lightning Memory-Mapped Database Manager https://symas.com/lmdb/ You can use --with-database=ARG (choose from db (for Berkeley DB), qdbm, -sqlite) to pick the database backend (you must have installed the -database and the corresponding developer package). db is the default. +sqlite, tokyocabinet, kyotocabinet, lmdb) to pick the database backend (you +must have installed the database and the corresponding developer package). +db is the default. If you are using "db", you can use --disable-transactions or --enable-transactions to force the use of 1a or 1b. The default is to Modified: branches/lmdb-support/bogofilter/NEWS =================================================================== --- branches/lmdb-support/bogofilter/NEWS 2018-07-19 19:56:17 UTC (rev 7066) +++ branches/lmdb-support/bogofilter/NEWS 2018-07-19 20:03:48 UTC (rev 7067) @@ -15,6 +15,11 @@ ------------------------------------------------------------------------------- + 2018-07-19 + * Support for using LMDB (Lightning Memory-Mapped Database Manager) + as the database back-end. Suggested, courteously implemented and + contributed by Steffen Nurpmeso, steffen .at. sdaoden.eu. + 2018-07-17 * The Berkeley DB backend driver forgoes DB_NOSYNC in transactional mode, so as to synchronize changes from the logs back into the .db This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site. |
From: <m-...@us...> - 2018-07-19 19:56:23
|
Revision: 7066 http://sourceforge.net/p/bogofilter/code/7066 Author: m-a Date: 2018-07-19 19:56:17 +0000 (Thu, 19 Jul 2018) Log Message: ----------- Update sent by Steffen Nurpmeso, fixes self-test failures. Modified Paths: -------------- branches/lmdb-support/bogofilter/configure.ac branches/lmdb-support/bogofilter/src/datastore_lmdb.c Modified: branches/lmdb-support/bogofilter/configure.ac =================================================================== --- branches/lmdb-support/bogofilter/configure.ac 2018-07-17 23:24:07 UTC (rev 7065) +++ branches/lmdb-support/bogofilter/configure.ac 2018-07-19 19:56:17 UTC (rev 7066) @@ -533,9 +533,6 @@ ;; xlmdb) AC_DEFINE(ENABLE_LMDB_DATASTORE,1, [Enable LMDB datastore]) - AC_MSG_WARN([Note that LMDB support is not yet complete.]) - AC_MSG_WARN([You may wish to choose a different database driver for now.]) - sleep 10 DB_TYPE=lmdb DB_EXT=.lmdb AC_LIB_LINKFLAGS([lmdb]) Modified: branches/lmdb-support/bogofilter/src/datastore_lmdb.c =================================================================== --- branches/lmdb-support/bogofilter/src/datastore_lmdb.c 2018-07-17 23:24:07 UTC (rev 7065) +++ branches/lmdb-support/bogofilter/src/datastore_lmdb.c 2018-07-19 19:56:17 UTC (rev 7066) @@ -21,38 +21,56 @@ * You open an environment (which may contain multiple DBs), create * a transaction and open a DB in that transaction. * 2. LMDB is based on a finite-sized memory map. When a writable transaction - * reaches the size limit, the transaction must be aborted, hen the + * reaches the size limit, the transaction must be aborted, then the * environment must be resized, then a new transaction has to be created. * Resizing will not shrink, effectively. * 3. We assume xmalloc() aborts if out of memory. * 4. We assume no token->leng actually exceeds int32_t. + * 5. mdb_env_get_maxkeysize(): + * Depends on the compile-time constant #MDB_MAXKEYSIZE. Default 511. + * We reject any keys which excess this. * * In order to be able to deal with 2. we need to track all changes that are * performed in a txn, so that in case we are running against the wall we are * capable to replay all changes after having resized the map. + * + * Alternatively, define a_BFLM_FIXED_SIZE, in which case all the replay code + * is not compiled, but instead the given size is fixed, and any DB overflow + * results in program abortion. Since the DB should only consume disc space + * for those pages which are used, this should not hurt in practice. */ +/* Alternative implementation: fixed DB size */ +/*#define a_BFLM_FIXED_SIZE (1u << 31)*/ + /* mdb_env_set_maxreaders() */ #define a_BFLM_MAXREADERS 15 -/* Minimum/initial database size, and DB size grow. - * Space it so that a DB load does not run against walls too many times. - * We try _TRIES times to resize for a single new entry before giving up */ -#define a_BFLM_MINSIZE (1u << 21) -#define a_BFLM_GROW (1u << 24) -#define a_BFLM_GROW_TRIES 3 +#ifndef a_BFLM_FIXED_SIZE + /* DB size grow. Must be a power of two! + * Space it so that a DB load does not run against walls too many times. + * We try _TRIES times to resize for a single new entry before giving up. + * Note that the minimum size must be capable to store the two used DB + * structured without resize etc. */ +# define a_BFLM_GROW (1u << 23) +# define a_BFLM_GROW_TRIES 3 -/* Size of one chunk of the intermediate txn cache, as above. - * Space it so that a DB load does not require all too many. - * Of course, if a token requires more space, we allocate a larger chunk */ -#define a_BFLM_TXN_CACHE_SIZE (1u << 20) + /* Size of one chunk of the intermediate txn cache, as above. + * Space it so that a DB load does not require all too many. + * Of course, if a token requires more space, we allocate a larger chunk */ +# define a_BFLM_TXN_CACHE_SIZE (1u << 20) -/* An entry consists of an uint32_t describing the length of the key. - * If the high bit is set an uint32_t describing the length of the value - * follows. After the data buffers there possibly is alignment pad */ -#define a_BFLM_TXN_CACHE_ALIGN(X) \ + /* An entry consists of an uint32_t describing the length of the key. + * If the high bit is set an uint32_t describing the length of the value + * follows. After the data buffers there possibly is alignment pad */ +# define a_BFLM_TXN_CACHE_ALIGN(X) \ (((X) + (sizeof(uint32_t) - 1)) & ~(sizeof(uint32_t) - 1)) +#endif /* a_BFLM_FIXED_SIZE */ +/* The DB names we use: one for our "is-created" event, the other for data */ +#define a_BFLM_DB_NAME_MAN "BF_MAN" +#define a_BFLM_DB_NAME_DAT "BF_DAT" + #include "common.h" #include <errno.h> @@ -76,20 +94,25 @@ a_BFLM_NONE, a_BFLM_DEBUG = 1u<<0, a_BFLM_RDONLY = 1u<<1, - a_BFLM_HAS_TXN = 1u<<2 + a_BFLM_DB_CREATED = 1u<<2, /* DBs were newly created */ + a_BFLM_DB_UNAVAIL = 1u<<3, /* rdonly open, but no DB exists yet! */ + a_BFLM_HAS_TXN = 1u<<4 }; struct a_bflm{ char *bflm_filepath; /* bfpath.filepath (points to &self[1]) */ MDB_env *bflm_env; - size_t bflm_mapsize; /* Current notion of DB map size */ MDB_txn *bflm_txn; MDB_cursor *bflm_cursor; MDB_dbi bflm_dbi; uint32_t bflm_flags; + size_t bflm_maxkeysize; /* mdb_env_get_maxkeysize() */ +#ifndef a_BFLM_FIXED_SIZE struct a_bflm_txn_cache *bflm_txn_cache; /* Stack thereof */ +#endif }; +#ifndef a_BFLM_FIXED_SIZE struct a_bflm_txn_cache{ struct a_bflm_txn_cache *bflmtc_last; struct a_bflm_txn_cache *bflmtc_next; /* Needs to be build before use! */ @@ -98,9 +121,11 @@ /* Actually points to &self[1] TODO [0] or [8], dep. __STDC_VERSION__! */ char *bflmtc_data; }; +#endif /**/ -static struct a_bflm *a_bflm_init(bfpath *bfp); +static struct a_bflm *a_bflm_init(bfpath *bfp, bool rdonly); +static int a_bflm__check_create(struct a_bflm *bflmp); static void a_bflm_free(struct a_bflm *bflmp); /**/ @@ -108,6 +133,7 @@ static int a_bflm_txn_abort(void *vhandle); static int a_bflm_txn_commit(void *vhandle); +#ifndef a_BFLM_FIXED_SIZE /**/ static bool a_bflm_txn_mapfull(struct a_bflm *bflmp, bool close_cursor); @@ -122,6 +148,7 @@ /* Free the recovery stack and possible heap data */ static void a_bflm_txn_cache_free(struct a_bflm *bflmp); +#endif /* a_BFLM_FIXED_SIZE */ static dsm_t /* TODO const*/ a_bflm_dsm = { /* public -- used in datastore.c */ @@ -152,9 +179,8 @@ }; static struct a_bflm * -a_bflm_init(bfpath *bfp){ +a_bflm_init(bfpath *bfp, bool rdonly){ /* No variable array for .bflm_filepath, use same method as in word.h */ - MDB_envinfo envinfo; int e; char const *emsg; struct a_bflm *rv; @@ -165,8 +191,10 @@ memset(rv, 0, sizeof *rv); memcpy(rv->bflm_filepath = (char*)&rv[1], bfp->filepath, i); - rv->bflm_flags = ((DEBUG_DATABASE(1) || getenv("BF_DEBUG_DB") != NULL) - ? a_BFLM_DEBUG : a_BFLM_NONE); + rv->bflm_flags = (((DEBUG_DATABASE(1) || getenv("BF_DEBUG_DB") != NULL) + ? a_BFLM_DEBUG : a_BFLM_NONE) | + (rdonly ? a_BFLM_RDONLY : a_BFLM_NONE)); + e = mdb_env_create(&rv->bflm_env); if(e != MDB_SUCCESS){ emsg = "mdb_env_open()"; @@ -173,14 +201,23 @@ goto jerr1; } + rv->bflm_maxkeysize = mdb_env_get_maxkeysize(rv->bflm_env); + + /* To acommodate with bogofilter's db_created() mechanism we cannot use the + * unnamed DB which "always exists", but must place data in a named one */ + e = mdb_env_set_maxdbs(rv->bflm_env, 2); + if(e != MDB_SUCCESS){ + emsg = "mdb_env_set_maxdbs()"; + goto jerr1; + } + mdb_env_set_maxreaders(rv->bflm_env, a_BFLM_MAXREADERS); - /* The "problem" is that we need to set_mapsize() before env_open(), - * otherwise the LMDB default will be used as a default (in 0.9.22). - * But since this is cheap at this point just do it.. */ - /* TODO We may not do this because with v0.9.22 a further DB open - * TODO may crash in mdb_*_put() after a growing _mapsize! */ -#if 0 - e = mdb_env_set_mapsize(rv->bflm_env, a_BFLM_MINSIZE); + + /* TODO We may not do this unless going for a huge fixed size, because with + * TODO v0.9.22 a further DB open will then crash in mdb_*_put() after + * TODO a growing _mapsize call! ... */ +#ifdef a_BFLM_FIXED_SIZE + e = mdb_env_set_mapsize(rv->bflm_env, a_BFLM_FIXED_SIZE); if(e != MDB_SUCCESS){ emsg = "mdb_env_set_mapsize()"; goto jerr2; @@ -193,15 +230,16 @@ goto jerr2; } - /* ..then query the actual environment and use the reported map size: - * Note: LMDB documents to reject requests to shrink the real map size! */ - /* no error defined */mdb_env_info(rv->bflm_env, &envinfo); - rv->bflm_mapsize = envinfo.me_mapsize; + /* Let us fake a "has been created" event :( */ + if(!(rv->bflm_flags & a_BFLM_RDONLY) && + (e = a_bflm__check_create(rv)) != MDB_SUCCESS){ + emsg = "cannot handle management DB"; + goto jerr2; + } if(rv->bflm_flags & a_BFLM_DEBUG) - fprintf(dbgout, "LMDB[%ld]: init: %p/%s, mapsize: %lu\n", - (long)getpid(), rv, rv->bflm_filepath, - (unsigned long)rv->bflm_mapsize); + fprintf(dbgout, "LMDB[%ld]: init: %p [%s]\n", + (long)getpid(), rv, rv->bflm_filepath); jleave: return rv; @@ -216,14 +254,58 @@ goto jleave; } +static int +a_bflm__check_create(struct a_bflm *bflmp){ + char const *db_name; + unsigned int f; + int e; + +jredo_txn: + e = mdb_txn_begin(bflmp->bflm_env, NULL, 0, &bflmp->bflm_txn); + if(e != MDB_SUCCESS){ + if(e == MDB_MAP_RESIZED){ + mdb_env_set_mapsize(bflmp->bflm_env, 0); + goto jredo_txn; + } + goto jleave; + } + + db_name = a_BFLM_DB_NAME_MAN; + f = 0; +jredo_dbi: + e = mdb_dbi_open(bflmp->bflm_txn, db_name, f, &bflmp->bflm_dbi); + if(e != MDB_SUCCESS){ + if(e == MDB_NOTFOUND && f == 0){ + bflmp->bflm_flags |= a_BFLM_DB_CREATED; + f = MDB_CREATE; + goto jredo_dbi; + } + goto jerr; + } + + if(f == MDB_CREATE && db_name == a_BFLM_DB_NAME_MAN){ + db_name = a_BFLM_DB_NAME_DAT; + goto jredo_dbi; + } + + e = mdb_txn_commit(bflmp->bflm_txn); + if(e != MDB_SUCCESS) +jerr: + mdb_txn_abort(bflmp->bflm_txn); +jleave: + return e; +} + static void a_bflm_free(struct a_bflm *bflmp){ if(bflmp != NULL){ +#ifndef a_BFLM_FIXED_SIZE if(bflmp->bflm_txn_cache != NULL){ if(DEBUG_DATABASE(1)) fprintf(dbgout, "LMDB _free(): error: there is txn_cache!\n"); a_bflm_txn_cache_free(bflmp); } +#endif mdb_env_close(bflmp->bflm_env); @@ -250,22 +332,25 @@ fprintf(dbgout, "LMDB[%ld]: txn_begin(%p [%s])\n", (long)getpid(), bflmp, bflmp->bflm_filepath); - if(DEBUG_DATABASE(1) && (bflmp->bflm_flags & a_BFLM_HAS_TXN)){ - fprintf(dbgout, "LMDB txn_begin(): error: HAS_TXN!\n"); - e = DST_FAILURE; - goto jleave; - } - +jredo_txn: e = mdb_txn_begin(bflmp->bflm_env, NULL, (bflmp->bflm_flags & a_BFLM_RDONLY ? MDB_RDONLY : 0), &bflmp->bflm_txn); if(e != MDB_SUCCESS){ + if(e == MDB_MAP_RESIZED){ + mdb_env_set_mapsize(bflmp->bflm_env, 0); + goto jredo_txn; + } emsg = "mdb_txn_begin()"; goto jerr1; } - e = mdb_dbi_open(bflmp->bflm_txn, NULL, 0, &bflmp->bflm_dbi); + e = mdb_dbi_open(bflmp->bflm_txn, a_BFLM_DB_NAME_DAT, 0, &bflmp->bflm_dbi); if(e != MDB_SUCCESS){ + if(e == MDB_NOTFOUND){ + bflmp->bflm_flags |= a_BFLM_DB_UNAVAIL; + goto junavail; + } emsg = "mdb_dbi_open()"; goto jerr2; } @@ -276,6 +361,7 @@ goto jerr2; } +junavail: bflmp->bflm_flags |= a_BFLM_HAS_TXN; e = DST_OK; jleave: @@ -304,15 +390,14 @@ fprintf(dbgout, "LMDB[%ld]: txn_abort(%p [%s])\n", (long)getpid(), bflmp, bflmp->bflm_filepath); - if(DEBUG_DATABASE(1) && !(bflmp->bflm_flags & a_BFLM_HAS_TXN)){ - fprintf(dbgout, "LMDB txn_abort(): error: !HAS_TXN!\n"); - e = DST_FAILURE; - goto jleave; - } + if(!(bflmp->bflm_flags & a_BFLM_DB_UNAVAIL)) + mdb_cursor_close(bflmp->bflm_cursor); - mdb_cursor_close(bflmp->bflm_cursor); mdb_txn_abort(bflmp->bflm_txn); + +#ifndef a_BFLM_FIXED_SIZE a_bflm_txn_cache_free(bflmp); +#endif bflmp->bflm_flags &= ~a_BFLM_HAS_TXN; jleave: @@ -322,7 +407,10 @@ static int a_bflm_txn_commit(void *vhandle){ struct a_bflm *bflmp; - int e, retries; +#ifndef a_BFLM_FIXED_SIZE + int retries; +#endif + int e; e = DST_OK; @@ -333,28 +421,30 @@ fprintf(dbgout, "LMDB[%ld]: txn_commit(%p [%s])\n", (long)getpid(), bflmp, bflmp->bflm_filepath); - if(DEBUG_DATABASE(1) && !(bflmp->bflm_flags & a_BFLM_HAS_TXN)){ - fprintf(dbgout, "LMDB txn_commit(): error: !HAS_TXN!\n"); - e = DST_FAILURE; - goto jleave; - } + if(!(bflmp->bflm_flags & a_BFLM_DB_UNAVAIL)) + mdb_cursor_close(bflmp->bflm_cursor); - mdb_cursor_close(bflmp->bflm_cursor); - +#ifndef a_BFLM_FIXED_SIZE retries = 0; jredo: +#endif e = mdb_txn_commit(bflmp->bflm_txn); if(e != MDB_SUCCESS){ - if(e == MDB_MAP_FULL && ++retries <= a_BFLM_GROW_TRIES && +#ifndef a_BFLM_FIXED_SIZE + if((e == MDB_MAP_FULL || e == MDB_MAP_RESIZED) && + ++retries <= a_BFLM_GROW_TRIES && a_bflm_txn_mapfull(bflmp, false)){ mdb_cursor_close(bflmp->bflm_cursor); goto jredo; } +#endif mdb_txn_abort(bflmp->bflm_txn); e = MDB_PANIC; } +#ifndef a_BFLM_FIXED_SIZE a_bflm_txn_cache_free(bflmp); +#endif bflmp->bflm_flags &= ~a_BFLM_HAS_TXN; if(e == MDB_SUCCESS) @@ -368,6 +458,7 @@ return e; } +#ifndef a_BFLM_FIXED_SIZE static bool a_bflm_txn_mapfull(struct a_bflm *bflmp, bool close_cursor){ MDB_envinfo envinfo; @@ -375,33 +466,39 @@ int e; size_t i; - if(DEBUG_DATABASE(1) && (bflmp->bflm_flags & a_BFLM_RDONLY)) - fprintf(dbgout, "LDMB txn_mapfull() on RDONLY DB!\n"); + /* Abort transaction */ + if(DEBUG_DATABASE(1) && (bflmp->bflm_flags & a_BFLM_DB_UNAVAIL)) + exit(EX_ERROR); - /* Abort transaction */ if(close_cursor) mdb_cursor_close(bflmp->bflm_cursor); + mdb_txn_abort(bflmp->bflm_txn); /* Resize map */ - i = bflmp->bflm_mapsize; - i += a_BFLM_GROW; +jredo_txn: + /* no error defined */mdb_env_info(bflmp->bflm_env, &envinfo); + i = envinfo.me_mapsize; + i += a_BFLM_GROW / 10; + i = (i + (a_BFLM_GROW - 1)) & ~(a_BFLM_GROW - 1); e = mdb_env_set_mapsize(bflmp->bflm_env, i); if(e != MDB_SUCCESS){ emsg = "mdb_env_set_mapsize()"; goto jerr1; } - /* no error defined */mdb_env_info(bflmp->bflm_env, &envinfo); - bflmp->bflm_mapsize = envinfo.me_mapsize; /* Recreate transaction */ e = mdb_txn_begin(bflmp->bflm_env, NULL, 0, &bflmp->bflm_txn); if(e != MDB_SUCCESS){ + if(e == MDB_MAP_RESIZED){ + mdb_env_set_mapsize(bflmp->bflm_env, 0); + goto jredo_txn; + } emsg = "mdb_txn_begin()"; goto jerr1; } - e = mdb_dbi_open(bflmp->bflm_txn, NULL, 0, &bflmp->bflm_dbi); + e = mdb_dbi_open(bflmp->bflm_txn, a_BFLM_DB_NAME_DAT, 0, &bflmp->bflm_dbi); if(e != MDB_SUCCESS){ emsg = "mdb_dbi_open()"; goto jerr2; @@ -419,7 +516,8 @@ if(bflmp->bflm_flags & a_BFLM_DEBUG) fprintf(dbgout, "LMDB[%ld]: txn_mapfull(%p [%s]): " "recreated, new size %lu\n", - (long)getpid(), bflmp, bflmp->bflm_filepath, bflmp->bflm_mapsize); + (long)getpid(), bflmp, bflmp->bflm_filepath, + (unsigned long)envinfo.me_mapsize); e = 0; jleave: return (e == 0); @@ -581,6 +679,7 @@ xfree(bflmtcp); } } +#endif /* a_BFLM_FIXED_SIZE */ dsm_t /* const TODO */ *dsm = &a_bflm_dsm; @@ -589,12 +688,9 @@ struct a_bflm *bflmp; UNUSED(env); - if((bflmp = a_bflm_init(bfp)) == NULL) + if((bflmp = a_bflm_init(bfp, (open_mode == DS_READ))) == NULL) goto jleave; - if(open_mode == DS_READ) - bflmp->bflm_flags |= a_BFLM_RDONLY; - if(bflmp->bflm_flags & a_BFLM_DEBUG) fprintf(dbgout, "LMDB[%ld]: db_open(%p [%s; rdonly=%d])\n", (long)getpid(), bflmp, bflmp->bflm_filepath, @@ -614,9 +710,6 @@ fprintf(dbgout, "LMDB[%ld]: db_close(%p [%s])\n", (long)getpid(), bflmp, bflmp->bflm_filepath); - if(DEBUG_DATABASE(1) && (bflmp->bflm_flags & a_BFLM_HAS_TXN)) - fprintf(dbgout, "LMDB db_close(): error: HAS_TXN!\n"); - a_bflm_free(bflmp); jleave:; } @@ -629,7 +722,12 @@ bool db_created(void *vhandle){ - return (vhandle != NULL); + bool created; + struct a_bflm *bflmp; + + created = ((bflmp = vhandle) != NULL && + (bflmp->bflm_flags & a_BFLM_DB_CREATED) != 0); + return created; } int @@ -644,15 +742,18 @@ if((bflmp = vhandle) == NULL) goto jleave; - if(DEBUG_DATABASE(1) && !(bflmp->bflm_flags & a_BFLM_HAS_TXN)){ - emsg = "!HAS_TXN!"; - goto jerr; + if(bflmp->bflm_flags & a_BFLM_DB_UNAVAIL) + goto jleave; + + if((size_t)token->leng > bflmp->bflm_maxkeysize){ + if(bflmp->bflm_flags & a_BFLM_DEBUG) + fprintf(dbgout, "LMDB[%ld]: get_dbvalue: key too big " + "(> %lu bytes), ignoring %.*s\n", + (long)getpid(),(unsigned long)bflmp->bflm_maxkeysize, + (int)token->leng, token->data); + goto jleave; } - if(DEBUG_DATABASE(3)) - fprintf(dbgout, "LMDB db_get_dbvalue(): %lu <%.*s>\n", - (unsigned long)token->leng, (int)token->leng, token->data); - key.mv_data = token->data; key.mv_size = token->leng; e = mdb_cursor_get(bflmp->bflm_cursor, &key, &val, MDB_SET); @@ -670,6 +771,10 @@ e = 0; jleave: + if(DEBUG_DATABASE(3)) + fprintf(dbgout, "LMDB db_get_dbvalue(): %lu <%.*s> -> %d\n", + (unsigned long)token->leng, (int)token->leng, token->data, + (e == 0)); return e; jerr: if(e != MDB_NOTFOUND){ @@ -687,7 +792,10 @@ MDB_val key, val; char const *emsg; struct a_bflm *bflmp; - int e, retries; +#ifndef a_BFLM_FIXED_SIZE + int retries; +#endif + int e; e = 0; @@ -694,17 +802,22 @@ if((bflmp = vhandle) == NULL) goto jleave; - if(DEBUG_DATABASE(1) && !(bflmp->bflm_flags & a_BFLM_HAS_TXN)){ - emsg = "!HAS_TXN!"; - goto jerr; + if(bflmp->bflm_flags & a_BFLM_DB_UNAVAIL) + goto jleave; + + if((size_t)token->leng > bflmp->bflm_maxkeysize){ + if(bflmp->bflm_flags & a_BFLM_DEBUG) + fprintf(dbgout, "LMDB[%ld]: set_dbvalue: key too big " + "(> %lu bytes), ignoring %.*s\n", + (long)getpid(),(unsigned long)bflmp->bflm_maxkeysize, + (int)token->leng, token->data); + goto jleave; } - if(DEBUG_DATABASE(3)) - fprintf(dbgout, "LMDB db_set_dbvalue(): %lu <%.*s>\n", - (unsigned long)token->leng, (int)token->leng, token->data); - +#ifndef a_BFLM_FIXED_SIZE retries = 0; jredo: +#endif key.mv_data = token->data; key.mv_size = token->leng; val.mv_data = value->data; @@ -711,18 +824,26 @@ val.mv_size = value->leng; e = mdb_cursor_put(bflmp->bflm_cursor, &key, &val, 0); if(e != MDB_SUCCESS){ +#ifndef a_BFLM_FIXED_SIZE if(e == MDB_MAP_FULL && ++retries <= a_BFLM_GROW_TRIES && a_bflm_txn_mapfull(bflmp, true)) goto jredo; +#endif emsg = "mdb_cursor_put()"; goto jerr; } +#ifndef a_BFLM_FIXED_SIZE if((emsg = a_bflm_txn_cache_put(bflmp, &key, &val)) != NULL) goto jerr; +#endif e = 0; jleave: + if(DEBUG_DATABASE(3)) + fprintf(dbgout, "LMDB db_set_dbvalue(): %lu <%.*s> -> %d\n", + (unsigned long)token->leng, (int)token->leng, token->data, + (e == 0)); return e; jerr: print_error(__FILE__, __LINE__, "LMDB[%ld]: db_set_dbvalue(), %s: %d, %s", @@ -735,7 +856,10 @@ MDB_val key; char const *emsg; struct a_bflm *bflmp; - int e, retries; +#ifndef a_BFLM_FIXED_SIZE + int retries; +#endif + int e; e = 0; @@ -742,18 +866,22 @@ if((bflmp = vhandle) == NULL) goto jleave; - if(DEBUG_DATABASE(1) && !(bflmp->bflm_flags & a_BFLM_HAS_TXN)){ - emsg = "!HAS_TXN!"; - e = DS_NOTFOUND; - goto jerr; + if(bflmp->bflm_flags & a_BFLM_DB_UNAVAIL) + goto jleave; + + if((size_t)token->leng > bflmp->bflm_maxkeysize){ + if(bflmp->bflm_flags & a_BFLM_DEBUG) + fprintf(dbgout, "LMDB[%ld]: delete: key too big " + "(> %lu bytes), ignoring %.*s\n", + (long)getpid(),(unsigned long)bflmp->bflm_maxkeysize, + (int)token->leng, token->data); + goto jleave; } - if(DEBUG_DATABASE(3)) - fprintf(dbgout, "LMDB db_delete(): %lu <%.*s>\n", - (unsigned long)token->leng, (int)token->leng, token->data); - +#ifndef a_BFLM_FIXED_SIZE retries = 0; jredo: +#endif key.mv_data = token->data; key.mv_size = token->leng; e = mdb_cursor_get(bflmp->bflm_cursor, &key, NULL, MDB_SET_KEY); @@ -764,19 +892,27 @@ e = mdb_cursor_del(bflmp->bflm_cursor, 0); if(e != MDB_SUCCESS){ +#ifndef a_BFLM_FIXED_SIZE /* Should not happen, though */ if(e == MDB_MAP_FULL && ++retries <= a_BFLM_GROW_TRIES && a_bflm_txn_mapfull(bflmp, true)) goto jredo; +#endif emsg = "mdb_cursor_del()"; goto jerr; } +#ifndef a_BFLM_FIXED_SIZE if((emsg = a_bflm_txn_cache_put(bflmp, &key, NULL)) != NULL) goto jerr; +#endif e = 0; jleave: + if(DEBUG_DATABASE(3)) + fprintf(dbgout, "LMDB db_delete(): %lu <%.*s> -> %d\n", + (unsigned long)token->leng, (int)token->leng, token->data, + (e == 0)); return e; jerr: print_error(__FILE__, __LINE__, "LMDB[%ld]: db_delete(), %s: %d, %s", @@ -811,6 +947,7 @@ MDB_val key, val; char *buf; MDB_cursor_op cursor_op; + MDB_cursor *fecp; struct a_bflm *bflmp; ex_t rv; @@ -819,21 +956,25 @@ if((bflmp = vhandle) == NULL) goto jleave; - if(DEBUG_DATABASE(1) && !(bflmp->bflm_flags & a_BFLM_HAS_TXN)){ - rv = EX_ERROR; + if(bflmp->bflm_flags & a_BFLM_DB_UNAVAIL) goto jleave; - } if(bflmp->bflm_flags & a_BFLM_DEBUG) fprintf(dbgout, "LMDB[%ld]: db_foreach(%p [%s])\n", (long)getpid(), bflmp, bflmp->bflm_filepath); + if(mdb_cursor_open(bflmp->bflm_txn, bflmp->bflm_dbi, &fecp + ) != MDB_SUCCESS){ + rv = EX_ERROR; + goto jleave; + } + buf = NULL; for(cursor_op = MDB_FIRST;; cursor_op = MDB_NEXT){ size_t i; int e; - e = mdb_cursor_get(bflmp->bflm_cursor, &key, &val, cursor_op); + e = mdb_cursor_get(fecp, &key, &val, cursor_op); if(e != MDB_SUCCESS){ if(e != MDB_NOTFOUND){ print_error(__FILE__, __LINE__, "LMDB[%ld]: db_foreach(): " @@ -862,6 +1003,8 @@ } if(buf != NULL) xfree(buf); + + mdb_cursor_close(fecp); jleave: return rv; } This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site. |
From: <m-...@us...> - 2018-07-17 23:24:13
|
Revision: 7065 http://sourceforge.net/p/bogofilter/code/7065 Author: m-a Date: 2018-07-17 23:24:07 +0000 (Tue, 17 Jul 2018) Log Message: ----------- Remove one more obsolete workaround. Modified Paths: -------------- branches/lmdb-support/bogofilter/src/datastore_lmdb.c Modified: branches/lmdb-support/bogofilter/src/datastore_lmdb.c =================================================================== --- branches/lmdb-support/bogofilter/src/datastore_lmdb.c 2018-07-17 23:10:31 UTC (rev 7064) +++ branches/lmdb-support/bogofilter/src/datastore_lmdb.c 2018-07-17 23:24:07 UTC (rev 7065) @@ -742,12 +742,6 @@ if((bflmp = vhandle) == NULL) goto jleave; - /* TODO bogofilter tries to put .WORDLIST_VERSION even into a RDONLY DB. - * TODO Since we silently fake set_dbvalue() success for RDONLY, do the - * TODO very same for delete(), too */ - if(bflmp->bflm_flags & a_BFLM_RDONLY) - goto jleave; - if(DEBUG_DATABASE(1) && !(bflmp->bflm_flags & a_BFLM_HAS_TXN)){ emsg = "!HAS_TXN!"; e = DS_NOTFOUND; This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site. |
From: <m-...@us...> - 2018-07-17 23:10:35
|
Revision: 7064 http://sourceforge.net/p/bogofilter/code/7064 Author: m-a Date: 2018-07-17 23:10:31 +0000 (Tue, 17 Jul 2018) Log Message: ----------- Sync with trunk, and drop 'do not write to read-only DB' shortcut. The latter masked an error in the datastore layer and is no longer required. Modified Paths: -------------- branches/lmdb-support/bogofilter/NEWS branches/lmdb-support/bogofilter/doc/README.db branches/lmdb-support/bogofilter/src/datastore.c branches/lmdb-support/bogofilter/src/datastore_db.c branches/lmdb-support/bogofilter/src/datastore_lmdb.c Property Changed: ---------------- branches/lmdb-support/ Index: branches/lmdb-support =================================================================== --- branches/lmdb-support 2018-07-17 23:06:11 UTC (rev 7063) +++ branches/lmdb-support 2018-07-17 23:10:31 UTC (rev 7064) Property changes on: branches/lmdb-support ___________________________________________________________________ Added: svn:mergeinfo ## -0,0 +1 ## +/trunk:7060-7063 \ No newline at end of property Modified: branches/lmdb-support/bogofilter/NEWS =================================================================== --- branches/lmdb-support/bogofilter/NEWS 2018-07-17 23:06:11 UTC (rev 7063) +++ branches/lmdb-support/bogofilter/NEWS 2018-07-17 23:10:31 UTC (rev 7064) @@ -15,6 +15,12 @@ ------------------------------------------------------------------------------- + 2018-07-17 + * The Berkeley DB backend driver forgoes DB_NOSYNC in transactional + mode, so as to synchronize changes from the logs back into the .db + files to keep them up to date and make environments more robust + against a loss of log.* files, for instance, when moving databases. + 2017-09-18 * The contrib/spamitarium.pl, originally written by Thomas 'Tom' Anderson, was enhanced by Jonathan Kamens and grew a few features. Modified: branches/lmdb-support/bogofilter/doc/README.db =================================================================== --- branches/lmdb-support/bogofilter/doc/README.db 2018-07-17 23:06:11 UTC (rev 7063) +++ branches/lmdb-support/bogofilter/doc/README.db 2018-07-17 23:10:31 UTC (rev 7064) @@ -51,32 +51,15 @@ - if using a pre-packaged Berkeley DB version, the packager should have applied the patches, check your vendor's update site regularly): - Sleepycat Software: Berkeley DB 3.1.17: (July 31, 2000) - Sleepycat Software: Berkeley DB 3.2.9: (January 24, 2001) - Sleepycat Software: Berkeley DB 3.3.11: (July 12, 2001) - Sleepycat Software: Berkeley DB 4.0.14: (November 18, 2001) - Sleepycat Software: Berkeley DB 4.1.25: (December 19, 2002) - Sleepycat Software: Berkeley DB 4.2.52: (December 3, 2003) - Sleepycat Software: Berkeley DB 4.3.29: (September 6, 2005) - Sleepycat Software: Berkeley DB 4.4.20: (January 10, 2006) - Berkeley DB 4.5.20: (September 20, 2006) - Berkeley DB 4.6.19: (August 10, 2007) - Berkeley DB 4.7.25: (May 15, 2008) - Berkeley DB 4.8.24: (August 14, 2009) - Berkeley DB 4.8.26: (December 18, 2009) - Berkeley DB 5.0.21: (March 30, 2010) - Berkeley DB 5.1.19: (August 27, 2010) - Berkeley DB 5.2.42: (February 29, 2012) Berkeley DB 5.3.21: (May 11, 2012) Other versions of Berkeley DB between the first and last listed above may or may not work but usually they will. -Berkeley DB versions 4.1 and newer are recommended over the previous -versions, because the newer can detect data corruptions more reliably -(through the use of checksums that detect partially written data base -pages); Berkeley DB 4.2 and 4.3 appear a bit faster under load than 4.1 -and older versions, 4.4 and newer have not yet been evaluated for speed. +Note that versions starting with Berkeley DB 6 changed their license +to the GNU Affero General Public License, which "requires the operator of a +network server to provide the source code of the modified version running there +to the users of that server." (quoting the AGPLv3 preamble). 2.2.1 Upgrading to transactional databases, also from older bogofilter versions Modified: branches/lmdb-support/bogofilter/src/datastore.c =================================================================== --- branches/lmdb-support/bogofilter/src/datastore.c 2018-07-17 23:06:11 UTC (rev 7063) +++ branches/lmdb-support/bogofilter/src/datastore.c 2018-07-17 23:10:31 UTC (rev 7064) @@ -161,7 +161,7 @@ dsh = dsh_init(v); - if (db_created(v) && ! (open_mode & DS_LOAD)) { + if (db_created(v) && ! (open_mode & DS_LOAD) && (open_mode & DS_WRITE)) { if (DST_OK != ds_txn_begin(dsh)) exit(EX_ERROR); ds_set_wordlist_version(dsh, NULL); Modified: branches/lmdb-support/bogofilter/src/datastore_db.c =================================================================== --- branches/lmdb-support/bogofilter/src/datastore_db.c 2018-07-17 23:06:11 UTC (rev 7063) +++ branches/lmdb-support/bogofilter/src/datastore_db.c 2018-07-17 23:10:31 UTC (rev 7064) @@ -899,8 +899,7 @@ int ret; dbh_t *handle = (dbh_t *)vhandle; DB *dbp = handle->dbp; - /* This is _ONLY_ safe as long as we're logging TXNs */ - uint32_t flag = (eTransaction == T_ENABLED) ? DB_NOSYNC : 0; + uint32_t flag = 0; assert(handle->magic == MAGIC_DBH); Modified: branches/lmdb-support/bogofilter/src/datastore_lmdb.c =================================================================== --- branches/lmdb-support/bogofilter/src/datastore_lmdb.c 2018-07-17 23:06:11 UTC (rev 7063) +++ branches/lmdb-support/bogofilter/src/datastore_lmdb.c 2018-07-17 23:10:31 UTC (rev 7064) @@ -694,11 +694,6 @@ if((bflmp = vhandle) == NULL) goto jleave; - /* TODO bogofilter tries to put .WORDLIST_VERSION even into a RDONLY DB. - * TODO Therefore silently fake set_dbvalue() success for RDONLY */ - if(bflmp->bflm_flags & a_BFLM_RDONLY) - goto jleave; - if(DEBUG_DATABASE(1) && !(bflmp->bflm_flags & a_BFLM_HAS_TXN)){ emsg = "!HAS_TXN!"; goto jerr; This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site. |
From: <m-...@us...> - 2018-07-17 23:06:14
|
Revision: 7063 http://sourceforge.net/p/bogofilter/code/7063 Author: m-a Date: 2018-07-17 23:06:11 +0000 (Tue, 17 Jul 2018) Log Message: ----------- Do not write special tokens to read-only DBs. When opening databases, bogofilter would attempt to write .ENCODING/.WORDLIST_VERSION tokens to read-only DBs. This failed with Steffen Nurpmeso's new LMDB driver, and is generally unexpected behaviour. Modified Paths: -------------- trunk/bogofilter/src/datastore.c Modified: trunk/bogofilter/src/datastore.c =================================================================== --- trunk/bogofilter/src/datastore.c 2018-07-17 22:58:55 UTC (rev 7062) +++ trunk/bogofilter/src/datastore.c 2018-07-17 23:06:11 UTC (rev 7063) @@ -161,7 +161,7 @@ dsh = dsh_init(v); - if (db_created(v) && ! (open_mode & DS_LOAD)) { + if (db_created(v) && ! (open_mode & DS_LOAD) && (open_mode & DS_WRITE)) { if (DST_OK != ds_txn_begin(dsh)) exit(EX_ERROR); ds_set_wordlist_version(dsh, NULL); This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site. |
From: <m-...@us...> - 2018-07-17 22:58:57
|
Revision: 7062 http://sourceforge.net/p/bogofilter/code/7062 Author: m-a Date: 2018-07-17 22:58:55 +0000 (Tue, 17 Jul 2018) Log Message: ----------- Forgo DB_NOSYNC when closing transactional DB, because the original reason for the flag is unclear, and trim the list of supported Berkeley DB versions to 5.3. There is no code to deliberately remove support for other versions, it's just that 5.3 appears to be the most widespread version these days. Prior versions can be replaced by 5.3, and 6.x versions have a license with a stronger copyleft. If this causes EINVAL to come back in older database versions, we may want to check that we are not writing to read-only databases - this was recently found by Steffen Nurpmeso when adding LMDB support. Modified Paths: -------------- trunk/bogofilter/NEWS trunk/bogofilter/doc/README.db trunk/bogofilter/src/datastore_db.c Modified: trunk/bogofilter/NEWS =================================================================== --- trunk/bogofilter/NEWS 2018-07-17 06:57:16 UTC (rev 7061) +++ trunk/bogofilter/NEWS 2018-07-17 22:58:55 UTC (rev 7062) @@ -15,6 +15,12 @@ ------------------------------------------------------------------------------- + 2018-07-17 + * The Berkeley DB backend driver forgoes DB_NOSYNC in transactional + mode, so as to synchronize changes from the logs back into the .db + files to keep them up to date and make environments more robust + against a loss of log.* files, for instance, when moving databases. + 2017-09-18 * The contrib/spamitarium.pl, originally written by Thomas 'Tom' Anderson, was enhanced by Jonathan Kamens and grew a few features. Modified: trunk/bogofilter/doc/README.db =================================================================== --- trunk/bogofilter/doc/README.db 2018-07-17 06:57:16 UTC (rev 7061) +++ trunk/bogofilter/doc/README.db 2018-07-17 22:58:55 UTC (rev 7062) @@ -51,32 +51,15 @@ - if using a pre-packaged Berkeley DB version, the packager should have applied the patches, check your vendor's update site regularly): - Sleepycat Software: Berkeley DB 3.1.17: (July 31, 2000) - Sleepycat Software: Berkeley DB 3.2.9: (January 24, 2001) - Sleepycat Software: Berkeley DB 3.3.11: (July 12, 2001) - Sleepycat Software: Berkeley DB 4.0.14: (November 18, 2001) - Sleepycat Software: Berkeley DB 4.1.25: (December 19, 2002) - Sleepycat Software: Berkeley DB 4.2.52: (December 3, 2003) - Sleepycat Software: Berkeley DB 4.3.29: (September 6, 2005) - Sleepycat Software: Berkeley DB 4.4.20: (January 10, 2006) - Berkeley DB 4.5.20: (September 20, 2006) - Berkeley DB 4.6.19: (August 10, 2007) - Berkeley DB 4.7.25: (May 15, 2008) - Berkeley DB 4.8.24: (August 14, 2009) - Berkeley DB 4.8.26: (December 18, 2009) - Berkeley DB 5.0.21: (March 30, 2010) - Berkeley DB 5.1.19: (August 27, 2010) - Berkeley DB 5.2.42: (February 29, 2012) Berkeley DB 5.3.21: (May 11, 2012) Other versions of Berkeley DB between the first and last listed above may or may not work but usually they will. -Berkeley DB versions 4.1 and newer are recommended over the previous -versions, because the newer can detect data corruptions more reliably -(through the use of checksums that detect partially written data base -pages); Berkeley DB 4.2 and 4.3 appear a bit faster under load than 4.1 -and older versions, 4.4 and newer have not yet been evaluated for speed. +Note that versions starting with Berkeley DB 6 changed their license +to the GNU Affero General Public License, which "requires the operator of a +network server to provide the source code of the modified version running there +to the users of that server." (quoting the AGPLv3 preamble). 2.2.1 Upgrading to transactional databases, also from older bogofilter versions Modified: trunk/bogofilter/src/datastore_db.c =================================================================== --- trunk/bogofilter/src/datastore_db.c 2018-07-17 06:57:16 UTC (rev 7061) +++ trunk/bogofilter/src/datastore_db.c 2018-07-17 22:58:55 UTC (rev 7062) @@ -899,8 +899,7 @@ int ret; dbh_t *handle = (dbh_t *)vhandle; DB *dbp = handle->dbp; - /* This is _ONLY_ safe as long as we're logging TXNs */ - uint32_t flag = (eTransaction == T_ENABLED) ? DB_NOSYNC : 0; + uint32_t flag = 0; assert(handle->magic == MAGIC_DBH); This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site. |
From: <m-...@us...> - 2018-07-17 06:57:19
|
Revision: 7061 http://sourceforge.net/p/bogofilter/code/7061 Author: m-a Date: 2018-07-17 06:57:16 +0000 (Tue, 17 Jul 2018) Log Message: ----------- Initial LMDB support, by Steffen Nurpmeso. Not yet complete, make check shows five FAILing tests. Modified Paths: -------------- branches/lmdb-support/bogofilter/AUTHORS branches/lmdb-support/bogofilter/configure.ac branches/lmdb-support/bogofilter/src/Makefile.am branches/lmdb-support/bogofilter/src/tests/t.frame Added Paths: ----------- branches/lmdb-support/bogofilter/src/datastore_lmdb.c Modified: branches/lmdb-support/bogofilter/AUTHORS =================================================================== --- branches/lmdb-support/bogofilter/AUTHORS 2018-07-17 06:43:59 UTC (rev 7060) +++ branches/lmdb-support/bogofilter/AUTHORS 2018-07-17 06:57:16 UTC (rev 7061) @@ -57,3 +57,4 @@ Roman Trunov Julius Plenz Denny Lin (KyotoCabinet support) +Steffen Nurpmeso (LMDB support) Modified: branches/lmdb-support/bogofilter/configure.ac =================================================================== --- branches/lmdb-support/bogofilter/configure.ac 2018-07-17 06:43:59 UTC (rev 7060) +++ branches/lmdb-support/bogofilter/configure.ac 2018-07-17 06:57:16 UTC (rev 7061) @@ -479,7 +479,7 @@ WITH_DB_ENGINE=db AC_ARG_WITH(database, AS_HELP_STRING([--with-database=ENGINE], - [choose database engine {db|qdbm|sqlite3|tokyocabinet|kyotocabinet} [[db]]]), + [choose database engine {db|qdbm|sqlite3|tokyocabinet|kyotocabinet|lmdb} [[db]]]), [ WITH_DB_ENGINE=$withval ] ) @@ -531,6 +531,32 @@ ])],,AC_MSG_ERROR(Cannot link to kyotocabinet library.)) LIBS="$saveLIBS" ;; + xlmdb) + AC_DEFINE(ENABLE_LMDB_DATASTORE,1, [Enable LMDB datastore]) + AC_MSG_WARN([Note that LMDB support is not yet complete.]) + AC_MSG_WARN([You may wish to choose a different database driver for now.]) + sleep 10 + DB_TYPE=lmdb + DB_EXT=.lmdb + AC_LIB_LINKFLAGS([lmdb]) + LIBDB="$LIBLMDB" + saveLIBS="$LIBS" + LIBS="$LIBS $LIBDB" + AC_LINK_IFELSE([AC_LANG_PROGRAM([ +#include <lmdb.h> + ], [ + MDB_env *env; + MDB_txn *txn; + MDB_dbi dbi; + mdb_env_create(&env); + mdb_env_set_maxreaders(env, 1); + mdb_env_set_mapsize(env, 4096*42); + mdb_env_open(env, "/tmp", 0, 0660); + mdb_txn_begin(env, (void*)0, 0, &txn); + mdb_dbi_open(txn, (void*)0, 0, &dbi); + ])],,AC_MSG_ERROR(Cannot link to lmdb library.)) + LIBS="$saveLIBS" + ;; xqdbm) AC_DEFINE(ENABLE_QDBM_DATASTORE,1, [Enable qdbm datastore]) DB_TYPE=qdbm @@ -681,7 +707,7 @@ LIBS="$saveLIBS" ;; *) - AC_MSG_ERROR([Invalid --with-database argument. Supported engines are db, qdbm, sqlite3, tokyocabinet, kyotocabinet.]) + AC_MSG_ERROR([Invalid --with-database argument. Supported engines are db, qdbm, sqlite3, tokyocabinet, kyotocabinet, lmdb.]) ;; esac @@ -708,6 +734,7 @@ AM_CONDITIONAL(ENABLE_SQLITE_DATASTORE, test "x$WITH_DB_ENGINE" = "xsqlite3") AM_CONDITIONAL(ENABLE_TOKYOCABINET_DATASTORE, test "x$WITH_DB_ENGINE" = "xtokyocabinet") AM_CONDITIONAL(ENABLE_KYOTOCABINET_DATASTORE, test "x$WITH_DB_ENGINE" = "xkyotocabinet") +AM_CONDITIONAL(ENABLE_LMDB_DATASTORE, test "x$WITH_DB_ENGINE" = "xlmdb") dnl Use TRIO to replace missing snprintf/vsnprintf. needtrio=0 Modified: branches/lmdb-support/bogofilter/src/Makefile.am =================================================================== --- branches/lmdb-support/bogofilter/src/Makefile.am 2018-07-17 06:43:59 UTC (rev 7060) +++ branches/lmdb-support/bogofilter/src/Makefile.am 2018-07-17 06:57:16 UTC (rev 7061) @@ -195,6 +195,11 @@ datastore_opthelp_dummies.c \ datastore_dummies.c else +if ENABLE_LMDB_DATASTORE +datastore_SOURCE = datastore_lmdb.c \ + datastore_opthelp_dummies.c \ + datastore_dummies.c +else if ENABLE_TRANSACTIONS datastore_SOURCE = datastore_db.c datastore_db_trans.c else @@ -209,6 +214,7 @@ endif endif endif +endif datastore_OBJECT = $(datastore_SOURCE:.c=.o) Added: branches/lmdb-support/bogofilter/src/datastore_lmdb.c =================================================================== --- branches/lmdb-support/bogofilter/src/datastore_lmdb.c (rev 0) +++ branches/lmdb-support/bogofilter/src/datastore_lmdb.c 2018-07-17 06:57:16 UTC (rev 7061) @@ -0,0 +1,890 @@ +/* $Id$ */ + +/* + * NAME: + * datastore_lmdb.c -- implements the datastore, using LMDB. + * + * AUTHORS: + * Steffen Nurpmeso <st...@sd...> 2018 + * (copied from datastore_kc.c: + * Gyepi Sam <gy...@pr...> 2003 + * Matthias Andree <mat...@gm...> 2003 + * Stefan Bellon <sb...@sb...> 2003-2004 + * Pierre Habouzit <mad...@de...> 2007 + * Denny Lin <den...@hs...> 2015) + */ + +/* + * Remarks. + * + * 1. LMDB places anything inside transactions (txn). + * You open an environment (which may contain multiple DBs), create + * a transaction and open a DB in that transaction. + * 2. LMDB is based on a finite-sized memory map. When a writable transaction + * reaches the size limit, the transaction must be aborted, hen the + * environment must be resized, then a new transaction has to be created. + * Resizing will not shrink, effectively. + * 3. We assume xmalloc() aborts if out of memory. + * 4. We assume no token->leng actually exceeds int32_t. + * + * In order to be able to deal with 2. we need to track all changes that are + * performed in a txn, so that in case we are running against the wall we are + * capable to replay all changes after having resized the map. + */ + +/* mdb_env_set_maxreaders() */ +#define a_BFLM_MAXREADERS 15 + +/* Minimum/initial database size, and DB size grow. + * Space it so that a DB load does not run against walls too many times. + * We try _TRIES times to resize for a single new entry before giving up */ +#define a_BFLM_MINSIZE (1u << 21) +#define a_BFLM_GROW (1u << 24) +#define a_BFLM_GROW_TRIES 3 + +/* Size of one chunk of the intermediate txn cache, as above. + * Space it so that a DB load does not require all too many. + * Of course, if a token requires more space, we allocate a larger chunk */ +#define a_BFLM_TXN_CACHE_SIZE (1u << 20) + +/* An entry consists of an uint32_t describing the length of the key. + * If the high bit is set an uint32_t describing the length of the value + * follows. After the data buffers there possibly is alignment pad */ +#define a_BFLM_TXN_CACHE_ALIGN(X) \ + (((X) + (sizeof(uint32_t) - 1)) & ~(sizeof(uint32_t) - 1)) + +#include "common.h" + +#include <errno.h> + +#include <lmdb.h> + +#include "datastore.h" +#include "datastore_db.h" +#include "error.h" +#include "paths.h" +#include "xmalloc.h" +#include "xstrdup.h" + +#if MDB_VERSION_FULL < MDB_VERINT(0, 9, 22) +# error "Required LMDB version: 0.9.22 or later (0.9.11 may do, but untested)" +#endif + +#define UNUSED(x) ((void)(x)) + +enum a_bflm_flags{ + a_BFLM_NONE, + a_BFLM_DEBUG = 1u<<0, + a_BFLM_RDONLY = 1u<<1, + a_BFLM_HAS_TXN = 1u<<2 +}; + +struct a_bflm{ + char *bflm_filepath; /* bfpath.filepath (points to &self[1]) */ + MDB_env *bflm_env; + size_t bflm_mapsize; /* Current notion of DB map size */ + MDB_txn *bflm_txn; + MDB_cursor *bflm_cursor; + MDB_dbi bflm_dbi; + uint32_t bflm_flags; + struct a_bflm_txn_cache *bflm_txn_cache; /* Stack thereof */ +}; + +struct a_bflm_txn_cache{ + struct a_bflm_txn_cache *bflmtc_last; + struct a_bflm_txn_cache *bflmtc_next; /* Needs to be build before use! */ + char *bflmtc_caster; /* Current caster */ + char *bflmtc_max; /* Maximum usable byte, exclusive */ + /* Actually points to &self[1] TODO [0] or [8], dep. __STDC_VERSION__! */ + char *bflmtc_data; +}; + +/**/ +static struct a_bflm *a_bflm_init(bfpath *bfp); +static void a_bflm_free(struct a_bflm *bflmp); + +/**/ +static int a_bflm_txn_begin(void *vhandle); +static int a_bflm_txn_abort(void *vhandle); +static int a_bflm_txn_commit(void *vhandle); + +/**/ +static bool a_bflm_txn_mapfull(struct a_bflm *bflmp, bool close_cursor); + +/* Put an entry; it is a deletion if val_or_null is NULL. + * Return NULL on success or an error message otherwise */ +static char const *a_bflm_txn_cache_put(struct a_bflm *bflmp, MDB_val *key, + MDB_val *val_or_null); + +/* Replay all the cache operations in order to redo the transaction. + * Return NULL on success or an error message otherwise */ +static char const *a_bflm_txn_cache_replay(struct a_bflm *bflmp); + +/* Free the recovery stack and possible heap data */ +static void a_bflm_txn_cache_free(struct a_bflm *bflmp); + +static dsm_t /* TODO const*/ a_bflm_dsm = { + /* public -- used in datastore.c */ + &a_bflm_txn_begin, + &a_bflm_txn_abort, + &a_bflm_txn_commit, + /* private -- used in datastore_db_*.c */ + NULL, /* dsm_env_init */ + NULL, /* dsm_cleanup */ + NULL, /* dsm_cleanup_lite */ + NULL, /* dsm_get_env_dbe */ + NULL, /* dsm_database_name */ + NULL, /* dsm_recover_open */ + NULL, /* dsm_auto_commit_flags */ + NULL, /* dsm_get_rmw_flag */ + NULL, /* dsm_lock */ + NULL, /* dsm_common_close */ + NULL, /* dsm_sync */ + NULL, /* dsm_log_flush */ + NULL, /* dsm_pagesize */ + NULL, /* dsm_purgelogs */ + NULL, /* dsm_checkpoint */ + NULL, /* dsm_recover */ + NULL, /* dsm_remove */ + NULL, /* dsm_verify */ + NULL, /* dsm_list_logfiles */ + NULL /* dsm_leafpages */ +}; + +static struct a_bflm * +a_bflm_init(bfpath *bfp){ + /* No variable array for .bflm_filepath, use same method as in word.h */ + MDB_envinfo envinfo; + int e; + char const *emsg; + struct a_bflm *rv; + size_t i; + + i = strlen(bfp->filepath) +1; + rv = xmalloc(sizeof(*rv) + i); + memset(rv, 0, sizeof *rv); + memcpy(rv->bflm_filepath = (char*)&rv[1], bfp->filepath, i); + + rv->bflm_flags = ((DEBUG_DATABASE(1) || getenv("BF_DEBUG_DB") != NULL) + ? a_BFLM_DEBUG : a_BFLM_NONE); + e = mdb_env_create(&rv->bflm_env); + if(e != MDB_SUCCESS){ + emsg = "mdb_env_open()"; + goto jerr1; + } + + mdb_env_set_maxreaders(rv->bflm_env, a_BFLM_MAXREADERS); + /* The "problem" is that we need to set_mapsize() before env_open(), + * otherwise the LMDB default will be used as a default (in 0.9.22). + * But since this is cheap at this point just do it.. */ + /* TODO We may not do this because with v0.9.22 a further DB open + * TODO may crash in mdb_*_put() after a growing _mapsize! */ +#if 0 + e = mdb_env_set_mapsize(rv->bflm_env, a_BFLM_MINSIZE); + if(e != MDB_SUCCESS){ + emsg = "mdb_env_set_mapsize()"; + goto jerr2; + } +#endif + + e = mdb_env_open(rv->bflm_env, rv->bflm_filepath, MDB_NOSUBDIR, 0660); + if(e != MDB_SUCCESS){ + emsg = "mdb_env_open()"; + goto jerr2; + } + + /* ..then query the actual environment and use the reported map size: + * Note: LMDB documents to reject requests to shrink the real map size! */ + /* no error defined */mdb_env_info(rv->bflm_env, &envinfo); + rv->bflm_mapsize = envinfo.me_mapsize; + + if(rv->bflm_flags & a_BFLM_DEBUG) + fprintf(dbgout, "LMDB[%ld]: init: %p/%s, mapsize: %lu\n", + (long)getpid(), rv, rv->bflm_filepath, + (unsigned long)rv->bflm_mapsize); +jleave: + return rv; + +jerr2: + mdb_env_close(rv->bflm_env); +jerr1: + if(emsg != NULL) + print_error(__FILE__, __LINE__, "LMDB[%ld]: init, %s: %d, %s", + (long)getpid(), emsg, e, mdb_strerror(e)); + xfree(rv); + rv = NULL; + goto jleave; +} + +static void +a_bflm_free(struct a_bflm *bflmp){ + if(bflmp != NULL){ + if(bflmp->bflm_txn_cache != NULL){ + if(DEBUG_DATABASE(1)) + fprintf(dbgout, "LMDB _free(): error: there is txn_cache!\n"); + a_bflm_txn_cache_free(bflmp); + } + + mdb_env_close(bflmp->bflm_env); + + if(bflmp->bflm_flags & a_BFLM_DEBUG) + fprintf(dbgout, "LMDB[%ld]: a_bflm_free(%p [%s])\n", + (long)getpid(), bflmp, bflmp->bflm_filepath); + + xfree(bflmp); + } +} + +static int +a_bflm_txn_begin(void *vhandle){ + char const *emsg; + struct a_bflm *bflmp; + int e; + + e = DST_OK; + + if((bflmp = vhandle) == NULL) + goto jleave; + + if(bflmp->bflm_flags & a_BFLM_DEBUG) + fprintf(dbgout, "LMDB[%ld]: txn_begin(%p [%s])\n", + (long)getpid(), bflmp, bflmp->bflm_filepath); + + if(DEBUG_DATABASE(1) && (bflmp->bflm_flags & a_BFLM_HAS_TXN)){ + fprintf(dbgout, "LMDB txn_begin(): error: HAS_TXN!\n"); + e = DST_FAILURE; + goto jleave; + } + + e = mdb_txn_begin(bflmp->bflm_env, NULL, + (bflmp->bflm_flags & a_BFLM_RDONLY ? MDB_RDONLY : 0), + &bflmp->bflm_txn); + if(e != MDB_SUCCESS){ + emsg = "mdb_txn_begin()"; + goto jerr1; + } + + e = mdb_dbi_open(bflmp->bflm_txn, NULL, 0, &bflmp->bflm_dbi); + if(e != MDB_SUCCESS){ + emsg = "mdb_dbi_open()"; + goto jerr2; + } + + e = mdb_cursor_open(bflmp->bflm_txn, bflmp->bflm_dbi, &bflmp->bflm_cursor); + if(e != MDB_SUCCESS){ + emsg = "mdb_cursor_open()"; + goto jerr2; + } + + bflmp->bflm_flags |= a_BFLM_HAS_TXN; + e = DST_OK; +jleave: + return e; + +jerr2: + mdb_txn_abort(bflmp->bflm_txn); +jerr1: + print_error(__FILE__, __LINE__, "LMDB[%ld]: txn_begin(), %s: %d, %s", + (long)getpid(), emsg, e, mdb_strerror(e)); + e = DST_FAILURE; + goto jleave; +} + +static int +a_bflm_txn_abort(void *vhandle){ + struct a_bflm *bflmp; + int e; + + e = DST_OK; + + if((bflmp = vhandle) == NULL) + goto jleave; + + if(bflmp->bflm_flags & a_BFLM_DEBUG) + fprintf(dbgout, "LMDB[%ld]: txn_abort(%p [%s])\n", + (long)getpid(), bflmp, bflmp->bflm_filepath); + + if(DEBUG_DATABASE(1) && !(bflmp->bflm_flags & a_BFLM_HAS_TXN)){ + fprintf(dbgout, "LMDB txn_abort(): error: !HAS_TXN!\n"); + e = DST_FAILURE; + goto jleave; + } + + mdb_cursor_close(bflmp->bflm_cursor); + mdb_txn_abort(bflmp->bflm_txn); + a_bflm_txn_cache_free(bflmp); + + bflmp->bflm_flags &= ~a_BFLM_HAS_TXN; +jleave: + return e; +} + +static int +a_bflm_txn_commit(void *vhandle){ + struct a_bflm *bflmp; + int e, retries; + + e = DST_OK; + + if((bflmp = vhandle) == NULL) + goto jleave; + + if(bflmp->bflm_flags & a_BFLM_DEBUG) + fprintf(dbgout, "LMDB[%ld]: txn_commit(%p [%s])\n", + (long)getpid(), bflmp, bflmp->bflm_filepath); + + if(DEBUG_DATABASE(1) && !(bflmp->bflm_flags & a_BFLM_HAS_TXN)){ + fprintf(dbgout, "LMDB txn_commit(): error: !HAS_TXN!\n"); + e = DST_FAILURE; + goto jleave; + } + + mdb_cursor_close(bflmp->bflm_cursor); + + retries = 0; +jredo: + e = mdb_txn_commit(bflmp->bflm_txn); + if(e != MDB_SUCCESS){ + if(e == MDB_MAP_FULL && ++retries <= a_BFLM_GROW_TRIES && + a_bflm_txn_mapfull(bflmp, false)){ + mdb_cursor_close(bflmp->bflm_cursor); + goto jredo; + } + mdb_txn_abort(bflmp->bflm_txn); + e = MDB_PANIC; + } + + a_bflm_txn_cache_free(bflmp); + + bflmp->bflm_flags &= ~a_BFLM_HAS_TXN; + if(e == MDB_SUCCESS) + e = DST_OK; + else{ + print_error(__FILE__, __LINE__, "LMDB[%ld]: txn_commit(): %d, %s", + (long)getpid(), e, mdb_strerror(e)); + e = DST_FAILURE; + } +jleave: + return e; +} + +static bool +a_bflm_txn_mapfull(struct a_bflm *bflmp, bool close_cursor){ + MDB_envinfo envinfo; + char const *emsg; + int e; + size_t i; + + if(DEBUG_DATABASE(1) && (bflmp->bflm_flags & a_BFLM_RDONLY)) + fprintf(dbgout, "LDMB txn_mapfull() on RDONLY DB!\n"); + + /* Abort transaction */ + if(close_cursor) + mdb_cursor_close(bflmp->bflm_cursor); + mdb_txn_abort(bflmp->bflm_txn); + + /* Resize map */ + i = bflmp->bflm_mapsize; + i += a_BFLM_GROW; + e = mdb_env_set_mapsize(bflmp->bflm_env, i); + if(e != MDB_SUCCESS){ + emsg = "mdb_env_set_mapsize()"; + goto jerr1; + } + /* no error defined */mdb_env_info(bflmp->bflm_env, &envinfo); + bflmp->bflm_mapsize = envinfo.me_mapsize; + + /* Recreate transaction */ + e = mdb_txn_begin(bflmp->bflm_env, NULL, 0, &bflmp->bflm_txn); + if(e != MDB_SUCCESS){ + emsg = "mdb_txn_begin()"; + goto jerr1; + } + + e = mdb_dbi_open(bflmp->bflm_txn, NULL, 0, &bflmp->bflm_dbi); + if(e != MDB_SUCCESS){ + emsg = "mdb_dbi_open()"; + goto jerr2; + } + + e = mdb_cursor_open(bflmp->bflm_txn, bflmp->bflm_dbi, &bflmp->bflm_cursor); + if(e != MDB_SUCCESS){ + emsg = "mdb_cursor_open()"; + goto jerr2; + } + + if((emsg = a_bflm_txn_cache_replay(bflmp)) != NULL) + goto jerr3; + + if(bflmp->bflm_flags & a_BFLM_DEBUG) + fprintf(dbgout, "LMDB[%ld]: txn_mapfull(%p [%s]): " + "recreated, new size %lu\n", + (long)getpid(), bflmp, bflmp->bflm_filepath, bflmp->bflm_mapsize); + e = 0; +jleave: + return (e == 0); +jerr3: + /* Done by TXN abort mdb_cursor_close(bflmp->bflm_cursor); */ +jerr2: + /* Done by TXN abort mdb_txn_abort(bflmp->bflm_txn); */ +jerr1: + print_error(__FILE__, __LINE__, "LMDB[%ld]: txn_mapfull(): %s, %d, %s", + (long)getpid(), emsg, e, mdb_strerror(e)); + e = 1; + goto jleave; +} + +static char const * +a_bflm_txn_cache_put(struct a_bflm *bflmp, MDB_val *key, MDB_val *val_or_null){ + uint32_t ui; + char *dp; + struct a_bflm_txn_cache *bflmtcp; + char const *emsg; + size_t kl, vl, i; + + kl = key->mv_size; + if(val_or_null != NULL){ + vl = val_or_null->mv_size; + i = (2 * sizeof(uint32_t)) + kl + vl; + }else{ + vl = 0; + i = sizeof(uint32_t) + kl; + } + i = a_BFLM_TXN_CACHE_ALIGN(i); + + /* XXX We actually should abort() the program instead: cannot be handled */ + if(kl >= 0x7FFFFFFFu || vl >= 0x7FFFFFFFu || + i >= 0x7FFFFFFFu - sizeof(*bflmtcp)){ + emsg = "LMDB: entry too large to be stored"; + goto jleave; + } + + /* Do we need to create a new cache chunk entry? + * We are simple and only look into the top of the stack */ + if((bflmtcp = bflmp->bflm_txn_cache) == NULL) + goto jcache_new; + else if(i >= (size_t)(bflmtcp->bflmtc_max - bflmtcp->bflmtc_caster)){ +jcache_new: + i += sizeof(*bflmtcp); + i = max(i, a_BFLM_TXN_CACHE_SIZE); + dp = (char*)(bflmtcp = xmalloc(i)); + bflmtcp->bflmtc_last = bflmp->bflm_txn_cache; + bflmp->bflm_txn_cache = bflmtcp; + bflmtcp->bflmtc_caster = bflmtcp->bflmtc_data = (char*)&bflmtcp[1]; + i -= 2 * sizeof(uint32_t); + bflmtcp->bflmtc_max = &dp[i]; + } + + /* For actual storing always use memcpy() for simplicity. + * (That is: C standard and undefined behaviour, who knows?) */ + dp = bflmtcp->bflmtc_caster; + ui = (uint32_t)kl; + if(val_or_null != NULL) + ui |= 0x80000000u; + memcpy(dp, &ui, sizeof ui); + dp += sizeof ui; + if(val_or_null != NULL){ + ui = (uint32_t)vl; + memcpy(dp, &ui, sizeof ui); + dp += sizeof ui; + } + memcpy(dp, key->mv_data, kl); + dp += kl; + if(vl != 0){ + memcpy(dp, val_or_null->mv_data, vl); + dp += vl; + } + bflmtcp->bflmtc_caster = (char*)a_BFLM_TXN_CACHE_ALIGN((uintptr_t)dp); + + emsg = NULL; +jleave: + return emsg; +} + +static char const * +a_bflm_txn_cache_replay(struct a_bflm *bflmp){ + /* And replay all the changes we have yet seen */ + MDB_val key, val; + char const *emsg; + int e; + uint32_t kl, vl; + char *dp; + struct a_bflm_txn_cache *head, *bflmtcp; + + /* First of all create a list in the right order */ + for(head = NULL, bflmtcp = bflmp->bflm_txn_cache; bflmtcp != NULL; + bflmtcp = bflmtcp->bflmtc_last){ + bflmtcp->bflmtc_next = head; + head = bflmtcp; + } + + /* Then replay, using it */ + for(; head != NULL; head = head->bflmtc_next){ + for(dp = head->bflmtc_data; dp < head->bflmtc_caster;){ + bool isins; + + /* For actual loading always use memcpy() for simplicity. + * (That is: C standard and undefined behaviour, who knows?) */ + memcpy(&kl, dp, sizeof kl); + dp += sizeof kl; + if((isins = ((kl & 0x80000000u) != 0))){ + kl ^= 0x80000000u; + memcpy(&vl, dp, sizeof vl); + dp += sizeof vl; + } + + key.mv_size = kl; + key.mv_data = dp; + dp += kl; + if(isins){ + val.mv_size = vl; + val.mv_data = dp; + dp += vl; + + e = mdb_cursor_put(bflmp->bflm_cursor, &key, &val, 0); + if(e != MDB_SUCCESS){ + emsg = "mdb_cursor_put()"; + goto jleave; + } + }else{ + e = mdb_cursor_get(bflmp->bflm_cursor, &key, NULL, + MDB_SET_KEY); + if(e != MDB_SUCCESS){ + emsg = "mdb_cursor_get() for delete"; + goto jleave; + } + e = mdb_cursor_del(bflmp->bflm_cursor, 0); + if(e != MDB_SUCCESS){ + emsg = "mdb_cursor_del()"; + goto jleave; + } + } + dp = (char*)a_BFLM_TXN_CACHE_ALIGN((uintptr_t)dp); + } + } + + emsg = NULL; +jleave: + return emsg; +} + +static void +a_bflm_txn_cache_free(struct a_bflm *bflmp){ + struct a_bflm_txn_cache *bflmtcp; + + if(bflmp->bflm_flags & a_BFLM_DEBUG) + fprintf(dbgout, "LMDB[%ld]: cache_free(%p [%s])\n", + (long)getpid(), bflmp, bflmp->bflm_filepath); + + while((bflmtcp = bflmp->bflm_txn_cache) != NULL){ + bflmp->bflm_txn_cache = bflmtcp->bflmtc_last; + xfree(bflmtcp); + } +} + +dsm_t /* const TODO */ *dsm = &a_bflm_dsm; + +void * +db_open(void *env, bfpath *bfp, dbmode_t open_mode){ + struct a_bflm *bflmp; + UNUSED(env); + + if((bflmp = a_bflm_init(bfp)) == NULL) + goto jleave; + + if(open_mode == DS_READ) + bflmp->bflm_flags |= a_BFLM_RDONLY; + + if(bflmp->bflm_flags & a_BFLM_DEBUG) + fprintf(dbgout, "LMDB[%ld]: db_open(%p [%s; rdonly=%d])\n", + (long)getpid(), bflmp, bflmp->bflm_filepath, + !!(bflmp->bflm_flags & a_BFLM_RDONLY)); +jleave: + return bflmp; +} + +void +db_close(void *vhandle){ + struct a_bflm *bflmp; + + if((bflmp = vhandle) == NULL) + goto jleave; + + if(bflmp->bflm_flags & a_BFLM_DEBUG) + fprintf(dbgout, "LMDB[%ld]: db_close(%p [%s])\n", + (long)getpid(), bflmp, bflmp->bflm_filepath); + + if(DEBUG_DATABASE(1) && (bflmp->bflm_flags & a_BFLM_HAS_TXN)) + fprintf(dbgout, "LMDB db_close(): error: HAS_TXN!\n"); + + a_bflm_free(bflmp); +jleave:; +} + +bool +db_is_swapped(void *vhandle){ + UNUSED(vhandle); + return false; +} + +bool +db_created(void *vhandle){ + return (vhandle != NULL); +} + +int +db_get_dbvalue(void *vhandle, const dbv_t *token, dbv_t *value){ + MDB_val key, val; + char const *emsg; + struct a_bflm *bflmp; + int e; + + e = DS_NOTFOUND; + + if((bflmp = vhandle) == NULL) + goto jleave; + + if(DEBUG_DATABASE(1) && !(bflmp->bflm_flags & a_BFLM_HAS_TXN)){ + emsg = "!HAS_TXN!"; + goto jerr; + } + + if(DEBUG_DATABASE(3)) + fprintf(dbgout, "LMDB db_get_dbvalue(): %lu <%.*s>\n", + (unsigned long)token->leng, (int)token->leng, token->data); + + key.mv_data = token->data; + key.mv_size = token->leng; + e = mdb_cursor_get(bflmp->bflm_cursor, &key, &val, MDB_SET); + if(e != MDB_SUCCESS){ + emsg = "mdb_cursor_get()"; + goto jerr; + } + + if(val.mv_size > value->leng){ + emsg = "value storage too small"; + e = ENOSPC; + goto jerr; + } + memcpy(value->data, val.mv_data, value->leng = val.mv_size); + + e = 0; +jleave: + return e; +jerr: + if(e != MDB_NOTFOUND){ + print_error(__FILE__, __LINE__, "LMDB[%ld]: db_get_dbvalue(), " + "%s: %d, %s", + (long)getpid(), emsg, e, mdb_strerror(e)); + exit(EX_ERROR); + } + e = DS_NOTFOUND; + goto jleave; +} + +int +db_set_dbvalue(void *vhandle, const dbv_t *token, const dbv_t *value){ + MDB_val key, val; + char const *emsg; + struct a_bflm *bflmp; + int e, retries; + + e = 0; + + if((bflmp = vhandle) == NULL) + goto jleave; + + /* TODO bogofilter tries to put .WORDLIST_VERSION even into a RDONLY DB. + * TODO Therefore silently fake set_dbvalue() success for RDONLY */ + if(bflmp->bflm_flags & a_BFLM_RDONLY) + goto jleave; + + if(DEBUG_DATABASE(1) && !(bflmp->bflm_flags & a_BFLM_HAS_TXN)){ + emsg = "!HAS_TXN!"; + goto jerr; + } + + if(DEBUG_DATABASE(3)) + fprintf(dbgout, "LMDB db_set_dbvalue(): %lu <%.*s>\n", + (unsigned long)token->leng, (int)token->leng, token->data); + + retries = 0; +jredo: + key.mv_data = token->data; + key.mv_size = token->leng; + val.mv_data = value->data; + val.mv_size = value->leng; + e = mdb_cursor_put(bflmp->bflm_cursor, &key, &val, 0); + if(e != MDB_SUCCESS){ + if(e == MDB_MAP_FULL && ++retries <= a_BFLM_GROW_TRIES && + a_bflm_txn_mapfull(bflmp, true)) + goto jredo; + emsg = "mdb_cursor_put()"; + goto jerr; + } + + if((emsg = a_bflm_txn_cache_put(bflmp, &key, &val)) != NULL) + goto jerr; + + e = 0; +jleave: + return e; +jerr: + print_error(__FILE__, __LINE__, "LMDB[%ld]: db_set_dbvalue(), %s: %d, %s", + (long)getpid(), emsg, e, mdb_strerror(e)); + exit(EX_ERROR); +} + +int +db_delete(void *vhandle, const dbv_t *token){ + MDB_val key; + char const *emsg; + struct a_bflm *bflmp; + int e, retries; + + e = 0; + + if((bflmp = vhandle) == NULL) + goto jleave; + + /* TODO bogofilter tries to put .WORDLIST_VERSION even into a RDONLY DB. + * TODO Since we silently fake set_dbvalue() success for RDONLY, do the + * TODO very same for delete(), too */ + if(bflmp->bflm_flags & a_BFLM_RDONLY) + goto jleave; + + if(DEBUG_DATABASE(1) && !(bflmp->bflm_flags & a_BFLM_HAS_TXN)){ + emsg = "!HAS_TXN!"; + e = DS_NOTFOUND; + goto jerr; + } + + if(DEBUG_DATABASE(3)) + fprintf(dbgout, "LMDB db_delete(): %lu <%.*s>\n", + (unsigned long)token->leng, (int)token->leng, token->data); + + retries = 0; +jredo: + key.mv_data = token->data; + key.mv_size = token->leng; + e = mdb_cursor_get(bflmp->bflm_cursor, &key, NULL, MDB_SET_KEY); + if(e != MDB_SUCCESS){ + emsg = "mdb_cursor_get()"; + goto jerr; + } + + e = mdb_cursor_del(bflmp->bflm_cursor, 0); + if(e != MDB_SUCCESS){ + /* Should not happen, though */ + if(e == MDB_MAP_FULL && ++retries <= a_BFLM_GROW_TRIES && + a_bflm_txn_mapfull(bflmp, true)) + goto jredo; + emsg = "mdb_cursor_del()"; + goto jerr; + } + + if((emsg = a_bflm_txn_cache_put(bflmp, &key, NULL)) != NULL) + goto jerr; + + e = 0; +jleave: + return e; +jerr: + print_error(__FILE__, __LINE__, "LMDB[%ld]: db_delete(), %s: %d, %s", + (long)getpid(), emsg, e, mdb_strerror(e)); + if(e != MDB_NOTFOUND) + exit(EX_ERROR); + e = DS_NOTFOUND; + goto jleave; +} + +void +db_flush(void *vhandle){ + struct a_bflm *bflmp; + + if((bflmp = vhandle) != NULL){ + int e; + + if(bflmp->bflm_flags & a_BFLM_DEBUG) + fprintf(dbgout, "LMDB[%ld]: db_flush(%p [%s])\n", + (long)getpid(), bflmp, bflmp->bflm_filepath); + + e = mdb_env_sync(bflmp->bflm_env, true); + if(e != MDB_SUCCESS) + print_error(__FILE__, __LINE__, "LMDB[%ld]: db_flush(): %d, %s", + (long)getpid(), e, mdb_strerror(e)); + } +} + +ex_t +db_foreach(void *vhandle, db_foreach_t hook, void *userdata){ + dbv_t dbv_key, dbv_val; + MDB_val key, val; + char *buf; + MDB_cursor_op cursor_op; + struct a_bflm *bflmp; + ex_t rv; + + rv = EX_OK; + + if((bflmp = vhandle) == NULL) + goto jleave; + + if(DEBUG_DATABASE(1) && !(bflmp->bflm_flags & a_BFLM_HAS_TXN)){ + rv = EX_ERROR; + goto jleave; + } + + if(bflmp->bflm_flags & a_BFLM_DEBUG) + fprintf(dbgout, "LMDB[%ld]: db_foreach(%p [%s])\n", + (long)getpid(), bflmp, bflmp->bflm_filepath); + + buf = NULL; + for(cursor_op = MDB_FIRST;; cursor_op = MDB_NEXT){ + size_t i; + int e; + + e = mdb_cursor_get(bflmp->bflm_cursor, &key, &val, cursor_op); + if(e != MDB_SUCCESS){ + if(e != MDB_NOTFOUND){ + print_error(__FILE__, __LINE__, "LMDB[%ld]: db_foreach(): " + "%d, %s", + (long)getpid(), e, mdb_strerror(e)); + rv = EX_ERROR; + } + break; + } + + /* Copy to dbv_key and dbv_val in order to avoid loss upon possible + * action on the DB; should not matter, but NUL terminate them */ + dbv_key.leng = (uint32_t)(i = key.mv_size); + dbv_key.data = buf = xrealloc(buf, i +1 + val.mv_size +1); + memcpy(buf, key.mv_data, i); + buf[i++] = '\0'; + dbv_val.leng = (uint32_t)val.mv_size; + memcpy(dbv_val.data = &buf[i], val.mv_data, val.mv_size); + i += val.mv_size; + buf[i++] = '\0'; + + rv = hook(&dbv_key, &dbv_val, userdata); + + if(rv != EX_OK) + break; + } + if(buf != NULL) + xfree(buf); +jleave: + return rv; +} + +const char * +db_version_str(void){ + return MDB_VERSION_STRING; +} + +char const * +db_str_err(int e){ + return mdb_strerror(e); +} + +/* vim:set et sts=4 sw=4 sts=4 tw=79: */ Modified: branches/lmdb-support/bogofilter/src/tests/t.frame =================================================================== --- branches/lmdb-support/bogofilter/src/tests/t.frame 2018-07-17 06:43:59 UTC (rev 7060) +++ branches/lmdb-support/bogofilter/src/tests/t.frame 2018-07-17 06:57:16 UTC (rev 7061) @@ -53,6 +53,7 @@ *Kyoto*) DB_TXN=true ;; *SQLite*) DB_TXN=true ;; *TrivialDB*) DB_TXN=false ;; + *LMDB*) DB_TXN=true ;; *) echo >&2 "Unknown data base type in bogofilter -V: $DB_NAME" exit 1 ;; esac This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site. |
From: <m-...@us...> - 2018-07-17 06:44:02
|
Revision: 7060 http://sourceforge.net/p/bogofilter/code/7060 Author: m-a Date: 2018-07-17 06:43:59 +0000 (Tue, 17 Jul 2018) Log Message: ----------- Branching for LMDB support. Added Paths: ----------- branches/lmdb-support/ This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site. |
From: <m-...@us...> - 2018-05-28 18:57:56
|
Revision: 7059 http://sourceforge.net/p/bogofilter/code/7059 Author: m-a Date: 2018-05-28 18:57:53 +0000 (Mon, 28 May 2018) Log Message: ----------- Fix double word. Modified Paths: -------------- trunk/bogofilter/doc/README.db Modified: trunk/bogofilter/doc/README.db =================================================================== --- trunk/bogofilter/doc/README.db 2018-04-04 13:12:56 UTC (rev 7058) +++ trunk/bogofilter/doc/README.db 2018-05-28 18:57:53 UTC (rev 7059) @@ -120,7 +120,7 @@ can lose or corrupt up to a few MB of data when the power fails. Note: This problem is not specific to bogofilter. -It is possible to sacrifice a bit bit of the the write speed and get +It is possible to sacrifice a bit of the the write speed and get reliability in turn, by switching off the disk's write cache (see appendix A for instructions). This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site. |
From: <m-...@us...> - 2018-04-04 13:12:58
|
Revision: 7058 http://sourceforge.net/p/bogofilter/code/7058 Author: m-a Date: 2018-04-04 13:12:56 +0000 (Wed, 04 Apr 2018) Log Message: ----------- Silence GCC fallthrough warning. Modified Paths: -------------- trunk/bogofilter/src/bogoutil.c Modified: trunk/bogofilter/src/bogoutil.c =================================================================== --- trunk/bogofilter/src/bogoutil.c 2017-09-18 23:43:02 UTC (rev 7057) +++ trunk/bogofilter/src/bogoutil.c 2018-04-04 13:12:56 UTC (rev 7058) @@ -715,6 +715,7 @@ case 'r': onlyprint = true; + /* fallthrough */ case 'R': flag = M_ROBX; count += 1; This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site. |
From: <m-...@us...> - 2017-09-18 23:43:04
|
Revision: 7057 http://sourceforge.net/p/bogofilter/code/7057 Author: m-a Date: 2017-09-18 23:43:02 +0000 (Mon, 18 Sep 2017) Log Message: ----------- Update contrib/spamitarium.pl to 0.5.2. * Bump version number to 0.5.2. * Refactor the header field output code to be cleaner and warn about empty fields. * Fix two header field references which could cause empty header hashes to be inserted. Modified Paths: -------------- trunk/bogofilter/NEWS trunk/bogofilter/contrib/spamitarium.pl Modified: trunk/bogofilter/NEWS =================================================================== --- trunk/bogofilter/NEWS 2017-09-12 06:28:07 UTC (rev 7056) +++ trunk/bogofilter/NEWS 2017-09-18 23:43:02 UTC (rev 7057) @@ -15,7 +15,7 @@ ------------------------------------------------------------------------------- - 2017-09-12 + 2017-09-18 * The contrib/spamitarium.pl, originally written by Thomas 'Tom' Anderson, was enhanced by Jonathan Kamens and grew a few features. Run perldoc contrib/spamitarium.pl, or spamitarium.pl -h, to read Modified: trunk/bogofilter/contrib/spamitarium.pl =================================================================== --- trunk/bogofilter/contrib/spamitarium.pl 2017-09-12 06:28:07 UTC (rev 7056) +++ trunk/bogofilter/contrib/spamitarium.pl 2017-09-18 23:43:02 UTC (rev 7057) @@ -48,7 +48,7 @@ =cut -my $version = "0.5.1"; +my $version = "0.5.2"; ################################################ ############### Copyleft Notice ################ @@ -442,6 +442,7 @@ ################################################# use Benchmark; +use Data::Dumper; use File::Basename; use Time::Local; use Net::DNS::Resolver; @@ -586,7 +587,7 @@ if ($options =~ /r/) { $start_rcvd = new Benchmark if $options =~ /b/; - $header->{'received'} = process_rcvd($header->{'received'},$return_path || $header->{'return-path'}->[0]->{'value'}); + $header->{'received'} = process_rcvd($header->{'received'},$return_path || ($header->{'return-path'} && $header->{'return-path'}->[0]->{'value'})); $end_rcvd = new Benchmark if $options =~ /b/; } @@ -596,7 +597,7 @@ if ($options =~ /t/) { $header->{'x-date-check'}->[0]->{'name'} = "X-Date-Check"; - $header->{'x-date-check'}->[0]->{'value'} = date_check($header->{'date'}->[0]->{'value'},$header->{'received'}->[0]->{'date'}); + $header->{'x-date-check'}->[0]->{'value'} = date_check($header->{'date'}->[0]->{'value'},$header->{'received'} && $header->{'received'}->[0]->{'date'}); } if ($options =~ /p/) @@ -1153,39 +1154,39 @@ return $output; } -sub set_field -{ - my $header = shift; - my $name = shift; - my $output = ""; +sub set_field { + my($header, $name) = @_; + my $output = ""; - if ((defined $header->{$name}) && (ref($header->{$name}) eq "ARRAY")) - { - for (my $x = 0; $x < scalar @{$header->{$name}}; $x++) - { - if (($name eq "received") && ($options =~ /r/)) - { - if (defined $header->{$name}->[$x]->{'sane'} && $header->{$name}->[$x]->{'sane'} =~ /\w/) - { - $output .= $header->{$name}->[$x]->{'name'} . ": " . $header->{$name}->[$x]->{'sane'} . $CRLF; - } - #else { $output .= $header->{$name}->[$x]->{'name'} . ": sanity check failed" . $CRLF; } - } - else { - $output .= $header->{$name}->[$x]->{'name'} . ": " . $header->{$name}->[$x]->{'value'} . $CRLF; - } - } - } - elsif (defined $header->{$name}) - { - $output .= ucfirst($name) . ": " . $header->{$name} . $CRLF; - } - elsif ($req_fields =~ /(?:^|,)$name(?:,|$)/) - { - $output .= ucfirst($name) . ": [no-$name] " . $CRLF; - } - - return $output; + if ($header->{$name}) { + foreach my $header (@{$header->{$name}}) { + if ($name eq "received" and $options =~ /r/) { + if ($header->{'sane'} and $header->{'sane'} =~ /\w/) { + $output .= $header->{'name'} . ": " . $header->{'sane'} . + $CRLF; + } + # else { + # $output .= $header->{'name'} . ": sanity check failed" . + # $CRLF; + # } + } + elsif ($header->{'name'} and defined($header->{'value'})) { + $output .= $header->{'name'} . ": " . $header->{'value'} . + $CRLF; + } + else { + my $dumped = Data::Dumper->new([$header], [qw(header)])-> + Indent(0)->Dump(); + error("warn", "Header for $name, $dumped, is missing name " . + "and/or value?"); + } + } + } + elsif ($req_fields =~ /(?:^|,)$name(?:,|$)/) { + $output .= ucfirst($name) . ": [no-$name] " . $CRLF; + } + + return $output; } ################################################ This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site. |
From: <m-...@us...> - 2017-09-12 06:28:10
|
Revision: 7056 http://sourceforge.net/p/bogofilter/code/7056 Author: m-a Date: 2017-09-12 06:28:07 +0000 (Tue, 12 Sep 2017) Log Message: ----------- Mention JIK's updates to contrib/spamitarium.pl. Modified Paths: -------------- trunk/bogofilter/NEWS Modified: trunk/bogofilter/NEWS =================================================================== --- trunk/bogofilter/NEWS 2017-09-12 06:20:22 UTC (rev 7055) +++ trunk/bogofilter/NEWS 2017-09-12 06:28:07 UTC (rev 7056) @@ -15,6 +15,12 @@ ------------------------------------------------------------------------------- + 2017-09-12 + * The contrib/spamitarium.pl, originally written by Thomas 'Tom' + Anderson, was enhanced by Jonathan Kamens and grew a few features. + Run perldoc contrib/spamitarium.pl, or spamitarium.pl -h, to read + its manual. + 2016-01-26 * Apply patch from Denny Lin, with one fix, to add support for the This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site. |
From: <m-...@us...> - 2017-09-12 06:20:24
|
Revision: 7055 http://sourceforge.net/p/bogofilter/code/7055 Author: m-a Date: 2017-09-12 06:20:22 +0000 (Tue, 12 Sep 2017) Log Message: ----------- Another update to spamitarium.pl by Jonathan Kamens. Modified Paths: -------------- trunk/bogofilter/contrib/spamitarium.pl Modified: trunk/bogofilter/contrib/spamitarium.pl =================================================================== --- trunk/bogofilter/contrib/spamitarium.pl 2017-09-11 21:01:35 UTC (rev 7054) +++ trunk/bogofilter/contrib/spamitarium.pl 2017-09-12 06:20:22 UTC (rev 7055) @@ -17,6 +17,14 @@ # the documentation below for the options --return-path, --no-local-received, # --remote-ip, --remote-name, --helo, --local-ip, --local-name, --rcpt, and # --add-local-received. +# * A --timeout option has been added for specifying how long the script should +# wait for input, or 0 to disable the timeout completely. This is primarily +# useful for debugging the script. +# * The header parsing code has been refactored to be cleaner and more robust. +# * In particular, empty header fields are now handled correctly (previously, +# they were appended to the previous header field!). +# * Empty header fields are now included in the output, for more accurate +# bogofilter'ing. # * Typos and such have been cleaned up in the documentation. # * A date-parsing bug which was causing the time zone to be ignored, thus # causing the X-Date-Check header to report an inaccurate delta, has been @@ -40,13 +48,13 @@ =cut -my $version = "0.4.0"; +my $version = "0.5.1"; ################################################ ############### Copyleft Notice ################ ################################################ -# Copyright \xA9 2004 Order amid Chaos, Inc. +# Copyright © 2004 Order amid Chaos, Inc. # Author: Tom Anderson # neo...@or... # @@ -205,6 +213,11 @@ =back +=item B<--timeout> I<seconds> + +How long to wait for input before giving up. Specify 0 to disable the timeout +completely. Primarily for debugging purposes. + =back @@ -531,6 +544,7 @@ "rcpt=s" => \$opt_rcpt, "no-local-received|nolocalreceived" => \$no_local_received, "add-local-received|addlocalreceived" => \$add_local_received, + "timeout=i" => \$timeout, # 0 to disable timeout )) { &usage; exit(1); @@ -549,7 +563,9 @@ { # set an alarm so that we don't hang on an empty STDIN local $SIG{ALRM} = sub { die "timeout" }; - alarm $timeout; + if ($timeout > 0) { + alarm $timeout; + } # parse the header $start_parse = new Benchmark if $options =~ /b/; @@ -589,8 +605,8 @@ { if (defined $header->{'received'}->[$x]->{'spf'} && $header->{'received'}->[$x]->{'spf'} =~ /\w/) { - $header->{'x-spf'}->[$x]->{'name'} = "X-SPF"; - $header->{'x-spf'}->[$x]->{'value'} = $header->{'received'}->[$x]->{'spf'}; + push(@{$header->{'x-spf'}}, + {'name' => "X-SPF", 'value' => $header->{'received'}->[$x]->{'spf'}}); } } } @@ -657,80 +673,66 @@ sub parse_header { - my $header = {}; - my $name = ""; + local($_); + my $header_text = ""; - linein: - while (<STDIN>) - { - alarm 0; - # This is really gross. There is a certain prominent email - # marketing company whose software has a bug in it which causes - # Date: headers to sometimes be terminated with just CR rather than - # CRLF. If we interpret RFC 5321 section 2.3.8 strictly, then we're - # required to treat such a Date: header and the one following it as - # a single header field, but strict adherence to the RFC when that - # results in obviously broken behavior is not the best approach. On - # the other hand, when we're straying from the RFC, we want to do - # so as minimally as possible. Therefore, what we are doing here is - # checking specifically for this exact problem -- Date: headers - # ending with CR rather than CRLF -- and correcting for just that - # one, limited case. - my(@lines); - # This handles input with both CRLF and LF line terminators. - if (/^(Date:.*)\r([^\n].*(.)\n)/) { - if ($3 eq "\r") { - @lines = ("$1\r\n", $2); - } - else { - @lines = ("$1\n", $2); - } - } - else { - @lines = ($_); - } + while (<STDIN>) { + last if (/^\r?\n$/); + $header_text .= $_; + } - foreach my $line (@lines) { - chomp($line); + # This is really gross. There is a certain prominent email marketing + # company whose software has a bug in it which causes Date: headers to + # sometimes be terminated with just CR rather than CRLF. If we interpret + # RFC 5321 section 2.3.8 strictly, then we're required to treat such a + # Date: header and the one following it as a single header field, but + # strict adherence to the RFC when that results in obviously broken + # behavior is not the best approach. On the other hand, when we're straying + # from the RFC, we want to do so as minimally as possible. Therefore, what + # we are doing here is checking specifically for this exact problem -- + # Date: headers ending with CR rather than CRLF -- and correcting for just + # that one, limited case. + # This handles input with both CRLF and LF line terminators. + $header_text =~ s/^(Date:.*)\r([^\n].*(.)\n)/$1$3\n$2/m; - # we're done with the header when we've found a blank line - # and the required headers have been found already - last linein if (!defined $line || $line !~ /\S/); - #&& ( - #(defined $header->{'received'} && $header->{'received'}->[0]->{'value'} =~ /\w/) && - #(defined $header->{'subject'} && $header->{'subject'}->[0]->{'value'} =~ /\w/) && - #(defined $header->{'to'} && $header->{'to'}->[0]->{'value'} =~ /\w/) && - #(defined $header->{'from'} && $header->{'from'}->[0]->{'value'} =~ /\w/))); + my(@headers) = split(/\n\b/, $header_text); + + my $header = {}; + my $last_header = undef; - # match header lines - if ($line =~ /^(\S+?):\s*?(\S.*?)$/) - { - my $head = $1; my $value = $2; - $name = $head; - - $name =~ tr/A-Z/a-z/; # header names are case insensitive - chomp($name); - - $value =~ s/\s+?/ /gis; # nix extra spaces & unfold header lines by removing CRLF - $value =~ s/(\S)$/$1 /; - chomp($value); - - # if this header name has already been found, append to the end of the array - my $count = ((defined $header->{$name}) && (ref($header->{$name}) eq "ARRAY"))? scalar @{$header->{$name}} : 0; - - # record this header line - $header->{$name}->[$count]->{'value'} = $value; - $header->{$name}->[$count]->{'name'} = $head; # just for consistency (i.e. pre transforms) - - #print "found $head [$count] = $value$CRLF"; - } - - # if this line doesn't start with "header:", append to last line found (if exists) - elsif ($name && $line =~ /\w/ && $line !~ /^:/) { $line =~ s/\s+?/ /gis; $line =~ s/^\s//; $header->{$name}->[(scalar @{$header->{$name}} - 1)]->{'value'} .= $line if ((defined $header->{$name}) && (ref($header->{$name}) eq "ARRAY")); } + for (@headers) { + s/\s+$//; + s/\s+/ /g; # collapse whitespace + if (s/^(\S+):\s*//) { + my $name = $1; + my $tag = $name; + $tag =~ tr/A-Z/a-z/; # header names are case-insensitive + if (! $_ and $header->{$tag} and + grep(! $_->{'value'}, @{$header->{$tag}})) { + # If an empty header field is repeated multiple times, we only + # need to preserve one of them. + next; } - } - - return $header; + push(@{$header->{$tag}}, {'name' => $name, 'value' => $_}); + $last_header = $header->{$tag}->[-1]; + } + else { + # What's the right thing to do here? Either there's no colon or + # there's whitespace before the colon, both of which are RFC + # violations. Our best guess is to append this to the previous + # header. + if (! $last_header) { + error("warn", "Bad initial header line '$_' ignored\n"); + } + else { + error("warn", "Bad header line '$_' appended to preceding '" . + $last_header->{'name'} . "' header field"); + $last_header->{'value'} .= " " . $_; + } + } + } + + return $header; } sub date_check @@ -1169,9 +1171,7 @@ } #else { $output .= $header->{$name}->[$x]->{'name'} . ": sanity check failed" . $CRLF; } } - elsif ($header->{$name}->[$x]->{'value'} and - $header->{$name}->[$x]->{'value'} =~ /\w/) - { + else { $output .= $header->{$name}->[$x]->{'name'} . ": " . $header->{$name}->[$x]->{'value'} . $CRLF; } } This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site. |