From: Jakob E. <jab...@gm...> - 2011-03-26 16:07:06
|
Okay this is the patch for the memory overflow issue related to memo fields. There where two issues: 1) an invalid encoding could lead to an infinite loop that would after some time cause an invalid write access 2) the last character of a buffer was set to zero, but actually the character right after the buffer was set. This might not be a problem if MDB_BIND_SIZE is not a power of two, because the allocator might actually allocate more memory than was requested. One issue still remains: Memo fields can contain up to 64K characters, so MDB_BIND_SIZE should be at least 64K*3+1 (an UCS2 character will need up to 3 bytes in UTF-8). However, this will drastically increase memory usage for tables with many columns. In the future, we should rewrite all those parts so that we allocate the memory lazily when reading a memo field, and allocate only as much as we actually need. In some cases, Memo fields could contain more that 64K characters, but let's not worry about that right now. At any rate, here's the patch (I just found out it was already on my github page anyways...) From d8bac7fd263b47b7e9eb146da3ee2ffbcf7f3307 Mon Sep 17 00:00:00 2001 From: jakob <jab...@gm...> Date: Thu, 17 Mar 2011 20:11:06 +0100 Subject: [PATCH] Fixed a buffer overflow with long MEMO fields --- include/mdbtools.h | 2 +- src/libmdb/data.c | 2 +- src/libmdb/iconv.c | 21 ++++++++++++--------- 3 files changed, 14 insertions(+), 11 deletions(-) diff --git a/include/mdbtools.h b/include/mdbtools.h index 8286d77..98da37d 100644 --- a/include/mdbtools.h +++ b/include/mdbtools.h @@ -46,7 +46,7 @@ #define MDB_MAX_IDX_COLS 10 #define MDB_CATALOG_PG 18 #define MDB_MEMO_OVERHEAD 12 -#define MDB_BIND_SIZE 16384 +#define MDB_BIND_SIZE 200000 enum { MDB_PAGE_DB = 0, diff --git a/src/libmdb/data.c b/src/libmdb/data.c index 4109b5a..0e2bb0e 100644 --- a/src/libmdb/data.c +++ b/src/libmdb/data.c @@ -675,7 +675,7 @@ static char *mdb_memo_to_string(MdbHandle *mdb, int start, int size) gint32 row_start, pg_row; size_t len; void *buf, *pg_buf = mdb->pg_buf; - char *text = (char *) g_malloc(MDB_BIND_SIZE); + char *text = (char *) g_malloc(MDB_BIND_SIZE+1); if (size<MDB_MEMO_OVERHEAD) { strcpy(text, ""); diff --git a/src/libmdb/iconv.c b/src/libmdb/iconv.c index d58be97..2f449a3 100644 --- a/src/libmdb/iconv.c +++ b/src/libmdb/iconv.c @@ -65,18 +65,21 @@ mdb_unicode2ascii(MdbHandle *mdb, char *src, size_t slen, char *dest, size_t dle in_ptr = (tmp) ? tmp : src; out_ptr = dest; len_in = (tmp) ? tlen : slen; - len_out = dlen; + len_out = dlen-1; #if HAVE_ICONV //printf("1 len_in %d len_out %d\n",len_in, len_out); while (1) { iconv(mdb->iconv_in, &in_ptr, &len_in, &out_ptr, &len_out); - if ((!len_in) || (errno == E2BIG)) break; - /* Don't bail if impossible conversion is encountered */ - in_ptr += (IS_JET4(mdb)) ? 2 : 1; - len_in -= (IS_JET4(mdb)) ? 2 : 1; - *out_ptr++ = '?'; - len_out--; + /* retry if there still are input bytes and output bytes left */ + if (len_out && len_in && (errno==EILSEQ || errno==EINVAL)) { + in_ptr += (IS_JET4(mdb)) ? 2 : 1; + len_in -= (IS_JET4(mdb)) ? 2 : 1; + *out_ptr++ = '?'; + len_out--; + } else { + break; + } } //printf("2 len_in %d len_out %d\n",len_in, len_out); dlen -= len_out; @@ -84,7 +87,7 @@ mdb_unicode2ascii(MdbHandle *mdb, char *src, size_t slen, char *dest, size_t dle if (IS_JET3(mdb)) { size_t copy_len = len_in; if (copy_len > dlen) - copy_len = dlen; + copy_len = dlen-1; strncpy(out_ptr, in_ptr, copy_len); dlen = copy_len; } else { @@ -97,7 +100,7 @@ mdb_unicode2ascii(MdbHandle *mdb, char *src, size_t slen, char *dest, size_t dle #endif if (tmp) g_free(tmp); - dest[dlen]='\0'; + dest[dlen-1]='\0'; //printf("dest %s\n",dest); return dlen; } -- 1.7.2.3+GitX On 26.03.2011, at 15:01, Alex Mayrhofer wrote: > On 25.03.2011 18:57, Jakob Egger wrote: >> There's a memory overflow bug in the Memo-to-string function (character set conversion) somewhere, I fixed it but I still need to prepare it for github. Maybe I'll have time to sort through my changes on the weekend. Does your database by any chance contain large Memo fields? One workaround could be to compile MDB Tools with a larger bind size (I think the default is 10000, try using 200000). The bind size is declared in mdbtools.h, I think. > > Jakob, > > thanks for that hint. You're right, the database contains some large > memo fields. After increasing the bind size and rebuilding mdb-export > (as well as my PHP script making use of the mdbtools binding) runs > through - great! > > However, i understand that this is a temporary fix? Would be great to > get access to the patch you mentioned... > >> I'd be happy to test your mdb file if you send it to me. > > I'll send it to you in private mail. > > thanks! > > Alex > > ------------------------------------------------------------------------------ > Enable your software for Intel(R) Active Management Technology to meet the > growing manageability and security demands of your customers. Businesses > are taking advantage of Intel(R) vPro (TM) technology - will your software > be a part of the solution? Download the Intel(R) Manageability Checker > today! http://p.sf.net/sfu/intel-dev2devmar > _______________________________________________ > mdbtools-dev mailing list > mdb...@li... > https://lists.sourceforge.net/lists/listinfo/mdbtools-dev |