Re: [mdb-dev] Patches + thoughts

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

On Friday 18 February 2011 19:38:19 Brian Bruns wrote:
> I'd like to keep quote/escape (particularly quote, I can't think of a
> case where escape is more than one char) as a string since it allows a
> multicharacter delimiter, not as uncommon a convention as you might
> think.

Right.
Attached is a revised version of export patch.

That does not really anwser my question: "Should we escape the escape character ?"
My patch does.
If you or anyone don't like it, there is a #define DONT_ESCAPE_ESCAPE

Attached is an update for Binary type (should not be treated as Memo).

> I agree that the escaping needs some work as it is often not as simple
> as adding a simple leading character, but that gets a little
> complicated and we'd have to bake in some escaping styles rather than
> the generic switch we now have.

I agree.

But I bet the INSERT version of mdb-export doesn't support charcters < 32 right now, nor non-utf8 characters above 127 (OLE).

I guess the best solution would be to move the INSERT code into the schema exportation tools (backend.c).
The insert would then occur before foreign keys setup. We could also set up the sequences (autonum/serial) automatically then. And have a full mdb->sql conversion tool with both schema and data.

I don't really have the time for that. Would that be a good idea?

> On unicode2ascii, I have no clue how string compression would work for
> BIG5, but otherwise it basically punts on the overrun question, it's
> up to the calling function to properly allocate the size which is
> generally MDB_BIND_SIZE, which is 4 times the page size, so that
> should accommodate any code set.

I hit some surprised with props, because I was careless of the unciode conversion expanding size problem at first. I mean column name sizes.
Anyways. I was thinking about Memo fields. I though the limit was 16k, so that mdbtools might fail after utf-8 conversion.
But Memo field limit actually is 64k chars (192k bytes in utf-8), so we are in trouble anyways.

Maybe all LVal should be handled the same way, with no automatic binding. We now have 3 versions of lval reading in data.c...

Here's another hint: The table/column props internally have:
- Name
- type (like col_type)
- Buffer (pointer + size)
Right now, we are converting everything to a proper string, meaning we loose the type. We might want a kind of MdbVariant (type+buffer) that could be played with. Like numeric/date operations.

I have the feeling a MdbVariant might be usefull for binding too. We could probably wrap a LVal reader in it.

Ok, I stop daydreaming now... :)