From: Kyle R. B. <mo...@vo...> - 2003-08-07 00:46:01
|
Whoops! I forgot to attach the patch files. Please excuse the earlier email. They should be attached to this one. Kyle On Wed, Aug 06, 2003 at 05:40:22PM -0400, Kyle R. Burton wrote: > Hello again. > > In using mdb-export, we've run into a few snags with data that contains > embedded newlines, carriage returns, tabs and quotes. There are > options for mdb-export to supress quoting, and to specify an alternate > delimiter. Unfortunately these options weren't enough to handle the > data we were trying to dump from our MDB files. > > I modified mdb-export and added a few new options: > > -q <string> specify a column quoting string (defaults to ") > -e <string> specify an escape string that will be substituted > for a double quote in data (defaults to a pair of > double quotes) > -d <delimiter> specify a column delimiter (default is a comma) > -R <eol> specify a record delimiter (default is a single newline) > > I also made some changes to the behavior of mdb-export based on > these options. The changes preserve the original behavior of > mdb-export with the default values for the new options. > > - the code now looks for quote_string instead of a hard-coded double > quote - emitting escape_string in place of quote_string. This > means using strstr() instead of single character comparisons. > - the header row is now quoted unless -Q is specified. We were seeing > column names with all kinds of special characters in them - commas, > spaces, etc. > - escape_string (defaults to ", overrideable via a command line switch) > is emitted in place of any quote_string values that column data > contains. It is not emitted before the quote_string, it is emitted > instead of the quote_string, so a double quote can be replaced > entirely by another string. > > For our data processing, we composed a more complex command line: > > [mortis@magenta]$ mdb-export -q "'" -e """ -R " > ***RECORD SEPARATOR*** > " -d " ||delimiter|| " ~/CREDITS_IMPORT.mdb ALL_CREDS |& less > > The record seperator we specified has embedded newlines in it: > > "\n***RECORD SEPARATOR***\n" > > This way in the Perl code that we're using to wrap the output of mdb-export, > we can set the input record seperator ($/) to that record delimiter. Doing > that makes delimiting the records very easy. The pre-existing escaping > features, combined with the ability to specify the quote character and > the software looking for that character makes parsing the fields very easy > as well. Even in the presence of all the embedded meta characters. > > > I've attached two patches, mdbtools-combined.patch includes the changes > from the perspective of 0.5rc2 for the whole archive, including the > patch sent by David Mansfield <mdb...@dm...>. > > The second patch, mdb-export.patch2, is just my changes to mdb-export > (assuming David Mansfield's patch as a baseline). > > > > > Thanks, > Kyle > > > > -- > > ------------------------------------------------------------------------------ > Wisdom and Compassion are inseparable. > -- Christmas Humphreys > mo...@vo... http://www.voicenet.com/~mortis > ------------------------------------------------------------------------------ > > > ------------------------------------------------------- > This SF.Net email sponsored by: Free pre-built ASP.NET sites including > Data Reports, E-commerce, Portals, and Forums are available now. > Download today and enter to win an XBOX or Visual Studio .NET. > http://aspnet.click-url.com/go/psa00100003ave/direct;at.aspnet_072303_01/01 > _______________________________________________ > mdbtools-dev mailing list > mdb...@li... > https://lists.sourceforge.net/lists/listinfo/mdbtools-dev -- ------------------------------------------------------------------------------ Wisdom and Compassion are inseparable. -- Christmas Humphreys mo...@vo... http://www.voicenet.com/~mortis ------------------------------------------------------------------------------ |