From: Alexander K. <as...@id...> - 2004-08-31 12:16:54
|
>The utility is built on the JDBC C++ binding defined in IscDbc. How specific to Firebird is this tool then? Can it be used to dump/load data from any JBDC-compatible database? >My evolving XML schema looks like this: > ><?xml version="1.0" encoding="US-ASCII"?> How do you determine the encoding? I think it sould at least be controllable by the user. >The key questions, I think, are how data is presented. > My starting point is: > > * A table row is represented as a single xml element > * Each non-null column is presented by an xml attribute I think that such a scheme is impossible to validate, at least with DTD. This is very unfortunate, but the only "correct" way IMO is <row> <column name="col1">data 1</column> <column name="col2">more data</column> </row> Yes, XML _is_ bloated :( >(I'm sending this as both text and html for readability. > For those morally opposed to html mail, get a life.) I receive this list in a digest form, so I have to scroll through both text and *undecoded* html versions of your e-mails. So while I think I do have a life, it is not made any easier by this ;) --- Professional hosting for everyone - http://www.host.ru |
From: Leyne, S. <Se...@br...> - 2004-08-31 21:20:02
|
Jim, > My evolving XML schema looks like this: >=20 > <?xml version=3D"1.0" encoding=3D"US-ASCII"?> > <database> > <metadata> > <table name=3D"MESSAGES"> > <column name=3D"TRANS_NOTES" type=3D"blob"/> > <column name=3D"EXPLANATION" type=3D"blob"/> > <column name=3D"ACTION" type=3D"blob"/> > <column name=3D"TEXT" type=3D"varchar" precision=3D"118"/> > <column name=3D"CODE" type=3D"int"/> > <column name=3D"FLAGS" type=3D"smallint"/> > <column name=3D"NUMBER" type=3D"smallint"/> > <column name=3D"FAC_CODE" type=3D"smallint"/> > <column name=3D"SYMBOL" type=3D"varchar" precision=3D"32"/> > <column name=3D"ROUTINE" type=3D"varchar" precision=3D"32"/> > <column name=3D"MODULE" type=3D"varchar" precision=3D"32"/> > </table> > </metadata> > <data> > <rows table=3D"MESSAGES"> > <row TEXT=3D"Do you want to roll back your updates?" > CODE=3D"10351" NUMBER=3D"351" FAC_CODE=3D"1" SYMBOL=3D"" > ROUTINE=3D"process_statement"/> > <row TEXT=3D"gen_descriptor: dtype not recognized" > CODE=3D"10352" NUMBER=3D"352" FAC_CODE=3D"1" = ROUTINE=3D"gen_descriptor"/> > <row TEXT=3D"MOVQ_move: conversion not done" CODE=3D"10047" > NUMBER=3D"47" FAC_CODE=3D"1"/> > <row TEXT=3D"BLOB conversion is not supported" CODE=3D"10048" > NUMBER=3D"48" FAC_CODE=3D"1" SYMBOL=3D"" ROUTINE=3D""/> > <row TEXT=3D"expected type" CODE=3D"10000" NUMBER=3D"0" > FAC_CODE=3D"1"/> > </rows> > </data> > </database> >=20 > The key questions, I think, are how data is presented. My starting point > is: >=20 > * A table row is represented as a single xml element > * Each non-null column is presented by an xml attribute The representation proposed is exactly how I define XML schemas -- not sure you'll think this is a good thing ;-) You will have a problem with large text/blob columns. There is a size restriction for attribute items. I don't recall the size limit, off the top of my head. I am investigating and will reply back. > * Null columns are not represented > * No special casing of blob values are supported > * Column attributes are based on JDBC definitions |
From: Jim S. <ja...@ne...> - 2004-08-31 22:16:11
|
Leyne, Sean wrote: >The representation proposed is exactly how I define XML schemas -- not >sure you'll think this is a good thing ;-) > > Gosh, Sean, I'll just have to deal with it, I guess. > <>You will have a problem with large text/blob columns. There is a size > restriction for attribute items. > I don't recall the size limit, off the top of my head. I am > investigating and will reply back. > I suppose we could allow a <column> without a value attribute to mean that the value is enclosed between tags. That would give us the following set of alternatives: * <row FIRSTNAME="Jim"/> * <row> <column name="FIRSTNAME" value="Jim"/></row> * <row> <column name="FIRSTNAME">Jim</column></row> This doesn't actually bother me. The code to handle the three cases on the load side is approximately three or four lines. I'd like to hear some more opinions. -- Jim Starkey Netfrastructure, Inc. 978 526-1376 |
From: Olivier M. <om...@ti...> - 2004-09-01 00:32:59
|
Leyne, Sean wrote: > I don't recall the size limit, off the top of my head. I am > investigating and will reply back. I don't recall having ever read about a size limit in the naming of elements or in the values of attributes. Of course, there is always a practical / system limit. But I don't remember having seen a drastic limit. -- Olivier Mascia |
From: Todd F. <ta...@le...> - 2004-09-06 06:26:25
|
I would make the following modification to your proposed representation: <?xml version="1.0" encoding="UTF-8"?> <database> <metadata> <table name="table1"> <col name="COL1" type="type1"/> <col name="COL2" type="type2"/> </table> <table name="table2"> <col name="COL1" type="type1"/> <col name="COL2" type="type2"/> </table> </metadata> <data table="table1> <row><COL1></COL1><COL2><![CDATA[binary data or otherwise necessary to escape data]]></COL2></row> <row><COL1></COL1></row> <row><COL1></COL1><COL2><![CDATA[]]></COL2></row> <row><COL1></COL1></row> </data> </database> * Nulls are easily handled by a missing column. * Blobs can be delt with using CDATA, you might want to use this for text too although text might be just as well escaping. The trade off I see with this format over what you had proposed earlier is that this format solves the above to issues, while perhaps adding an additional cost in size by a facter of 2 times the length of each column name. This is balanced in my opinion by two factors: 1. XML can be compressed at least one inparticular libxml can read gzipped files. Meaning increased disk space is really a mute point. 2. It is typically faster to process XML with fewer attributes. Attribute parsing tends to be slower then simple tag processing. I would also use UTF-8 encoding over ASCII; doing should leave open the possiblity of your xml being used in many more locales. -todd p.s. Perhaps I missed somewhere, but what is the reason for representing the tables/data in the database in an xml format? Just curious ;-) Leyne, Sean wrote: >Jim, > > > >>My evolving XML schema looks like this: >> >> <?xml version="1.0" encoding="US-ASCII"?> >> <database> >> <metadata> >> <table name="MESSAGES"> >> <column name="TRANS_NOTES" type="blob"/> >> <column name="EXPLANATION" type="blob"/> >> <column name="ACTION" type="blob"/> >> <column name="TEXT" type="varchar" precision="118"/> >> <column name="CODE" type="int"/> >> <column name="FLAGS" type="smallint"/> >> <column name="NUMBER" type="smallint"/> >> <column name="FAC_CODE" type="smallint"/> >> <column name="SYMBOL" type="varchar" precision="32"/> >> <column name="ROUTINE" type="varchar" precision="32"/> >> <column name="MODULE" type="varchar" precision="32"/> >> </table> >> </metadata> >> <data> >> <rows table="MESSAGES"> >> <row TEXT="Do you want to roll back your updates?" >>CODE="10351" NUMBER="351" FAC_CODE="1" SYMBOL="" >>ROUTINE="process_statement"/> >> <row TEXT="gen_descriptor: dtype not recognized" >>CODE="10352" NUMBER="352" FAC_CODE="1" ROUTINE="gen_descriptor"/> >> <row TEXT="MOVQ_move: conversion not done" CODE="10047" >>NUMBER="47" FAC_CODE="1"/> >> <row TEXT="BLOB conversion is not supported" >> >> >CODE="10048" > > >>NUMBER="48" FAC_CODE="1" SYMBOL="" ROUTINE=""/> >> <row TEXT="expected type" CODE="10000" NUMBER="0" >>FAC_CODE="1"/> >> </rows> >> </data> >> </database> >> >>The key questions, I think, are how data is presented. My starting >> >> >point > > >>is: >> >>* A table row is represented as a single xml element >>* Each non-null column is presented by an xml attribute >> >> > >The representation proposed is exactly how I define XML schemas -- not >sure you'll think this is a good thing ;-) > >You will have a problem with large text/blob columns. There is a size >restriction for attribute items. > >I don't recall the size limit, off the top of my head. I am >investigating and will reply back. > > > >>* Null columns are not represented >>* No special casing of blob values are supported >>* Column attributes are based on JDBC definitions >> >> > > >------------------------------------------------------- >This SF.Net email is sponsored by BEA Weblogic Workshop >FREE Java Enterprise J2EE developer tools! >Get your free copy of BEA WebLogic Workshop 8.1 today. >http://ads.osdn.com/?ad_idP47&alloc_id808&opÌk >Firebird-Devel mailing list, web interface at https://lists.sourceforge.net/lists/listinfo/firebird-devel > > > |
From: Jim S. <ja...@ne...> - 2004-09-06 15:55:27
|
Todd Fisher wrote: > I would make the following modification to your proposed representation: > > <?xml version="1.0" encoding="UTF-8"?> > <database> > <metadata> > <table name="table1"> > <col name="COL1" type="type1"/> > <col name="COL2" type="type2"/> > <data table="table1> > <row><COL1></COL1><COL2><![CDATA[binary data or otherwise > necessary to escape data]]></COL2></row> > <row><COL1></COL1></row> > <row><COL1></COL1><COL2><![CDATA[]]></COL2></row> > <row><COL1></COL1></row> > </data> > </database> > > * Nulls are easily handled by a missing column. > * Blobs can be delt with using CDATA, you might want to use this for > text too although text might be just as well escaping. I don't see the merit of replacing the tag "column" with a column name. It doesn't follow general xml schema rules and it leads to a big problem with column names that are valid xml tags. As for CDATA, it may be useful for a cleaner representation of text with lots of characters that need escapes, but it really don't work at all for binary data. All sorts of things tend to break with things like nulls and backspaces are included in more or less ascii files. I'm currently testing for the presence of a byte value below 10 (decimal) and dropping into base64. An examples follows <?xml version="1.0" encoding="US-ASCII"?> <database> <metadata> <table name="RDB$TRIGGERS"> <column name="RDB$TRIGGER_NAME" type="char" precision="32"/> <column name="RDB$RELATION_NAME" type="char" precision="32"/> <column name="RDB$TRIGGER_SEQUENCE" type="smallint"/> <column name="RDB$TRIGGER_TYPE" type="smallint"/> <column name="RDB$TRIGGER_SOURCE" type="clob"/> <column name="RDB$TRIGGER_BLR" type="blob"/> <column name="RDB$DESCRIPTION" type="clob"/> <column name="RDB$TRIGGER_INACTIVE" type="smallint"/> <column name="RDB$SYSTEM_FLAG" type="smallint"/> <column name="RDB$FLAGS" type="smallint"/> <index name="RDB$INDEX_38"> <column name="RDB$RELATION_NAME"/> </index> <index name="RDB$INDEX_8" type="unique"> <column name="RDB$TRIGGER_NAME"/> </index> </table> </metadata> <data> <rows table="RDB$TRIGGERS"> <row> <column name="RDB$TRIGGER_NAME">RDB$TRIGGER_1</column> <column name="RDB$RELATION_NAME">RDB$USER_PRIVILEGES</column> <column name="RDB$TRIGGER_SEQUENCE">0</column> <column name="RDB$TRIGGER_TYPE">3</column> <column name="RDB$TRIGGER_BLR" encoding="base64">/F6.H.**</column> <column name="RDB$SYSTEM_FLAG">1</column> </row> <row> <column name="RDB$TRIGGER_NAME">RDB$TRIGGER_8</column> <column name="RDB$RELATION_NAME">RDB$USER_PRIVILEGES</column> <column name="RDB$TRIGGER_SEQUENCE">0</column> <column name="RDB$TRIGGER_TYPE">5</column> <column name="RDB$TRIGGER_BLR" encoding="base64">/EUvDFQ.1Z72EWF4GIJAF3xCEIp3/oA/GVBGF26YIYJAEJF7HotTFYZ3H2FH .oQu9lQ12J72EWFGFIl/J2ZDHZxCEIp33k.FIYF07373H23IGIxCLot/HIIj 3kACIYF072N7FIl2Lot/HIIL..tGF26YFYZ3H2FTHY3BFTw001QL.l7GF26Y IoJ1JJ77J3ZTEol/IpAJ1UY.Ip3A72RGEItI.UR1.IcIIYF073B3EpJGGJFN LoBAEJBHFJA2FmwL//7GF26YIoJ1JJ77J3ZTEol/IpAL.l7GF26YIoJ1JJ77 J3ZTEol/IpDz.UI2zkc1/E6/9FQ32Z72EWFHFIBJIYZIKJx1H23HIzzzzzzz H.**</column> <column name="RDB$SYSTEM_FLAG">1</column> </row> |
From: Arno B. <fir...@ab...> - 2004-09-06 16:31:28
|
Hi Jim, > As for CDATA, it may be useful for a cleaner representation of text with > lots of characters that need escapes, but it really don't work at all > for binary data. All sorts of things tend to break with things like > nulls and backspaces are included in more or less ascii files. I'm > currently testing for the presence of a byte value below 10 (decimal) > and dropping into base64. An examples follows I agree that CDATA seems too be not very usefull (due the ]]>), beside your example there's another way in xml to represent characters that aren't excepted by their normal form : ------------------------------------------- 4.1 Character and Entity References [Definition: A character reference refers to a specific character in the ISO/IEC 10646 character set, for example one not directly accessible from available input devices.] Character Reference [66] CharRef ::= '&#' [0-9]+ ';' | '&#x' [0-9a-fA-F]+ ';'[WFC: Legal Character] Well-formedness constraint: Legal Character ------------------------------------------- For example (this could be the default encoding then) : <column name="RDB$TRIGGER_BLR">�L</column> At least with the encoding attribute you're flexible :) represents : blr_version5, blr_leave, 0, blr_eoc Regards, Arno Brinkman ABVisie -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- Firebird open source database (based on IB-OE) with many SQL-99 features : http://www.firebirdsql.org http://www.firebirdsql.info http://www.fingerbird.de/ http://www.comunidade-firebird.org/ Support list for Interbase and Firebird users : fir...@ya... Nederlandse firebird nieuwsgroep : news://80.126.130.81 |
From: Jim S. <ja...@ne...> - 2004-09-07 15:17:36
|
Arno Brinkman wrote: >I agree that CDATA seems too be not very usefull (due the ]]>), beside your >example there's another way in xml to represent characters that aren't >excepted by their normal form : > >------------------------------------------- >4.1 Character and Entity References > >[Definition: A character reference refers to a specific character in the >ISO/IEC 10646 character set, for example one not directly accessible from >available input devices.] > >Character Reference > [66] CharRef ::= '&#' [0-9]+ ';' > | '&#x' [0-9a-fA-F]+ ';'[WFC: Legal Character] > > > Yes, this could be done, but keep in mind that an escaped character is 4 or 5 bytes where a base64 represented byte is 1.25 bytes. For large blobs, that's huge. The large question, however, is which is the more appropriate representation. There are roughly three cases: Mostly characters some binary, general mix of character and binary data, and mostly binary. Escaping individual characters would work for the first case, but I think this is extremely rare. BLR falls into the second category, but being able to see the ASCII inside of BLR is probably more dangerous than interesting. For heavily binary stuff like jpegs, gifs, and object files, escaping binary bytes loses heavily. -- Jim Starkey Netfrastructure, Inc. 978 526-1376 |
From: Lester C. <le...@ls...> - 2004-09-06 19:21:29
|
Jim Starkey wrote: > <?xml version="1.0" encoding="US-ASCII"?> > <database> > <metadata> > <table name="RDB$TRIGGERS"> > <column name="RDB$TRIGGER_NAME" type="char" precision="32"/> > <column name="RDB$RELATION_NAME" type="char" precision="32"/> > <column name="RDB$TRIGGER_SEQUENCE" type="smallint"/> > <column name="RDB$TRIGGER_TYPE" type="smallint"/> > <column name="RDB$TRIGGER_SOURCE" type="clob"/> > <column name="RDB$TRIGGER_BLR" type="blob"/> > <column name="RDB$DESCRIPTION" type="clob"/> > <column name="RDB$TRIGGER_INACTIVE" type="smallint"/> > <column name="RDB$SYSTEM_FLAG" type="smallint"/> > <column name="RDB$FLAGS" type="smallint"/> > <index name="RDB$INDEX_38"> > <column name="RDB$RELATION_NAME"/> > </index> > <index name="RDB$INDEX_8" type="unique"> > <column name="RDB$TRIGGER_NAME"/> > </index> > </table> > </metadata> > <data> > <rows table="RDB$TRIGGERS"> > <row> > <column name="RDB$TRIGGER_NAME">RDB$TRIGGER_1</column> > <column name="RDB$RELATION_NAME">RDB$USER_PRIVILEGES</column> > <column name="RDB$TRIGGER_SEQUENCE">0</column> > <column name="RDB$TRIGGER_TYPE">3</column> > <column name="RDB$TRIGGER_BLR" > encoding="base64">/F6.H.**</column> > <column name="RDB$SYSTEM_FLAG">1</column> > </row> > <row> > <column name="RDB$TRIGGER_NAME">RDB$TRIGGER_8</column> > <column name="RDB$RELATION_NAME">RDB$USER_PRIVILEGES</column> > <column name="RDB$TRIGGER_SEQUENCE">0</column> > <column name="RDB$TRIGGER_TYPE">5</column> > <column name="RDB$TRIGGER_BLR" > encoding="base64">/EUvDFQ.1Z72EWF4GIJAF3xCEIp3/oA/GVBGF26YIYJAEJF7HotTFYZ3H2FH > > .oQu9lQ12J72EWFGFIl/J2ZDHZxCEIp33k.FIYF07373H23IGIxCLot/HIIj > 3kACIYF072N7FIl2Lot/HIIL..tGF26YFYZ3H2FTHY3BFTw001QL.l7GF26Y > IoJ1JJ77J3ZTEol/IpAJ1UY.Ip3A72RGEItI.UR1.IcIIYF073B3EpJGGJFN > LoBAEJBHFJA2FmwL//7GF26YIoJ1JJ77J3ZTEol/IpAL.l7GF26YIoJ1JJ77 > J3ZTEol/IpDz.UI2zkc1/E6/9FQ32Z72EWFHFIBJIYZIKJx1H23HIzzzzzzz > H.**</column> > <column name="RDB$SYSTEM_FLAG">1</column> > </row> Looking nice - but what about default="xxx" and how about 'Not Null'? Also Primary Key? -- Lester Caine ----------------------------- L.S.Caine Electronic Services |
From: Jim S. <ja...@ne...> - 2004-09-07 15:53:26
|
Lester Caine wrote: > > Looking nice - but what about > default="xxx" > and how about 'Not Null'? > > Also Primary Key? > > Primary key is handled but the samples don't have any, so you don't see it. Not null is now there. Default value and domain (global field) definitions are more problematic. First, they're not supported by JDBC on which IscDbc is modelled. Second, I don't really know how to differentiate between bona fide and artificial domains. I would like to support the former and not clutter up the world with the latter. <metadata> <table name="EMPLOYEE"> <column name="SALARY" type="double" scale="-2" nullable="yes"/> <column name="HIRE_DATE" type="timestamp" nullable="yes"/> <column name="JOB_GRADE" type="smallint" nullable="yes"/> <column name="PHONE_EXT" type="varchar" precision="4"/> <column name="LAST_NAME" type="varchar" precision="20" nullable="yes"/> <column name="EMP_NO" type="smallint" nullable="yes"/> <column name="FULL_NAME" type="varchar" precision="37"/> <column name="JOB_COUNTRY" type="varchar" precision="15" nullable="yes"/> <column name="JOB_CODE" type="varchar" precision="5" nullable="yes"/> <column name="DEPT_NO" type="char" precision="4" nullable="yes"/> <column name="FIRST_NAME" type="varchar" precision="15" nullable="yes"/> <primary_key> <column name="EMP_NO"/> </primary_key> <index name="NAMEX"> <column name="LAST_NAME"/> <column name="FIRST_NAME"/> </index> <index name="RDB$FOREIGN8"> <column name="DEPT_NO"/> </index> <index name="RDB$FOREIGN9"> <column name="JOB_CODE"/> <column name="JOB_GRADE"/> <column name="JOB_COUNTRY"/> </index> <index name="RDB$PRIMARY7" type="unique"> <column name="EMP_NO"/> </index> </table> Note that the primary key is referenced twice, once as primary key and once of unique index. Gotta figure out how handle that. Also have to add foreign key support. -- Jim Starkey Netfrastructure, Inc. 978 526-1376 |
From: Leyne, S. <Se...@br...> - 2004-08-31 22:42:30
|
Jim, > > Oh btw, '&' has to be escaped, both in an attribute value and in the > > element text. '&' is common for such escape. >=20 > And quotes, too, sir. And that's only the start! There are a large number of characters that need to be 'escaped' -- I won't list them here. Sean |
From: Todd F. <ta...@le...> - 2004-09-01 00:24:33
|
Leyne, Sean wrote: >Jim, > > > >>>Oh btw, '&' has to be escaped, both in an attribute value and in the >>>element text. '&' is common for such escape. >>> >>> >>And quotes, too, sir. >> >> > >And that's only the start! > >There are a large number of characters that need to be 'escaped' -- I >won't list them here. > > Why not use a CDATA block where you otherwise need to escape? -todd |
From: Ann W. H. <aha...@ib...> - 2004-09-01 16:20:31
|
At 08:24 PM 8/31/2004, Todd Fisher wrote: >Why not use a CDATA block where you otherwise need to escape? The goal of the exercise is to have a data representation that can be manipulated with a text editor. CDATA doesn't fit that requirement. Regards, Ann |
From: Thomas M. <tm...@bs...> - 2004-09-01 17:01:28
|
In addition, I believe this would be an add on (an option) to the current system. So you could still do it the same way you do it today. Please correct me if I am wrong Ann. So if these doesn't interest you, then ignore the feature. I like the idea a lot. Ann W. Harrison wrote: > At 08:24 PM 8/31/2004, Todd Fisher wrote: > > >> Why not use a CDATA block where you otherwise need to escape? > > > The goal of the exercise is to have a data representation that can > be manipulated with a text editor. CDATA doesn't fit that requirement. > > Regards, > > > Ann > > > ------------------------------------------------------- > This SF.Net email is sponsored by BEA Weblogic Workshop > FREE Java Enterprise J2EE developer tools! > Get your free copy of BEA WebLogic Workshop 8.1 today. > http://ads.osdn.com/?ad_id=5047&alloc_id=10808&op=click > Firebird-Devel mailing list, web interface at > https://lists.sourceforge.net/lists/listinfo/firebird-devel > -- Thomas Miller Delphi Client/Server Certified Developer BSS Accounting & Distribution Software BSS Enterprise Accounting FrameWork http://www.bss-software.com http://sourceforge.net/projects/dbexpressplus |
From: Olivier M. <om...@ti...> - 2004-09-01 00:25:43
|
Leyne, Sean wrote: >>>Oh btw, '&' has to be escaped, both in an attribute value and in the >>>element text. '&' is common for such escape. >> >>And quotes, too, sir. > > And that's only the start! > > There are a large number of characters that need to be 'escaped' -- I > won't list them here. Absolutely not Sean. ;-) All text that is not markup constitutes the character data of the document. And: CharData ::= [^<&]* - ([^<&]* ']]>' [^<&]*) See this paragraph, it resumes it all quite clearly: http://www.w3.org/TR/2004/REC-xml-20040204/#syntax -- Olivier Mascia |
From: Leyne, S. <Se...@br...> - 2004-09-01 01:49:48
|
Olivier, > > There are a large number of characters that need to be 'escaped' -- I > > won't list them here. >=20 > Absolutely not Sean. ;-) >=20 > All text that is not markup constitutes the character data of the > document. And: >=20 > CharData ::=3D [^<&]* - ([^<&]* ']]>' [^<&]*) >=20 > See this paragraph, it resumes it all quite clearly: >=20 > http://www.w3.org/TR/2004/REC-xml-20040204/#syntax Oh yes, _quite clear_ (as clear as mud!)... my apologies! The list of characters that cannot be found in an attribute value (I know, I've tried) includes: Quote (") Apos (') Lt (<) Gt (>) Amp (&) Not to mention unprintable characters which are valid for CHAR fields but invalid for XML. Sean |
From: Leyne, S. <Se...@br...> - 2004-09-01 01:54:13
|
Olivier, > > I don't recall the size limit, off the top of my head. I am > > investigating and will reply back. >=20 > I don't recall having ever read about a size limit in the naming of > elements or in the values of attributes. Of course, there is always a > practical / system limit. But I don't remember having seen a drastic > limit. My apologizes, I was mistaken. The problem I ran into wasn't with the size but rather that an attribute value CANNOT contain CRLF characters. This was a problem for our routine since we were trying to keep the XML to as simple a definition as possible (and we wanted to avoid CData or Charset issues) -- the XML was going to be consumed by external/third-party systems/vendors, which could only be expected to have nominal XML support. Sean |
From: Olivier M. <om...@ti...> - 2004-09-01 15:27:29
|
On Tue, 31 Aug 2004 21:52:19 -0400 "Leyne, Sean" <Se...@br...> wrote: LS> The problem I ran into wasn't with the size but rather that an attribute LS> value CANNOT contain CRLF characters. Specific implementation limitation. CRLF or LF is defined as whitespace, just as space and tab. Whitespace CAN happen in an attribute value. When reading an attribute value, an xml processor is supposed to map each whitespace character to a single space. But I can understand that some implementations might be buggy with that. Anyway, this means that the whitespace of an attribute value is not stable. So not suitable for storing column values. Except if we declare 'our' xml not to be xml (which is possible of course, but then why use xml in the first place ?). http://www.w3.org/TR/2004/REC-xml-20040204/#AVNormalize -- Olivier Mascia |
From: Henk v. d. M. <hv...@ta...> - 2004-09-01 10:11:28
|
Hello, I am trying to build Firebird 1.5.1. I ran prepare.bat, but run into problems on make_boot.bat I get a lot of errors, the first being: Processing dsql/array.epp Calling GPRE for dsql/array.epp 'C:\firebird2\gen\gpre_boot' is not recognized as an internal or external command, operable program or batch file. Did I forget to install something? Thank you, Henk van der Meer |
From: Peter J. <pj...@wa...> - 2004-09-06 18:47:35
|
> <column name="RDB$TRIGGER_BLR">�L</column> This is not legal (wellformed) XML 1.0 Regards, Peter Jacobi |
From: Arno B. <fir...@ab...> - 2004-09-06 21:40:59
|
Hi, > > <column name="RDB$TRIGGER_BLR">�L</column> > > This is not legal (wellformed) XML 1.0 Why not? It's well-formed, but in hex it's (and that's what i wanted to write in fact): <?xml version="1.0" encoding="US-ASCII"?> <database> <column name="RDB$TRIGGER_BLR">�L</column> </database> You can check it with an on-line XML validator. Regards, Arno Brinkman ABVisie -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- Firebird open source database (based on IB-OE) with many SQL-99 features : http://www.firebirdsql.org http://www.firebirdsql.info http://www.fingerbird.de/ http://www.comunidade-firebird.org/ Support list for Interbase and Firebird users : fir...@ya... Nederlandse firebird nieuwsgroep : news://80.126.130.81 |
From: Peter J. <pj...@wa...> - 2004-09-06 22:07:02
|
Hi Arno, All, "Arno Brinkman" <fir...@ab...> wrote: > Why not? > It's well-formed, but in hex it's (and that's what i wanted to write in > fact): > > <?xml version="1.0" encoding="US-ASCII"?> > <database> > <column name="RDB$TRIGGER_BLR">�L</column> > </database> No, whether hex or not, it is not well formed: This disallows control characters: Character Range [2] Char ::= #x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] | [#x10000-#x10FFFF] /* any Unicode character, excluding the surrogate blocks, FFFE, and FFFF. */ And this paragraph says, that text substituted for entities is handled "as though it were part of the document at the location the reference was recognized" 4.4.2 Included [Definition: An entity is included when its replacement text is retrieved and processed, in place of the reference itself, as though it were part of the document at the location the reference was recognized.] The replacement text MAY contain both character data and (except for parameter entities) markup, which MUST be recognized in the usual way. (The string "AT&T;" expands to "AT&T;" and the remaining ampersand is not recognized as an entity-reference delimiter.) A character reference is included when the indicated character is processed in place of the reference itself. The question regulary comes up on the mailing lists, e.g: http://lists.xml.org/archives/xml-dev/199804/msg00502.html http://lists.xml.org/archives/xml-dev/199804/msg00504.html > You can check it with an on-line XML validator. You can check an XML validator with your input. If it doesn't flag it as not wellformed, it's a rather bad validator. Try xmllint from libxml2. All things considered, XML is for markup. If you want to encode data structures, there is ASN.1 (only joking, nowadays it must be XML, whether it fits or not) Regards, Peter Jacobi |
From: Leyne, S. <Se...@br...> - 2004-09-07 17:14:31
|
Jim, > Default value and domain (global field) definitions are more > problematic. First, they're not supported by JDBC on which IscDbc is > modelled. Second, I don't really know how to differentiate between bona > fide and artificial domains. I would like to support the former and not > clutter up the world with the latter. >=20 > <metadata> > <table name=3D"EMPLOYEE"> > <column name=3D"SALARY" type=3D"double" scale=3D"-2" nullable=3D"yes"/> This presentation is 'unexpected' (I didn't expect it). Although "double" is the true datatype, it is not the SQL datatype which I expected to be used in the DDL (i.e. NUMERIC( 10, 2)). Doesn't this representation loose the accuracy of data schema? How would a value with a datatype of NUMERIC( 10, 2) be presented? <column name=3D"SALARY" type=3D"numeric" size=3D"10" precision=3D"2" nullable=3D"yes"/>??? Further, keeping "small is beautiful" in mind, wouldn't it be better to use 'NotNull=3D"yes"' instead of 'nullable=3D"yes"', since NotNull is = the exception more than the norm?=20 Sean |
From: Jim S. <ja...@ne...> - 2004-09-07 17:55:37
|
Leyne, Sean wrote: >> <metadata> >> <table name="EMPLOYEE"> >> <column name="SALARY" type="double" scale="-2" >> >> >nullable="yes"/> > >This presentation is 'unexpected' (I didn't expect it). > >Although "double" is the true datatype, it is not the SQL datatype which >I expected to be used in the DDL (i.e. NUMERIC( 10, 2)). Doesn't this >representation loose the accuracy of data schema? > > I haven't a clue what the original DDL looked like, but the database created from the standard employee.fbk file has the employee salary field defined with a double (aka long float) and a scale of -2. I report what I see. >How would a value with a datatype of NUMERIC( 10, 2) be presented? > ><column name="SALARY" type="numeric" size="10" precision="2" >nullable="yes"/>??? > > Probably type="bigint" scale="-2". I've adopted, for better or worse, more or less JDBC conventions and usages. The reasoning is that I believe that C++ JDBC binding used in IscDbc should become the primary API for Firebird, leaving the dreadful DSQL interface as an exercise in backwards compatibility. JDBC has it's fair share of limitation, warts, and flaws, but it's miles ahead of any competitors. > >Further, keeping "small is beautiful" in mind, wouldn't it be better to >use 'NotNull="yes"' instead of 'nullable="yes"', since NotNull is the >exception more than the norm? > > > > I have a personal aversion to negativity. Nullable is clear. Not nullable is also clear. Not NotNull violates our innate distrust of double negatives (not, he said pointedly, "long floating" negatives). It also give me something to bargain with if/when you come up with a more substantial objection. Remember my stock advice to fledgling language designers: Always pick bad keywords so meddlers have something to change. -- Jim Starkey Netfrastructure, Inc. 978 526-1376 |
From: Leyne, S. <Se...@br...> - 2004-09-07 18:43:01
|
Leyne, Sean wrote: <metadata> <table name=3D"EMPLOYEE"> <column name=3D"SALARY" type=3D"double" scale=3D"-2" nullable=3D"yes"/> =20 This presentation is 'unexpected' (I didn't expect it). =20 Although "double" is the true datatype, it is not the SQL datatype which I expected to be used in the DDL (i.e. NUMERIC( 10, 2)). Doesn't this representation loose the accuracy of data schema? =20 I haven't a clue what the original DDL looked like, but the database created from the standard employee.fbk file has the employee salary field defined with a double (aka long float) and a scale of -2. I report what I see. =20 <SL> FYI, The original DDL was NUMERIC( 10, 2). =20 <SL> Then what is the use of the schema information in the XML? =20 =20 <SL> By your own words, it doesn't provide true backup/restore functionality; since it doesn't care about maintaining the original schema definition (just a generalization of the datatypes). =20 <SL> The JDBC conventions are more than appropriate for Java and (as you suggest) internal uses, there are plenty of languages where a NUMERIC( 10, 2) and NUMERIC( 15, 2) would have significantly different meanings, due to the data access components. Therefore, the load/dump utility could produce a database which is incompatible with the target application. =20 =20 Remember my stock advice to fledgling language designers: Always pick bad keywords so meddlers have something to change. =20 <SL> Yes, I remember it well! You've gotten me to bite that apple more than once!!!. =20 =20 Sean |