Thread: [OpenFormula-discuss] FYI: Formulas not specified by Microsoft XML, either
Status: Alpha
Brought to you by:
dwheeler
From: David A. W. <dwh...@dw...> - 2005-11-07 18:39:28
|
FYI: I've learned that Microsoft's Brian Jones takes some shots at OpenDocument because it doesn't specify spreadsheet formulas in detail; see: http://blogs.msdn.com/brian_jones/archive/2005/10/04/477127.aspx Those in glass houses should not throw stones. As far as I know, Microsoft's XML format is just as unspecified as OpenDocument. In both cases, formulas are a "string" with some examples. Microsoft's XML has EXACTLY the same limitations as OpenDocument, in terms of its formula specifications. Thus Jones' claims that they uniquely support "full fidelity formats" are complete nonsense. Microsoft's formats do manage to exchange data between two identical versions of Microsoft Office. So what? All the spreadsheets can do that, there's nothing interesting about that. What is DESIRED is a standard, public format, implementable by ANYONE (without restriction on their licenses, etc.) so that ALL spreadsheets can exchange data. Microsoft's XML format does not even begin to do this. Microsoft's licensing approach makes it illegal to have open competition, or for customers to access their data in arbitrary ways. The failure of Microsoft's XML formats to specify the formulas means that their approach is not even TECHNICALLY an improvement over OpenDocument in this area. "Whatever Excel appears to do today" is NOT a specification. It changes with each release. And are the bugs in Excel REALLY required? For example, Excel still does not correctly implement AND and OR in an array formula, making these two fundamental functions useless in an important circumstance. A _real_ specification of functions would make it clear what is intended... vs. what is a bug. Thanks; I thought everyone on this list ought to know. --- David A. Wheeler |
From: Dennis E. H. <den...@ac...> - 2005-11-07 21:06:00
|
David, I'm confused. 1. In the page on Brian Jones blog that you link to (and is based on a concern expressed by Tim Bray of Sun) there is this statement, near the end: "The Microsoft Office Open XML formats are specifically designed as an XML representation of our full file formats. Everything you can do in our default format is represented as XML. Our formats are primarily designed around viewing, editing and integrating the files with data, formulas, and other application behavior." 2. Later, in a comment response to information about your work on OpenFormula, <http://blogs.msdn.com/brian_jones/archive/2005/10/04/477127.aspx#477463> , Brian makes a further observation: "I'm actually curious about what peoples thoughts were around formulas. One could make an argument that using strings for formulas is the right way to go, but in order to have a shared document format I'm assuming they still need to have everyone agree on a single type of syntax for those strings. Otherwise the formats aren't interoperable." 3. I can't tell from these comments whether the Office "12" Open XML format for Excel will use strings or XML structures for formulas. Until I saw your posting, I had the (unjustified) thought that they intended to use XML structures, except I am sure the possible bloating of the resulting files and parsing time would be a critical consideration. 4. In the provisional schemas that have been published, I notice that the simple type, ST_Formula, is based on xs:string but I am not comfortable enough with XML Schema and the sketchy form of the Office "12" preliminary schema to know the import of that. (For example, is this an entity name or is it an attribute name, that kind of thing. I haven't dug into their schema enough to understand how it all holds together.) 5. If formulas are in strings, the XML schema cannot define the syntax. That doesn't mean the syntax and semantics are not defined, it means they aren't to be found in the schema. 6. Maybe we should ask Brian for clarification before we speculate so much? Also, where are the examples you are looking at? Are they using the current Office 2003 Excel XML or are they done with the proposed Office 12 format, which is supposed to be a complete round-trip fidelity-preserving solution compared to the initial steps taken in Office 2003? - Dennis -----Original Message----- Thread http://sourceforge.net/mailarchive/forum.php?thread_id=8911498&forum_id=4663 2 by David A. Wheeler Sent: Monday, November 07, 2005 10:40 To: ope...@li... Subject: [OpenFormula-discuss] FYI: Formulas not specified by Microsoft XML, either FYI: I've learned that Microsoft's Brian Jones takes some shots at OpenDocument because it doesn't specify spreadsheet formulas in detail; see: http://blogs.msdn.com/brian_jones/archive/2005/10/04/477127.aspx Those in glass houses should not throw stones. As far as I know, Microsoft's XML format is just as unspecified as OpenDocument. In both cases, formulas are a "string" with some examples. [ ... ] |
From: Dennis E. H. <den...@ac...> - 2005-11-08 01:51:59
|
A short after-thought: 1. However Microsoft matures the specifications for Office "12" Open XML and details of the elements and attributes, I would think that has nothing to do with the value of producing an OpenFormula specification that is widely acceptable in conjunction with OpenDocument. 2. Our task would seem to be unchanged and maybe we should not allow ourselves to be distracted by suspected Microsoft maneuvers. - Dennis -----Original Message----- http://sourceforge.net/mailarchive/message.php?msg_id=13781771 On Behalf Of Dennis E. Hamilton on thread http://sourceforge.net/mailarchive/forum.php?thread_id=8911498&forum_id=4663 2 Sent: Monday, November 07, 2005 13:06 To: ope...@li... Subject: RE: [OpenFormula-discuss] FYI: Formulas not specified by Microsoft XML, either David, I'm confused. 1. In the page on Brian Jones blog that you link to (and is based on a concern expressed by Tim Bray of Sun) there is this statement, near the end: "The Microsoft Office Open XML formats are specifically designed as an XML representation of our full file formats. Everything you can do in our default format is represented as XML. Our formats are primarily designed around viewing, editing and integrating the files with data, formulas, and other application behavior." 2. Later, in a comment response to information about your work on OpenFormula, <http://blogs.msdn.com/brian_jones/archive/2005/10/04/477127.aspx#477463> , Brian makes a further observation: "I'm actually curious about what peoples thoughts were around formulas. One could make an argument that using strings for formulas is the right way to go, but in order to have a shared document format I'm assuming they still need to have everyone agree on a single type of syntax for those strings. Otherwise the formats aren't interoperable." 3. I can't tell from these comments whether the Office "12" Open XML format for Excel will use strings or XML structures for formulas. Until I saw your posting, I had the (unjustified) thought that they intended to use XML structures, except I am sure the possible bloating of the resulting files and parsing time would be a critical consideration. 4. In the provisional schemas that have been published, I notice that the simple type, ST_Formula, is based on xs:string but I am not comfortable enough with XML Schema and the sketchy form of the Office "12" preliminary schema to know the import of that. (For example, is this an entity name or is it an attribute name, that kind of thing. I haven't dug into their schema enough to understand how it all holds together.) 5. If formulas are in strings, the XML schema cannot define the syntax. That doesn't mean the syntax and semantics are not defined, it means they aren't to be found in the schema. 6. Maybe we should ask Brian for clarification before we speculate so much? Also, where are the examples you are looking at? Are they using the current Office 2003 Excel XML or are they done with the proposed Office 12 format, which is supposed to be a complete round-trip fidelity-preserving solution compared to the initial steps taken in Office 2003? - Dennis -----Original Message----- Thread http://sourceforge.net/mailarchive/forum.php?thread_id=8911498&forum_id=4663 2 by David A. Wheeler Sent: Monday, November 07, 2005 10:40 To: ope...@li... Subject: [OpenFormula-discuss] FYI: Formulas not specified by Microsoft XML, either FYI: I've learned that Microsoft's Brian Jones takes some shots at OpenDocument because it doesn't specify spreadsheet formulas in detail; see: http://blogs.msdn.com/brian_jones/archive/2005/10/04/477127.aspx Those in glass houses should not throw stones. As far as I know, Microsoft's XML format is just as unspecified as OpenDocument. In both cases, formulas are a "string" with some examples. [ ... ] ------------------------------------------------------- SF.Net email is sponsored by: Tame your development challenges with Apache's Geronimo App Server. Download it for free - -and be entered to win a 42" plasma tv or your very own Sony(tm)PSP. Click here to play: http://sourceforge.net/geronimo.php _______________________________________________ Openformula-discuss mailing list Ope...@li... https://lists.sourceforge.net/lists/listinfo/openformula-discuss |
From: John.Cowan <jc...@re...> - 2005-11-08 15:15:16
|
Dennis E. Hamilton scripsit: > 3. I can't tell from these comments whether the Office "12" Open XML format > for Excel will use strings or XML structures for formulas. I think it's clear that they will be strings. "Represented as XML" just means that everything is, one way or another, embedded inside XML elements and attributes. It says nothing about how the attribute values and element content may or may not be structured. > 4. In the provisional schemas that have been published, I notice that the > simple type, ST_Formula, is based on xs:string but I am not comfortable > enough with XML Schema and the sketchy form of the Office "12" preliminary > schema to know the import of that. (For example, is this an entity name or > is it an attribute name, that kind of thing. I haven't dug into their > schema enough to understand how it all holds together.) Simple types are neither entity names nor attribute names; rather, they are schema-internal labels for particular kinds of character content or attribute values. For example, the simple type xs:integer describes unlimited-precision integers, and indicates that the corresponding character content or attribute value consists solely of digits with a possible leading sign and possible leading or trailing spaces. In this case, ST_Formula is a subtype of xs:string, meaning that there are no particular interpretive semantics within the framework of XML Schema that can be placed on it. It is neither a number nor a boolean nor a language code nor a URI nor .... Depending on the exact definition of ST_Formula, it may be limited in size or constrained to match a regular expression, though I doubt it in this particular case. (I haven't looked at the schemas, as I don't want to risk having knowledge of patented material, which exposes one to triple damages.) > 5. If formulas are in strings, the XML schema cannot define the syntax. > That doesn't mean the syntax and semantics are not defined, it means they > aren't to be found in the schema. Correct. -- A rose by any other name John Cowan may smell as sweet, http://www.ccil.org/~cowan but if you called it an onion http://www.reutershealth.com you'd get cooks very confused. --RMS jc...@re... |
From: Dennis E. H. <den...@ac...> - 2005-11-08 17:21:52
|
Thanks John, that's very helpful. I don't think the XML schemas (or any schemas) are patentable subject matter. The Microsoft royalty-free patent license applies to Microsoft necessary/essential claims for software that processes data according to those schemas. However, the preview materials are under an EULA and that does appear to limit use and public discussion of them. That's why I haven't done more than glance them over, notice how thin they seem to be so far, and move on. The schema is not enough to know what to do and what the intended interpretation of the XML is, of course, or the OASIS OpenDocument specification would be about 40 pages instead of over 700 [;<). I think there may be a Microsoft newsgroup somewhere to ask further questions, but I haven't sought it out. There is a blog where new features for Office "12" Excel are being discussed: http://blogs.msdn.com/excel/default.aspx and their are a few entries about formula-related improvements too. - Dennis -----Original Message----- http://sourceforge.net/mailarchive/message.php?msg_id=13790750 On Behalf Of John.Cowan on thread http://sourceforge.net/mailarchive/forum.php?thread_id=8911498&forum_id=4663 2 Sent: Tuesday, November 08, 2005 07:15 To: Dennis E. Hamilton Cc: ope...@li... Subject: Re: [OpenFormula-discuss] FYI: Formulas not specified by Microsoft XML, either [ ... useful observations on how ST_Formula is used as part of the Office "12" Excel schema ...] Depending on the exact definition of ST_Formula, it may be limited in size or constrained to match a regular expression, though I doubt it in this particular case. (I haven't looked at the schemas, as I don't want to risk having knowledge of patented material, which exposes one to triple damages.) > 5. If formulas are in strings, the XML schema cannot define the syntax. > That doesn't mean the syntax and semantics are not defined, it means > they aren't to be found in the schema. Correct. [ ... ] |