|
From: Peter Murray-R. <pm...@ca...> - 2010-03-21 12:54:04
|
Sorry not to have replied earlier. CML is deisgned for extensibility - primarily through other namespaces. It is possible to add foreign attributes and foreign elements (and I assume that gfx does not resolve to http://www.xml-cml.org/schema). So <atom gfx:color="#ff0000"/> is perfectly OK - you can use it to mean whatever you want. CML parsers are allowed to ignore it. Similarly: <atom elementType="O"> <gfx:color>#ff0000</gfx:color> </atom> is allowed. Note that: <atom gfx:radius:units="angstrom"/> is badly formed XML - it could be: <atom gfx:radius_units="angstrom"/> Note that attributes in CML are NOT in the CML namespace - this is an XML feature, not a CML one. Writing: <atom cml:elementType="Cl"/> (where cml resources to CML namespace ) is NOT recommended and is NOT the same as : <atom elementType="Cl"/> This is the syntax. The philosophy is that we can extend XML-languages through community agreement and practice. If yous wish to write: <atom egon:radius="2.0"/> you may - the only question is whether other people will understand that and write software. We tend to reserve "cmlx" for extensions in JUMBO and CMLLite. There is no definitive list of such extensions but there are a number we use regularly. In general we extend CML through dictionaries. In <atom> <property dictRef="gfx:color"> <scalar type="xsd:string">#ff0000</ scalar> </property> <property dictRef="cml:radius"> <scalar type="xsd:float" units="units:angstrom">1.2</scalar> </property> </atom> There is agreed semantics that there should be namespaced dictionary entries for color and radius. They are further enhanced by the typing (string/float/units) in the XML. In principle it's better to have the typing in the dictionary and we are moving that way. There is no *CML* semantics that says that "gfx:color" should have a given type or form or have a dictionary entry. 2010/3/21 Konstantin Tokarev <an...@ya...> > Hi all, > I think this peace of XML cleanly demonstrates excessiveness, low human > readability and parsing inefficiency: > <atom> > <property dictRef="gfx:color"> > <scalar type="xsd:string">#ff0000</scalar> > </property> > <property dictRef="cml:radius"> > <scalar type="xsd:float" units="units:angstrom">1.2</scalar> > </property> > </atom> > > XML is not always very human readable, but nor are most data formats. A Molfile is not very human readable in places either. A Gaussian archive file is almost human-unreadable. A CDX file is completely human-unreadable. The implortant thing is that the semantics are unambiguous. I'm am afraid that's a necessary payoff in the machine age. IMHO, this information could be easily stored as > <atom gfx:color="#ff0000" cml:radius="1.2" cml:radius:units="angstrom"/> > > If I'm wrong, please, explain why. > > It could also be stored as red atom radius 1.2A That's very human readable and rather difficult for a machine without special implicit conventions. It's always a tradeoff. P. -- Peter Murray-Rust Reader in Molecular Informatics Unilever Centre, Dep. Of Chemistry University of Cambridge CB2 1EW, UK +44-1223-763069 |