From: SourceForge.net <no...@so...> - 2004-08-15 17:34:50
|
Bugs item #988592, was opened at 2004-07-10 20:58 Message generated for change (Comment added) made by migueljmol You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=379133&aid=988592&group_id=23629 Category: Applet Group: None Status: Open Resolution: None Priority: 5 Submitted By: Nobody/Anonymous (nobody) Assigned to: Bob Hanson (hansonr) Summary: cif issues Initial Comment: There is one more important CIF format that Jmol needs to be able to read. This comes from the Inorganic Crystal Structure Database, http://www.fiz-informationsdienste.de/en/DB/icsd/ For an example, see http://www.stolaf.edu/people/hansonr/jmol/cif/viewdir.ht m and select "ic1166.cif" To download the file, check for it at http://www.stolaf.edu/people/hansonr/jmol/cif Basically, this is a slight variant of the Cambridge Crystal Structure Database. Mostly the format of the atom information is different. I note that in this file there can be atoms designated with no coordinates! So we have here: Re1 Re3+ 4 e 0.0736(1) -.00119(7) 0.07901(8) 0. 1. Re2 Re3+ 4 e 0.4356(1) 0.06192(7) 0.45604(8) 0. 1. Cs1 Cs1+ 4 e 0.3809(2) 0.3961(2) 0.3240(2) 0. 1. Cs2 Cs1+ 4 e 0.1175(2) -.3131(1) -.0456(1) 0. 1. Cl1 Cl1- 4 e 0.2775(7) -.0799(4) 0.0192(5) 0. 1. Cl2 Cl1- 4 e -.0577(8) 0.0757(5) 0.2110(5) 0. 1. Cl3 Cl1- 4 e -.2203(7) 0.0564(4) 0.5429(5) 0. 1. Cl4 Cl1- 4 e 0.3159(7) -.0246(4) 0.3056(5) 0. 1. Cl5 Cl1- 4 e 0.4982(7) 0.2012(4) 0.5660(5) 0. 1. Cl6 Cl1- 4 e 0.1978(7) 0.1484(4) 0.0706(5) 0. 1. Cl7 Cl1- 4 e 0.0166(7) -.1522(4) 0.1619(5) 0. 1. Cl8 Cl1- 4 e 0.5887(7) 0.1235(5) 0.3295(5) 0. 1. O1 O2- 4 e 0.272(4) 0.194(2) 0.338(3) 9.6 1. H1 H1+ 4 e 0. 2. Note the H1 H1+ line has no coordinate. A single space is being used, not just generic white space, to separate fields. The error given is: Java(TM) Plug-in: Version 1.4.0_01 Using JRE version 1.4.0_01 Java HotSpot(TM) Client VM User home directory = C:\Documents and Settings\hansonr Proxy Configuration: Browser Proxy Configuration FileManager.openFile(icsd_1166.cif) SmarterModelAdapter:The model resolver thinks:Cif java.lang.NullPointerException at org.jmol.adapter.smarter.ModelReader.parseFloat (ModelReader.java:45) at org.jmol.adapter.smarter.CifReader.processAtomSiteLoopB lock(CifReader.java:288) at org.jmol.adapter.smarter.CifReader.processLoopBlock (CifReader.java:144) at org.jmol.adapter.smarter.CifReader.readModel (CifReader.java:65) at org.jmol.adapter.smarter.ModelResolver.resolveModel (ModelResolver.java:57) at org.jmol.adapter.smarter.SmarterModelAdapter.openBuffe redReader(SmarterModelAdapter.java:55) at org.openscience.jmol.viewer.managers.FileManager$FileO penThread.openReader(FileManager.java:409) at org.openscience.jmol.viewer.managers.FileManager$FileO penThread.openInputStream(FileManager.java:402) at org.openscience.jmol.viewer.managers.FileManager$FileO penThread.run(FileManager.java:379) at org.openscience.jmol.viewer.managers.FileManager.openF ile(FileManager.java:100) at org.openscience.jmol.viewer.JmolViewer.openFile (JmolViewer.java:897) at org.openscience.jmol.viewer.script.Eval.load (Eval.java:1566) at org.openscience.jmol.viewer.script.Eval.instructionDispatc hLoop(Eval.java:337) at org.openscience.jmol.viewer.script.Eval.run (Eval.java:281) at java.lang.Thread.run(Unknown Source) error opening file:/D:/js/struc/data/csd/icsd_1166.cif java.lang.NullPointerException openFile(icsd_1166.cif) 210 ms InterruptedException! But there are more problems. The following block seems to be causing great difficulty: loop_ _atom_site_aniso_label _atom_site_aniso_type_symbol _atom_site_aniso_U_11 _atom_site_aniso_U_22 _atom_site_aniso_U_33 _atom_site_aniso_U_12 _atom_site_aniso_U_13 _atom_site_aniso_U_23 Re1 Re3+ 0.0046(1) 0.00294(4) 0.00353(6) -.0008(1) 0.0005(1) -.0002(1) Re2 Re3+ 0.0049(1) 0.00264(4) 0.00456(6) 0.0005(1) 0.0019(1) 0.0005(1) Cs1 Cs1+ 0.0097(2) 0.00993(14) 0.0068(1) 0.0024(3) 0.0049(3) 0.0037(2) Cs2 Cs1+ 0.0082(2) 0.00459(9) 0.0080(1) 0.0015(3) 0.0001(3) -.0029(2) Cl1 Cl1- 0.0059(8) 0.0050(4) 0.0068(5) 0.0028(9) 0.001 (1) -.0014(7) Cl2 Cl1- 0.0104(9) 0.0054(4) 0.0049(5) 0.0007(11) 0.004 (1) -.0019(7) Cl3 Cl1- 0.0062(7) 0.0041(3) 0.0074(5) 0.0015(9) 0.004 (1) -.0006(7) Cl4 Cl1- 0.0088(8) 0.0050(4) 0.0047(4) -.0021(10) -.001 (1) 0.0011(6) Cl5 Cl1- 0.0083(8) 0.0033(3) 0.0070(5) 0.0007(9) 0.000 (1) -.0004(7) Cl6 Cl1- 0.0075(8) 0.0035(3) 0.0062(5) -.0038(9) 0.000 (1) -.0002(7) Cl7 Cl1- 0.0101(9) 0.0033(3) 0.0052(5) -.0028(9) 0.000 (1) 0.0020(6) Cl8 Cl1- 0.0100(9) 0.0053(4) 0.0056(5) -.0039(10) 0.005 (1) 0.0029(7) and charges on the type symbol is causing the atom symbol to be misinterpreted. see http://www.stolaf.edu/people/hansonr/jmol/cif/ic1166b.ci f for what this should (probably) look like. (It's not a great data set.) similarly: http://www.stolaf.edu/people/hansonr/jmol/cif/ic30516.ci f and http://www.stolaf.edu/people/hansonr/jmol/cif/ic30516b. cif Bob Hanson ---------------------------------------------------------------------- >Comment By: Miguel (migueljmol) Date: 2004-08-15 19:34 Message: Logged In: YES user_id=1050060 Bob, Did you get any feedbackf rom the Cambridge Crystal Structure folks? Have we reached resolution on this issue? ---------------------------------------------------------------------- Comment By: Miguel (michaelthoward) Date: 2004-07-15 08:37 Message: Logged In: YES user_id=608250 Regarding determining element type ... I will modify the Jmol CifReader so that it does a better job of determining the element symbol. ---------------------------------------------------------------------- Comment By: Bob Hanson (hansonr) Date: 2004-07-15 01:51 Message: Logged In: YES user_id=1082841 OK, Peter makes a very good distinction there. a) Format: I agree 100% that the file is invalid because of the improper use of white space. The IUCr's own CIF checker failed this one, so that's something I'm sure they will look into. b) Semantics: I suggest Jmol be somewhat more flexible in reading atom names. At the very least, we should strip ![A-Z|a-z] from the _atom_site_type_symbol to determine the element symbol. No need to for perfection here. Some of these files will be very odd, since two atoms can occupy the same position, as in "Ni2+Fe3+" for an atom name. I have only looked at a few of these IUCr CIF files; I suspect that mostly they are just fine and that this one was an exception. Bob Hanson ---------------------------------------------------------------------- Comment By: Peter Murray-Rust (petermr) Date: 2004-07-14 13:11 Message: Logged In: YES user_id=125666 _atom_type_symbol Name '_atom_type_symbol' Category: atom_type Data type: char Must appear in a looped list May match a value of '_atom_site_type_symbol' Examples: C Cu2+ H(SDS) dummy FeNi <bob> Date: 2004-07-11 20:38 Sender: nobody Logged In: NO I will contact ICSD and enquire, but it is quite possible that they are doing exactly this. IMHO, their software reads the file. So should Jmol. Standards aside... Bob Hanson </bob> We have to be very tough on this (I am part of the COMCIFs committee process and if there is a problem I will take it up). There are two layered aspects: - syntax. does it conform with the CIF specification. (This is similar to whether an XML document is well-formed). The first example didn't. It is therefore invalid. There are a range of CIF checker tools on the IUCr site and if any of them flag the CIF as invalid then IMO ICSD have the problem. - semantics. This is whether the value is reasonable. (The second example seems well formed at first glance) Unfortunately there is less experience in developing semantics tools for CIF. The It sounds as if the "CL1-" is causing problems. This refers to an atom_type_symbol which is defined as Definition The code used to identify the atom specie(s) representing this atom type. Normally this code is the element symbol. The code may be composed of any character except an underline with the additional proviso that digits designate an oxidation state and must be followed by a + or - character. It would be useful if CIF had used a regular expression for this but... In any case "Cl1-" appears to be a semantically valid label. So if this is the problem and if it is in Jmol and if Jmol wishes to read this type of semantics then Jmol needs adjusting. Note that it is not easy to write code for transformation of CIF semantics which is why I do it all in XML. PeterMR ---------------------------------------------------------------------- Comment By: Miguel (migueljmol) Date: 2004-07-14 11:23 Message: Logged In: YES user_id=1050060 Please follow up with the ICSD folks and see what they say. Proliferation of support for *invalid* files is a major problem. We don't do anybody any favors by being *flexible* ... it just comes back to bite you. The IUCr has worked very hard to try to promote the CIF standard. And they claim to enforce (legally) that people who claim to support CIF actually do so. If they have software that reads invalid CIF files then they should be forced to fix it ... first with a gentle nudge and then (if necessary) with a report to the CIF-police (IUCr) Miguel ---------------------------------------------------------------------- Comment By: Miguel (migueljmol) Date: 2004-07-12 12:01 Message: Logged In: YES user_id=1050060 I am glad that PeterMR saw this ... I was going to send it to him and ask him if it was valid. I do not believe that there is any way that we can reliably read this file in the context of a CIF reader. I believe that there is nothing special about newline characters within the context of a data loop. That is, some files have newline characters to separate the data values associated with a single atom. Therefore, the reader cannot reliably 'guess' when to move to a new atom. I understand that you are going to ask ICSD about this file ... let's see what their stance is ... Miguel ---------------------------------------------------------------------- Comment By: Nobody/Anonymous (nobody) Date: 2004-07-12 05:38 Message: Logged In: NO I will contact ICSD and enquire, but it is quite possible that they are doing exactly this. IMHO, their software reads the file. So should Jmol. Standards aside... Bob Hanson ---------------------------------------------------------------------- Comment By: Nobody/Anonymous (nobody) Date: 2004-07-11 19:00 Message: Logged In: NO This is an invalid CIF. CIFs have two data structures, loops and items. In a loop like this all rows must have the same number of (whitespace-separated) fields. Note that the number of fields in each row and their semantics is given by the list of names in the loop whose length must equal each row. For more information on CIF syntax, from which no deviations are allowed, see http://www.iucr.org PeterMR NB I doubt that the ICSD is emitting invalid CIFs on a systematic basis. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=379133&aid=988592&group_id=23629 |