From: Bob H. <ha...@st...> - 2006-04-29 16:57:44
|
My guiding principles include: a) We should be representing the data within Jmol in as close to its native format as possible (strings where strings, floats where floats, ints where ints, etc.) b) The JSON model is easily "human-readable", easy to implement, and well-discussed and documented in books and online papers. c) The JSON model allows for a very restricted set of data structures which, I think, is in general agreement with what you are suggesting, Miguel. Thus, the JSON model involves only two allowable primary data types: sequential and associative array literals (Vectors and Hashtables). These arrays may be lists of only five data types: strings, numbers, booleans, sequential array literals, and associative array literals. d) Note that there is a very simple JSON<-->XML mapping. These two formats differ primarily in notation. Not surprisingly, the JSON model is essentially the same as the XML model. e) The Java equivalent to JSON utilized by bob200603.PropertyManager involves Vectors and Hashtables. Within those, one can have the same five data types, with "number" being specified as Float or Integer, of course. Three additional data types were needed within the Java structure: Point3f, Vector3f, and Matrix3f. I believe these nine data types, then, to be sufficient to handle molecular file data. Certainly I've looked at a lot of files, and I don't see more than that. Nothing more complicated than that, I think. f) The current (bob200603) auxiliaryInfo implementation follows this model mainly so that the web designer has easy access to the file data. This is the key -- that the system we implement be simple enough but flexible enough so that real data can be passed simply and easily. It should mirror the data types found in real data files, namely: Vector, Hashtable, String, Float, Integer, Boolean, Point3f, Vector3f, and Matrix3f. g) So, in my opinion, these are the nine data types I think we should implement in auxiliaryInfo. h) bob200603.PropertyManager.toJSON(String infoType, Object info) provides a good list of these sorts of objects that need passing. Note that object types Point3f, Vector3f, and Matrix3f are first converted to the JavaScript sequential array literals to be consistent with the JSON model. Miguel wrote: >[clip] > > >* I would be useful for me to get some understanding of the types of data >that we would expect to come through this interface in the future ... > > symmetry > vibration/resonance stuff (?) > > > Probably the best place to look is at http://www.stolaf.edu/people/hansonr/jmol/test/proto/spartan.htm Just click on [modelInfo]. When the list appears, it may be easier to read if you click on "new window" to look at it (scroll down to find this if you don't see it). You should get a pop-up window with information looking something like that below. modelInfo=new Array() modelInfo.modelSetHasVibrationVectors=true modelInfo.modelCount=31 modelInfo.modelSetAuxiliaryInfo=new Array() modelInfo.modelSetAuxiliaryInfo.S_ROT=0.02066953 modelInfo.modelSetAuxiliaryInfo.TOTAL_WALL_TIME=0.7565 modelInfo.modelSetAuxiliaryInfo.H_TOT=71.35714 modelInfo.modelSetAuxiliaryInfo.CPKAREA=108.4228 modelInfo.modelSetAuxiliaryInfo.H_VIB=0.7278247 modelInfo.modelSetAuxiliaryInfo.C_ROT=2.9807887 modelInfo.modelSetAuxiliaryInfo.CHELP_PRMS=3.1538928 modelInfo.modelSetAuxiliaryInfo.ESPCHARGES=new Array() modelInfo.modelSetAuxiliaryInfo.ESPCHARGES[0]=0.1415481 modelInfo.modelSetAuxiliaryInfo.ESPCHARGES[1]=-0.14394641 modelInfo.modelSetAuxiliaryInfo.ESPCHARGES[2]=-0.14394641 modelInfo.modelSetAuxiliaryInfo.ESPCHARGES[3]=-0.1393379 modelInfo.modelSetAuxiliaryInfo.ESPCHARGES[4]=-0.1393379 modelInfo.modelSetAuxiliaryInfo.ESPCHARGES[5]=-0.1393379 modelInfo.modelSetAuxiliaryInfo.ESPCHARGES[6]=-0.1393379 modelInfo.modelSetAuxiliaryInfo.ESPCHARGES[7]=0.1405371 modelInfo.modelSetAuxiliaryInfo.ESPCHARGES[8]=0.1405371 modelInfo.modelSetAuxiliaryInfo.ESPCHARGES[9]=0.1405371 modelInfo.modelSetAuxiliaryInfo.ESPCHARGES[10]=0.1405371 modelInfo.modelSetAuxiliaryInfo.ESPCHARGES[11]=0.1415481 modelInfo.modelSetAuxiliaryInfo.S_TRAN_0=0.038979057 modelInfo.modelSetAuxiliaryInfo.SCF_ITERS=8 modelInfo.modelSetAuxiliaryInfo.H_ROT=0.8887221 modelInfo.modelSetAuxiliaryInfo.MOMENT_I_cm=new Array() modelInfo.modelSetAuxiliaryInfo.MOMENT_I_cm[0]=0.1931196 modelInfo.modelSetAuxiliaryInfo.MOMENT_I_cm[1]=0.1931196 modelInfo.modelSetAuxiliaryInfo.MOMENT_I_cm[2]=0.09655979 modelInfo.modelSetAuxiliaryInfo.WEIGHT=78.114 modelInfo.modelSetAuxiliaryInfo.FREQ_QUALITY=1 modelInfo.modelSetAuxiliaryInfo.CHELP_INFO=1025.501 modelInfo.modelSetAuxiliaryInfo.HOMO_N=21 modelInfo.modelSetAuxiliaryInfo.TOTAL_MASS=78.04695 etc... The bob200603 SpartanSmolReader simply reads all the data structures in the file into atomSetCollectionAuxiliaryInfo. This, I believe should be our model -- not worrying about "what data people will want" but, rather, "delivering the data present." (Within limits.) That, in my opinion, will be the best "good-neighbor" policy. >Q: What other kinds of data would we reasonably expect to want to read >from data files within the next 2 years? > > > see above. Now: dipole moments, vibrations, frequencies, energies. Soon: 2D data involving calculation maps (a grid of frames rather than a linear sequence of frames). >* I need to better understand the distinction between which properties are >"AtomSet" and which are "AtomSetCollection" > > So do we all, so do we all. I think this is the most important thing to get down and agreed upon. It is possible that some properties are "default" properties of AtomSetCollection and only selectively overwritten by AtomSet. The fact that we are implementing atom sets in two totally different contexts -- actual models and vibrations of the same model -- complicates the issue. The difference is between, say, having: AtomSetCollection .AtomSet[] .Frequencies[] .Energies[] or AtomSetCollection .AtomSet[] .Frequency .Energy That make sense? In the first case, Frequencies and Energies are properties of AtomSetCollection, even though really they are correlated 1:1 with AtomSets. This is analogous to the way "Dots" and "Labels" are parallel sets of classes even though they correspond 1:1 with atoms. In the second case, these properties are elements of each individual atom set. In general, I think most properties should be properties of the AtomSetCollection -- whether they correspond 1:1 with an AtomSet or not. This is because we cannot a priori know -- and Jmol may have no business knowing -- whether a vector in a data file corresponds to what we are going to call "atom sets". Here's a good example involving vibration: The data file will list vibrations 0,1,2,3,....,N-1, where N is the number of vibrations. But we have implemented N+1 atom sets when vibrations are present. So any file data associated with vibrations is NOT necessarily associated with that first "base" atom set. It would make more sense to leave that data in the form it is in the file and not apply Jmol-derived constructs here. BUT in the other context --- independent models in PDB and CIF files, here it might make more sense to say: "All properties should be properties of the specific atom set (modelAuxiliaryInfo in Viewer), and none should be properties of the atom set collection as a whole (modelSetAuxiliaryInfo in Viewer)." (Because there is no fundamental common concept like "vibrational modes of a molecule" linking these models -- they are just independent objects.) So, perhaps a compromise involves where in the file that data is present: If it is present within _data ////// _data ////// ... as in CIF files, then I would argue that it should be a model-specific (AtomSet) property. If it is defined for the data as a whole, as in CSF files: object_class dihedral property dflag Linus noUnit 0 1 HEX property rflag Linus noUnit 0 1 HEX property angle Linus degree 2 1 FLOAT property AngleRange Linus degree 3 2 FLOAT property Steps Linus noUnit 0 1 INTEGER ID dflag rflag angle AngleRange Steps 1 0x1 0x0 0.00 -179.000 61.000 2 2 0x1 0x0 -180.00 -179.000 61.000 2 3 0x1 0x0 180.00 -179.000 61.000 2 Then it should be more broadly a model set (AtomSetCollection) property. Bob |