From: Eric B. <er...@pi...> - 2017-12-12 17:38:40
|
Hello everyone, I think the workshop was relatively productive, even though there were only a few concrete products. Karol, it was good to finally meet in person. There is supposed to be a report eventually, but it hasn’t materialized yet. Current written results are located here <https://drive.google.com/drive/u/0/folders/0BwQDzp1VrB3veDdGRS1aTHI1d1E>. Here is a brief summary and my thoughts. - MolSSI has pledged to provide a definition of the schema via http://json-schema.org/, tools for validation, including a curl’able server, and tools for translation between AO normalization/ordering. - Like Chemical JSON, we are (at first) going with a “big” or “monolithic” layout. That means all atom coordinates are stored together, all basis set definitions are stored together, all atomic charges are together, etc. The opposite would be to have each atom carry its own coordinates, charge, basis functions, etc. This is something like a procedural/imperative vs. object-oriented structure, though accessing attributes only changes from frame.xyz[4] to frame.atoms[4].xyz and so on. - Like Chemical JSON, I think we also agreed on completely flat arrays. I am unsure if we’re doing C or Fortran ordering or what the precedent is. I suspect CJSON is row-major. It should *not* be on a per-attribute basis. - There is an open question about how to store data for multiple geometries. Should each geometry carry its own coordinates and MO coefficients, or should those be up a level with an additional pre-index, like how we do it? Also no idea how compressed/binary data will work. - There will be different levels of adherence to the spec. The bare minimum we should adhere to, and a concrete goal, is the “Molden” level, meaning we can read and write everything that the Molden format can handle. - There are going to be vendor-specific extensions, allowing arbitrary keys to be added to the schema, similar to how we are going to redo attributes to hold arbitrary program-specific properties. We had a few discussions about provenance, the storage of which will map to metadata. See here <https://docs.google.com/document/d/11-Q9UpcgCOf_ssKlyKLzhNf8C7g-QCnzElP8hMvwG7g/edit#heading=h.exa5ufsjljlk> for an example from Geoff and myself. - The path forward is not super clear, and there has been little activity in the schema repository. Bob Hanson pledged a sort of living implementation for the community to discuss, which is already available. I had also agreed to do the same thing. My cclib-related questions and proposal: - Is our Molden writer feature-complete? Along with all the CJSON already done, it will make a great starting template. If it isn’t, we should finish that first. As an aside, I have bits of a Molden reader and reorganizer somewhere… - We have a request in 451 <https://github.com/cclib/cclib/issues/451> to add more things to metadata. There is currently no overlap with what Bob has added to Jmol. I propose we just start adding things to metadata within some reasonable structure. It looks like provenance and input-related things will go entirely in metadata? We could discuss it in the issue thread, since we have an actual user. Eric On Mon, Nov 20, 2017 at 5:55 PM, Karol Langner <kar...@gm...> wrote: > Yes, looking forward to being there! > > On Fri, Nov 17, 2017 at 8:02 AM, Eric Berquist <er...@pi...> wrote: > >> Based on the schedule and working group outlines, I think we will learn a >> lot more by the time the workshop is done, and I'll write up with a >> detailed report for us. Right now, I think the most we can ask for is a >> well-defined schema with good documentation, which would be more useful to >> us than monetary support. >> >> I can't think of any other input right now specifically for MolSSI. We're >> all reasonably responsive for communication, in case they require something >> from us. Since we've been discussing it recently with external pushes to >> add more attributes and metadata, we should set a target date for releasing >> v2.0. I think it should be either when we have an initial implementation of >> the MolSSI spec, or the end of spring 2018, whichever comes first. That >> means we should set a date for v1.5.3, maybe by the end of December. >> >> Karol, I noticed your name as part of some working groups. Are you >> actually attending in-person at all? >> >> Eric >> >> On Sun, Oct 15, 2017 at 3:59 AM, Karol Langner <kar...@gm...> >> wrote: >> >>> I think this is a valuable initiative, please push it forward as much as >>> possible! Is there any specific input or decision from us (cclib as a >>> whole) that you would like on this topic? >>> >>> On Fri, Oct 13, 2017 at 10:41 AM, Eric Berquist <er...@pi...> wrote: >>> >>>> Hi everyone, >>>> >>>> I've been invited to attend the a MolSSI workshop on quantum chemistry >>>> schema at LBNL on 11/30 and 12/1; I've attached a PDF of the solicitation. >>>> Here is Daniel Smith's email to me: >>>> >>>> >>>>> During a MolSSI Interoperability Workshop this year, one topic that >>>>> came up was an interoperable schema between various quantum chemistry >>>>> programs so that users and developers could have a unified interface to >>>>> move data in and out of these very large programs as opposed to processing >>>>> ASCII files and building custom inputs. To this end, we have been tweaking >>>>> a base schema and talking to the creators of the many different schema >>>>> already out there in the hope of unifying these diverse groups. We hope to >>>>> pull in approximately 30 quantum chemistry developers from a broad set of >>>>> backgrounds and programs to make this a reality. >>>>> >>>>> The current version and primary discussion of the schema can be found >>>>> on GitHub here: https://github.com/MolSSI/QC_JSON_Schema >>>>> <https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FMolSSI%2FQC_JSON_Schema&data=01%7C01%7Cerb74%40pitt.edu%7C473d47bc11b145effed008d513a2aad8%7C9ef9f489e0a04eeb87cc3a526112fd0d%7C1&sdata=ZBuo3wngMsab4aiLzsjJJeCUFjHm07l75Ca8SkTmI8M%3D&reserved=0> >>>>> >>>>> We would encourage your entire community to discuss the schema in its >>>>> current form in order to spur more discussion and tune the overall scope of >>>>> the schema on the GitHub page, otherwise feel free to email me back >>>>> personally if you have any questions. For the workshop participant, we are >>>>> looking for one developer that would represent your community to help >>>>> finalize the schema and decide on future governance and communication plans. >>>>> >>>> >>>> As of right now, I am not representing cclib since we should come to >>>> some census decision about our path. I do feel it gives us more exposure, >>>> which is good, and would push development a bit, which could be a pro or a >>>> con. We are well-positioned to implement (now or soon) all of their >>>> requirements (https://github.com/MolSSI/QC_ >>>> JSON_Schema/blob/master/Requirements.md >>>> <https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FMolSSI%2FQC_JSON_Schema%2Fblob%2Fmaster%2FRequirements.md&data=01%7C01%7Cerb74%40pitt.edu%7C473d47bc11b145effed008d513a2aad8%7C9ef9f489e0a04eeb87cc3a526112fd0d%7C1&sdata=oUhxcu3jVdKYVu3lfNgzLX04P%2FSM7Vi%2Bw%2FUW95wF3b0%3D&reserved=0>), >>>> especially for QM packages that will certainly not support the schema >>>> directly. A substantial amount of work was already done by Sanjeed during >>>> GSoC last year as part of CJSON, which itself is being unified. Our >>>> transition to more modular attributes, which we've already started >>>> discussing, can only make this easier. >>>> >>>> Their repository is just a few Markdown files and is worth reading. As >>>> far as what our obligation would be, I think it would be to implement their >>>> spec, with their development assistance if need be. Implicit in my >>>> invitation to the workshop is that I'd do most of the heavy lifting. In >>>> particular, since large data (MO coefficients, densities, response vectors, >>>> ...) will need to be stored, we will probably need an HDF5 interface that >>>> can be optional, similar to how our other bridges are already optional. One >>>> question is whether or not there will be a fallback non-binary >>>> representation for these fields. >>>> >>>> Please let me know your thoughts/questions/suggestions/concerns. >>>> >>>> Eric >>>> >>>> ------------------------------------------------------------ >>>> ------------------ >>>> Check out the vibrant tech community on one of the world's most >>>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot >>>> <https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fsdm.link%2Fslashdot&data=01%7C01%7Cerb74%40pitt.edu%7C473d47bc11b145effed008d513a2aad8%7C9ef9f489e0a04eeb87cc3a526112fd0d%7C1&sdata=VwbxblmY2NxxxgcAxGzkfsnfjsyiQdtJdQxCIpcthjg%3D&reserved=0> >>>> _______________________________________________ >>>> cclib-devel mailing list >>>> ccl...@li... >>>> https://lists.sourceforge.net/lists/listinfo/cclib-devel >>>> <https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.sourceforge.net%2Flists%2Flistinfo%2Fcclib-devel&data=01%7C01%7Cerb74%40pitt.edu%7C473d47bc11b145effed008d513a2aad8%7C9ef9f489e0a04eeb87cc3a526112fd0d%7C1&sdata=7omT%2FSuUEMSVkxNVXIiHsmvF75t%2FezyeLz8sM15idFk%3D&reserved=0> >>>> >>>> >>> >> > |