[GeneX-dev] ["Miller, Michael" <MMiller@rii.com>] Tech Meeting Notes and Teleconference 3/14/01?

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Hey All,

This is the report from the OMG meeting that just happen in Irvine
CA. It was a surprisingly productive meeting. 

I still feel that the OpenSource process: release early, release often
is a better approach than the OMG process. But since many corporate
players feel very uncomfortable with that approach, I guess the OMG
process is as good as you get.

Executive Summary
=================
* we were extremely productive, and were able to agree on big pieces
  of the data format. These pieces had to do with sample tracking, and
  higher level annotations.
* we spent most of the first 1/3 of the time discussing the glossary
  and agreeing on nomenclature. This was a GoodThing (TM). We now have
  a fairly broad group of academics (MGED) and three major corporate
  players (Rosetta, NetGenix, and Agilent) agreeing on terminology. It
  is my assigned duty to write up a new glossary, and when I do I will
  distribute it to this list.
* The big discussion came to the actual data encoding for the spot
  values. The two major options were Rosetta's very verbose but
  explicit row-by-row XML encoding (much like GeneXML) or MAML's very
  loose, flexible and un-tested matrix approach. Paul and I agreed to
  implement some examples of the matrix approach before we would
  decide anything, but it seemed likely that support for the row-based
  approach would be manditory, and the matrix approach optional.
* We decided that a description of spots on an ArrayLayout (which we
  agreed to call an ArrayPattern) needed to include at least three
  levels: 1) the SequenceFeature -- the nucleic acid (or other stuff)
  bound to the array; 2) the Reporter the higher level entity that SF
  was to represent (we used to call this the *canonical* sequence
  feature; 3) the BiosequenceCluster -- for example a UniGene set.
* We uncovered that Rosetta's and NetGenix submission's had missed
  the concept of between array comparisons (i.e. ratio's of two
  different time points). Michael Miller was going to address this.
* Rosetta had an elegant way to encode spot position which did not
  require massive numbers of attributes
* There was an enormous amount of technical discussion about what
  would be the *normative* output of this groups effort, i.e. what is
  the actual document that will hold the specification. Should it be
  UML, a CORBA IDL, or XML? We decided to absolutely produce UML, and
  that the UML can be used to produce an IDL and an XML specification,
  but that we are likely to produce both the IDL and XML specification
  as well. 

So, all in all, we learned a lot, and it was worth the trip.
jas.