|
From: Mark F. B. <sa...@co...> - 2004-12-07 14:22:43
|
We have made good progress on AnIML but I think it may be worth a short
pause for a major structural review before we push ahead to release AnIML
1.0. We have too little XML Schema expertise and especially too few who are
expert in analytical chemistry, XML Schema, and programming. It might be
worthwhile to ask Murray Rust for an opinion. Remember the XML instances
may never exist as documents, and only as byte streams across a network as
information moves from its container to a viewer. I would not enjoy writing
a fully-functional AnIML viewer based on our current schema.
Suggestions for Improvement of the AnIML Schema
1. Improve database table to schema translation, perhaps using simple
data table schema and relations instead of hierarchical encoding. Add
database keys (long or int) to every table as primary key. I can provide an
example alternate schema constructed this way.
2. AnIML is too complex. It is nearly impossible to grasp it without
XMLSpy, and even then it is frustratingly difficult. Consider trimming down
to what we know we need, permitting core extension.
a. Remove recursive nesting of PageSet, ExperimentStepSet, and
ParameterCategorySet until it is proven necessary.
b. Remove empty containers that confuse and complicate an already
complicated schema. These are simply containers of 1 to many containers
which themselves hold no information. However, as many of these have
SignableItems, another approach to signing might be needed. Needs thought.
This would trim the number of elements from 45 to 33, but some of these
appear multiple times or recursively, so the effect would be greater than it
appears.
i. AnIML
ii. SampleSet
container of Sample
iii.
ParameterCategorySet container of ParameterCategory
iv. ParameterSet
container of Parameter
v. Template (simply
add a flag, isTemplate to ExperimentStep) either wrong or confused in
schema as it derives by extension
vi. SamplesUsed
container of Sample
vii. References
container of IndexRef
viii. ExperimentStepSet
and possibly MeasurementData container of ExperimentStep
ix. PageSet container
of Page
x. VectorSet
container of Vector
xi. AuditTrail (or
LogEntry) container of LogEntry
xii. Signatures
container of Signature
c. Consider removing attributes and moving info within compleTypes
3. Make AnIML less flexible in the ways it can be filled or there will
be 10 ways to insert chromatographic data. Use more strong typing.
4. Make method of extension much clearer; providing examples in
Specification.
5. Consider reworking referencing for so-called hierarchical data,
making it easier to get the point index, the point value, as well as
pointRanges, pointValueRanges, etc for spectra from a chromatogram (for
example)
6. Add information on how to validate, certify and test AnIML
extensions.
7. Clarify rationale for using Technique xml files instead of
Technique schema extensions of the core it is currently not recorded
anywhere.
8. Trim JCAMP from Technique and create a JCAMP extension.
9. Add filename and file URL references if they are not already there
somewhere (I can't find them).
10. Values encoded in Base64 are not limited or specified there is
currently no way to know into what we should decode our Base64 (float32,
float64, etc.)
I suspect with help we could amplify this list,
regards, Mark
|