Thread: [Animl-develop] Signatures, Inheritable, Produced/Consumed,

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Hi Maren and Mark,

please see my earlier email about the XYZSet container tags. It may be 
beneficial to keep them.

> However, I'm wondering whether we really need the possibility to sign 
> single elements in the AnIML file, or whether it would suffice to be able 
> to sign the whole document. Any thoughts on that?

Signing only whole documents may be too strict a limitation. The 
philosophy in most Part 11-compliant implementations is that everybody 
signs the data she/he is responsible for. So one person (i.e. a chemist) 
could have created the samples and somebody else (i.e. a lab technician) 
could run the experiment. A 3rd person is responsible for calibration, 
etc. So that's one thing.

Another point is that parts of a document could be generated at 
different points in time. Here it is a critical feature to preserve the 
original signatures, even when data is later added in other parts of the 
document (not covered by the initial signature).

A third, and very interesting point: In the future, we could consider 
instruments that directly sign their result data. I've talked to a few 
folks (instrument manufacturers) at the LIMS Conference in Barcelona 
last September, and this is something they get asked about in regulated 
environments. And right now, only AnIML provides a (non-proprietary) 
solution to this problem.

> - In the Technique Schema, we have an attribute "maxOccurs" and an 
> attribute "modality". It would be more consistent to replace "modality" 
> with "minOccurs" (=0 or 1).

Sounds good.

> - What is the exact meaning of the attributes "inheritable" and 
> "upwardsInherited"? These seem rather technical to me; what do we need 
> them for?

"inheritable" is used with nested techniques. It indicate if a technique 
can inherit a sample from the surrounding experiment step.

Example:

LC
  +-- MS (inherits sample from LC)

Here we don't have to explicitly declare the sample consumed by the MS, 
because the MS ExperimentStep is attached to the chromatogram page and 
refers to a point (or range) of the LC time axis. So we know where the 
sample comes from without having to create an explicit entry for it in 
the SampleSet section.

For this to work, the MS technique definition needs to set 
"inheritable=true" for the run sample.

> - What is the benefit of assigning "consumed" or "produced" to a sample?

It allows us to easily track the material flow in the experiment. We can 
see how a sample is created by looking at the ExperimentStep that 
"produced" it. We can also find out what happened to it by looking at 
all the steps that "consumed" the sample.

One important consequence: If a sample is "consumed" in a step, the 
result data pages of that step will tell us something we measured about 
the sample. If a sample is "produced", that step did not measure the 
sample but merely produced the material; i.e. we will typically need to 
take additional steps to measure its characteristics.

So we can chain together experiment steps using the produced/consumed 
concept. This very feature allows us to cover the lab workflow, so it's 
one of the most important attributes in the core schema. :-)

> - Parameters are usually not stored as binary data. Is it useful to have 
> float and double data types for them? 

Yes. Using float and double as a data type does not mean that the data 
is stored in binary/base64. The digits are stored in plain text. Some 
instruments deliver IEEE floating point values, so having these types is 
certainly a good idea.

> Storing non-binary data as floats or 
> doubles incurs a loss of precision as the decimal numbers are converted 
> internally to the closest binary number, which is not always exact. We 
> might want to be able to use the XML datatype xs:decimal as well.

You are right on the rounding. Perhaps I got you wrong in the last 
paragraph.

I've looked at xs:decimal and it seems interesting for us. From what I 
gather this type supports arbitrary precision numbers. This is somewhat 
problematic to handle in software, since there is no data type in a 
programming language that would directly map to it. I know that Java, 
C++, and .NET have classes available to encapsulate it. Nevertheless, 
implementation tends to be hairy.

But my gut feeling would be that xs:decimal should be put in.

> Another issue (which we probably can't solve in XML) is that you cannot 
> specify how high the precision of your data is. "1.200" is something 
> different from "1.2" as the two zeroes tell you that the precision is 
> three decimal digits. From what I've found in an XML book, however, it 
> looks like XML views the trailing zeroes as "non-significant".

That's true. It's also in the Schema spec for the decimal type.
http://www.w3.org/TR/xmlschema-2/#decimal

Viele Grüße,
Burkhard

Thread: [Animl-develop] Signatures, Inheritable, Produced/Consumed,

Open XML format for analytical chemistry and biology data.

animl-develop