|
From: Burkhard S. <b_...@us...> - 2005-01-27 01:42:21
|
Hi Maren and Mark, please see my earlier email about the XYZSet container tags. It may be beneficial to keep them. > However, I'm wondering whether we really need the possibility to sign > single elements in the AnIML file, or whether it would suffice to be able > to sign the whole document. Any thoughts on that? Signing only whole documents may be too strict a limitation. The philosophy in most Part 11-compliant implementations is that everybody signs the data she/he is responsible for. So one person (i.e. a chemist) could have created the samples and somebody else (i.e. a lab technician) could run the experiment. A 3rd person is responsible for calibration, etc. So that's one thing. Another point is that parts of a document could be generated at different points in time. Here it is a critical feature to preserve the original signatures, even when data is later added in other parts of the document (not covered by the initial signature). A third, and very interesting point: In the future, we could consider instruments that directly sign their result data. I've talked to a few folks (instrument manufacturers) at the LIMS Conference in Barcelona last September, and this is something they get asked about in regulated environments. And right now, only AnIML provides a (non-proprietary) solution to this problem. > - In the Technique Schema, we have an attribute "maxOccurs" and an > attribute "modality". It would be more consistent to replace "modality" > with "minOccurs" (=0 or 1). Sounds good. > - What is the exact meaning of the attributes "inheritable" and > "upwardsInherited"? These seem rather technical to me; what do we need > them for? "inheritable" is used with nested techniques. It indicate if a technique can inherit a sample from the surrounding experiment step. Example: LC +-- MS (inherits sample from LC) Here we don't have to explicitly declare the sample consumed by the MS, because the MS ExperimentStep is attached to the chromatogram page and refers to a point (or range) of the LC time axis. So we know where the sample comes from without having to create an explicit entry for it in the SampleSet section. For this to work, the MS technique definition needs to set "inheritable=true" for the run sample. > - What is the benefit of assigning "consumed" or "produced" to a sample? It allows us to easily track the material flow in the experiment. We can see how a sample is created by looking at the ExperimentStep that "produced" it. We can also find out what happened to it by looking at all the steps that "consumed" the sample. One important consequence: If a sample is "consumed" in a step, the result data pages of that step will tell us something we measured about the sample. If a sample is "produced", that step did not measure the sample but merely produced the material; i.e. we will typically need to take additional steps to measure its characteristics. So we can chain together experiment steps using the produced/consumed concept. This very feature allows us to cover the lab workflow, so it's one of the most important attributes in the core schema. :-) > - Parameters are usually not stored as binary data. Is it useful to have > float and double data types for them? Yes. Using float and double as a data type does not mean that the data is stored in binary/base64. The digits are stored in plain text. Some instruments deliver IEEE floating point values, so having these types is certainly a good idea. > Storing non-binary data as floats or > doubles incurs a loss of precision as the decimal numbers are converted > internally to the closest binary number, which is not always exact. We > might want to be able to use the XML datatype xs:decimal as well. You are right on the rounding. Perhaps I got you wrong in the last paragraph. I've looked at xs:decimal and it seems interesting for us. From what I gather this type supports arbitrary precision numbers. This is somewhat problematic to handle in software, since there is no data type in a programming language that would directly map to it. I know that Java, C++, and .NET have classes available to encapsulate it. Nevertheless, implementation tends to be hairy. But my gut feeling would be that xs:decimal should be put in. > Another issue (which we probably can't solve in XML) is that you cannot > specify how high the precision of your data is. "1.200" is something > different from "1.2" as the two zeroes tell you that the precision is > three decimal digits. From what I've found in an XML book, however, it > looks like XML views the trailing zeroes as "non-significant". That's true. It's also in the Schema spec for the decimal type. http://www.w3.org/TR/xmlschema-2/#decimal Viele Grüße, Burkhard |
|
From: Burkhard S. <b_...@us...> - 2005-01-27 00:08:40
|
Hi Anh Dao and Maren,
the ds:Reference mechanism is the right way to go. That way you can
point to the item(s) you want signed.
I spent some more time to think whether we really need the container
tags (XYZSet). The following example came to my mind. Let's assume we
have a VectorSet vs1 containing two Vectors v1 and v2 -- like this:
Scenario 1 (with XYZSets):
<Page>
<VectorSet id="vs1">
<Vector id="v1"> ... </Vector>
<Vector id="v2"> ... </Vector>
</VectorSet>
...
</Page>
Scenario 2 (without XYZSets):
<Page>
<Vector id="v1"> ... </Vector>
<Vector id="v2"> ... </Vector>
...
</Page>
Now the question is: Is a signature over [vs1] in scenario 1 equivalent
to a signature over [v1, v2] in scenario 2? The surprising answer is "no".
In both scenarios, the signatures prevent the modification of the data
in v1 and v2. But in scenario 2, we could easily introduce another
Vector v3 without invalidating the signature - because v3 is outside the
scope of the signature:
<Page>
<Vector id="v1"> ... </Vector>
<Vector id="v2"> ... </Vector>
<Vector id="v3"> ... </Vector>
...
</Page>
In scenario 1, doing this would make the signature invalid (which is the
desired behavior).
So taking out the container tags (XYZSets) would significantly weaken
the security of our digital signature mechanism: It allows data to be
added without the ability to detect it (unless you sign the entire
surrounding element - which will not always be possible). This will make
the 21 CFR Part 11 folks very unhappy.
Therefore I would tend to leave the container tags in.
Best wishes,
Burkhard
Anh Dao Nguyen wrote:
>
> Hi Maren,
>
> I have developed a small digital-signature-signing and –verification
> program which allows signing numerous data objects (elements) all at
> once. If the purpose for creating the “XYZSet” containers was to be able
> to sign several elements of similar type at one time, then it would be
> unnecessary. You are right, we theoretical can remove all the
> Set-containers from the Core Schema.
>
> At the beginning stage of the application development, I used
> “Object”-element-concept (according to the paragraph 2.3 of the
> DSig-Specification) to sign multiple data objects. That means the data
> being signed must be accommodated within the “Object” element.
>
> For instance:
> ….
> <Sample derived="false" *id="signitem1"* sampleID="SRM936">
> …
> </Sample>
> …….
> <Signatures>
> <Signature Id="sig0">
> <ds:SignedInfo>
> <ds:Reference URI="*#signObject1*">
> …
> </ds:Reference>
> </ds:SignedInfo>
> <ds:*Object* Id="*signObject1*">
> <Sample derived="false" *id="signitem1"*
> sampleID="SRM936">
> …
> </Sample>
>
> </ds:Object>
> </Signature>
> </Signatures>
>
>
> As everyone already knew, each ID must be unique within an
> AnIML-document. Therefore I had to remove the original data of the
> signed element (outside of the signature element) from the AnIML file.
> Otherwise, we would have double ID in the AnIML file (id=”signitem1”).
> Finally, I decided to utilize “Reference” element within the “SignInfo”
> element to refer to the signed data by using “URI” attribute (see the
> attach file).
>
> ….
> <Sample derived="false" id="*signitem1*" sampleID="SRM936">
> …
> </Sample>
> …….
> <Signatures>
> <Signature Id="sig0">
> <ds:SignedInfo>
> <ds:Reference *URI="# signitem1"*>
> …
> </ds:Reference>
> </ds:SignedInfo>
>
> </Signature>
> </Signatures>
>
>
> Anh Dao
>
>
>
>
> At 05:57 AM 1/26/2005, Mar...@wa... wrote:
>
>> Hi Mark,
>>
>> > But let's go on and see if we can make a better hierarchical model.
>> OK. I had a look at the XML-DSIG Specification
>> (http://www.w3.org/TR/xmldsig-core/) yesterday, and it looks like we can
>> get rid of the "XYZSet" containers in the Core Schema:
>> The idea of these was, as far as I remember, to be able to sign several
>> elements of the same kind at once.
>> XML-DSIG has an element called "Signature" with several children. The one
>> that tells you what you are signing is "SignedInfo/Reference", which has
>> an optional "URI" attribute where you state the URI of the object you're
>> signing. Now, paragraph 2.3 of the Spec says that you can include
>> multiple
>> "Reference" elements within "SignedInfo" to sign multiple data objects.
>> Judging from this, we don't need the containers any more. I made a new
>> core schema based on animl-core 1.05 where I took the container elements
>> out:
>>
>> By the way, I am not sure any more whether this is the version with the
>> corrected references or not. Maybe Anh Dao can help clarify this.
>>
>> However, I'm wondering whether we really need the possibility to sign
>> single elements in the AnIML file, or whether it would suffice to be able
>> to sign the whole document. Any thoughts on that?
>>
>> BTW, The technique schema remains untouched by the changes I made.
>>
>> I have some more questions/suggestions about schema details:
>>
>> - In the Technique Schema, we have an attribute "maxOccurs" and an
>> attribute "modality". It would be more consistent to replace "modality"
>> with "minOccurs" (=0 or 1).
>>
>> - What is the exact meaning of the attributes "inheritable" and
>> "upwardsInherited"? These seem rather technical to me; what do we need
>> them for?
>>
>> - What is the benefit of assigning "consumed" or "produced" to a sample?
>>
>> - Parameters are usually not stored as binary data. Is it useful to have
>> float and double data types for them? Storing non-binary data as
>> floats or
>> doubles incurs a loss of precision as the decimal numbers are converted
>> internally to the closest binary number, which is not always exact. We
>> might want to be able to use the XML datatype xs:decimal as well.
>> Another issue (which we probably can't solve in XML) is that you cannot
>> specify how high the precision of your data is. "1.200" is something
>> different from "1.2" as the two zeroes tell you that the precision is
>> three decimal digits. From what I've found in an XML book, however, it
>> looks like XML views the trailing zeroes as "non-significant".
>>
>>
>> > I don't think I had a response on the need for the concept of an
>> > "analysis" in AnIML
>> I'm going to look into this soon. Maybe Mark Mullins can tell us about
>> his
>> experiences implementing AnIML?
>>
>>
>> Maren.
>>
>>
>> Mit freundlichen Grüßen / Best regards
>>
>> Dr. Maren Fiege
>> Product Manager
>>
>> --------------------------------------------------------------
>> Waters Informatics
>> Europaallee 27, D-50226 Frechen, Germany
>> Tel. +49 2234 9207 - 0 Fax. +49 2234 9207-99
>> Reply to: mar...@wa...
>> http://www.creonlabcontrol.com <http://www.creonlabcontrol.com/>
>> http://www.watersinformatics.net <http://www.watersinformatics.net/>
>> --------------------------------------------------------------
>> ===========================================================
>>
>> The information in this email is confidential, and is intended solely
>> for the addressee(s). Access to this email by anyone else is
>> unauthorized and therefore prohibited. If you are not the intended
>> recipient you are notified that disclosing, copying, distributing or
>> taking any action in reliance on the contents of this information is
>> strictly prohibited and may be unlawful.
>>
>> ===========================================================
|