Re: [pyasn1-users] Asn1Item.hasValue()/.isValue
Brought to you by:
elie
From: Ilya E. <il...@gl...> - 2017-04-10 21:31:28
|
Hi Sergey, > On 9 Apr 2017, at 20:43, Sergey Matveev <sta...@st...> wrote: > > *** Ilya Etingof <il...@gl...> [2017-04-09 21:22]: >> Oh, right, that information is now lost… The `sq[‘certificates’].isValue` property returns True only if >> there is a non-empty SET OF sequence and False otherwise. >> >> Could you please elaborate what is the use case? When it is useful to distinguish situation when optional value was not present in serialization from empty SET OF value being encountered there? > > The situation is to strictly check user's input, to allow only and only > single representation of semantic context. For example if semantic > context is "no certificates", then only 'certificates' field absence is > allowable way to represent that context. So... some kind of "DER" for > context -- only one way to reflect the context in binary form. > > Well, I assume it can be achieved with value constrained schemes, like > SET OF with ValueSizeConstraint(1, ...), but in CMS CertificateSet > structure are not constrained. > > If I save in database the context like 'user send me that kind (content > type) of data (content itself), with the following signature, digest, > bla bla bla and no certificates -- then I can reconstruct original > received DER data based on that context. With ambiguity (because of CMS > standard, not the DER or something related to pyasn1) of that > representation I can not do that. So I have to deal with that standards > and strictly verify user's input. Moreover if I set ValueSizeConstraint > then I will get error that user's supplied data can not be decoded, that > it does not reflect the schema -- rather scary exception that is not > correct according to CMS standard. > > And at last, if component existence information is lost, then that means > that I can not tell what exactly was in decoded ASN.1 data :-). I see and I do not envy you…. %-\ How about this: what if we make pyasn1 decoders building a map of asn1_object -> original_substrate so that once decoder is done you could lookup original DER encoding (possibly in form of TLV for convenience) from which given asn1_object was recovered? In your use case, such map lookup for “certificates” field would return: a) empty “TVL” if no SET OF was encoded b) zero “L” if empty SET OF was encoded c) non-zero “L” if SET OF and its components were encoded In pseudocode: sq, asn1_object_map, rest_of_substrate = decode(substrate, asn1Spec=SomeAsn1Structure()) try: set_of_substate = asn1_object_map[sq[“certificates”].someUniqueAsn1ObjectId]: except KeyError: print(“certificates are not present”) else: if set_of_substrate.length: print(“certificates are present”) else: print(“certificates are empty’) Such map might have other uses for example encoder could pull fragments of substrate from it (whenever present) and use them instead of building new substrate from ASN.1 objects. Another idea is that we could also reverse this map (so it would be substrate -> asn1_object) and give it to decoder which could then look up whole branches of ASN.1 objects tree by substrate fragments. In use cases like decode-modify-encode or decoding parts of ASN.1 tree this may give significant performance improvement. WDYT? >> Thing is that `sq[‘certificates’] is None` check was somewhat unreliable — it depends on the order of the elements being recovered (only applicable to SET, not SEQUENCE). > > Could you please explain more deeply that? If I decode some bytes, then > somehow I "is None" check can fail? I do not clearly understand that case. So we assume that `sq[“certificates”] is None` is an indication of absent SET OF substrate for the "certificates” field. That assumption may fail if there is any other field beyond the “certificates” one (which may not be the case here). When that “next-to-certificates” field gets initialized, “certificates” would be initialized as well thus failing our assumption. This may happen if “certificates” is optional (so it’s not present in the encoding) but “next-to-certificates” is present. Note that I’m not referring to CMS use case here, just thinking over this approach in general. >> Secondly, user could accidentally initialize N-th component by setting N+M-th component in the SEQUENCE. This is the rationale for changing components initialization logic — now components get initialized all at once on first reference to any component. > > Yeah, that is known problem :-). But only applicable to manual pyasn1 > objects initialization, is it? No problems here with decode()-ing? Right, that should not happen, but it’s never guaranteed. So it feels like a shaky grounds in the long run. ;-) |