|
From: Burkhard S. <b_...@us...> - 2004-12-16 14:04:11
|
Mark, >>>and EncodedDataSet to 0 to 1 per Vector. If we do NOT do that, the >>>VectorSet length becomes problematic as there is no guarantee that all >>>ValueSets in a Vector are the same length. >> >> I think my example illustrates that not all ValueSets need to share the >> same length. We are currently allowing an unlimited number of ValueSets >> per Vector to permit storage of data with "holes" / sparse data. > > I don't think that is completely true. The only ValueSet that absolutely > requires a length is the AutoIncrementedValueSet as you cannot AutoIncrement > without knowing how many times to increment!! I agree that we need to know how many times to increment. How can we find out in the current structure? Calculate endOffset-startOffset in the AutoIncrementedValueSet. Proposal: - make startOffset and endOffset required for all valuesets (they are optional right now) - leave the number *ValueSets at unbounded (as is) Justificatin: We need the offsets anyway in the case of sparse / non-continuous data. If we make them mandatory, we can use them not only to "align the data points" but also to determine the number of values to generate in the AutoIncrementedValueSet. > The number of increments is > being taken from the VectorSet @length. However, if each > AutoIncrementedValueSet has a different number of increments, then we are > stuck. There are options: Taking the length from endOffset-startOffset will do it. One less point of possible inconsistency in the file. > take care, I look forward to your recursive parser in an infinitely flexible > AnIML kingdom! AnIML kingdom -- nice. ;-) Best wishes, Burkhard |