I will take an example to explain the difference between feature label and value.
Suppose, you have 4 labels (starting from 0) - NN, J, VB, DT for
doing POS tagging. Now, a sequence can start from only some specific
labels and you want to encode this fact as a feature. (Lets call it
There will be in all four features for this feature type (one for each class)
During training of this feature, you try to determine that which of
the labels occur at pos = 0, by looking at the actual labels in
training data. Suppose it turns out that only label 0 and 3 occur
at the starting position (i.e. NN and DT)
Now, while firing features, in startScanFeaturesAt(DataSequence dataSeq, int prev, int pos)
if (pos != 0) then you will not fire any features, and will return false.
Implicity, that means that all the feature values for pos != 0 will be 0.
Now, for pos = 0, you will fire two features, one for label = 0 and one for label = 3 (and setting the feature value = 1)
the rest of the labels, no feature is fired so that implicity means
that their feature value is 0. You can achieve the same by setting the
feature values like this:
i) yend = 0, val = 1, ystart = -1, id = 0
ii) yend = 1, val = 0, ystart = -1, id = 1
iii) yend = 2, val = 0, ystart = -1, id = 2
iv) yend = 3, val = 1, ystart = -1, id = 3
And just to clarify, the actual label of the data is seen only during
the training. (And in a FeatureType, the actual label is seen only in
the train() method, and not while firing the features)
When you are firing features, you fire it for all possible labels for
which the feature will hold true. If you search the previous posting on
this mailing list, you will get some explanation on this.