> Dear Dr. Sarawagi,
> I am working in the Information Extraction research group of Thomson
Corp. and recently got a chance to use the CRF package that you have
created. Thanks for the excellent implementation, it is really a very
useful module and personally I feel more comfortable than any other
> However, I have few very basic doubts regarding the code, especially
on the usage and values of the weight vector (lambda) during the
training procedure. I will be grateful if you can clarify them.
> Firstly, I started working with the same data corpus (address
sequence) you have implemented in the sample example. From my
understanding of the original Mccallum paper I thought the lambda vector
(weight vector) will have the same length as number of the feature
functions. I generated 4 state feature functions (depending upon the
address data) and 3 transition (emission) feature functions. So the
weight vector has a length of 7 in my case. Whenever I find any feature
from the training data (that is if the data passes any particular
feature function - which is a boolean function) I update the lambda of
the particular feature function index. Where as, in the java CRF
implementation, the weight vector has a length of the size of total
possible feature vectors (I think it is 220). This is little confusing
> Secondly, I checked out that the generated features are composed of
some functions (all caps, alpha-numeric property, the "word" etc.).
Based on the "feature index" for the weight vector, you apply the
Viterbi algorithm. My doubt is, while finding out the index from weight
vector (during evaluation) how do you match the index of the trained
weight vector? In the training implementation every word is a feature in
your case, if some unseen word (which is very possible) occurs during
the testing procedure then do you index this word as "unseen feature"?
> Lastly, many words can be represented as various features (such as the
"word", alphanumeric or not, starts with caps etc.). While finding the
index of the weight vector to match during evaluation, how do you select
which feature index to be used (from the weight vector)? Do you give
weight to any particular feature (say, if the word matches with a
training data, it is of higher priority) than the other?
> Thanks in advance,
Get latest updates about Open Source Projects, Conferences and News.