Re: [Crf-users] Crf-users Digest, Vol 5, Issue 3

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

Hi,
A small correction. There is an error in the example given in point (2). The
correct one is:

Now consider the global feature vector F() for difference possible label
sequences
a)
For Y = [ 0 0 0 0]
F(Y|X) = f(y = 0, pos=0) + f(y=0, pos=1) + f(y=0, pos=2) + f(y=0, pos=3)
F(Y|X) = [ 1 0 ] + [ 0 0 ] + [ 1 0 ] + [ 0 0]
F(Y|X) = [ 2 0 ]

b)
For Y = [ 1 0 0 0 ]
F(Y|X) = f(y = 1, pos=0) + f(y=0, pos=1) + f(y=0, pos=2) + f(y=0, pos=3)
F(Y|X) = [ 0 1 ] + [ 0 0 ] + [ 1 0 ] + [ 0 0]
F(Y|X) = [ 1 1 ]

-regards
Amit

On 1/25/07, Amit Jaiswal <ami...@gm...> wrote:
>
> Hi,
> A quick reply now (will give a more detailed reply in a day or two)
> 1. If there are n classes, then there are n features for a particular
> "property". If you see the definition of a feature, it is a function of both
> the property that you want to represent and the state/label. For example,
> say the property is "isCapitalized" and there are 2 classes, then there will
> be 2 features (with different feature ids)
> i) isCapitalized is true and state = 0  ( featureID = 0)
> ii) isCapitalized is true and state = 1 ( featureID = 1)
>
> 2. If you look at the maximum log likelihood equation in the paper(Shallow
> Parsing), then you will see that the numerator contains the label sequence
> of all the training instances, and the denominator contains a normalizing
> term for all possible label sequences.
>
> For a position (say) x=0, we fire features for all the states in which the
> feature will be true (in all the FeatureTypes class).
>
> Then in the CRF trainer, while computing the F(Y|X) for a particular label
> sequence Y, we take only those feature values whose state matches with the
> state present in that label sequence .
>
> For example, say n=2 [ 0 = other, 1 = NounPhrase], and feature =
> "isCapitalized", and datasequence = "Today is Thursday ."
> Trainig data = Y = [ 1 0 1 0]
>
>
> We fire 2 features at each position in the sequence. So at pos = 0 where
> the word is capitalized, , the features we fire are
> i) isCapitalized=1 and y=0 (featureId=0)
> ii) isCapitalized=1 and y=1 (featureId=1)
>
> Note that the length of the feature vector = 2.
>
> Similarly at the rest of the positions, we fire features by looking if the
> word at that position is capitalized or not.
>
> Now consider the global feature vector F() for difference possible label
> sequences
> a)
> For Y = [ 0 0 0 0]
> F(Y|X) = f(y = 0, pos=0) + f(y=0, pos=1) + f(y=0, pos=2) + f(y=0, pos=3)
> F(Y|X) = [ 1 0 ] + [ 0 0 ] + [ 1 0 ] + [ 0 0]
> F(Y|X) = [ 2 0 ]
>
> b)
> For Y = [ 1 0 0 0 ]
> F(Y|X) = f(y = 0, pos=0) + f(y=0, pos=1) + f(y=0, pos=2) + f(y=0, pos=3)
> F(Y|X) = [ 0 1 ] + [ 0 0 ] + [ 1 0 ] + [ 0 0]
> F(Y|X) = [ 1 1 ]
>
>
> If you look at the equations in the paper, then it is this global feature
> vector F(,) which is used.
> So for each possible label sequence Y, such F(Y|X) needs to be computed,
> and thus we fire features for all the possible states, and the CRF trainer
> takes care of generating the proper F(Y|X) for a particular label sequence
> Y.
>
> 3. Answer to your second question is bit simple. All the features are
> mapped to a contiguous array and each feature is given a unique id. You can
> look at iitb.Model.FeatureGenImpl class to see the implementation details.
>
> 4. Now about the unseen words seen during testing. WordFeatures is the
> feature that fires all the word features. There is an integer parameter
> called RARE_THRESHOLD. Any word that is not seen atleast RARE_THRESHOLD
> times in the training data is considered as a rare / unknown word and is not
> fired as a feature. There is another feature called UnknownFeatures which is
> fired only for such rare words.
>
> So, the UnknownFeatures is fired for any word that is seen only in testing
> data (because its frequency in the training data would be 0 and thus will be
> less than RARE_THRESHOLD)
>
> 5. The last question is bit confusing. First try to understand the meaning
> of the feature vector and how this is implemented in CRF package. If you
> need any specific detail about any class/package, then please send a mail.
>
> 6. An excellent documentation about using the CRF package is given in
> http://crf.sourceforge.net/introduction/
>
> Hope this helps.
>
> -amit
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> On 1/25/07, crf...@li... <crf...@li...>
> wrote:
> >
> > Send Crf-users mailing list submissions to
> >         crf...@li...
> >
> > To subscribe or unsubscribe via the World Wide Web, visit
> >          https://lists.sourceforge.net/lists/listinfo/crf-users
> > or, via email, send a message with subject or body 'help' to
> >         crf...@li...
> >
> > You can reach the person managing the list at
> >         crf...@li...
> >
> > When replying, please edit your Subject line so it is more specific
> > than "Re: Contents of Crf-users digest..."
> >
> >
> > Today's Topics:
> >
> >    1. CRF Implementation (deb...@th...)
> >
> >
> > ----------------------------------------------------------------------
> >
> > Message: 1
> > Date: Wed, 24 Jan 2007 09:13:50 -0500
> > From: <deb...@th...>
> > Subject: [Crf-users] CRF Implementation
> > To: < crf...@li...>
> > Message-ID:
> >         <FD3...@tl...
> > >
> >
> > Content-Type: text/plain;       charset="us-ascii"
> >
> >
> >
> > > Dear Dr. Sarawagi,
> > > I am working in the Information Extraction research group of Thomson
> > Corp. and recently got a chance to use the CRF package that you have
> > created. Thanks for the excellent implementation, it is really a very
> > useful module and personally I feel more comfortable than any other
> > available implementations.
> > >
> > > However, I have few very basic doubts regarding the code, especially
> > on the usage and values of the weight vector (lambda) during the
> > training procedure. I will be grateful if you can clarify them.
> > >
> > > Firstly, I started working with the same data corpus (address
> > sequence) you have implemented in the sample example. From my
> > understanding of the original Mccallum paper I thought the lambda vector
> > (weight vector) will have the same length as number of the feature
> > functions. I generated 4 state feature functions (depending upon the
> > address data) and 3 transition (emission) feature functions. So the
> > weight vector has a length of 7 in my case. Whenever I find any feature
> > from the training data (that is if the data passes any particular
> > feature function - which is a boolean function) I update the lambda of
> > the particular feature function index. Where as, in the java CRF
> > implementation, the weight vector has a length of the size of total
> > possible feature vectors (I think it is 220). This is little confusing
> > for me.
> > >
> > > Secondly, I checked out that the generated features are composed of
> > some functions (all caps, alpha-numeric property, the "word" etc.).
> > Based on the "feature index" for the weight vector, you apply the
> > Viterbi algorithm. My doubt is, while finding out the index from weight
> > vector (during evaluation) how do you match the index of the trained
> > weight vector? In the training implementation every word is a feature in
> >
> > your case, if some unseen word (which is very possible) occurs during
> > the testing procedure then do you index this word as "unseen feature"?
> > >
> > > Lastly, many words can be represented as various features (such as the
> >
> > "word", alphanumeric or not, starts with caps etc.). While finding the
> > index of the weight vector to match during evaluation, how do you select
> > which feature index to be used (from the weight vector)? Do you give
> > weight to any particular feature (say, if the word matches with a
> > training data, it is of higher priority) than the other?
> > >
> > > Thanks in advance,
> > >
> > > Regards,
> > > Debanjan
> >
> >
> >
> > ------------------------------
> >
> >
> > -------------------------------------------------------------------------
> > Take Surveys. Earn Cash. Influence the Future of IT
> > Join SourceForge.net's Techsay panel and you'll get the chance to share
> > your
> > opinions on IT & business topics through brief surveys - and earn cash
> > http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
> >
> >
> > ------------------------------
> >
> > _______________________________________________
> > Crf-users mailing list
> > Crf...@li...
> > https://lists.sourceforge.net/lists/listinfo/crf-users
> >
> >
> > End of Crf-users Digest, Vol 5, Issue 3
> > ***************************************
> >
>
>