Numeric features in CRF

Help
2011-12-19
2013-04-26
  • Hi,

    Does the CRFLearner work with numeric features?

    Thanks.

     
  • Frank Lin
    Frank Lin
    2011-12-21

    It should! If for some reason it fails please let us know!

     
  • I had a whole bunch of binary features, which were working fine. Then I added a numeric feature, and I get the following output from the CRFLearner:

    Property: ll
    Number of features :29846
    Iteration 0 log-likelihood -48307.080944733694 norm(grad logli) NaN norm(x) 0.0
    Iteration 1 log-likelihood NaN norm(grad logli) NaN norm(x) NaN
    Iteration 2 log-likelihood NaN norm(grad logli) NaN norm(x) NaN
    Iteration 3 log-likelihood NaN norm(grad logli) NaN norm(x) NaN
    Iteration 4 log-likelihood NaN norm(grad logli) NaN norm(x) NaN
    Iteration 5 log-likelihood NaN norm(grad logli) NaN norm(x) NaN
    Iteration 6 log-likelihood NaN norm(grad logli) NaN norm(x) NaN
    Iteration 7 log-likelihood NaN norm(grad logli) NaN norm(x) NaN
    Iteration 8 log-likelihood NaN norm(grad logli) NaN norm(x) NaN
    Iteration 9 log-likelihood NaN norm(grad logli) NaN norm(x) NaN
    Iteration 10 log-likelihood NaN norm(grad logli) NaN norm(x) NaN
    Iteration 11 log-likelihood NaN norm(grad logli) NaN norm(x) NaN
    Iteration 12 log-likelihood NaN norm(grad logli) NaN norm(x) NaN
    Iteration 13 log-likelihood NaN norm(grad logli) NaN norm(x) NaN
    Iteration 14 log-likelihood NaN norm(grad logli) NaN norm(x) NaN
    Iteration 15 log-likelihood NaN norm(grad logli) NaN norm(x) NaN
    Iteration 16 log-likelihood NaN norm(grad logli) NaN norm(x) NaN
    Iteration 17 log-likelihood NaN norm(grad logli) NaN norm(x) NaN
    Iteration 18 log-likelihood NaN norm(grad logli) NaN norm(x) NaN
    Iteration 19 log-likelihood NaN norm(grad logli) NaN norm(x) NaN
    Iteration 20 log-likelihood NaN norm(grad logli) NaN norm(x) NaNCRF: lbfgs failed.
    Line search failed. See documentation of routine mcsrch. Error return of line search: info = 3 Possible causes: function or gradient are incorrect, or incorrect tolerances. (iflag == -1)

    Possible reasons could be:
    1. Bug in the feature generation or data handling code
    2. Not enough features to make observed feature value==expected value

    I was looking into the documentation to see if it said anything about numeric features, but couldn't find anything. I noticed something about "sparse" numeric features. What are those?

    Thanks.

     
  • Frank Lin
    Frank Lin
    2011-12-22

    Can you send me a sample of what your input data to minorthird? For example, maybe twenty instances, including ones with numeric features that failed.

     
  • Here are the first three sequences in the sequential data output:

    k ABC19980108.1830.0711/0 eO  lemma.notfound lemma.pos.noun toks.orth.capped next.lemma.eq.news next.lemma.pos.noun svm.stacked.prediction=-2.422762427891579
    k ABC19980108.1830.0711/0 eO  lemma.eq.news lemma.pos.noun toks.orth.letters next.lemma.eq.story next.lemma.pos.noun prev.lemma.pos.noun pos.noun.feat.sem.abst svm.stacked.prediction=-2.6101074341910797
    k ABC19980108.1830.0711/0 eO  lemma.eq.story lemma.pos.noun toks.orth.letters next.lemma.eq.on next.lemma.pos.prep prev.lemma.eq.news prev.lemma.pos.noun pos.noun.feat.sem.abst pos.noun.feat.sem.artf svm.stacked.prediction=-1.8876929399723312
    k ABC19980108.1830.0711/0 eO  lemma.eq.on lemma.pos.prep toks.orth.letters next.lemma.eq.the next.lemma.pos.det prev.lemma.eq.story prev.lemma.pos.noun bigram.grm.pn.on.hand svm.stacked.prediction=-1.9747840817792566
    k ABC19980108.1830.0711/0 eO  lemma.eq.the lemma.pos.det toks.orth.letters next.lemma.eq.other next.lemma.pos.adj prev.lemma.eq.on prev.lemma.pos.prep svm.stacked.prediction=-2.3550884371708793
    k ABC19980108.1830.0711/0 eO  lemma.eq.other lemma.pos.adj toks.orth.letters next.lemma.eq.hand next.lemma.pos.noun prev.lemma.eq.the prev.lemma.pos.det svm.stacked.prediction=-2.402346942499232
    k ABC19980108.1830.0711/0 eO  lemma.eq.hand lemma.pos.noun toks.orth.letters next.lemma.eq., next.lemma.pos.verb prev.lemma.eq.other prev.lemma.pos.adj unigram.grm.pnhd.hand bigram.grm.pn.on.hand svm.stacked.prediction=-2.0037983323521518
    k ABC19980108.1830.0711/0 eO  lemma.eq., lemma.pos.verb next.lemma.eq.it next.lemma.pos.noun prev.lemma.eq.hand prev.lemma.pos.noun pos.verb.feat.sem.abst pos.verb.feat.sem.artf pos.verb.feat.syn.auxv pos.verb.feat.syn.vfin svm.stacked.prediction=-1.8584796994223285
    k ABC19980108.1830.0711/0 eO  lemma.eq.it lemma.pos.noun next.lemma.eq.turn next.lemma.pos.verb prev.lemma.eq., prev.lemma.pos.verb unigram.grm.sbj.it bigram.grm.sv.it.be bigram.grm.sv.it.turn svm.stacked.prediction=-1.8699858902489017
    k ABC19980108.1830.0711/0 eB  lemma.eq.turn lemma.pos.verb toks.orth.letters next.lemma.eq.out next.lemma.pos.prep prev.lemma.eq.it prev.lemma.pos.noun unigram.grm.vrb.turn bigram.grm.sv.it.turn pos.verb.feat.sem.vchng svm.stacked.prediction=-1.7869181251673862
    k ABC19980108.1830.0711/0 eO  lemma.eq.out lemma.pos.prep toks.orth.letters next.lemma.eq.to next.lemma.pos.infto prev.lemma.eq.turn prev.lemma.pos.verb svm.stacked.prediction=-1.882012947734622
    k ABC19980108.1830.0711/0 eO  lemma.eq.to lemma.pos.infto toks.orth.letters next.lemma.eq.be next.lemma.pos.verb prev.lemma.eq.out prev.lemma.pos.prep svm.stacked.prediction=-1.882012947734622
    k ABC19980108.1830.0711/0 eO  lemma.eq.be lemma.pos.verb toks.orth.letters next.lemma.eq.another next.lemma.pos.det prev.lemma.eq.to prev.lemma.pos.infto unigram.grm.vrb.be bigram.grm.sv.it.be pos.verb.feat.syn.auxv svm.stacked.prediction=-1.8699805836050514
    k ABC19980108.1830.0711/0 eO  lemma.eq.another lemma.pos.det toks.orth.letters next.lemma.eq.very next.lemma.pos.qual prev.lemma.eq.be prev.lemma.pos.verb svm.stacked.prediction=-2.0887324634559725
    k ABC19980108.1830.0711/0 eO  lemma.eq.very lemma.pos.qual toks.orth.letters next.lemma.eq.bad next.lemma.pos.adj prev.lemma.eq.another prev.lemma.pos.det svm.stacked.prediction=-2.3144706567629356
    k ABC19980108.1830.0711/0 eB  lemma.eq.bad lemma.pos.adj toks.orth.letters next.lemma.eq.financial next.lemma.pos.adj prev.lemma.eq.very prev.lemma.pos.qual svm.stacked.prediction=-2.120477233729585
    k ABC19980108.1830.0711/0 eO  lemma.eq.financial lemma.pos.adj toks.orth.letters next.lemma.eq.week next.lemma.pos.noun prev.lemma.eq.bad prev.lemma.pos.adj svm.stacked.prediction=-2.120477233729585
    k ABC19980108.1830.0711/0 eO  lemma.eq.week lemma.pos.noun toks.orth.letters next.lemma.eq.for next.lemma.pos.prep prev.lemma.eq.financial prev.lemma.pos.adj pos.noun.feat.sem.abst svm.stacked.prediction=-1.8020659675938702
    k ABC19980108.1830.0711/0 eO  lemma.eq.for lemma.pos.prep toks.orth.letters next.lemma.eq.Asia next.lemma.pos.noun prev.lemma.eq.week prev.lemma.pos.noun bigram.grm.pn.for.Asia svm.stacked.prediction=-1.9685392056324356
    k ABC19980108.1830.0711/0 eO  lemma.eq.Asia lemma.pos.noun toks.orth.capped toks.orth.letters prev.lemma.eq.for prev.lemma.pos.prep unigram.grm.pnhd.Asia bigram.grm.pn.for.Asia svm.stacked.prediction=-1.9834361256408992
    *
    k ABC19980108.1830.0711/1 eO  lemma.eq.the lemma.pos.det toks.orth.letters next.lemma.eq.financial next.lemma.pos.adj svm.stacked.prediction=-2.455987111602126
    k ABC19980108.1830.0711/1 eO  lemma.eq.financial lemma.pos.adj toks.orth.letters next.lemma.pos.noun prev.lemma.eq.the prev.lemma.pos.det svm.stacked.prediction=-2.541219068970979
    k ABC19980108.1830.0711/1 eB  lemma.pos.noun toks.orth.letters next.lemma.eq.from next.lemma.pos.prep prev.lemma.eq.financial prev.lemma.pos.adj bigram.grm.sv.assistance.help pos.noun.feat.sem.act pos.noun.feat.sem.activity svm.stacked.prediction=-2.076530310374684
    k ABC19980108.1830.0711/1 eO  lemma.eq.from lemma.pos.prep toks.orth.letters next.lemma.eq.the next.lemma.pos.det prev.lemma.pos.noun bigram.grm.pn.from.World_Bank svm.stacked.prediction=-2.131454094717844
    k ABC19980108.1830.0711/1 eO  lemma.eq.the lemma.pos.det toks.orth.letters next.lemma.pos.noun prev.lemma.eq.from prev.lemma.pos.prep svm.stacked.prediction=-2.45833963625058
    k ABC19980108.1830.0711/1 eO  lemma.pos.noun next.lemma.eq.and next.lemma.pos.noun prev.lemma.eq.the prev.lemma.pos.det bigram.grm.pn.from.World_Bank svm.stacked.prediction=-2.14478854280412
    k ABC19980108.1830.0711/1 eO  lemma.eq.and lemma.pos.noun toks.orth.letters next.lemma.eq.the next.lemma.pos.det prev.lemma.pos.noun svm.stacked.prediction=-2.0843192161495776
    k ABC19980108.1830.0711/1 eO  lemma.eq.the lemma.pos.det toks.orth.letters next.lemma.pos.noun prev.lemma.eq.and prev.lemma.pos.noun svm.stacked.prediction=-2.4916212640210147
    k ABC19980108.1830.0711/1 eO  lemma.pos.noun next.lemma.eq.be next.lemma.pos.verb prev.lemma.eq.the prev.lemma.pos.det bigram.grm.sv.International_Monetary_Fund.help svm.stacked.prediction=-2.117918451990428
    k ABC19980108.1830.0711/1 eO  lemma.eq.be lemma.pos.verb toks.orth.letters next.lemma.eq.not next.lemma.pos.adv prev.lemma.pos.noun pos.verb.feat.syn.auxv pos.verb.feat.syn.vfin svm.stacked.prediction=-1.8584796994223285
    k ABC19980108.1830.0711/1 eO  lemma.eq.not lemma.pos.adv toks.orth.letters next.lemma.eq.help next.lemma.pos.verb prev.lemma.eq.be prev.lemma.pos.verb svm.stacked.prediction=-2.3486275741290843
    k ABC19980108.1830.0711/1 eB  lemma.eq.help lemma.pos.verb toks.orth.letters prev.lemma.eq.not prev.lemma.pos.adv unigram.grm.vrb.help bigram.grm.sv.International_Monetary_Fund.help bigram.grm.sv.assistance.help svm.stacked.prediction=-1.833622663454806
    *
    k ABC19980108.1830.0711/2 eO  lemma.eq.in lemma.pos.prep toks.orth.letters next.lemma.eq.the next.lemma.pos.det bigram.grm.pn.in.hour svm.stacked.prediction=-1.9945846293151008
    k ABC19980108.1830.0711/2 eO  lemma.eq.the lemma.pos.det toks.orth.letters next.lemma.eq.last next.lemma.pos.adj prev.lemma.eq.in prev.lemma.pos.prep svm.stacked.prediction=-2.3864239986327953
    k ABC19980108.1830.0711/2 eO  lemma.eq.last lemma.pos.adj toks.orth.letters next.lemma.pos.noun prev.lemma.eq.the prev.lemma.pos.det svm.stacked.prediction=-2.432432225647196
    k ABC19980108.1830.0711/2 eO  lemma.pos.noun next.lemma.eq.hour next.lemma.pos.noun prev.lemma.eq.last prev.lemma.pos.adj svm.stacked.prediction=-2.432432225647196
    k ABC19980108.1830.0711/2 eO  lemma.eq.hour lemma.pos.noun toks.orth.letters next.lemma.eq.the next.lemma.pos.det prev.lemma.pos.noun unigram.grm.pnhd.hour bigram.grm.pn.in.hour pos.noun.feat.sem.abst svm.stacked.prediction=-2.000383874818899
    k ABC19980108.1830.0711/2 eO  lemma.eq.the lemma.pos.det toks.orth.letters next.lemma.eq.value next.lemma.pos.noun prev.lemma.eq.hour prev.lemma.pos.noun svm.stacked.prediction=-2.5060741369413675
    k ABC19980108.1830.0711/2 eO  lemma.eq.value lemma.pos.noun toks.orth.letters next.lemma.eq.of next.lemma.pos.prep prev.lemma.eq.the prev.lemma.pos.det unigram.grm.sbj.value bigram.grm.sv.value.fall pos.noun.feat.sem.abst svm.stacked.prediction=-2.0843192161495776
    k ABC19980108.1830.0711/2 eO  lemma.eq.of lemma.pos.prep toks.orth.letters next.lemma.eq.the next.lemma.pos.det prev.lemma.eq.value prev.lemma.pos.noun bigram.grm.pn.of.stock_market svm.stacked.prediction=-2.215272471402582
    k ABC19980108.1830.0711/2 eO  lemma.eq.the lemma.pos.det toks.orth.letters next.lemma.eq.Indonesian next.lemma.pos.adj prev.lemma.eq.of prev.lemma.pos.prep svm.stacked.prediction=-2.543989006452915
    k ABC19980108.1830.0711/2 eO  lemma.eq.Indonesian lemma.pos.adj toks.orth.capped toks.orth.letters next.lemma.eq.stock_market next.lemma.pos.noun prev.lemma.eq.the prev.lemma.pos.det svm.stacked.prediction=-2.593338196120829
    k ABC19980108.1830.0711/2 eO  lemma.eq.stock_market lemma.pos.noun next.lemma.eq.have next.lemma.pos.verb prev.lemma.eq.Indonesian prev.lemma.pos.adj unigram.grm.pnhd.stock_market bigram.grm.pn.of.stock_market svm.stacked.prediction=-2.212157264731229
    k ABC19980108.1830.0711/2 eO  lemma.eq.have lemma.pos.verb toks.orth.letters next.lemma.eq.fall next.lemma.pos.verb prev.lemma.eq.stock_market prev.lemma.pos.noun pos.verb.feat.syn.auxv pos.verb.feat.syn.vfin svm.stacked.prediction=-1.8584796994223285
    k ABC19980108.1830.0711/2 eB  lemma.eq.fall lemma.pos.verb toks.orth.letters next.lemma.eq.by next.lemma.pos.prep prev.lemma.eq.have prev.lemma.pos.verb unigram.grm.vrb.fall bigram.grm.sv.value.fall pos.verb.feat.syn.ven svm.stacked.prediction=-1.806266552304277
    k ABC19980108.1830.0711/2 eO  lemma.eq.by lemma.pos.prep toks.orth.letters next.lemma.pos.adv prev.lemma.eq.fall prev.lemma.pos.verb svm.stacked.prediction=-1.9529181017829333
    k ABC19980108.1830.0711/2 eO  lemma.pos.adv prev.lemma.eq.by prev.lemma.pos.prep svm.stacked.prediction=-2.226826116294806

    Each instance has one numeric feature, while the remaining features are all binary.
    I should mention that the CRF model was not trained directly with this file as input. It was trained within the Java code by training the CRF learner over the Sequence Dataset that was constructed within the code. In addition, the code produced as output the above data file using the saveAs() method on the constructed Sequence Dataset.

    • Sid.
     
  • Frank Lin
    Frank Lin
    2011-12-27

    Hi Sid,

    Try turning off "useHighPrecisionArithmetic" in the options for CRFLearner, which can be accessed via the GUI or the set/get method for CRFLearner class if you are using minorthird API. It works for me on this small example data.

    This happens because the default setting for useHighPrecisionArithmetic is ON, and what that means is that, if you have features with really small decimal values, the training will be done in log space so differences between small real values are preserved. This also means that all negative values (in this case your stacked svm prediction) will be NaN and the LBFGS will fail to work.

    I realize now that perhaps it should be turned off by default. We will make a note of this in the next version. Let us know if that helps.

     
  • Thanks for your help with this.

    I turned off high precision arithmetic, and that did work. But it seems that the overall accuracy of the CRF is much lower. I tried two runs (using only binary features) - once with high precision arithmetic and once without… it seems that there is drop in accuracy of the model (of about 1% for my task). Then adding the numeric feature dropped accuracy even further… but it did successfully train a model.

    I think keeping the default with high precision arithmetic on makes sense if the model accuracy is better in that case. If the only issue is to ensure that the numeric features should always be positive, maybe that can be made explicit in the documentation?

    Thanks again for your help with this.

    • Sid.
     
  • Frank Lin
    Frank Lin
    2012-01-06

    It is actually a bug - the optimization routine should keep track of the sign and the value separately and do the high-precision arithmetic accordingly so the users can use whatever feature values they want.

    For the next release we'd like to fix it so high-precision would work for both positive and negative features. For now we'll have to work around it…