From: Roger P. M. <ro...@ya...> - 2006-07-03 06:49:08
|
(FWDed as it didn't reach everybody the first time.) ================================= Replies are inline. Anthony Liu wrote: >Exception in thread "main" >java.lang.ArrayIndexOutOfBoundsException: -1 > at >iitb.CRF.Viterbi.viterbiSearch(Viterbi.java:168) > at >iitb.CRF.Viterbi.bestLabelSequence(Viterbi.java:137) > at iitb.CRF.CRF.apply(CRF.java:118) > at >iitb.Segment.Segment.segment(Segment.java:191) > at >iitb.Segment.Segment.doTest(Segment.java:252) > at iitb.Segment.Segment.test(Segment.java:236) > at iitb.Segment.Segment.main(Segment.java:58) > >I am actually using this Segment application for my >named entity recognition project. I am not sure if it >is gonna work. > > This happens when your dataSeq[] vector (X vector) is empty. Check if there is an empty test instance in your corpus. >My training data look like below (faked by me in 2 >minutes, so pls just assume its validity). > >Bangalore/LOC ,/PUNC which/WH is/IS essentially/ADV >a/A brand/NN that/P has/VB3 been/VBP created/VBP >painfully/ADV in/P the/DT last/ADJ so/ADV many/ADJ >years/NN, is/IS fast/ADV disappearing/VBG ,/PUNC >"/PUNC said/VBD Rajendra Misra/PER, managing/ADJ >director/NN of/P private/ABJ equity/NN firm/NN Tenet >Holdings Private/ORG ./PUNC > >Any good idea to share? Thanks. > > Using Segment directly may not be a good idea, you may need to custom it (change code) for your use. But it's a good example application. We have developed two NER tasks around this CRF code. Both return good results. The problem is that the work is to be commercially used, hence cannot be shared. I can send you detailed instructions for writing your own application, if you can wait for another 8 hours :-) -regards, Roger |