SourceForge has been redesigned. Learn more.
Close

Creating a SentenceDetector in 1.5.0

Help
Anonymous
2010-09-25
2013-04-16
  • Anonymous

    Anonymous - 2010-09-25

    Hi,

    Can anyone point me in the right direction for creating a SentenceDetector in 1.5.0, previously, I created it like this in JRuby:

    @detector = SentenceDetector.new(File.expand_path("models/EnglishSD.bin.gz"))

    I see that SetenceDector is now an interface, I think I need to create a SentenceModel now, also I think I should be using one of the new models like en.sent.bin?

    Any pointers would be greatly appreciated.

    Paul

     
  • James Kosin

    James Kosin - 2010-09-25

    Paul,

    In Java:

                    sdetector = new SentenceDetectorME(new SentenceModel(getClass().getResourceAsStream(sentdetector)));
    

    I'm guessing that in JRuby:

                   @detector = SentenceDetectorME.new( SentenceModel.new(File.expand_path(...)))
    

    Hope this helps…
    James

     
  • Anonymous

    Anonymous - 2010-09-25

    Thanks for your help James.

    Could you tell me what has happened to the getModel method of name finder, I have examined the source of NameFinderME and it is no longer there, here is my code which has been updated to create the new SentenceDetector and the new NameFinderME

    @formatter = formatter
        file = FileInputStream.new(File.join(File.dirname(__FILE__), '../models/en-sent.bin'))
        sentModel = SentenceModel.new(file)
        @detector = SentenceDetectorME.new(sentModel)
        @finders = %w{en-ner-person en-ner-location en-ner-organization}.map do |model|
         
        end
        @tokenizer = SimpleTokenizer.new

    getModel is no longer part of NameFinderME.

    Is there an alternative method or can someone enlighten me as to how things have changed?

    Thank you very much for your help, I know these questions can be annoying for regulars!

    Paul

     
  • James Kosin

    James Kosin - 2010-09-25

    Paul,

    I'm not sure what has changed exactly.  I'm a bit new myself.

    Basically, each detector (SentenceDectorME, NameFinderME, etc) are responsible for keeping a handle on the model itself; however, it is the main application part that passes the model into the class.  This allows for multiple detectors using the same model to be running.  Keeping things cleaner than having say 20-30 models of the same exact type.

    If you need to keep a handle on the model, then you just need to assign the model someplace before assigning it to the model allowing you to reuse the model.

    Your probably going to have to go through all your code.  Jorn or Jason could probably help more in the area of what has changed in geater detail.

    James

     
  • Anonymous

    Anonymous - 2010-09-25

    I should mention, this is how it used to be called

    @finders = %w{person location date organization}.map do |model|
      [model, NameFinderME.new(BinaryGISModelReader.new(java.io.File.new("models/#{model}.bin.gz")).getModel)]
    end
    

    I was passing in a BinaryGISModelReader.

    Perhaps, I should just be using a GenericModel or something now?

    Things seem quite different now.

    Cheers

    Paul

     
  • Anonymous

    Anonymous - 2010-09-25

    Thank you for your help James.

     
  • Anonymous

    Anonymous - 2010-09-25

    OK, we are in business with the following code:

      def initialize(formatter)
        @formatter = formatter
        file = FileInputStream.new(File.join(File.dirname(__FILE__), '../models/en-sent.bin'))
        sentModel = SentenceModel.new(file)
        @detector = SentenceDetectorME.new(sentModel)
        @finders = %w{en-ner-person en-ner-location en-ner-organization}.map do |model|
           [model, NameFinderME.new(TokenNameFinderModel.new(FileInputStream.new("../../models/#{model}.bin")))]
        end
        @tokenizer = SimpleTokenizer.new
      end
    

    The BinaryGISModelReader does not like the new model files and I was able to create the model's using the TokenNameFinderModel.

    Sorry to waste anybody's time.

    Paul

     
  • Anonymous

    Anonymous - 2010-09-27

    Thank you for th link and the advice.

     

Log in to post a comment.