Menu

problem in sentence detection and tokenization with opennlp

Help
2016-04-06
2016-04-06
  • Nooshin Taheri

    Nooshin Taheri - 2016-04-06

    Hello all
    before starting my question, i appreciate you in advance for your support.

    I am a newbie in java and opennlp, so i am truly sorry if my questions don't make any sense in your opinion.
    I want to do sth like this:
    I have a XML file as an input. so first of all i want to read whole documents and give it to my classes to do some operations. I have problem in this part... I can read XML file with java functions but not whole of the document... about half of them.

    second, without considering about the input I wrote some code which i want to use opennlp models. the first one is sentence detection, i wrote the code and it works. but when i call the class in main method (the class has to return a string[]), and I have this error:
    [Ljava.lang.String;@5b2133b1
    which I can't understand why. when my class is void, it hasn't any error and works perfect.
    then i need to tokenize each sentence. the input of tokenization part is the out put of sentence detection. but i don't know how to deal with it.

    I am really thankful for your responses.

     
    • niceday

      niceday - 2016-04-06

      Hi I'm not sure why you are reading in from an XML file? Does the XML file contain the document or does it contain the sentences or paragraphs wrapped into XML tags? Tbh I would tend to throw away the XML file and put your text into a single file and just open an stream input reader and pass that to sentence detector and tokenizer.
      Hope it helps 

      Sent from Yahoo Mail on Android

      On Wed, 6 Apr, 2016 at 10:30, Nooshin Taherinooshin4268@users.sf.net wrote:
      Hello all
      before starting my question, i appreciate you in advance for your support.

      I am a newbie in java and opennlp, so i am truly sorry if my questions don't make any sense in your opinion.
      I want to do sth like this:
      I have a XML file as an input. so first of all i want to read whole documents and give it to my classes to do some operations. I have problem in this part... I can read XML file with java functions but not whole of the document... about half of them.

      second, without considering about the input I wrote some code which i want to use opennlp models. the first one is sentence detection, i wrote the code and it works. but when i call the class in main method (the class has to return a string[]), and I have this error:
      [Ljava.lang.String;@5b2133b1
      which I can't understand why. when my class is void, it hasn't any error and works perfect.
      then i need to tokenize each sentence. the input of tokenization part is the out put of sentence detection. but i don't know how to deal with it.

      I am really thankful for your responses.

      problem in sentence detection and tokenization with opennlp

      Sent from sourceforge.net because you indicated interest in https://sourceforge.net/p/opennlp/discussion/9943/

      To unsubscribe from further messages, please visit https://sourceforge.net/auth/subscriptions/

       

Log in to post a comment.