Menu

A question about coreference package

jianlee
2006-04-03
2013-04-16
  • jianlee

    jianlee - 2006-04-03

    Could coreference package find definite noun phrase anaphora?

    I am using the OpenNLP coref package. I had tried some examples from this package. And I found that it can find the relations in pronominal anaphora. For example:

    <strong>Researchers</strong> from many different places attended the conference. <strong>They</strong> discuss experiment results with each other after the meeting.

    The relation between "Researchers" and "They" was found by the package successfully.

    But... It can't find the relations in definite noun phrase anaphora. For example:

    <strong>Bioengineering researchers</strong> from many different places attended the conference. <strong>The participants</strong> discuss experiment results with each other.

    It can't find the relation between "Bioengineering researchers" and "The participants" in this example.

    Any help appreciated.

    Jianlee

     
    • Thomas Morton

      Thomas Morton - 2006-04-04

      Hi,
         The module does try and resolve these but misses this case.  Specifically, it's only 23% sure these are coreferent.

      The participants -> [ Bioengineering researchers from many different places ] (male,male) 0.2338269273799131 [default, sim.compatible, gen.compatible, num.compatible, all.compatible, pt=BOS, pw=BOS, nt=VBP, nw=discuss, bnt=VBP, bnw=discuss, hd=0, de=3, ds=1]

      Definite NPs are much harder than pronouns and often don't have explicit references if the entity is inferable.  "We walked up to the house.  The door was open".   This is even more the case with plural definite NPs (like the one from your example) since the antecedent may be split: "Tom went to the store and met Bill.  The boys then had lunch together." and the annotated data on which this model was trained doesn't account for such phenomena.

      The model is essentially playing it safe and not positing a relationship here because in many other cases this is correct. 

      Hope this helps...Tom

       
      • jianlee

        jianlee - 2006-04-17

        Anaphora resolution systems rely on syntactic, semantic or statistical clues to identify the antecedent of an anaphor.

        I am wondering which strategy are used in your anaphora resolution system. Can you give me some references about your system?

        Thank you...

        Jianlee

         
        • Thomas Morton

          Thomas Morton - 2006-04-17

          Hi,
             You'll probably find this thread and the referenced information helpful.

          https://sourceforge.net/forum/forum.php?thread_id=1456314&forum_id=9943

          Hope this helps, post back if it doesn't.  Thanks..Tom

           
    • gyrocyclist

      gyrocyclist - 2006-05-13

      Aha! Tom, your answer is enlightening -- can you tell me from what class/data structure I can retrieve the bracketed information:

      The participants -> [ Bioengineering researchers from many different places ] (male,male) 0.2338269273799131 [default, sim.compatible, gen.compatible, num.compatible, all.compatible, pt=BOS, pw=BOS, nt=VBP, nw=discuss, bnt=VBP, bnw=discuss, hd=0, de=3, ds=1]

      I assume there's some confidence setting somewhere that drops this anaphor due to the low confidence; so where is the setting.  How can I change the setting?

      Extended commentary: suppose I'm comparing the outputs of N anaphor and/or coref resolution codes; and suppose I also have additional data (e.g, a semantic tagger that I think is the cat's pajamas).  Well, in that case I would want to know everything that ONLP found -- not only what was left after discarding -- so that I can compare everything that OpenNLP has deduced with everything that all other codes have deduced, and then employ my own reconciliation/dropping strategy.  At the moment my goal is very high precision, and my heuristic is: if everyone agrees that 'x' is true, then it is most likely true that 'x' is indeed true.

      thanks,
      David H.
      Center for Applied Scientific Computing (CASC)
      Lawrence Livermore National Lab. (LLNL)

       

Log in to post a comment.