Menu

More than one parse with ccg-build

Help
Jeezri
2015-11-05
2015-11-06
  • Jeezri

    Jeezri - 2015-11-05

    Hi,

    I'm working on a school project and I want to use OpenCCG. I've got everything up and running and I've been able to run alle the examples/test cases. However, for my final project in school I want to research ambiguities in natural language and I've been wondering how I would have to tweak OpenCCG's settings in order for it to produce more than one parse for ambiguous sentences.

    That is, for example, for the classic 'the kids saw the man with the telescope' I only get one lf (logical form?) but there are more. How would I be able to get the additional/extra readings?

    Sorry if this is totally obvious and easy, but Java is still kind of new to me and my teacher hasn't been able to help me.

     
  • Michael White

    Michael White - 2015-11-05

    Hello Jeezri

    I'm not sure from your message how exactly you're running the parser, but in any case you need to ensure that the preference settings are set in a way that will return multiple parses. If you run tccg then enter :sh, it will show you the current settings, and :h will show the commands for changing them. The :all command shows all parse results in tccg, and the :ppv N command sets the parser pruning value to N when running the statistical parser (where N must be greater than 1 to get multiple parses back). These preference settings persist across tools, so should change the settings in tccg and try again.

    Mike

     
  • Jeezri

    Jeezri - 2015-11-05

    Hello!

    Thank you very much for your quick reply. Currently I'm running the parser by invoking it through the build-ps.xml file on novel data (in the ccgbank directory). Does this use the settings of tccg as well?

    I am able to get multiple parses by modfiying the settings in tccg and invoking tccg in a directory with a grammar.xml, but I still just get one parse when running ccg-build build-ps.xml on novel data.

    Again, thank you very much!

     
  • Michael White

    Michael White - 2015-11-05

    Ah, when parsing novel text the preferences are set according to what's in ccgbank/models/parser/parse.prefs, overriding whatever might've been set using tccg interactively. But unless you've changed that file, it should have a parser pruning value of 7. Looking at src/opennlp/ccg/Parse.java, I see that what you need to do is use the -nbestListSize N switch with N > 1.

      <target name="test-parser-novel" depends="check-test-parser-novel" unless="check-test-parser-novel.uptodate">
        <echo>Loading parse.prefs</echo>
        <java classname="opennlp.ccg.TextCCG">
          <arg value="-importprefs"/> <arg value="${parser.models.dir}/parse.prefs"/>
        </java>
        <echo>Parsing ${novel.file}.dir/nertext-nolabs to ${novel.file}.dir/tb.xml</echo>
        <java classname="opennlp.ccg.Parse" output="${novel.file}.dir/parse.log">
          <arg value="-g"/> <arg value="${novel.file}.dir/extract/grammar.xml"/>
          <arg value="-stconfig"/> <arg value="${supertagger.models.dir}/st.config"/>
          <arg value="-parsescorer"/> <arg value="plugins.MyGenSynScorer"/>
          <arg value="${novel.file}.dir/nertext-nolabs"/>
          <arg value="${novel.file}.dir/tb.xml"/>
        </java>
      </target>
    
     
  • Jeezri

    Jeezri - 2015-11-05

    Sadly, even when setting <arg value="-nbestListSize"/> <arg value="${nbest.list.size}"/> to something like <arg value="-nbestListSize"/> <arg value="100"/> (or the corresponding value in parse.prefs) I still only get numOfParses=1 in tb.xml for something like the kids saw the man with the telescope. Am I thinking about this all wrong and this has got nothing to do with the output that is written to tb.xml? Is it correct to just change the value and then re-run 'ccg-build -Dnovel.file=PATH -f build-ps.xml'?

    Sorry for wasting your time but, again, thank you very much for your help and time.

     
  • Michael White

    Michael White - 2015-11-05

    Ah, looking at this again my guess is that you're not running it with the right build target. Try:

    ccg-build -Dnovel.file=data/novel/ambiguous -f build-ps.xml test-parser-novel-nbest &> logs/log.ps.test.novel.ambig.nbest
    

    where data/novel/ambiguous contains:

    Students who procrastinate often fail.
    I saw a mouse on the table with hairy legs.
    

    Note that something seems to be going wrong in my version with duplicate detection, but you should see some distinct parses with different attachments of 'often' and 'with'.

     
  • Jeezri

    Jeezri - 2015-11-06

    Ah, what an embarrassing oversight! That works perfectly, thank you very much!

     

Log in to post a comment.