Menu

Java exception when loading grammar with tccg

Help
Zhao Meng
2016-07-20
2016-07-20
  • Zhao Meng

    Zhao Meng - 2016-07-20

    Really very sorry to bother again.

    I was trying to load openccgbank grammar to tccg but I encountered the following error message(s):

    Loading grammar from URL: file:/home/zhao/Desktop/test/grammar.xml
    Skipping family: np_~1\np_1/*punct[,]/*
    java.util.NoSuchElementException
    Loading supercat combos from /home/zhao/Desktop/test/info/combos-train
    Grammar 'openccgbank' loaded.
    
    Enter strings to parse.
    Type ':r' to realize selected reading of previous parse.
    Type ':h' for help on display options and ':q' to quit.
    You can use the tab key for command completion, 
    Ctrl-P (prev) and Ctrl-N (next) to access the command history, 
    and emacs-style control keys to edit the line.
    

    But it seemed that even there was a java exception, tccg just worked as fine.

    BTW, I want to build a dialogue system using OpenCCG. Is it possible that I input some of the essential words to OpenCCG then it output a full sentence? I tried this with the openccgbank grammar by this way:

    First, I generate a xml form describing the LF of a sentence by using tccg's 2xml command.

    Then, I delete some words in the obtained xml file (such as "this", "is", etc).

    Next, I use ccg-realize to realize it.(Now the xml file only contains words like "food", "good", i.e., some essential words)

    (By saying deleting words, I mean deleting these stuff (take "this" as an example):

          <rel name="Arg0">
            <node id="w0" pred="this"/>
          </rel>
    

    )

    But it seems that this method doesn't work. The realizer just throws an exception or outputs some sentences which do not make sense.

    So does this mean I have to have a content planner which can produce a LF which represent the whole sentence which intended to realize?

     

    Last edit: Zhao Meng 2016-07-20
  • Michael White

    Michael White - 2016-07-20

    The warning with the Java exception represents cruft that can be ignored. The broad coverage grammar is extracted from the CCGbank in a way that yields some ill-formed categories; in principle the grammar extraction process could be refined to avoid yielding these categories but that has not been a priority.

    The realizer is essentially designed to produce strings that result from reversing the derivations used in parsing. Thus the realizer won't work unless it is given logical forms that are the same as the parser would produce. These are typically produced by a content (or text) planner and a sentence planner. See this CL article (http://aclweb.org/anthology/J/J10/J10-2001.pdf) for an example that shows how this can be done at an overview level (though note it's not necessary to deal with prosody).

    Previous dialogue systems using OpenCCG have used small, hand-crafted grammars. Using the broad coverage grammar in a dialogue system may be a bit slow. If you keep working in this direction, it may make sense to strip the grammar down to just what's needed to cover a set of representative examples. There are no scripts to do this at present, but iin principle it should be a straightforward programming task.

    OpenCCG is designed to support experimentation with precise grammars capable of parsing and generating complex texts with high quality. For many dialogue system needs, grammar-free methods may be adequate, especially when there is little structure to the desired texts beyond a conjunction of simple facts. See for example the neural net approaches in Steve Young's (http://mi.eng.cam.ac.uk/~sjy/) lab.

     
    • Zhao Meng

      Zhao Meng - 2016-07-20

      Many thanks~

       

Log in to post a comment.