Menu

Parse an input file?

Help
sracioppa
2016-01-21
2016-01-22
  • sracioppa

    sracioppa - 2016-01-21

    Hello,
    Using tccg I can parse and display the semantics of a single example sentence. :2xml also saves this single example to xml. Is it possible to choose a sentence LIST (e.g. testbed) as input and save all parsed semantics to file? How?
    Thank you!

     
  • Michael White

    Michael White - 2016-01-21

    Hi

    There is a program to parse a file with one sentence per line and output a testbed file: https://github.com/OpenCCG/openccg/blob/master/src/opennlp/ccg/Parse.java

    At present it seems to be missing a command-line script in $OPENCCG_HOME/bin/, but it would be easy to add a ccg-parse (and ccg-parse.bat) script to do so. If you create these, you could even issue a pull request on github to add them to the current dev version.

     
  • sracioppa

    sracioppa - 2016-01-22

    Hi Michael, thank you!
    With "output a testbed" you mean that I could save the semantic output of every input line to file?

     
  • Michael White

    Michael White - 2016-01-22

    Right, it saves the LF output of each parse to a testbed file.

    I've added ccg-parse scripts to the master branch on github, but I'm reminded that it's currently set up in a way that requires a supertagging and parsing model. As such, it's easier to use the ant-based script for parsing novel text that also does necessary pre-processing, which is described in docs/ccgbank-README in the section "Using the pre-built English models". Otherwise, if you're not using a statistical model, then it will perhaps make sense to write a simpler version of the program in src/opennlp/ccg/Parse.java.

     

Log in to post a comment.