Looking for the latest version? Download standford-pos_model.zip (10.2 MB)
Home
Name Modified Size Downloads / Week Status
shortcuts 2011-12-03 0
ReadMe.txt 2012-03-18 5.8 kB 0
standford-pos_model.zip 2012-03-18 10.2 MB 11 weekly downloads
sweetonionCCG2PTBconverter_v0.1.3.zip 2012-03-18 2.7 MB 11 weekly downloads
ccg2ptb_model_v0.1.3.zip 2012-03-18 4.3 MB 44 weekly downloads
readme.txt 2011-12-21 3.8 kB 0
sweetonionCCG2PTBconverter_linux_64_bin_v0.1.2.tar.gz 2011-12-21 2.5 MB 0
standford-pos_model.tar.gz 2011-12-21 12.9 MB 0
ccg2ptb_model_v0.1.2.tar.gz 2011-12-21 3.5 MB 0
sweetonionCCG2PTBconverter_win7_64_bin_v0.1.2.zip 2011-12-21 2.4 MB 0
sweetonionCCG2PTBconverter_win7_32_bin_v0.1.2.zip 2011-12-21 2.4 MB 0
ccg2ptb_model_v0.1.2.zip 2011-12-21 3.5 MB 0
sweetonionCCG2PTBconverter_win7_32_bin_v0.1.zip 2011-12-04 18.8 MB 0
sweetonionCCG2PTBconverter_win7_64_bin_v0.1.zip 2011-12-04 18.8 MB 0
sweetonionCCG2PTBconverter_linux_64_bin_v0.1.tar.gz 2011-12-04 18.9 MB 0
Totals: 15 Items   101.0 MB 6
SweetOnionCCG2PTBConverter: A tool that converts CCG derivations to PTB trees Menu: 1 Change Log 2 Summary 3 Features 4 System requirements 5 System tested on 6 Sample input 7 Install 8 Run 9 Source Code & License ------------------------------------------------------------------------------------------------------------------- ------------------------------------------------------------------------------------------------------------------- Change Log: 0.1.0 0.1.1 Correct commandline instructions. 0.1.2 Put the models outside the package. 0.1.3 Add the validation that the internal nodes won't be assigned with POS tags. Replace the src of OpenNLP with their jars. Further development. ------------------------------------------------------------------------------------------------------------------- Summary: This tool can convert CCG derivations to PTB trees using Max Entropy model implemented by OpenNLP, as well as visualizing the tree graphs. The main technical innovation presented here is the effective conversion method which achieves a F score over 95%. ------------------------------------------------------------------------------------------------------------------- Features: Visualize PTB trees. Visualize CCG derivations and the converted PTB result according to different POS tags used. Evaluate the converted results by using EVALB evaluation script. ------------------------------------------------------------------------------------------------------------------- System requirements: Linux Windows ------------------------------------------------------------------------------------------------------------------- Successfully tested on: Linux: Ubantu 2.6.32-24-generic (64bit) Linux: Ubantu 2.6.32-26-generic (32bit) Windows: 7 (64bit) Windows: 7 (32bit) ------------------------------------------------------------------------------------------------------------------- Sample input: CCG derivations and PTB trees. One sentence a line. We include 499 sentences in the data folder and you could try with it. Sample input CCG: {S[dcl] {S[dcl] {NP {NP {NP {NP {N {N/N Pierre}{N Vinken}}}{, ,}}{NP\NP {S[adj]\NP {NP {N {N/N 61}{N years}}}{(S[adj]\NP)\NP old}}}}{, ,}}{S[dcl]\NP {(S[dcl]\NP)/(S[b]\NP) will}{S[b]\NP {S[b]\NP {(S[b]\NP)/PP {((S[b]\NP)/PP)/NP join}{NP {NP[nb]/N the}{N board}}}{PP {PP/NP as}{NP {NP[nb]/N a}{N {N/N nonexecutive}{N director}}}}}{(S\NP)\(S\NP) {((S\NP)\(S\NP))/N[num] Nov.}{N[num] 29}}}}}{. .}} Sample input PTB: (S (NP-SBJ (NP (NNP Pierre) (NNP Vinken) )(, ,) (ADJP (NP (CD 61) (NNS years) )(JJ old) )(, ,) )(VP (MD will) (VP (VB join) (NP (DT the) (NN board) )(PP-CLR (IN as) (NP (DT a) (JJ nonexecutive) (NN director) ))(NP-TMP (NNP Nov.) (CD 29) )))(. .) ) Sample input PTB pos: Pierre/NNP Vinken/NNP ,/, 61/CD years/NNS old/JJ ,/, will/MD join/VB the/DT board/NN as/IN a/DT nonexecutive/JJ director/NN Nov./NNP 29/CD ./. ------------------------------------------------------------------------------------------------------------------- Install: On Linux (Ubantu): Install JDK 1.6 or later. Install Graphviz by "sudo apt-get install graphviz". Download sweetonionCCG2PTBconverter_v0.1.3.zip. Download standford-pos_model.zip and extract the POS tagging models into the folder stanford-pos under the models directory. Download ccg2ptb_model_v0.1.3.zip and extract the CCG2PTB models into the folder ccg2ptb under the models directory. On Windows (7): Install JDK 1.6 or later. Install Graphviz Download sweetonionCCG2PTBconverter_v0.1.3.zip. Download standford-pos_model.zip and extract the POS tagging models into the folder stanford-pos under the models directory. Download ccg2ptb_model_v0.1.3.zip and extract the CCG2PTB models into the folder ccg2ptb under the models directory. Graphviz is used to visualize the tree graph. It is not necessary if you don't use GUI. You could train and test with the following command lines. ------------------------------------------------------------------------------------------------------------------- Run Runnable jars are in the folder. First: cd to the folder. GUI: java -Xmx1600m -jar sweetonionccg2ptb.jar Command line: Train: java -cp sweetonionccg2ptb.jar integration.CCG2PTBConverter -trainccg ccgtrainfilename -trainptb ptbtrainfilename -trainpos ptbtrainfilePOS -model1 models/ccg2ptb/m1-4-200-lapos-valid-all.bin -model2 models/ccg2ptb/m2-4-200-7-300-lapos-valid-all.bin -cutoff1 4 -cutoff2 7 -iteration1 200 -iteration2 300 Test: java -cp sweetonionccg2ptb.jar integration.CCG2PTBConverter -testccg data/ccg_sample_test -testpos data/ptb_sample_test_lapos_auto -model1 models/ccg2ptb/m1-4-200-lapos-valid-all.bin -model2 models/ccg2ptb/m2-4-200-7-300-lapos-valid-all.bin -resultfile results/rs_lapos_4_200_7_300_sampletest Evaluate: Use EVALB evaluation script. Windows: > evalb.exe -p COLLINS.prm data/ptb_sample_test results/rs_lapos_4_200_7_300_sampletest Linux: > ./evalb -p COLLINS.prm data/ptb_sample_test results/rs_lapos_4_200_7_300_sampletest Add ``-posSplit yourSeparator'' if your POS file doesn't use / as a separator. ------------------------------------------------------------------------------------------------------------------- Source Code: Source code is in the src folder. Your should include the lib in your buildpath. License: The software is available for non-commercial purposes under the the Apache License, Version 2.0
Source: ReadMe.txt, updated 2012-03-18

Thanks for helping keep SourceForge clean.

Screenshot instructions:
Windows
Mac
Red Hat Linux   Ubuntu

Click URL instructions:
Right-click on ad, choose "Copy Link", then paste here →
(This may not be possible with some types of ads)

More information about our ad policies
X

Briefly describe the problem (required):

Upload screenshot of ad (required):
Select a file, or drag & drop file here.

Please provide the ad click URL, if possible:

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:

JavaScript is required for this form.

No, thanks