An end-to-end system to catalog and audit the flow and transformation of textual data through a series of stages, into linguistic corpora of parallel text phrases for use in training a statistical machine translation system.
Be the first to post a text review of Corpus Processor. Rate and review a project by clicking thumbs up or thumbs down in the right column.
v1.0 (alpha) Initial release of Corpus Processor v1.0.1 (alpha) 2008-12-16 1) Added "if not exist "%~dp0\~datetime.tmp" goto :TERMINATE" to functional loop in all process.cmd files. This allows users to finish processing a file and shutdown the process before starting a new file, simply by deleting the ~datetime.tmp file. 2) replaced "net use" function with "pushd" & "popd" functions and eliminated the drive letter command line parameter. 3) deleted all .lnk shortcuts and referenced all schedule tasks to the process.cmd file. Using the .lnk files gave the advantage that the command prompt title bar showed the name of the .lnk file (corresponding to the stage). However, the .lnk file prevented the scheduler from showing the status of the scheduled task. Corpus Processor is intended for unattended operation when no one is logged in. Therefore, it's more important to monitor scheduled task status rather than seeing the stage in the command prompt title bar. 4) Changed ERRORLEVEL setting to accomodate new evaluation version of TextPipe 5) Minor Change to other files.
v1.0 (alpha) Initial release of Corpus Processor
v1.0 (alpha) Initial release of Corpus Processor
Be the first person to add a text review.
Copyright © 2009 Geeknet, Inc. All rights reserved. Terms of Use
Thanks for your rating!
Would you also like to write a review?
Thanks for your review!
Get credit for your review by logging in via OpenID. Click your account provider: