Anusaaraka Git
Status: Beta
Brought to you by:
sriramchaudhury
1. Pre requisites: English-Hindi Anusaaraka requires the following resources. -- Java 1.6.0 -- perl, -- python, -- flex, -- apertium (with lttoolbox 3.0.4 or above) -- xsltproc -- gdbm, and -- gcc -------------------------------------------------------------------------------------------------------------- 2. Install: To install anu_testing from the source code, run tar -xvzf <file name> (Ex : tar -xvzf anu_testing.tgz) Add following line to your ~/.bash_profile (or) ~/.bashrc ========================================================= export HOME_anu_test=path_of_anu_installation export PATH=$HOME_anu_test/bin:$PATH export HOME_anu_tmp=path_for_tmp_dir export HOME_anu_provisional_wsd_rules=path_for_user_wsd_rules(provisional_wsd_rules) (Note : Along with anu_testing we are giving provisional_wsd_rules. Move this directory wherever you want and set that path for provisional_wsd_rules) export LD_LIBRARY_PATH=/usr/local/lib/ ( for Shared libraries ) Ex: == export HOME_anu_test=$HOME/svn/14-05-09/anu_testing export PATH=$HOME_anu_test/bin:$PATH export HOME_anu_tmp=$HOME/tmp_anu_dir export HOME_anu_provisional_wsd_rules=$HOME/svn/09-05-09/provisional_wsd_rules export LD_LIBRARY_PATH=/usr/local/lib/ compile: ======== cd anu_testing shell_scripts/anu_compile.sh Note :: Please run remove_out-files.sh before recompiling. ---------------------------------------------------------------------------------------------------------------- 3. To remove out files shell_scripts/remove_out-files.sh ----------------------------------------------------------------------------------------------------------------- 4. Run: sample input file is given below: <TITLE> test. </TITLE> <p> This is a sample file for anusaaraka. </p> Command to run -------------- Anusaaraka.sh <file_name> Ex : Anusaaraka.sh verified To run the sentence for partucular linkage : ------------------------------------------- Anusaaraka.sh <file_name> <linkage_number> Ex : Anusaaraka.sh test 2 --- running test file for second linkage. ------------------------------------------------------------------------------------------------------------------- 5. Output : The "tmp" directory is created for given input file. The facts for the particular sentences are stored in their respective directories. (Ex: for second paragraph first sentence tmp/inputfile_tmp/2.1 dir ) To view the facts : $HOME_anu_tmp/tmp/<file_name_tmp>/<sentence_no>/all_facts Ex: $HOME_anu_tmp/tmp/verified_tmp/2.1/all_facts To view the output, open html file through a browser by the command : firefox <filename.html> Ex: firefox verified.html (html o/p can be viewed in the directory where you have run) ------------------------------------------------------------------------------------------------------------------- 6. To view the linkage diagram To run the program issue the Unix command: $ anu_link-parser.sh The program can run in batch mode for testing the system on a large number of sentences. The following command runs the parser on a file. $ anu_link-parser.sh < <input_file> ------------------------------------------------------------------------------------------------------------------- Categories: The categories.txt file contains category information in the following format: <Long Form> <POS Tag> <Crude Category> Punctuation: The following words are used for the respective punctuations: , comma . full_stop ; semi_colon : colon ' single_quote " double_quote ? question_mark ! exclamation '' 2_single_quotes -------------------------------------------------------------------------------------------------------------------- The sentences with correct translation are in $HOME_anu_test/verified file . --------------------------------------------------------------------------------------------------------------------