Anusaaraka Git
Status: Beta
Brought to you by:
sriramchaudhury
1. Pre requisites:
English-Hindi Anusaaraka requires the following resources.
-- Java 1.6.0
-- perl,
-- python,
-- flex,
-- apertium (with lttoolbox 3.0.4 or above)
-- xsltproc
-- gdbm, and
-- gcc
--------------------------------------------------------------------------------------------------------------
2. Install:
To install anu_testing from the source code, run
tar -xvzf <file name> (Ex : tar -xvzf anu_testing.tgz)
Add following line to your ~/.bash_profile (or) ~/.bashrc
=========================================================
export HOME_anu_test=path_of_anu_installation
export PATH=$HOME_anu_test/bin:$PATH
export HOME_anu_tmp=path_for_tmp_dir
export HOME_anu_provisional_wsd_rules=path_for_user_wsd_rules(provisional_wsd_rules)
(Note : Along with anu_testing we are giving provisional_wsd_rules. Move this directory wherever you want
and set that path for provisional_wsd_rules)
export LD_LIBRARY_PATH=/usr/local/lib/ ( for Shared libraries )
Ex:
==
export HOME_anu_test=$HOME/svn/14-05-09/anu_testing
export PATH=$HOME_anu_test/bin:$PATH
export HOME_anu_tmp=$HOME/tmp_anu_dir
export HOME_anu_provisional_wsd_rules=$HOME/svn/09-05-09/provisional_wsd_rules
export LD_LIBRARY_PATH=/usr/local/lib/
compile:
========
cd anu_testing
shell_scripts/anu_compile.sh
Note :: Please run remove_out-files.sh before recompiling.
----------------------------------------------------------------------------------------------------------------
3. To remove out files
shell_scripts/remove_out-files.sh
-----------------------------------------------------------------------------------------------------------------
4. Run:
sample input file is given below:
<TITLE> test. </TITLE>
<p>
This is a sample file for anusaaraka.
</p>
Command to run
--------------
Anusaaraka.sh <file_name> Ex : Anusaaraka.sh verified
To run the sentence for partucular linkage :
-------------------------------------------
Anusaaraka.sh <file_name> <linkage_number>
Ex : Anusaaraka.sh test 2 --- running test file for second linkage.
-------------------------------------------------------------------------------------------------------------------
5. Output :
The "tmp" directory is created for given input file. The facts for the particular sentences are stored in their
respective directories. (Ex: for second paragraph first sentence tmp/inputfile_tmp/2.1 dir )
To view the facts :
$HOME_anu_tmp/tmp/<file_name_tmp>/<sentence_no>/all_facts Ex: $HOME_anu_tmp/tmp/verified_tmp/2.1/all_facts
To view the output, open html file through a browser by the command :
firefox <filename.html>
Ex: firefox verified.html (html o/p can be viewed in the directory where you have run)
-------------------------------------------------------------------------------------------------------------------
6. To view the linkage diagram
To run the program issue the Unix command:
$ anu_link-parser.sh
The program can run in batch mode for testing the system on a large
number of sentences. The following command runs the parser on
a file.
$ anu_link-parser.sh < <input_file>
-------------------------------------------------------------------------------------------------------------------
Categories: The categories.txt file contains category information in the following format:
<Long Form> <POS Tag> <Crude Category>
Punctuation: The following words are used for the respective punctuations:
, comma
. full_stop
; semi_colon
: colon
' single_quote
" double_quote
? question_mark
! exclamation
'' 2_single_quotes
--------------------------------------------------------------------------------------------------------------------
The sentences with correct translation are in $HOME_anu_test/verified file .
--------------------------------------------------------------------------------------------------------------------