| Name | Modified | Size | Downloads / Week |
|---|---|---|---|
| README.rst | 2015-08-26 | 9.9 kB | |
| Totals: 1 Item | 9.9 kB | 0 |
python-il-parser
Parser for Indian Languages
Installation
Dependencies
python-il-parser requires NumPy, GraphViz and PyDot.
To install the dependencies do something like this (Ubuntu):
sudo apt-get install python-numpy sudo apt-get install graphviz sudo apt-get install python-pydot
Download
Download python-il-parser from sourceforge.
Install
tar -xvzf python-il-parser.tar.gz cd python-il-parser gunzip models/* sudo python setup.py install
Example
>>> from ilparser import ilparser
>>> with open('tests/sample.conll') as fp:
... sentences = fp.read()
...
>>> print(sentences)
1 इसके यह pn PRP cat-pn|gen-any|num-sg|pers-3|case-o|vib-0_अतिरिक्त|tam-ke|chunkId-NP|chunkType-head|stype-|voicetype- _ _ _ _
2 अतिरिक्त अतिरिक्त psp PSP cat-psp|gen-|num-|pers-|case-|vib-|tam-|chunkId-NP|chunkType-child|stype-|voicetype- _ _ _ _
3 गुग्गुल गुग्गुल n NNPC cat-n|gen-m|num-sg|pers-3|case-d|vib-0|tam-0|chunkId-NP2|chunkType-child|stype-|voicetype- _ _ _ _
4 कुंड कुंड n NNP cat-n|gen-m|num-sg|pers-3|case-d|vib-0|tam-0|chunkId-NP2|chunkType-head|stype-|voicetype- _ _ _ _
5 , COMMA punc SYM cat-punc|gen-|num-|pers-|case-|vib-|tam-|chunkId-NP2|chunkType-child|stype-|voicetype- _ _ _ _
6 भीम भीम n NNPC cat-n|gen-m|num-sg|pers-3|case-d|vib-0|tam-0|chunkId-NP3|chunkType-child|stype-|voicetype- _ _ _ _
7 गुफा गुफा n NNP cat-n|gen-f|num-sg|pers-3|case-d|vib-0|tam-0|chunkId-NP3|chunkType-head|stype-|voicetype- _ _ _ _
8 तथा तथा avy CC cat-avy|gen-|num-|pers-|case-|vib-|tam-|chunkId-CCP|chunkType-head|stype-|voicetype- _ _ _ _
9 भीमशिला भीमशिला n NNP cat-n|gen-f|num-sg|pers-3|case-d|vib-0|tam-0|chunkId-NP4|chunkType-head|stype-|voicetype- _ _ _ _
10 भी भी avy RP cat-avy|gen-|num-|pers-|case-|vib-|tam-|chunkId-NP4|chunkType-child|stype-|voicetype- _ _ _ _
11 दर्शनीय दर्शनीय adj JJ cat-adj|gen-any|num-any|pers-|case-d|vib-|tam-|chunkId-NP5|chunkType-child|stype-|voicetype- _ _ _ _
12 स्थल स्थल n NN cat-n|gen-m|num-pl|pers-3|case-d|vib-0|tam-0|chunkId-NP5|chunkType-head|stype-|voicetype- _ _ _ _
13 हैं है v VM cat-v|gen-any|num-pl|pers-3|case-|vib-है|tam-hE|chunkId-VGF|chunkType-head|stype-declarative|voicetype-active 0 root _ _
14 । । punc SYM cat-punc|gen-|num-|pers-|case-|vib-|tam-|chunkId-BLK|chunkType-head|stype-|voicetype- _ _ _ _
1 इसकी यह pn PRP cat-pn|gen-f|num-sg|pers-3|case-o|vib-का|tam-kA|chunkId-NP|chunkType-head|stype-|voicetype- _ _ _ _
2 ऊँचाई ऊँचाई n NN cat-n|gen-f|num-sg|pers-3|case-d|vib-0|tam-0|chunkId-NP2|chunkType-head|stype-|voicetype- _ _ _ _
3 केवल केवल avy RP cat-avy|gen-|num-|pers-|case-|vib-|tam-|chunkId-NP3|chunkType-child|stype-|voicetype- _ _ _ _
4 1982 1982 num QC cat-num|gen-any|num-any|pers-|case-any|vib-|tam-|chunkId-NP3|chunkType-child|stype-|voicetype- _ _ _ _
5 मीटर मीटर n NN cat-n|gen-m|num-sg|pers-3|case-d|vib-0|tam-0|chunkId-NP3|chunkType-head|stype-|voicetype- _ _ _ _
6 है है v VM cat-v|gen-any|num-sg|pers-3|case-|vib-है|tam-hE|chunkId-VGF|chunkType-head|stype-declarative|voicetype-active 0 root _ _
7 । । punc SYM cat-punc|gen-|num-|pers-|case-|vib-|tam-|chunkId-BLK|chunkType-head|stype-|voicetype- _ _ _ _
>>>
>>>
>>> psrser = ilparser(out_dir="output-trees", plot=True)
>>> #plot is a flag to be set if you want to plot output parse trees
... #if plot is True, you need to pass the output directory for plotted trees in "out_dir"
... #default plot directory is /home/user/output-trees
... #if the specified plot directory already exists it will be cleaned first before redirecting plots to it
... #make sure the specified plot directory doesn't contain any important files
...
>>> parsed_sents = parser.getParse(sentences)
>>> print(parsed_sents)
>>>
>>>
>>> print(parse_sens)
1 इसके यह pn PRP case-o|vib-0_अतिरिक्त|cp-|psd-|cat-pn|pers-3|num-sg|stype-|voicetype-|tam-ke|sem-|chunkId-NP|gen-any|chunkType-head 13 k7p _ _
2 अतिरिक्त अतिरिक्त psp PSP case-|vib-|cp-|psd-|cat-psp|pers-|num-|stype-|voicetype-|tam-|sem-|chunkId-NP|gen-|chunkType-child 1 lwg__psp _ _
3 गुग्गुल गुग्गुल n NNPC case-d|vib-0|cp-|psd-|cat-n|pers-3|num-sg|stype-|voicetype-|tam-0|sem-|chunkId-NP2|gen-m|chunkType-child 4 pof__cn _ _
4 कुंड कुंड n NNP case-d|vib-0|cp-|psd-|cat-n|pers-3|num-sg|stype-|voicetype-|tam-0|sem-|chunkId-NP2|gen-m|chunkType-head 8 ccof _ _
5 , COMMA punc SYM case-|vib-|cp-|psd-|cat-punc|pers-|num-|stype-|voicetype-|tam-|sem-|chunkId-NP2|gen-|chunkType-child 4 rsym _ _
6 भीम भीम n NNPC case-d|vib-0|cp-|psd-|cat-n|pers-3|num-sg|stype-|voicetype-|tam-0|sem-|chunkId-NP3|gen-m|chunkType-child 7 pof__cn _ _
7 गुफा गुफा n NNP case-d|vib-0|cp-|psd-|cat-n|pers-3|num-sg|stype-|voicetype-|tam-0|sem-|chunkId-NP3|gen-f|chunkType-head 8 ccof _ _
8 तथा तथा avy CC case-|vib-|cp-|psd-|cat-avy|pers-|num-|stype-|voicetype-|tam-|sem-|chunkId-CCP|gen-|chunkType-head 12 nmod _ _
9 भीमशिला भीमशिला n NNP case-d|vib-0|cp-|psd-|cat-n|pers-3|num-sg|stype-|voicetype-|tam-0|sem-|chunkId-NP4|gen-f|chunkType-head 8 ccof _ _
10 भी भी avy RP case-|vib-|cp-|psd-|cat-avy|pers-|num-|stype-|voicetype-|tam-|sem-|chunkId-NP4|gen-|chunkType-child 9 lwg__rp _ _
11 दर्शनीय दर्शनीय adj JJ case-d|vib-|cp-|psd-|cat-adj|pers-|num-any|stype-|voicetype-|tam-|sem-|chunkId-NP5|gen-any|chunkType-child 12 nmod__adj _ _
12 स्थल स्थल n NN case-d|vib-0|cp-|psd-|cat-n|pers-3|num-pl|stype-|voicetype-|tam-0|sem-|chunkId-NP5|gen-m|chunkType-head 13 k1s _ _
13 हैं है v VM case-|vib-है|cp-|psd-|cat-v|pers-3|num-pl|stype-declarative|voicetype-active|tam-hE|sem-|chunkId-VGF|gen-any|chunkType-head 0 root _ _
14 । । punc SYM case-|vib-|cp-|psd-|cat-punc|pers-|num-|stype-|voicetype-|tam-|sem-|chunkId-BLK|gen-|chunkType-head 13 rsym _ _
1 इसकी यह pn PRP case-o|vib-का|cp-|psd-|cat-pn|pers-3|num-sg|stype-|voicetype-|tam-kA|sem-|chunkId-NP|gen-f|chunkType-head 2 r6 _ _
2 ऊँचाई ऊँचाई n NN case-d|vib-0|cp-|psd-|cat-n|pers-3|num-sg|stype-|voicetype-|tam-0|sem-|chunkId-NP2|gen-f|chunkType-head 6 k1 _ _
3 केवल केवल avy RP case-|vib-|cp-|psd-|cat-avy|pers-|num-|stype-|voicetype-|tam-|sem-|chunkId-NP3|gen-|chunkType-child 4 lwg__rp _ _
4 1982 1982 num QC case-any|vib-|cp-|psd-|cat-num|pers-|num-any|stype-|voicetype-|tam-|sem-|chunkId-NP3|gen-any|chunkType-child 5 nmod__adj _ _
5 मीटर मीटर n NN case-d|vib-0|cp-|psd-|cat-n|pers-3|num-sg|stype-|voicetype-|tam-0|sem-|chunkId-NP3|gen-m|chunkType-head 6 k1s _ _
6 है है v VM case-|vib-है|cp-|psd-|cat-v|pers-3|num-sg|stype-declarative|voicetype-active|tam-hE|sem-|chunkId-VGF|gen-any|chunkType-head 0 root _ _
7 । । punc SYM case-|vib-|cp-|psd-|cat-punc|pers-|num-|stype-|voicetype-|tam-|sem-|chunkId-BLK|gen-|chunkType-head 6 rsym _ _
>>>
Contact
Riyaz Ahmad Bhat PHD-CL IIITH, Hyderabad riyaz.bhat@research.iiit.ac.in Irshad Ahmad Bhat MS-CSE IIITH, Hyderabad irshad.bhat@research.iiit.ac.in