Name | Modified | Size | Downloads / Week |
---|---|---|---|
1.1 | 2016-04-18 | ||
SegTagParsing | 2014-12-17 | ||
1.0 | 2014-10-03 | ||
libcn_nlp_Tagger.so | 2014-10-08 | 548.6 kB | |
libcn_nlp_Parser.so | 2014-10-08 | 798.4 kB | |
readme.txt | 2014-10-08 | 2.1 kB | |
Totals: 6 Items | 1.3 MB | 0 |
updates log of 1.1: In Version 1.1, only the model file "model/model_parser_arceager_mvt_origin_autopos" and dll file "cn_nlp_Parser.dll" has been updated. If you have downloaded Version 1.0, you may copy the two new files into the old directory where Version 1.0 is, and replace the older two files. Usage: java -Xmx5000m -jar ZORE_pmt.jar sProcessType:tagOrigin inputFile:test/finance_example.txt outputFile:result_origin_finance/relation.xml nMaxSentenceCount:500000 corpusType:finance (1)System requirement: This system can be used on 64-bit windows with java 1.7+ and is suitable for processing simplified contemporary Chinese. If you want to run ZORE on linux(fedora), you only need to copy the two so fies into the ZORE directory in you fedora system. (2)The inputfile: The inputFile should be raw Chinese text in utf-8 encoding. Our system can segment a line of text into sentences. In the inputfile, letters (a...z) and nunmbers (0,1,...9) should be semiangle. (3)The outputFile: The outputFile is a xmlfile that contains the origin sentences with relations extracted from them. A sentence might contain several relations. Each relation consists of a predicate and two or more arguments. A argument is typically a noun phrase that contain one or more words, with one word as its head. (4)The userdict: The system supports a list of words defined by the user through the file "userdict.txt". In this file, each line contain a word and a part-of-speech tag. In most cases, you can use n as the part-of-speech tag for nouns and v for verbs. In particluar, nr, ns, nt can be used for person name, locations and organization names, respectively. Words in this file will be detected as a word during the analyzing process and so some word segmentation errors might be reduced. Accordingly, the relation extraction can be improved. (5)Citation: If you use this system in your paper, please cite our EMNLP 2014 paper: Likun Qiu and Yue Zhang, ZORE: A Syntax-based System for Chinese Open Relation Extraction, In Proceedings of EMNLP 2014.