Tool for extracting structured information from Estonian language
A lot of information is available in form of unstructured free texts. Pattern based fact extraction is one possible approach of information retrieval, which tries to extract information in structured form that is usable by other data mining algorithms. This software allows to build and apply models for extracting examples of different relations for Estonian language. A relation can describe any link between entities in the text. For instance, a birthday relation describes the connection between persons and their birth dates.
- preprocessing scripts with deep linguistic analysis
- GUI tool for making manual annotations, additionally using active learning to speed up the process
- scripts for training and applying relations on different corpora
- simple web front-end with embedded server for making using the software more convenient for users
Be the first to post a review of Pattern based fact extraction!