Program wordTabulator is intended for text analysis. It can generate index of word elements extracted from defined text set. Word elements may be words, N-grams or phrases (syntagmes). The program can process texts as in ordinary 1-byte encoding (ANSI), as in multibyte UTF-8 encoding.
Features
- ANSI and UTF-8 encodings
- NCR-codes, HTML-named entities. multi-language texts
- search by regular expressions
- set operations on source texts: subtraction, intersection and union
- morphology module for Russian language
- different formats of output index; true alphabetical ordering
License
GNU General Public License version 2.0 (GPLv2)Follow wordTabulator
You Might Also Like
Rate This Project
Login To Rate This Project
User Reviews
-
Цитата из лицензии "2. Исходный код является собственностью автора и не подлежит модификации путем декомпиляции двоичного кода программных модулей."