Program wordTabulator is intended for text analysis. With help of wordTabulator you can generate index of word elements extracted from defined text set. Word elements may be words, N-grams (of defined size) or phrases (syntagmes). The program can process texts as in ordinary 1-byte encoding (ANSI), as in multibyte UTF-8 encoding.

Source texts are defined as a set of flat text files or HTML/XML/SGML documents. In the last case the program can filter content from markup. Moreover, you can process only defined content within selected paired tags. Or you can skip that content from processing.

As additional feature you can analyse a pair of text sets and compare them by common or different elements.

Output word index may be generated in HTML format and contain frequences of each text element and links to original content. Also it may be generated as a flat text file. Words in the index are ordered by alphabet, value or frequency.

Features

  • support ANSII and UTF-8 encodings
  • support NCR-codes and HTML-named entities
  • regular expressions
  • different languages of source texts
  • set operations on source texts (subtraction, intersection and union)
  • morphology module for Russian
  • three different formats of output index
  • three different types of word elements (words, N-Grams and phrases)
  • browser of context
  • true alphabetical ordering

Project Samples

Project Activity

See All Activity >

Follow wordTabulator

wordTabulator Web Site

Other Useful Business Software

Focus on Business Growth with a VoIP Solution Focus on Business Growth with a VoIP Solution Icon
Focus on Business Growth with a VoIP Solution Icon

Cloud Phone Service. Built for Business.

  • Over 50 business-class features
  • Easy setup. Professional installation.
  • CRM integration

Rate This Project

Login To Rate This Project

User Reviews

Be the first to post a review of wordTabulator!

Additional Project Details

Languages

English, Russian

Intended Audience

Science/Research, End Users/Desktop

User Interface

Win32 (MS Windows)

Programming Language

C

Registered

2009-02-03