User Manual for TEXminer 1.0 by gearwheelsoft beta Dec 2018
TEXminer allows to analyze Texts in Unicode Format.
Before import save your Text in Unicode/UTF8 Format to get all characters correctly, or import from a PDF File.
The Text Database can be saved in XML where the orginal Text, the Sentence and Word Lists and additional Parameters (e.g. Abbreviations) are stored.
Most Functions are universal for all Languages:
- Letter Frequency Analysis (19 Languages - extensible)
- Cooccurrence Analysis of Word-Pairs (universal)
- Determination of Central Expressions (universal)
- Thematic Model Statistics (5 Languages - fixed data)
- Database Fingerprint Comparison (dependent of Thematic Model)
The Thematic Models also include Semantic Groups, which have been extended (2015).
The Thematic Models for Technical Terms have been extended (2015).
The Thematic Models for 1st additional Standard Vocabulary have been extended (2015).
The Thematic Models for 2nd additional Standard Vocabulary have been extended (2017).
The Thematic Models for 3rd additional Standard Vocabulary have been extended (2018).
------------
Key Features
------------
- Generic Processing of Unicode/UTF8 coded Texts
- Letter Frequency Analysis
- Generation of a Text Database using Abbreviations Lists and Stop-Word Lists in 19 Languages:
Bulgarian, Czech, Danish, Dutch, English, Finnish, French, German, Greek, Hungarian,
Italian, Norwegian, Polish, Portuguese, Russian, Romanian, Spanish, Swedish and Turkish;
- Other Languages can be processed by creating new Entries vs. Lists:
the Letter Frequencies Data, Abbreviations Lists and Stop-Word Lists are extensible
- Searching for Words
- Calculation of Cooccurrences
- Determining of Central Expressions
- Calculation of Thematic Model Statistics using Thematic Language Models
containing up to 65572 entries in 5 Languages: English, German, French, Spanish and Russian
- Comparation of Database Fingerprints
------------
Installation
------------
- extract the Project ZIP file to a new folder
- open the VB.NET Project with MS Visual Basic 2010 (or higher/Express Edition) or start the EXE file in the bin/Debug directory (.NET Framework required)
for optional Database Serialization:
- install SQLite for .NET2010 (or other SQLite bundle/wrapper appropriate for your Visual Studio Version):
* Win32: sqlite-netFx40-setup-bundle-x86-2010-1.0.91.0.exe and the .NET Wrapper SQLite-1.0.66.0-setup.exe (System.Data.SQLite)
* Win64: sqlite-netFx40-setup-bundle-x64-2010-1.0.91.0 and the .NET Wrapper SQLite-1.0.66.0-setup.exe (System.Data.SQLite)
- if you upgrade to a higher .NET Version, please download the appropriate Setup Bundle from the Web Site "sqlite.org"
(known problem: SQLite may not work; use XML Serialisation as default)
- the PDF Import Function uses the Open Source PDF Software "iTextSharp" (see bin/iTextSharp directory for more Information)
------------------
Use of the Program
------------------
see TEXminerHelp.htm in the bin/Debug directory or click Menu Help - HTML Help in the Main Window
gearwheelsoft2
Dec 2018