I have just released a new .NET 4.0 library: Html2Xhtml - a binding to the same-named HTML to XHTML convertor written in C.
It runs 2 - 4x faster than Chilkat's HTML-to-XML library which I evaluated for local reconstruction of a large online database of the European Union.
0.1.3.4 Release Notes
This little revision enables Concordancer to save results as well in plain text as in xml.
See feature request: https://sourceforge.net/tracker2/?func=detail&aid=2576942&group_id=173414&atid=865371
0.1.3.3 Release Notes
Tenka.Text now runs without any glitches on Linux and other platforms using Mono 2.2, FIXED 2105611, 0133A, 0133B.
This is a minor revision for the users of .NET Framework on Windows, FIXED 1817987, 0133C.
Note:
This is a maintenance/bug-fix release.
0.1.3.2 Release Notes
This is a minor revision which features new icons and fixes bugs in the new select files dialog introduced in 0.1.3.1 and finally fixes the check for updates behavior.
0.1.3.1 Release Notes
This is a minor revision which serves to fix a bug (1794829) that caused the program to crash on Mono/Mac OS X and to remedy the effects of another bug (1789682) which causes problems while handling files with diacritics in unusual encodings.
Thanks to Michal Kren and codeman38.
------
SVN267
0.1.3 Release Notes
This version of Tenka Text introduces dynamism into Wordlister. It’s the first release that juggles with interface design ideas which have come to prove themselves in professional integrated development environments. The main feature of this release is the explorer control in Wordlister, which is a tree view that performs organizational tasks much like the solution explorer control in Visual Studio. Wordlister explorer has been implemented in such a way that enables you to add, remove or switch between frequency lists dynamically. Because graphical user interfaces of integrated development environments are designed for highly complicated usage scenarios and heavy workloads developers tend to have, they stand as great examples for computational linguists that aim to deliver equally professional solutions to corpus research questions. Tenka Text is going to continue to introduce more and more concepts to its graphical user interface in the future – to become the first and best open-source “Rapid Corpus Analysis or RCA” tool available.... read more
This is the first versioned release of Tenka Text.
CLASS LIBRARY
The class library has undergone great changes one of which are the brand new segmenter customization classes which can be seen at:
and read about at:
GRAPHICAL USER INTERFACE ... read more
This is the first versioned release of Tenka Text.
CLASS LIBRARY
The class library has undergone great changes one of which are the brand new segmenter customization classes which can be seen at:
and read about at:
GRAPHICAL USER INTERFACE
Main Window... read more
WordLister update.
improved numeric display, fixed a bug, added an option to let the user determine whether token or type lengths should be displayed in the statistics view.
This is an implementation of the idea first mentioned here:
http://tenkatext.blogspot.com/2007/02/frequency-list-statistics-view.html
Corresponding SVN revision: 96
This version introduces:
----------
WordLister
----------
1) dramatic performance improvements in frequency listing (You can read more about this at http://tenkatext.blogspot.com\)
2) statistics view in frequency listing
Source files are stored in the subversion repository of the project. Corresponding SVN revision: 75
^_^ Hello again!
At http://tenkatext.blogspot.com is my new project development blog where you can read the latest about Tenka Text.
I'm in the process of finalizing the object model of frequency lists and improving their performance. This is alpha-stage serious stuff I'm talking about here!
Check http://tenkatext.blogspot.com out to see how Tenka Text now outperforms WordSmith Tools 4!
This version introduces:
1) frequency listing for word clusters
2) an updated grid control(widget) in WordLister
3) sub-gui level support for keyness calculations
Browse http://tenkatext.sourceforge.net/tp2/ for the first application of keyness calculations introduced in this version.
Source files are now stored in the project svn repository. Corresponding SVN revision: 17
Word lists now calculate standardized token/type ratios. Several concordancing related bugfixes.
WordLister tool now available for use! :) See it in action at the project homepage: http://tenkatext.sourceforge.net
Now it is possible to do batch word listing! Check the command "Test WordLister" under the new "Pre-alpha Test" menu. See it in action: http://tenkatext.sourceforge.net/screens/
I have improved the concordancing performance at sub-interface level. Currently looking for a reliable open-source control to replace the DataGridView of Windows Forms.
I have made a great discovery today: Tenka Text Class Libraries run on any OS on which Mono is installed!
See it running on Linux for yourselves: http://tenkatext.sourceforge.net/screens/
The open-source corpus analysis software Tenka Text has been released for the first time. It rivals WordSmith Tools from Mike Scott.