minihuff is a data compression library that enables the creation of a static frequency table to be stored at both ends of a connection. This allows effective compression even for very small pieces of data that maintain similar entropy characteristics.
Toke is a webmining toolkit for web exploring, indexing and searching for Java. Toke allows to you crawl public or private web sites, in order to create web estatistics, web Pajek graphs, Lucene indexs and word frequency files for data clustering.