Irudiko is a library written in C++ for generating Locality Sensitive Hashing sketches from any textual and web document. Mainly designed to work with HTML pages, it has also an optimization support for English or Italian documents.
Be the first to post a text review of Irudiko. Rate and review a project by clicking thumbs up or thumbs down in the right column.
CHANGELOG as of September 14, 2007 With respect to version 0.4.1, this release has a few changes regarding an improvement in hash function quality (with DJB Hash function replacing the old one). The really major addition is however the possibility to calculate a set of layout-related shingles in order to refine the cleanup step. Other improvements are expected to be made available in the next future. For information, bug reports, suggestions, request for improvements or simply greetings do not hesitate to contact me by email.
CHANGELOG as of April 26, 2006 This version is a small improvement of the version 0.4, in both perfomances and (especially) memory usage. It is circa 20% better than release 0.4 in terms of time to complete a sketching phase (tests ran on the fileset included in the package). Both IrudikoGenericReader and IrudikoSketchGenerator now include the possibility to be set up even after having been initialized, with no need to invoke necessarily new class object instances to reset them. Besides, a bug which drove the library to have an increasing memory usage has been fixed. For information, bug reports, suggestions, request for improvements or simply greetings do not hesitate to contact me by email. Sketch 'em all with Irudiko!
CHANGELOG as of April 24, 2006 The 0.3 version of Irudiko suffered a heavy bottleneck in the tag removal phase. The current release implements a different, quicker way to clean up the document, and performs over 50% better than the older releases. It is strongly suggested not to use Irudiko v0.3, and instead use this release.
Be the first person to add a text review.
Copyright © 2009 Geeknet, Inc. All rights reserved. Terms of Use
Thanks for your rating!
Would you also like to write a review?