Irudiko is a library written in C++ for generating Locality Sensitive Hashing sketches from any textual and web document. Mainly designed to work with HTML pages, it has also an optimization support for English or Italian documents.


http://irudiko.sourceforge.net





Separate each tag with a space.

Ratings and Reviews

Be the first to post a text review of Irudiko. Rate and review a project by clicking thumbs up or thumbs down in the right column.

Project Feed

  • irudiko 0.5 file released: irudiko-0.5.tar.gz

    CHANGELOG as of September 14, 2007 With respect to version 0.4.1, this release has a few changes regarding an improvement in hash function quality (with DJB Hash function replacing the old one). The really major addition is however the possibility to calculate a set of layout-related shingles in order to refine the cleanup step. Other improvements are expected to be made available in the next future. For information, bug reports, suggestions, request for improvements or simply greetings do not hesitate to contact me by email.

    posted 787 days ago

  • File released: /irudiko/0.5/irudiko-0.5.tar.gz

    posted 787 days ago

  • irudiko irudiko-0.4.1 file released: irudiko-0.4.1.tar.gz

    CHANGELOG as of April 26, 2006 This version is a small improvement of the version 0.4, in both perfomances and (especially) memory usage. It is circa 20% better than release 0.4 in terms of time to complete a sketching phase (tests ran on the fileset included in the package). Both IrudikoGenericReader and IrudikoSketchGenerator now include the possibility to be set up even after having been initialized, with no need to invoke necessarily new class object instances to reset them. Besides, a bug which drove the library to have an increasing memory usage has been fixed. For information, bug reports, suggestions, request for improvements or simply greetings do not hesitate to contact me by email. Sketch 'em all with Irudiko!

    posted 1293 days ago

  • File released: /irudiko/irudiko-0.4.1/irudiko-0.4.1.tar.gz

    posted 1293 days ago

  • irudiko irudiko-0.4 file released: irudiko-0.4.tar.gz

    CHANGELOG as of April 24, 2006 The 0.3 version of Irudiko suffered a heavy bottleneck in the tag removal phase. The current release implements a different, quicker way to clean up the document, and performs over 50% better than the older releases. It is strongly suggested not to use Irudiko v0.3, and instead use this release.

    posted 1295 days ago

  • File released: /irudiko/irudiko-0.4/irudiko-0.4.tar.gz

    posted 1295 days ago

  • irudiko irudiko-0.3 file released: irudiko-0.3.tar.gz

    posted 1328 days ago

  • File released: /irudiko/irudiko-0.3/irudiko-0.3.tar.gz

    posted 1328 days ago

  • Code committed

    Anonymous committed patchset 1 of module CVSROOT to the Irudiko CVS repository, changing 11 files

    posted by nobody 1329 days ago

  • Forum thread added

    Anonymous created the Welcome to Developers forum thread

    posted by nobody 1329 days ago

Rate and Review

Be the first person to add a text review.

Would you recommend this project?






<

Related Projects

Thanks for your rating!

Would you also like to write a review?





Skip Review