First index images by contents, then search the index for similarities
JavaFX GUI based on Lire and Lucene.
First index images by contents, then search the index for similarities between JPEG images.
LIRE : http://www.semanticmetadata.net/lire/
LUCENE : http://lucene.apache.org/core/
Console mode, outputs designed to be easily analyzed by a parser (--quiet mode).
DuMP3 is a duplicate and similar file finder. It finds exact duplicate binaries by hash, similar text files by substring content, images (JPG, BMP, GIF, PNG, etc) by color and audio files (MP3, WAV, OGG, etc) by wave data. Future: fonts, video.