extract unique file sets from sets with duplicates
...There are lots of duplicates and I want to extract a unique set from the larger set. That is what dupless does.
Written in Java, using sqlite, it is some simple code that solves the duplicate file problem.
All of the code is contained in the .jar file, both source and binary.
Currently it writes scripts for use on Linux or Windows.
See the Wiki or the README.txt in the .jar file for more information.
[2014-10-31] This project is obsolete, for latest version (6.1.3) see GitHub https://github.com/digital-preservation/droid (source) and http://www.nationalarchives.gov.uk/information-management/manage-information/preserving-digital-records/droid/ (binary)
DROID (Digital Record Object Identification) is an automatic file format identification tool. It is the first in a planned series of tools developed by The National Archives under the umbrella of its PRONOM technical registry service.
[2013-01-24] The binary download of the latest version of DROID has now been moved to The National Archives website: http://www.nationalarchives.gov.uk/information-management/projects-and-work/droid.htm
The source code for the latest version of DROID remains available via Github: http://digital-preservation.github.com/droid/
[2012-09-07] DROID 6.1 has been released. ...