Menu

Tree [r83] /
 History

HTTPS access


File Date Author Commit
 src 2011-08-24 liuag [r83] add feature: symlink support, but follow the link
 AUTHORS 2010-06-01 liuag [r1] init project
 COPYING 2010-06-01 liuag [r1] init project
 ChangeLog 2011-08-24 liuag [r83] add feature: symlink support, but follow the link
 INSTALL 2010-06-01 liuag [r1] init project
 Makefile.am 2010-06-01 liuag [r1] init project
 NEWS 2010-06-01 liuag [r1] init project
 README 2011-05-04 liuag [r61] fine tunning
 README_CN 2010-06-29 liuag [r38] update README_CN
 TODO 2011-08-09 liuag [r80] v1.4.1 release
 VERSION 2011-08-24 liuag [r83] add feature: symlink support, but follow the link
 autogen.sh 2011-07-07 liuag [r62] add bloom filter
 compile 2010-06-01 liuag [r1] init project
 config.status.lineno 2010-06-23 liuag [r25] 1.0 beta version release
 configure.in 2010-09-28 liuag [r55] add feature: block level compression
 depcomp 2010-06-01 liuag [r1] init project
 install-sh 2010-06-01 liuag [r1] init project
 missing 2010-06-01 liuag [r1] init project
 stamp-h1 2010-06-01 liuag [r1] init project

Read Me

dedup util
----------

dedup util is a lightweight open-source file packing tool. It is based on 
block-level data de-duplication technology, and can reduce the data capacity
to save storage space.

Internal data layout of dedup package as followed (in src/dedup.h):
deduplication file data layout
+---------------------------------------------------------------------+
|  header  |  logic block data  |  unique block data |  file metadata |
+---------------------------------------------------------------------+

file metedata entry layout
+---------------------------------------------------------------+
|  entry header  |  pathname  |  entry data  |  last block data |
+---------------------------------------------------------------+

entry data layout
+---------------------------------------+
| bid 1 | bid 2 | ... | bid n-1 | bid n |
+---------------------------------------+

Features
-------
1. append/remove files into/from package.

2. no MD5 collsion problem, but loss some performance.

3. support both variable-length and fixed-length segmentation. 


Source
------

project URL:    https://sourceforge.net/projects/deduputil
svn repository: https://deduputil.svn.sourceforge.net/svnroot/deduputil

Building
--------

1. checkout source code
  svn co https://deduputil.svn.sourceforge.net/svnroot/deduputil deduputil
 
2. install libz-dev
  apt-get install libz-dev
  if it does not work, please try other installations.

3. compile and install
  ./gen.sh (if need)
  ./configure
  make & make install

Command line
------------

Usage: dedup [OPTION...] [FILE]...
dedup tool packages files with deduplicaton technique.

Examples:
  dedup -c foobar.ded foo bar    # Create foobar.ded from files foo and bar.
  dedup -a foobar.ded foo1 bar1  # Append files foo1 and bar1 into foobar.ded.
  dedup -r foobar.ded foo1 bar1  # Remove files foo1 and bar1 from foobar.ded.
  dedup -t foobar.ded            # List all files in foobar.ded.
  dedup -x foobar.ded            # Extract all files from foobar.ded.
  dedup -s foobar.ded            # Show information about foobar.ded.

Options:
  -c, --creat      create a new archive
  -x, --extract    extrace files from an archive
  -a, --append     append files to an archive
  -r, --remove     remove files from an archive
  -t, --list       list files in an archive
  -s, --stat       show information about an archive
  -C, --chunk      chunk algorithms: FSP, CDC, SB, default is CDC
  -z, --compress   filter the archive through zlib compression
  -b, --block      block size for deduplication, default is 4096
  -H, --hashtable  hashtable backet number, default is 10240
  -d, --directory  change to directory, default is PWD
  -v, --verbose    print verbose messages
  -h, --help       give this help list

Platforms
---------

Linux only currently, maybe unix can work, and windows will be soon.

Author
------

Aigui Liu, focused on storage technologies, data mining and 
distributed computing.

Aigui Liu <aigui.liu@gmail.com> 2010.06.02
Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.