From: Rob M. <cap...@xs...> - 2009-03-25 18:06:30
|
On Wed, March 25, 2009 08:25, Mark Ruijter wrote: > Hi there, > > It might be interesting to know that I just released lessfs under the > GPLv3 > license. > > Lessfs is an userspace (fuse) inline data de-duplicating filesystem for > Linux. > http://sourceforge.net/projects/lessfs/ > > Mark. Hi Mark, Very interesting indeed. Potentially very useful in computer forensics, possibly combined with CarvFs. Do you have any idea on how good or bad Lessfs would scale to repositories in the size range between 10TB and 20 TB? T.I.A. Rob |
From: Mark R. <mru...@gm...> - 2009-03-25 18:38:04
|
Hi Rob, To give you an impression of a real server running backup's of real VM's. The backup is made on a blocklevel (e.g. dd if=/dev/backupthisdisk of=/pooldata/somedir bs=1M). The lessfs configuration file: [root@ccab05lfs01 /]# cat /etc/pool.cfg BLOCKDATA_PATH=/data/current BLOCKDATA_BS=104857600 # BLOCKUSAGE_PATH=/meta/current BLOCKUSAGE_BS=104857600 # DIRENT_PATH=/meta/current DIRENT_BS=10485760 # FILEBLOCK_PATH=/blockdata/current FILEBLOCK_BS=104857600 # META_PATH=/meta/current META_BS=10485760 LISTEN_IP=127.0.0.1 ---------------------------- The sizes of the databases: [root@ccab05lfs01 /]# cd /data/current/ [root@ccab05lfs01 current]# du . -s -h 693G . [root@ccab05lfs01 current]# cd /meta/current/ [root@ccab05lfs01 current]# du . -s -h 1.7G . [root@ccab05lfs01 current]# cd /blockdata/current/ [root@ccab05lfs01 current]# du . -s -h 23G . -------------------------------------------- The amount of data stored on the lessfs filesystem: [root@ccab05lfs01 current]# mount | grep lessfs lessfs on /pooldata type fuse.lessfs (rw,nosuid,nodev,default_permissions,allow_other,max_read=131072) [root@ccab05lfs01 current]# cd /pooldata/ [root@ccab05lfs01 pooldata]# du . -s -h 48T . As you can see lessfs stores in this case 48T of data on 718G of total diskspace. 693G to actually store the data and the rest is overhead. How scalable is lessfs? The tokyocabinet databases may contain as much as 8EB of data. Performance will degrade when the system has insufficient memory to contain the buckets (See http://tokyocabinet.sourceforge.net/spex-en.html#introduction). You will need 2 MB of system memory for approx 122GB in the database using a lessfs 128k blocksize. Mark. Rob Meijer wrote: > On Wed, March 25, 2009 08:25, Mark Ruijter wrote: > >> Hi there, >> >> It might be interesting to know that I just released lessfs under the >> GPLv3 >> license. >> >> Lessfs is an userspace (fuse) inline data de-duplicating filesystem for >> Linux. >> http://sourceforge.net/projects/lessfs/ >> >> Mark. >> > > Hi Mark, > > Very interesting indeed. Potentially very useful in computer forensics, > possibly combined with CarvFs. Do you have any idea on how good or bad > Lessfs would scale to repositories in the size range between 10TB and 20 > TB? > > T.I.A. > > Rob > > |