Using TokyoCabinet for uniform length fs info

2009-08-08
2013-05-13
  • Andreas Lund
    Andreas Lund
    2009-08-08

    Alongside of testing lessfs I have been toying with a perl dedup filesystem in order to test different approaches to the different challenges involved.

    I make one assumption that I don't really know if lessfs does at the moment: One never removes a block from the filesystem; instead they get allocated, put on the free list and then reallocated. Given this, I found the need for two arrays with uniform length values, namely blockid=>refcount and blockid=>hash. The first one is obviously for keeping track of whether a block is to be put on the free block list, the second is for knowing what hash to remove from the hash=>blockid database when a free block is overwritten. Again, this may differ from the inner workings of lessfs.

    Since these values are of uniform length (in my case 4 and 24 bytes) I replaced my .tcb files with flat files. The result? A performance increase of nearly 30% during sequential writes, around 20% for delete and random write operations. For a C implementation one would probably need to implement buffering to prevent the system from making 4 byte write operations, in perl this comes with the package.

    Just thought I'd share, dunno if it's useful or not :-)

     
    • Mark Ruijter
      Mark Ruijter
      2009-08-10

      I have done some testing recently where the data would no longer be written in tc but on a raw partition or in a file:

      # BLOCKDATA_IO = fileio or diskio
      BLOCKDATA_IO=fileio
      BLOCKDATA_PATH=/data/dta/blockdata.dta

      The reason why I tried it was that earlier versions of tc lacked good defragmentation capabilities.

      When I write directly in a file (fileio) performance seems to increase with 13%.

      saturn:/usr/src/lessfs-0.2.4-data_in_file # dd if=/dev/sda1 of=/fuse/boot.img bs=1M; time sync
      70+1 records in
      70+1 records out
      73995264 bytes (74 MB) copied, 0.895593 s, 82.6 MB/s

      real    0m4.149s
      user    0m0.000s
      sys     0m0.080s
      saturn:/usr/src/lessfs-0.2.4-data_in_file # cd ../lessfs-0.2.6/
      saturn:/usr/src/lessfs-0.2.6 # cp etc/lessfs.cfg /etc
      saturn:/usr/src/lessfs-0.2.6 # sync
      saturn:/usr/src/lessfs-0.2.6 # ./restart.sh
      sync
      Shutting down syslog services                                         done
      Starting syslog services                                              done
      saturn:/usr/src/lessfs-0.2.6 # sync
      saturn:/usr/src/lessfs-0.2.6 # dd if=/dev/sda1 of=/fuse/boot.img bs=1M; time sync
      70+1 records in
      70+1 records out
      73995264 bytes (74 MB) copied, 0.877396 s, 84.3 MB/s

      real    0m4.791s
      user    0m0.000s
      sys     0m0.272s
      saturn:/usr/src/lessfs-0.2.6

      Recent versions of tc come with defragmentation build-in and this is why I abandoned working on this for now.

      The 0.2.6 release has only one bug that needs to be fixed before I would call it stable. I think it's better to release lessfs-1.0.0 (stable) first and add the 'raw io / without tc'  code in the '1.1' unstable branch that will also include snapshot and replication support. The database layout will have to change to make this possible which makes changing to fileio a small step.