compression ratio

Help
Anonymous
2012-06-04
2012-09-10
  • Anonymous - 2012-06-04

    Hello,

    How much compression ratio can g4l rearch , lzo or img or other type? I found
    nothing useful for me from the internet. I need authority data. Thanks~~

    And, g4l will compression the disk unused or unformat?

    Best wish for g4l ^_^

     
  • Michael Setzer II

    The short answer is it depends on the contents of the data being compressed.

    If you compress a regular text file the ratio is very high.

    If you compress an already compressed img file you get almost nothing.

    But here is what I have found.

    Long ago, I did a test with a small boot partition.

    Using no compression it took 10 seconds.

    Using lzop compression it took 3 seconds.

    Using gzip compression it took 6 seconds.

    Using bzip2 compression it to 18 seconds.

    As for ratios, no compression 0%, whereas gzip was about 10% better than lzop,
    and bzip was about 10% better than gzip.

    I had also noted that lzop put about a 30% load on the cpu while gzip and bzip
    both had the CPU running at closs to 100% for the compression thread. This is
    even more on lower powered machines, and is why lzop is the default. Speed
    versus size. With about twice the speed and 10% increase in size.

    With a much larger partition, the same results were found. 50 minutes to
    backup using lzop, 1 hour 40 minutes using gzip. Tried to do bzip, but after
    many hours just cancelled it.

    Generally, I find the overall compression is 50% of the space used, but that
    is with a 40GB partition with 26GBB using making a 13GB image file. If the
    same data was on an 80GB or 200GB partition, the image would still come out to
    be about 13GB if the unused space has been cleared. Raw (dd) copy backs up all
    sectors used or unused, with whatever data they contain, so clearing these
    sectors greatly reduces final image size.

    Long ago, did a clean install of Fedora 3 I believe it was on an 80GB disk.
    Did a disk image, and got a 12GB image file. Cleared the unused seconds are
    redid the image, and it was only 2.5GB. Cleared sectors reduce to almost
    nothing.

    Another observation. Doing a raw backup of an 80GB windows partition took
    about 40 minutes. Doing an NTFSCLONE image of the same partition only took 10
    minutes, but image files resulted in only a couple hundred K difference. Being
    that NTFSCLONE only copied the data on the disk, that was about 20%, whereas
    the raw backup copied all the blank sectors, but compressed them to almost
    nothing.

    In the raw mode, g4l will backup whatever it sees on the disk or partition.
    Doing a raw image of very large disk with lots of time, since it has to
    process all that space. My fix for that is to make partitions, and do mbr
    backups and the partitions with the used data and partitions with lots of free
    space as needed.

    As I stated with, it all depends on the data on the disk/partition and how
    much is actually used.

    Hope that helps, but if I missed something, please don't hesitate to ask more
    specific questions.

     
  • Anonymous - 2012-06-05

    Dear msetzerii:

    Thanks for your help. I'm very grateful to you of meticulous guidance.I got a
    lot of inspiration.

    G4l will backup all sectors , although not used or not partition, these
    sectors will be compressed to a smaller image file.Is that right?

    And, I have another question. My hard disk capacity for 72G, there 9G haven't
    partitioned, and I used 8G in the rest .When I backup, g4l shows the total
    backup size is about 69G, how can this data calculated???

    Normaly, it will show the status of network card, just like 1000Mb/s ... But
    it show me 'unknow' yesterday. I want to know the possible resons although not
    influence the use.

    Recently, G4l improved my work efficiency. Thanks all.

     
  • Michael Setzer II

    If you do a disk image, it backs up all the sectors on the disk regardless of
    them being in a partition or not.

    If you do a partition image it backups up all sectors on that partition
    regardless of them being currently used or not.

    The mbr backp does the first 512 bytes that is the MBR and regular partition
    table.

    The mbr backup 2 backups up the first full track, since some boot loaders can
    use more than the first sector. As a matter of fact, the newer setups leave
    the first 1M of the disk open for additional image (Grub2). Also, partition
    software is also using the 1M boundaries.

    Note: When it comes to size, G4L is using G to be 1024 M rather than 1000 M or
    1 Billion bytes, so some numbers may appear different. It uses what is
    reported from the disk for the partition.

    It displays the nic speed based on what ethtool returns to it. It might have
    still been negotiating the speed, and thus got an unknown state.

    Are your systems Linux, Windows or some other OS. With Windows partitions, the
    NTFSCLONE backup is the fastest, but it does require an MBRBACKUP for a
    restore to work to a new disk, since the partition must already exist.

    In my classroom, I make an ntfsclone image on to partition /dev/sda6 and have
    an option on the grub2 menu that can restore it in about 12 minutes with no
    network traffic. This makes it simple to get the windows back up very quickly.

     
  • Anonymous - 2012-06-06

    I eliminate my doubt with your answer. I backup my linux with g4l. And I load
    g4l.iso with pxe. Yesterday , I won my teacher's praise. Ha~

    g4l is really a nice tool. I Believe, g4l will be able to become more powerful
    in the furture.

    My English is poor ,thank god you konw what I'm saying.And, thanks for your
    generosity. Best wish to you, and to g4l.

     
  • Anonymous - 2012-06-06

    And a small question, why g4l backup the sectors unused ? That will cost more
    time and more size of disk. Please forgive my bold~~

     
  • Michael Setzer II

    Glad you were able to get it to do what you needed.

     
  • Michael Setzer II

    The reason for backing up unused sectors has a couple of reasons.

    1. The raw disk/partition images are done using dd command with a compression tool to make the images smaller.
      dd does not know anything about the filesystem or disk layout. It just copies
      the sectors from the raw device file. This does allow it to backup a whole
      disk or partition that it doesn't understand. Just had a user with a power PC
      mac that could not run g4l (works on intel macs). Took the disk and hooked it
      to a PC, and made a copy and then reinstall copy in mac.

    2. To backup used files, the program just fully understand the filesystem on the partition. NTFSCLONE is an option on g4l that works with ntfs type partitions, and only backups up used data. But only works with ntfs partitions. Fsarchiver is another option, and it works with other systems, but I has some issues with special filesystem attributes that may not be copied. But its version 7 is still a beta.

    On the option of time, you are definitely right. Doing an image with an 80GB
    windows system took about 40 minutes with the raw backup, but only 10 minutes
    using ntfsclone. Note: Only about 16G of the partition was actually used at
    the time. But there are some advantages.

    If you do a raw backup of the disk, it can be restored to a new disk, and
    nothing needs to be done to setup the disk. Just restore the image, since it
    would contain the mbr and partition table and the entire partition.

    With an ntfsclone image, you would need to first create the ntfs partition on
    the disk, and install an MBR, and then restore the ntfsclone backup.

    Generally, I will do both. Do a disk level image, and then to an ntfsclone
    image. That way if there is a disk failure, I can restore the original disk
    backup to the disk, and then do the latest ntfsclone image. Now have added
    options to backup MBR and partition table, so that doesn't always need to be
    done.

    As far as size goes, if you clear all the unused sectors before doing a raw
    image, the image size is only a couple 100K bigger, since the cleared unused
    sectors compress to almost nothing.

    The other advantages is that if you have a disk with multiple partitions, they
    can all be backed up at once in a disk image.

    Hope that answers your questions, but if not, please as more questions.

     
  • Anonymous - 2012-06-06

    "Do a disk level image, and then to an ntfsclone image." That's a good idear.
    And I'm doing it now,to make my testing environment safer. Before this, I
    clear all the unused sectors, haha~

    Glad to talk with you .You help me a lot, I'll be here when I catch any other
    questions. And, I will continue to focus on the development of g4l!

     

Log in to post a comment.

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:





No, thanks