Menu

Clonezilla and LVM2

dreael
2024-06-10
2024-12-29
  • dreael

    dreael - 2024-06-10

    Hello Clonezilla developers

    Currently I am evaluating/testing a KVM based virtualisation host solution (small installation) where I indent to run Clonezilla on the host itself in an maintenance window. Since I read a lot of hints of using LVM volumes instead of regular harddrive files for better performance and less SSD wearing, I made a LVM based host setup.

    The issue: First I got

    Shutting down the Logical Volume Manager
      Shutting Down logical volume: /dev/deb_kvm/home
      Shutting Down logical volume: /dev/deb_kvm/rootfs
      Shutting Down logical volume: /dev/deb_kvm/swap
      Shutting Down logical volume: /dev/deb_kvm/tmp
      Shutting Down logical volume: /dev/deb_kvm/var
    Finished Shutting down the Logical Volume Manager
    

    messages and as a result, only the /boot partition outside the LVM and the virtual hard drive (I setup a Windows 10 guest) inside the thin pool is backed up (Clonezilla used dd mode). All other volumes are missing.

    I collected all important details at

    https://beilagen.dreael.ch/Diverses/Clonezilla_LVM_Problem/

    It may be helpful to clarify how the current Clonezilla version handles LVM2 in general (supported and unsupported LVM setup scenarios), especially where there are some regular volumes as well as a thin pool inside the same volume group.

    As a feature request, the following idea might be senseful for a future version: Consider LVMs as a container of sub hard drives. So a outer

    foreach lv in lvdisplay
    

    loop should run. On every LV try to detect for valid file systems first, then also check for a partition table (GPT and MBR). When neither nor can be found, use dd. In case of a valid partition table, a sub directory should be created and partclone should be run. In my example, Clonezilla then would detect the NTFS drives of my Windows 10 guest which then can be backed up like a physical Windows 10 installation.

    The restore process should run in reverse direction: Create the PV/VG/LVs as saved in the backup set, then run the restore partclone operations on the LVs itself.

     
  • Steven Shiau

    Steven Shiau - 2024-06-18

    Thanks for your feedback. We will read that and try to improve that in the future.

    Steven

     
  • dreael

    dreael - 2024-06-25

    In the meantime, I setup a new variant: All KVM host OS parts inside classic ext4 partitions while the virtual hard disks use a VG containg a single thin pool only with all virtual hard drives attached (I set up two Windows guests inside Proxmox).

    First done a backup with Clonezilla; at least, all virtual hard drives were also backuped but the verification process shows some error messages. After that, the whole SSD RAID-1 array wiped out / all sectors blkdiscard-ed and then running a restore (simulation a disaster recovery situation, i.e. bare metal Clonezilla restore process).

    The classic paritions were not a problem; the host OS restarted fine as expected. But restoring the LVM2 causes a lot of troubles beginning with LV Status NOT available in the lvdisplay output. Trying out to fix them shows messages like

    root@kvmhost2:~# vgchange -a y
      Thin pool pve-vhd-tpool (252:5) transaction_id is 0, while expected 6.
      Thin pool pve-vhd-tpool (252:5) transaction_id is 0, while expected 6.
      Thin pool pve-vhd-tpool (252:5) transaction_id is 0, while expected 6.
      3 logical volume(s) in volume group "pve" now active
    

    Also tried out commands like lvconvert --repair and vgck but no success.

    The only passible way: Deleting the whole VG and recreating it, i.e. beginning with lvremove pve vm-100-disk-0 and ending with pvremove /dev/md126p10 then trimming all sectors on the SSDs using blkdiscard /dev/md126p10 in my case. After that, recreating the whole LVM structure manually, i.e.

    pvcreate /dev/md126p10
    vgcreate /dev/md126p10 pve
    lvcreate --type thin-pool -L 727G -n vhd pve
    lvcreate -V 64.2G -n vm-100-disk-0 --thinpool vhd pve
    lvcreate -V 20.07G -n vm-101-disk-0 --thinpool vhd pve
    

    Restore process itself:

    mount /dev/sdc1 /mnt
    unzstd </mnt/2024-06-20-15-img_KVM2_Proxmox_LVM_nurThinPool/pve-vm-100-disk-0.dd-ptcl-img.zst|dd of=/dev/mapper/pve-vm--100--disk--0 conv=sparse
    unzstd </mnt/2024-06-20-15-img_KVM2_Proxmox_LVM_nurThinPool/pve-vm-101-disk-0.dd-ptcl-img.zst|dd of=/dev/mapper/pve-vm--101--disk--0 conv=sparse
    

    Note the sparse option to keep the thin pool LVs really thin.

    At least, both my Proxmox guest boot successfully.

    All details are collected at

    https://beilagen.dreael.ch/Diverses/KVM2_LVM_nurThinPool_Clonezilla/

    Thanks for considering in a future Clonezilla release in advance.

     
  • Steven Shiau

    Steven Shiau - 2024-06-26

    Thanks for updating that. Is that possible you can share the VM with us? We'd like to reproduce this issue, and try to improve it.

    Steven

     
  • dreael

    dreael - 2024-06-27

    I ran Clonezilla on a physical system which is set up as shown in

    https://pve.proxmox.com/wiki/Install_Proxmox_VE_on_Debian_12_Bookworm

    in general but with customized partition/LVM layout as shown in the collected data. After setting up it's recommended to setup 2 guest systems at least so there is some material in the LVM for testing the backup and restore process.

    The easier test case would be a LVM with thin pool for the virtual hard drives only. The more advanced test case would be when /boot is a regular partition only while /, /home, /var and so on are all LVs in combination with xfs which allow resizing without umount or reboot required.

     
  • Steven Shiau

    Steven Shiau - 2024-06-28

    Mmm.. Actually Clonezilla does not support thin provisioning LVM, due to some tools do not have required functions.
    So your issue is about thin provisioning LVM? If so, for the moment it's can not be done.

    Steven

     
  • dreael

    dreael - 2024-07-08

    In the meantime a new scenario tested for you:

    https://beilagen.dreael.ch/Diverses/KVM1_Clonezilla_LVM_konventionell_ohne_ThinPool/

    In short: conventional LVs which means thick provisioning from the hypervisor perspective.

    As you can see in the collected log files, this case works without error. The only big issue: On a virtualisation host, you typically don't use the whole disk space inside the volume group since the free space is needed for further VMs, creating snapshots and so on. The current implementation tries to resize all LVs proportionally which is not desired on a virtualisation host. So the behaviour to restore every LV to it's original size ist much mor useful.

    Keeping the LV's original size should be the standard behaviour when using beginner mode for restoredisk. On the expert mode, an option like

    [x] LVM2: resize LVs to use the whole disk space (unchecked=restore LVs in its original size)

    would be the senseful way to support this option.

    Thin pool support should be also placed as feature request for a future Clonezilla version.

    Note: Snapshot on LVs and KVM is a separate topic (RAW instead of QCOW2) as I have tested today which does not affect Clonezilla.

     
  • dreael

    dreael - 2024-07-15

    In the meantime, a full backup / wipe HD / restore process done: It works, when using conventional LVM, i.e. no thin pool.

    During watching the restore process, I saw some lvresize commands. So the improvement for the next release is just leave away these commands which I currently have to undo using lvreduce to put it to the original LV size.

     
  • Steven Shiau

    Steven Shiau - 2024-07-16

    So the improvement for the next release is just leave away these commands which I currently have to undo using lvreduce to put it to the original LV size." -> Are you sure? I believe there is more.
    Actually tried to support thin provisioning LVM in Clonezilla. However, we encountered this issue:
    https://github.com/jthornber/thin-provisioning-tools/issues/126

    Steven

     
  • dreael

    dreael - 2024-08-13

    In the meantime I tested the latest 3.1.3-16 release with LVM. Used test scenario does not use thin pool, i.e. should be compatible with the current Clonezilla release. Result: Several red messages still appears. Details see

    https://beilagen.dreael.ch/Diverses/KVM2_LVM_ohneThinpool_Clonezilla_aktVers/

    First I thought that something could be corrupted on the test LVM so I tested the following long way: Backed up all LVs, then delete everything and then recreating the whole LVM again. Result:

    https://beilagen.dreael.ch/Diverses/KVM2_LVM_ohneThinpool_Clonezilla_LVM_neu/

    I still got the red messages. So I made some observations. Debian 12 (Bookworm) on my test system uses 2.03.16(2) (2022-05-18) while the Clonezilla live system uses 2.03.22(2) (2023-08-02) (see the lvm_version*.txt files). Another hint for you: lvdisplay/vgdisplay used in the Clonezilla live USB cmd environment shows a

    WARNING: Not using device /dev/sda11 for PV 9ACvtN-9p5W-JOL6-f3vo-698H-OfU2-qDYyMw.
      WARNING: Not using device /dev/sdb11 for PV 9ACvtN-9p5W-JOL6-f3vo-698H-OfU2-qDYyMw.
      WARNING: PV 9ACvtN-9p5W-JOL6-f3vo-698H-OfU2-qDYyMw prefers device /dev/md126p11 because device is used by LV.
      WARNING: PV 9ACvtN-9p5W-JOL6-f3vo-698H-OfU2-qDYyMw prefers device /dev/md126p11 because device is used by LV.
    

    which this does not appear when done on the Debian 12 host system. So I suppose a LVM version issue here.

    Next test I will do: Backing up LVs and deleting LVM completely again but recreating it from the Clonezilla USB live environment because I suppose a version issue somewhere.

     
  • dreael

    dreael - 2024-08-15

    In the meantime, LV recreation from Clonezilla's live environment is done and backup repeated. Result: The same error messages. So the LVM version is not the root cause. Details see

    https://beilagen.dreael.ch/Diverses/KVM2_LVM_ohneThinpool_Clonezilla_LVM_LiveUmg/

    But I observed something very weird: When going to the shell (cmd) after the backup, lvdisplay shows warnings (lvdisplay_warnungen.txt) and one LV shows NOT available as status.

    When I boot from the stick and go to the shell (cmd) directly instead of "Start_Clonezilla" and do the lvdisplay command, then all LVs show status availabe and now warning appears. See also the differences (lvdisplay_diff.txt).

    Note: On Clonezilla version 3.1.2-22, these red messages didn't appear, i.e. I was able to backup my test system without errors, so this seems to be a new bug.

     
  • Steven Shiau

    Steven Shiau - 2024-08-22

    Mm... If you are able to reproduce this issue in a virtual machine, please share the files of the virtual machine so that we can try to reproduce this issue here. Otherwise it's not easy to debug and fix it here.
    Thanks.

    Steven

     
  • dreael

    dreael - 2024-08-28

    In the meantime, the puzzle about red message seems to be solved: In the current Clonezilla version, LVs must contain some content (partitions), i.e. empty LVs (all sectors zeroed) caused the errors as reported before (this happens witgh current 3.1.3-16 as well as 3.1.2-22 version).

    Remark: The vboxhd-vboxnetw65 LV was from a failed VirtualBox NetWare installation test (CD boot aborded) so it was empty. So I applied the following steps before:

    fdisk /dev/mapper/vboxhd-vboxnetw65
    

    (creating a FAT16 partition)

    partprobe /dev/mapper/vboxhd-vboxnetw65
    mkfs.fat -n NETW_TEST /dev/mapper/vboxhd-vboxnetw65p1
    

    Result:

    https://beilagen.dreael.ch/Diverses/KVM2_LVM_ohneTP_alleLVs_part/

    i.e. the current 3.1.3-16 release was able to backup my test system successfully, i.e. no red message at the end. Take care on lsblk.txt: All LVs contain partitions. Take also care on lvdisplay.txt: All LVs remain status available.

    Next step will be the disk wipe and restore test where you will get feedback at given time.

    You should able to reproduce the bug by just creating a new LV and leaving it empty.

     

    Last edit: dreael 2024-08-28
  • dreael

    dreael - 2024-08-30

    In the meantime, the restore test using 3.1.3-16 has also been done.

    In general, it worked fine. The only still open issue: Alle LV sizes are increased to allocate the whole VG's space instead remaining the original LV's size so I have to lvreduce every LV manually to its original size first. I also checked teh expert mode for new options. Details see

    https://beilagen.dreael.ch/Diverses/KVM2_LVM_ohneTP_Restore3_1_3-16/

    In short: The current Clonezilla is usable for my projekt when taking care about known issues (no thin pool, all LVs must contain content like partitions and lvreduce must be applied after a full restore, so documenting the original LV sizes using lvdisplay before backup is recommended).

     
  • Steven Shiau

    Steven Shiau - 2024-08-31

    Thanks for your feedback.
    "The only still open issue: Alle LV sizes are increased to allocate the whole VG's space instead remaining the original LV's size" ->
    You can enter expert mode, and deselect the option "-r":
    https://clonezilla.org//clonezilla-live/doc/02_Restore_disk_image/advanced/09-advanced-param.php

    Steven

     
  • dreael

    dreael - 2024-11-07

    Thanks for the hint about deactivating the -r option. I successfully tested it recently, i.e. all the LVs keep their size.

    Suggestion for the next version: Why not set -r disabled as default option, i.e. default behaviour should be keep all sizes and partition positions to the original values. Another approach may be compare the disk size (savedisk documents it in the partition table file). When it is equal (same hardware as used for the backup or same model found) then -r should be disabled by default, otherwise the user can be asked.

     
  • Steven Shiau

    Steven Shiau - 2024-11-10

    "Suggestion for the next version: Why not set -r disabled as default option, i.e. default behaviour should be keep all sizes and partition positions to the original values. Another approach may be compare the disk size (savedisk documents it in the partition table file). When it is equal (same hardware as used for the backup or same model found) then -r should be disabled by default, otherwise the user can be asked." -> Thanks for your suggestion. We will think about how to make it better.

    Steven

     
  • Steven Shiau

    Steven Shiau - 2024-12-01

    Please give testing Clonezilla live >= 3.2.0-22 or 20241201-* a try.
    The way you mentioned has been implemented, i.e., by default "-r" option is not on.
    When the option "-k1" is used, "-r" option will be added, too.
    Please let us know the results. Thanks.

    Steven

     

    Last edit: Steven Shiau 2024-12-01
  • Victor

    Victor - 2024-12-28

    Hi, I'm facing the same issue (TASK ERROR: activating LV 'pve/data' failed: Thin pool pve-data-tpool (252:4) transaction_id is 0, while expected 726.) after creating an image of a disk and trying to restore it on a different machine.

    I've tried restoring the backup with Clonezilla live version 3.2.0-29 (the image was created with version 3.2.0-5, but I can create it again with a different version if needed), but unfortunally I´m still getting the same error after restoring the backup.

    I have used option -k0 to keep the same partition size, and I double-checked and made sure the option "-r" in the advance options was not selected, is there any other option I can try to see if it fix the issue?

     

    Last edit: Victor 2024-12-28

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.