The problem started with stable version clonezilla-live-20211116-impish-amd64. The previous stable version, clonezilla-live-20210817-hirsute-amd64, worked fairly well considering the number of partitions with and without logical volumes that I have to back up.
The machine is a quite old but extremely reliable Hewlett-Packard Compaq 6200 Pro SFF PC with a 64-bit Core Intel Quad Core i5-2400 and 8GB of RAM. It has two internal and three external hard disk drives, two of the latter in a dual enclosure connected through USB3, and the other in a single-drive enclosure connected through eSATA.
As a retired IT professional, I have the peculiar hobby of “collecting” Linux distributions, and the HP has fourteen of them installed in a three partition layout each, all but one using logical volumes (lvm2). The system disks are as follows:
ata1:SATAlinkup6.0Gbps[sda]SeagateConstellationES.3(ST6000NM0044)6TB128MBCache7200RPMSATA6.0Gb/s3.5" Internal HDData2: SATA link up 3.0 Gbps[sdb] Seagate Barracuda 7200.12 (ST3500413AS) 500GB 7200RPM 16MB Cache SATA 6.0Gb/s 3.5"InternalHDData5:SATAlinkup3.0GbpsStarTech.comS3510BMU33ETUSB3.0/eSATATrayless3.5" SATA III HDD External Enclosure with UASP |__ [sde] Seagate Barracuda 7200.12 (ST31000528AS) 1TB 7200RPM SATA 3GB/s 32MB Cache 3.5"InternalHDDBus04.Port1:Dev1,Class=root_hub,Driver=xhci_hcd/4p,5000MID1d6b:0003LinuxFoundation3.0roothub|__Port3:Dev2,If0,Class=MassStorage,Driver=uas,5000MID152d:9561JMicronTechnologyCorp./JMicronUSATechnologyCorp.|__StarTech.comS3520BU33ERUSB3.0/eSATATraylessDual3.5" SATA III HDD External RAID Enclosure with UASP |__ [sdc] Seagate Barracuda XT (ST33000651AS) 3TB 7200RPM 64MB Cache SATA 6.0Gb/s 3.5"InternalHDD|__[sdd]SeagateBarracuda7200.12(ST31000528AS)1TB7200RPMSATA3GB/s32MBCache3.5"InternalHDD
The disks are not especially fast but have proven to be dependable. Backing up all those Linux distributions and several independent filesystems are an all-day affair, particularly with maximum compression, but it was manageable.
The problem I have is that partition and disk detection has become extremely slow starting with stable version clonezilla-live-20211116-impish-amd64. I have also tested with clonezilla-live-20220103-impish-amd64 with the same results. At issue is the part were the partitions to be backed up have to be selected:
Finding all disks and partitions..
Excluding busy partition..
Excluding linux raid member partitions..
Finding partitions..
Partition number: 56
With version clonezilla-live-20210817-hirsute-amd64, this would take barely one minutes; since version clonezilla-live-20211116-impish-amd64, this takes on average between seven and nine minutes:
This means that the waiting time has gone from roughly fourteen minutes (14x1 min.) to an hour and thirty-eight minutes (14x7) to back up all fourteen Linux distributions, not counting additional partitions.
For now I am sticking with clonezilla-live-20210817-hirsute-amd64. I apologise for the slightly excessive amount of data I’ve provided, but I am hoping it will help in diagnosing this issue.
Last edit: Bernard Michaud 2022-01-05
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
So could you please give testing Clonezilla live a try, e.g. 2.8.1-12 or 20220103-*: https://clonezilla.org/downloads.php
We have made some improvements about this issue.
Steven
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I have should have mentioned that the last test was in fact performed with test version clonezilla-live-20220103-impish-amd64. This is why I wrote in my initial submission that the issue started with clonezilla-live-20211116-impish-amd64.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Not sure if it's related to Linux kernel. If so, please give 2.8.1-12 or 20220103-jammy a try?
They come with newer Linux kernel so the results might be different.
Steven
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I tried clonezilla-live-20220103-jammy-amd64 with the 5.13.0-19 kernel, and if anything it was even worse. First, the devices were out of sequence, as if udisks was not properly correlating the device entries to their corresponding AHCI port numbers. I had to restart Clonezilla three times until they came out in their proper sequence. I can’t explain why that was, but I seem to recall that this would sometimes happen on Ubuntu. Then, the backup drive selection took over ten minutes, worse than clonezilla-live-20220103-impish-amd64. Finally, tried an image backup, and this was again much worse than before:
The dev/partition/filesystem processing took over nineteen minutes. This seems utterly broken, and as such all recent versions of Clonezilla are essentially unusable on this machine. It runs fairly fast on my HP Spectre laptop which has a 1TB nvme SSD, which admittedly is a much more recent machine, so this might be an issue with SATA/SCSI hard drives.
Last edit: Bernard Michaud 2022-01-06
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I just tried it. Under Debian, the disk devices were in the proper sequence and the backup disk selection took about the same time as under Ubuntu, but the disk and partition processing for an image backup ran even slower than for Ubuntu, from 2022-01-06 00:57:33 to 2022-01-06 02:06:41, an unbelievable hour and nine minutes.
I must reiterate that version clonezilla-live-20210817-hirsute-amd64 runs fast, devices, partitions and filesystems being processed in around a minute, while every version since 2021116 does not, running eight to ten times slower. It stands to reason therefore that this must be due to changes in the way Clonezilla now processes devices, partitions and filesystems. As others have mentioned, the disk access indicators show constant activity, but with little actual work being seemingly accomplished. It appears that there is excessive looping within the code.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Is that possible I can access your machine via ssh after booting Clonezilla live?
If so, please email me at steven at clonezilla org.
With that, it's easier for us to debug and find why.
Thanks.
Steven
Last edit: Steven Shiau 2022-01-06
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Or you can post the files which you have saved in the image dir, including the files:
*.sf
lvm_*.*
I can try to use those files to create a virtual machine, then try to reproduce this issue.
Steven
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Another way is you to create a virtual machine which you can reproduce this issue, and share that VM with me.
If the issue can be reproduced, then it can almost be fixed.
Thanks.
Steven
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Thank you Steven for taking the time to address this problem, as well as for offering several helpful proposals. Unfortunately, opening an Internet-facing ssh port is far too risky for my security infrastructure. Such ports are detected within minutes through constant aggressive scanning and are relentlessly subjected to brute-force attacks almost immediately. There have been sufficient vulnerabilities reported in ssh and ssl implementations in the last two years alone not to mention weaknesses in the underlying physical infrastructure to make this a hazardous undertaking.
As for the second proposal, I sincerely doubt that the issue I’m having with Clonezilla is data-related. The slowest part is concentrated in the disk and partition handling, not the actual data backup through partclone. In my opinion, installing the data elsewhere would probably not adequately reproduce the unique combination of the five SATA and USB-attached disks with their 56 partitions that is peculiar to this machine.
The thing to remember is that the 20210817-hirsute-amd64 release of Clonezilla works perfectly, and is still the one I am currently using. The problem started with the 20211116-impish-amd64 release, and continues with the recent 20220103-impish-amd64 release. Whatever changes were made to disk and partition handling between the 20210817 and 20211116 releases is responsible for the issue I’m having.
I also doubt the underlying operating systems are to blame, in this case Ubuntu, as I have the 21.10 release with the 5.13.0-23 kernel installed on that very machine and there have been no noticeable disk or partition issues with it. In fact, the machine successfully handles kernels ranging from 5.13 to the very latest 5.15.13 with nary a difficulty in sight, and several Red Hat versions to openSUSE Tumbleweed on the way to Arch Linux with no significant problems.
Believe me, I aim to assist in identifying the source of this problem and hopefully contribute to correcting it, but the machine is a critical part of my infrastructure and I have limited opportunities for extended downtime for testing purposes.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
OK, we have pushed Clonezilla live 2.8.1-12 and 20220103-impish as the stable release, but did not address the issue you have. We will keep trying to reproduce this issue and fix it.
Therefore, could you please boot Clonezilla live 20220103-impish on the machine, enter command line prompt, then
run:
sudo time ocs-prep-cache
to see how long did it run.
If you are familiar with screen or tmux, please enter it, and run it again with:
sudo bash -x ocs-prep-cache
Then cop & paste the output on the screen to a file, then post it.
Thanks.
Steven
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
My apologies for the tardy response, I had a hectic last few days, but I’ve freed some time from a busy schedule to try and move this forward.
I did notice that Clonezilla live 20220109-impish is available; would you rather I use that instead of 20220103-impish? Either is fine with me. Also, do I run “sudo time ocs-prep-cache” before or after the backup disk selection dialogue?
Finally, it’s been at least two decades that I’ve had to use the screen command; it was my favourite for running tasks on remote servers to protect my processes from inadvertent session disconnections. I can’t quite remember if it automatically saves the output from commands being run within; this would be practical in order the retain all output from ocs-prep-cache.
Thanks.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Yes, please use 20220109-impish to test that.
Please boot it and enter command line prompt, before doing anything, just run:
sudo time ocs-prep-cache
Actually in this release, we add a boot parameter to disable the devices list info cache, i.e., you can use
use_dev_list_cache=no
to make that. For more info, please check changelog.
Thank you for debugging this issue with us. Please let us know the results if you test both of them.
Steven
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I’ll use the Clonezilla live 20220109-impish then. As well, sorry for asking what may seen obvious, but how to I specify the boot parameter use_dev_list_cache=no — do I add it to /tftpboot/nbi_img/pxelinux.cfg/default as mentioned in the FAQ?
Thanks again.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I ended up modifying my usual Clonezilla entry in boot/grub/grub.cfg, adding the use_dev_list_cache=no entry at the end of the command line. Upon booting, a cursory check confirmed the parameter was enabled.
As well, the time command is apparently not installed by default, which is surprising. I had to download the time_1.9-0.1_amd64.deb package and install it with dpkg to get that to run.
Here’s the output from sudo time ocs-prep-cache:
use_dev_list_cache is set as no. Do not prepar device name cache files in /tmp/ocs-cache//...
0.08user 0.03system 0:00.11elapsed 102%CPU (0avgtext+0avgdata 15596maxresident)k
320inputs+0outputs (0major+7053minor)pagefaults 0swaps
It ran extremely quickly, which is not at all what I expected. This does not appear to be the same disk/partition/filesystem processing that takes several minutes for each partition. Should I still run sudo bash -x ocs-prep-cache? If so, what is the best way to save the output? Running as a live system, there is no local file storage available.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
If you use use_dev_list_cache=no in the boot parameter, then no need to run "time ocs-prep-cache" since it will be skipped.
To test ocs-prep-cache, you have to remove "use_dev_list_cache=no" from the boot parameter.
What I suggest is:
Add "use_dev_list_cache=no" in the boot parameter, then just do a regular saving. To see if any difference between this version and the previous one.
Do not add "use_dev_list_cache=no", then run: "time ocs-prep-cache".
BTW, time is also builtin in bash, so I should correct my description. Just switch to root by "sudo -i", then run:
time ocs-prep-cache
Steven
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
All right, this was very interesting. I had to run a test image backup three times, all for different reasons. All three were run using Clonezilla live 20220109-impish with boot parameter use_dev_list_cache=no.
First run, I had forgotten to comment out os-prober in /usr/share/drbl/sbin/ocs-functions. Usually, every time a new stable release comes out, I go to the trouble of un-squashing the filesystem.squashfs, editing ocs-functions to add a ‘#’ to the os-prober line, and re-squashing the filesystem. This is because os-prober takes forever to run on that machine. I also disable GRUB’s os-prober in all of my many Linuxes.
Second run, I followed the instructions in the split file dialogue and left the default of 0 for no splitting. The image backup appeared to run normally, but the image verification failed with a strange ‘invalid header’ error, and then froze when checking the lvm image files.
Third and last run, I entered ‘1000000’ in the split file dialogue, which was the recommendation for the older releases of Clonezilla, 20210817-hirsute included, if file splitting was not required. This one worked.
So, the conclusion of all this is that Clonezilla live 20220109-impish with boot parameter use_dev_list_cache=no works on this machine much the same as it did with 20210817-hirsute, which is good news. Would you still want me to run “time ocs-prep-cache” without the use_dev_list_cache=no parameter?
As well, as an enhancement request, it would be nice if os-prober could be disabled, probably through another parameter.
Last edit: Bernard Michaud 2022-01-13
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
"Would you still want me to run “time ocs-prep-cache” without the use_dev_list_cache=no parameter?" -> Yes.
" it would be nice if os-prober could be disabled, probably through another parameter.: -> Which files are related to this? Please list them so that we know how to deal with.
Steven
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
However, when running Clonezilla live 20220109-impish in the usual way without the use_dev_list_cache=no parameter, the backup disk selection dialogue took as long as earlier attempts, disk activity never stopping. I halted it after ten minutes by rebooting. I can’t explain why running ocs-prep-cache manually from the command line runs faster than when running Clonezilla live as usual. It seems obvious that the lengthy device/partition/filesystem processing is more than just ocs-prep-cache.
As for disabling os-prober, a change must be made in squashfs-root/usr/share/drbl/sbin/ocs-functions from:
iftypeos-prober&>/dev/null;thenecho"Saving OS info from the device..."echo"This OS-related info was saved from this machine with os-prober at $save_time:">$img_dir/Info-OS-prober.txtLC_ALL=Cos-prober>>$img_dir/Info-OS-prober.txtfi
to:
iftypeos-prober&>/dev/null;thenecho"Saving OS info from the device..."echo"This OS-related info was saved from this machine with os-prober at $save_time:">$img_dir/Info-OS-prober.txt#LC_ALL=Cos-prober>>$img_dir/Info-OS-prober.txtfi
Commenting out os-prober prevents it from running.
Last edit: Bernard Michaud 2022-01-13
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Well, it’s not all bad; at least I now know to specify the use_dev_list_cache=no parameter in Clonezilla releases 20220109-impish and beyond and get back the same processing speed I had before. I guess it beats standing still and continue to use release 20210817-hirsute, which works fine on my machine but will no longer evolve.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
In this release, a boot parameter was added to disable os-prober, as you have requested:
use_os_prober=no
For more info, please check changelog.
BTW, when you mentioned:
"Second run, I followed the instructions in the split file dialogue and left the default of 0 for no splitting. The image backup appeared to run normally, but the image verification failed with a strange ‘invalid header’ error, and then froze when checking the lvm image files." -> which parameters did you choose? Is that in expert mode and you choose to use "-z5p" (pixz) or "-z5" (xz)? If so, a bug related to this has been fixed in the same version of Clonezilla live.
Thank you for debugging this issue with us. Please let us know the results if you test both of them.
Steven
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I gave the new test release 20220118-impish a try. Beforehand, I edited /boot/grub/grub.cfg and added parameters use_dev_list_cache=no and use_os_prober=no to my customary boot entries. I can report that performance is well within the realm of 20210817-hirsute, my reference release. So, all things considered, I am satisfied that I can continue to use Clonezilla on this machine by disabling the dev list cache. As well, I appreciate the parameter to disable os_prober; the workaround was not overly complicated, but it was tedious to remember to apply it.
As for the other issue, it wasn’t with the compression option but the split image file value. My old release 20210817-hirsute had instructions to enter a “big number” such as ‘1000000’ so as not to split the image file, and had 4096 as the default value. New releases since 20211116-impish instead instruct to use ‘0’ to not split the image, and this is also the default value. Allowing ‘0’ to be used as the split image value causes the image verification to fail. Running it again but using ‘1000000’ as the split image value works properly. I have no idea why this is so.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
The problem started with stable version clonezilla-live-20211116-impish-amd64. The previous stable version, clonezilla-live-20210817-hirsute-amd64, worked fairly well considering the number of partitions with and without logical volumes that I have to back up.
The machine is a quite old but extremely reliable Hewlett-Packard Compaq 6200 Pro SFF PC with a 64-bit Core Intel Quad Core i5-2400 and 8GB of RAM. It has two internal and three external hard disk drives, two of the latter in a dual enclosure connected through USB3, and the other in a single-drive enclosure connected through eSATA.
As a retired IT professional, I have the peculiar hobby of “collecting” Linux distributions, and the HP has fourteen of them installed in a three partition layout each, all but one using logical volumes (lvm2). The system disks are as follows:
This is the backup disk, connected through USB3:
The disks are not especially fast but have proven to be dependable. Backing up all those Linux distributions and several independent filesystems are an all-day affair, particularly with maximum compression, but it was manageable.
The partitions and filesystems are as follows:
The problem I have is that partition and disk detection has become extremely slow starting with stable version clonezilla-live-20211116-impish-amd64. I have also tested with clonezilla-live-20220103-impish-amd64 with the same results. At issue is the part were the partitions to be backed up have to be selected:
Finding all disks and partitions..
Excluding busy partition..
Excluding linux raid member partitions..
Finding partitions..
Partition number: 56
With version clonezilla-live-20210817-hirsute-amd64, this would take barely one minutes; since version clonezilla-live-20211116-impish-amd64, this takes on average between seven and nine minutes:
2022-01-05 12:52:00.051918200 -0500 ocs-cache/disk_list.cache
2022-01-05 12:52:00.055918200 -0500 ocs-cache/pttable_for_disk.txt
2022-01-05 12:54:59.319922200 -0500 ocs-cache/dev_fs_size_type.cache
2022-01-05 12:55:15.015922500 -0500 ocs-cache/pttable_blkid_for_dev.txt
2022-01-05 12:58:51.935927300 -0500 ocs-cache/pttable_for_part.txt
2022-01-05 12:58:51.935927300 -0500 ocs-cache/part_list.cache
This means that the waiting time has gone from roughly fourteen minutes (14x1 min.) to an hour and thirty-eight minutes (14x7) to back up all fourteen Linux distributions, not counting additional partitions.
For now I am sticking with clonezilla-live-20210817-hirsute-amd64. I apologise for the slightly excessive amount of data I’ve provided, but I am hoping it will help in diagnosing this issue.
Last edit: Bernard Michaud 2022-01-05
So could you please give testing Clonezilla live a try, e.g. 2.8.1-12 or 20220103-*:
https://clonezilla.org/downloads.php
We have made some improvements about this issue.
Steven
Hello Steven,
I have should have mentioned that the last test was in fact performed with test version clonezilla-live-20220103-impish-amd64. This is why I wrote in my initial submission that the issue started with clonezilla-live-20211116-impish-amd64.
Not sure if it's related to Linux kernel. If so, please give 2.8.1-12 or 20220103-jammy a try?
They come with newer Linux kernel so the results might be different.
Steven
I tried clonezilla-live-20220103-jammy-amd64 with the 5.13.0-19 kernel, and if anything it was even worse. First, the devices were out of sequence, as if udisks was not properly correlating the device entries to their corresponding AHCI port numbers. I had to restart Clonezilla three times until they came out in their proper sequence. I can’t explain why that was, but I seem to recall that this would sometimes happen on Ubuntu. Then, the backup drive selection took over ten minutes, worse than clonezilla-live-20220103-impish-amd64. Finally, tried an image backup, and this was again much worse than before:
2022-01-05 23:00:07.504062200 -0500 ocs-cache-jammy/pttable_for_disk.txt
2022-01-05 23:00:07.504062200 -0500 ocs-cache-jammy/disk_list.cache
2022-01-05 23:12:23.720085800 -0500 ocs-cache-jammy/dev_fs_size_type.cache
2022-01-05 23:12:43.728086400 -0500 ocs-cache-jammy/pttable_blkid_for_dev.txt
2022-01-05 23:19:41.924099800 -0500 ocs-cache-jammy/part_list.cache
2022-01-05 23:19:41.928099800 -0500 ocs-cache-jammy/pttable_for_part.txt
The dev/partition/filesystem processing took over nineteen minutes. This seems utterly broken, and as such all recent versions of Clonezilla are essentially unusable on this machine. It runs fairly fast on my HP Spectre laptop which has a 1TB nvme SSD, which admittedly is a much more recent machine, so this might be an issue with SATA/SCSI hard drives.
Last edit: Bernard Michaud 2022-01-06
Since you mentioned it's a old machine, could you please give Clonezilla live 2.1.8-12 (Debian-based) a try?
Steven
I just tried it. Under Debian, the disk devices were in the proper sequence and the backup disk selection took about the same time as under Ubuntu, but the disk and partition processing for an image backup ran even slower than for Ubuntu, from 2022-01-06 00:57:33 to 2022-01-06 02:06:41, an unbelievable hour and nine minutes.
I must reiterate that version clonezilla-live-20210817-hirsute-amd64 runs fast, devices, partitions and filesystems being processed in around a minute, while every version since 2021116 does not, running eight to ten times slower. It stands to reason therefore that this must be due to changes in the way Clonezilla now processes devices, partitions and filesystems. As others have mentioned, the disk access indicators show constant activity, but with little actual work being seemingly accomplished. It appears that there is excessive looping within the code.
Is that possible I can access your machine via ssh after booting Clonezilla live?
If so, please email me at steven at clonezilla org.
With that, it's easier for us to debug and find why.
Thanks.
Steven
Last edit: Steven Shiau 2022-01-06
Or you can post the files which you have saved in the image dir, including the files:
*.sf
lvm_*.*
I can try to use those files to create a virtual machine, then try to reproduce this issue.
Steven
Another way is you to create a virtual machine which you can reproduce this issue, and share that VM with me.
If the issue can be reproduced, then it can almost be fixed.
Thanks.
Steven
Thank you Steven for taking the time to address this problem, as well as for offering several helpful proposals. Unfortunately, opening an Internet-facing ssh port is far too risky for my security infrastructure. Such ports are detected within minutes through constant aggressive scanning and are relentlessly subjected to brute-force attacks almost immediately. There have been sufficient vulnerabilities reported in ssh and ssl implementations in the last two years alone not to mention weaknesses in the underlying physical infrastructure to make this a hazardous undertaking.
As for the second proposal, I sincerely doubt that the issue I’m having with Clonezilla is data-related. The slowest part is concentrated in the disk and partition handling, not the actual data backup through partclone. In my opinion, installing the data elsewhere would probably not adequately reproduce the unique combination of the five SATA and USB-attached disks with their 56 partitions that is peculiar to this machine.
The thing to remember is that the 20210817-hirsute-amd64 release of Clonezilla works perfectly, and is still the one I am currently using. The problem started with the 20211116-impish-amd64 release, and continues with the recent 20220103-impish-amd64 release. Whatever changes were made to disk and partition handling between the 20210817 and 20211116 releases is responsible for the issue I’m having.
I also doubt the underlying operating systems are to blame, in this case Ubuntu, as I have the 21.10 release with the 5.13.0-23 kernel installed on that very machine and there have been no noticeable disk or partition issues with it. In fact, the machine successfully handles kernels ranging from 5.13 to the very latest 5.15.13 with nary a difficulty in sight, and several Red Hat versions to openSUSE Tumbleweed on the way to Arch Linux with no significant problems.
Believe me, I aim to assist in identifying the source of this problem and hopefully contribute to correcting it, but the machine is a critical part of my infrastructure and I have limited opportunities for extended downtime for testing purposes.
OK, we have pushed Clonezilla live 2.8.1-12 and 20220103-impish as the stable release, but did not address the issue you have. We will keep trying to reproduce this issue and fix it.
Therefore, could you please boot Clonezilla live 20220103-impish on the machine, enter command line prompt, then
run:
sudo time ocs-prep-cache
to see how long did it run.
If you are familiar with screen or tmux, please enter it, and run it again with:
sudo bash -x ocs-prep-cache
Then cop & paste the output on the screen to a file, then post it.
Thanks.
Steven
My apologies for the tardy response, I had a hectic last few days, but I’ve freed some time from a busy schedule to try and move this forward.
I did notice that Clonezilla live 20220109-impish is available; would you rather I use that instead of 20220103-impish? Either is fine with me. Also, do I run “sudo time ocs-prep-cache” before or after the backup disk selection dialogue?
Finally, it’s been at least two decades that I’ve had to use the screen command; it was my favourite for running tasks on remote servers to protect my processes from inadvertent session disconnections. I can’t quite remember if it automatically saves the output from commands being run within; this would be practical in order the retain all output from ocs-prep-cache.
Thanks.
Yes, please use 20220109-impish to test that.
Please boot it and enter command line prompt, before doing anything, just run:
sudo time ocs-prep-cache
Actually in this release, we add a boot parameter to disable the devices list info cache, i.e., you can use
use_dev_list_cache=no
to make that. For more info, please check changelog.
Thank you for debugging this issue with us. Please let us know the results if you test both of them.
Steven
I’ll use the Clonezilla live 20220109-impish then. As well, sorry for asking what may seen obvious, but how to I specify the boot parameter use_dev_list_cache=no — do I add it to /tftpboot/nbi_img/pxelinux.cfg/default as mentioned in the FAQ?
Thanks again.
Oh, the file /tftpboot/nbi_img/pxelinux.cfg/default is for PXE netboot,
I believe you are booting from USB flash drive? Or CD? If so, this depends. If the boot menu is syslinux or isolinux for legacy BIOS, you have to press "Tab" key when you see this kind of boot menu:
https://clonezilla.org/clonezilla-live/doc/01_Save_disk_image/images/ocs-01-bootmenu.png
Then append use_dev_list_cache=no in the end of line (/live/vmlinuz... vmwgfx.enable.fbdev=1, actually one one line for this case).
If it's grub boot menu, then you have to press "e" to enter editing mode, then append use_dev_list_cache=no in the end of line ($linux_cmd... vmwgfx.enable.fbdev=1)
If you want to write to the config file, check this doc:
https://clonezilla.org/fine-print-live-doc.php?path=./clonezilla-live/doc/99_Misc/00_live-boot-parameters.doc#00_live-boot-parameters.doc
Steven
I ended up modifying my usual Clonezilla entry in boot/grub/grub.cfg, adding the use_dev_list_cache=no entry at the end of the command line. Upon booting, a cursory check confirmed the parameter was enabled.
As well, the time command is apparently not installed by default, which is surprising. I had to download the time_1.9-0.1_amd64.deb package and install it with dpkg to get that to run.
Here’s the output from sudo time ocs-prep-cache:
It ran extremely quickly, which is not at all what I expected. This does not appear to be the same disk/partition/filesystem processing that takes several minutes for each partition. Should I still run sudo bash -x ocs-prep-cache? If so, what is the best way to save the output? Running as a live system, there is no local file storage available.
If you use use_dev_list_cache=no in the boot parameter, then no need to run "time ocs-prep-cache" since it will be skipped.
To test ocs-prep-cache, you have to remove "use_dev_list_cache=no" from the boot parameter.
What I suggest is:
BTW, time is also builtin in bash, so I should correct my description. Just switch to root by "sudo -i", then run:
time ocs-prep-cache
Steven
All right, this was very interesting. I had to run a test image backup three times, all for different reasons. All three were run using Clonezilla live 20220109-impish with boot parameter use_dev_list_cache=no.
So, the conclusion of all this is that Clonezilla live 20220109-impish with boot parameter use_dev_list_cache=no works on this machine much the same as it did with 20210817-hirsute, which is good news. Would you still want me to run “time ocs-prep-cache” without the use_dev_list_cache=no parameter?
As well, as an enhancement request, it would be nice if os-prober could be disabled, probably through another parameter.
Last edit: Bernard Michaud 2022-01-13
"Would you still want me to run “time ocs-prep-cache” without the use_dev_list_cache=no parameter?" -> Yes.
" it would be nice if os-prober could be disabled, probably through another parameter.: -> Which files are related to this? Please list them so that we know how to deal with.
Steven
I ran “time ocs-prep-cache” from the command line without the use_dev_list_cache=no parameter, and it ran relatively fast:
However, when running Clonezilla live 20220109-impish in the usual way without the use_dev_list_cache=no parameter, the backup disk selection dialogue took as long as earlier attempts, disk activity never stopping. I halted it after ten minutes by rebooting. I can’t explain why running ocs-prep-cache manually from the command line runs faster than when running Clonezilla live as usual. It seems obvious that the lengthy device/partition/filesystem processing is more than just ocs-prep-cache.
As for disabling os-prober, a change must be made in squashfs-root/usr/share/drbl/sbin/ocs-functions from:
to:
Commenting out os-prober prevents it from running.
Last edit: Bernard Michaud 2022-01-13
Well, it’s not all bad; at least I now know to specify the use_dev_list_cache=no parameter in Clonezilla releases 20220109-impish and beyond and get back the same processing speed I had before. I guess it beats standing still and continue to use release 20210817-hirsute, which works fine on my machine but will no longer evolve.
OK, we have another release: 20220118-* or 2.8.2-5 which you can give it a try:
https://clonezilla.org/downloads.php
In this release, a boot parameter was added to disable os-prober, as you have requested:
use_os_prober=no
For more info, please check changelog.
BTW, when you mentioned:
"Second run, I followed the instructions in the split file dialogue and left the default of 0 for no splitting. The image backup appeared to run normally, but the image verification failed with a strange ‘invalid header’ error, and then froze when checking the lvm image files." -> which parameters did you choose? Is that in expert mode and you choose to use "-z5p" (pixz) or "-z5" (xz)? If so, a bug related to this has been fixed in the same version of Clonezilla live.
Thank you for debugging this issue with us. Please let us know the results if you test both of them.
Steven
I gave the new test release 20220118-impish a try. Beforehand, I edited /boot/grub/grub.cfg and added parameters use_dev_list_cache=no and use_os_prober=no to my customary boot entries. I can report that performance is well within the realm of 20210817-hirsute, my reference release. So, all things considered, I am satisfied that I can continue to use Clonezilla on this machine by disabling the dev list cache. As well, I appreciate the parameter to disable os_prober; the workaround was not overly complicated, but it was tedious to remember to apply it.
As for the other issue, it wasn’t with the compression option but the split image file value. My old release 20210817-hirsute had instructions to enter a “big number” such as ‘1000000’ so as not to split the image file, and had 4096 as the default value. New releases since 20211116-impish instead instruct to use ‘0’ to not split the image, and this is also the default value. Allowing ‘0’ to be used as the split image value causes the image verification to fail. Running it again but using ‘1000000’ as the split image value works properly. I have no idea why this is so.
OK, got it.
If the"split" issue is always reproducible when you use "0", please let us know.
Steven