I find a big performance issue during image restore since version 3 of clonezilla live. Version 2.8 is ok.
I tried all downloadable versions: stable, unstable, kinetic, lunar.
Motherboard is an MB-SI-H510V2 (an Asus H510 rebranded from an italian PC manufacturer, Sicomputer). Destination disk is an M2 NVME Lexar 256 SSD.
Same USB3 external drive containing image, connected to the same port, same internal destination disk. Between tests I only changed boot USB key (with different clonezilla versions).
test result:
2.8.1-12-amd64: 4 min
3.0.2-21-amd64: 1h 50m
3.0.2-22-amd64: the same
kinetic and lunar: a bit faster, 1h 40m
kinetic and lunar have less latency in interface (menu loading and walking is more responsive)
I noticed that preliminary check takes the same (good) time in every case, less than 4 minutes.
It seems that reading from USB drive is ok. Issue is in writing to internal drive.
I tried a "hdparm -tT" on internal drive (with working and not working version), and I find similar values (22000/416 with "lunar", 21500/438 with working 2.8-1-12).
Nothing that can account such an extra time.
A note: on the same motherboard, if the M2 drive has SATA protocol instead of NVME, the issue does not happen.
I just tried with latest stable 3.0.3-22, issue persists.
This time is a non rebranded motherboard, a plain Asus Prime H510M-A with latest BIOS version (same H510 chipset).
Here again, latest working version is 2.8.1-12-amd64.
It looks like this issue is related to Linux kernel and hardware support.
Have you tried testing Clonezilla live >=3.1.0-15 or 20230316-lunar?
https://clonezilla.org/downloads.php
Both of them come with newer Linux kernel, so maybe this issue is gone.
Steven
This time I had one of those machines for a short time and I couldn't try testing versions.
I'll do as soon as possible (when I have one machine).
Tried now 3.1.0.18: issue is still there. Now I'm testing 20230328-kinetic, and later I'll try Lunar.
Kinetic is failing just as 3.1.0.18, I see from expected time.
And I fear Lunar will fail too.
Now I have the machine for some days: is there some data I can collect?
Which old verson works for you?
If there is, stick to that one.
Steven
Lunar fails too. As in the previous tests, Ubuntu versions have a more responsive UI and save 10 minutes (1h 50m -> 1h 40m).
As I said in my first message, 2.8.1-12-amd64 works (4-5 minutes).
Stick to that? It's what I'm doing since I faced this issue. Now I don't see disadvantages and I use as "stable" version on my USB keys, it works well on all machines I found.
But there's a risk that in the future it fails on some machine. I saw some version before 2.8 failing with secure boot enable, or not being able to restore an image done with newer version.
I want to install a standard Debian with vanilla kernels on this machine, to bisect between kernel versions and report a bug kernel.
But I don't know how to trigger the slowness: fio? partclone restoring a single partition?
hdparm -tT didn't show differences when I tested.
I built an environment test.
I installed Debian Linux 11 amd64 (it has partclone 0.3.13) on the machine.
I tried with three different kernels:
Original distribution kernel
1) 5.10.0-20-amd64
From kernel.org (compiled and installed without packaging: make, make modules, make install)
2) 5.15.105
3) 6.1.22
Some test with fio gave about same values, so I decided to try partclone with different kernels.
I tried to restore only the big partition from image (C: windows volume, it's the one that accounts for 1h50m result in new clonezilla versions; the other three Esp/msr/rec account for less than a minute)
I tried both from another partition on the same drive, and from USB drive: no difference.
Command:
cat nvme0n1p3.ntfs-ptcl-img.gz | gunzip -c | partclone.ntfs -d -r -N -s - -o /dev/nvme0n1p4
Results are:
5.10.0-20-amd64: 07:25 min
5.15.105: 08:05 min
6.1.22: 08:40 min
Then I updated to bookworm (it has partclone 0.3.23), and tested on same kernels:
5.10.0-20-amd64: 07:20 min
5.15.105: 08:04 min
6.1.22: 08:16 min
It seems not to be kernel itself from a given version.
What could be the difference between this test environment and live clonezilla one?
Perhaps kernel configuration?
Since I have no identical machine here, I can not find the culprit.
I would suggest you try to use Clonezilla live, but try to switch the Linux kernel if you can.
There is a command called "ocs-live-swap-kernel", and you can run:
sudo ocs-live-swap-kernel --help
to get the usage.
Try to find a Debian or Ubuntu Linux, and you can install the drbl, clonezilla packages, and download the Clonezilla live iso/zip to use that command to switch the Linux kernel.
By doing this, we can narrow the environment to be more similar.
BTW, if you do care about the speed, use "-z9p", i.e., parallel zstd, not the "-z1" or "-z1p".
Steven
Now I have the machine again and I can test, but I don't understand what you are suggesting.
Should I load Clonezilla live from USB key and try to change kernel there?
How do I install a kernel in that environment?
Or should I install a Debian or Ubuntu, and then install clonezilla package?
Last edit: Valerio Vanni 2023-05-11
Right, maybe it's too complicated. I should have explained more.
However, I believe it's easier you try to use the testing or alternation testing Clonezilla live, because we keep updating that. The Linux kernel is updated.
You can give Clonezilla live 20230507-mantic a try because it comes with newer Linux kernel 6.2.0-21.21:
https://clonezilla.org/downloads.php
Steven
I always try latest versions as long as they are released.
But it's unlikely that issue will disappear by chance. During last days I tried latest ones.
As I said, tests with a standard Debian install with v5 and v6 kernels are successful.
A difference could be kernel config, can you give me kernel config used in Clonezilla live?
hello, we are facing same kind of issues with Lenovo+Samsung NVME.
Recompiling partclone enabling O_DIRECT flag gave back expected performance.
(https://github.com/valeriop/partclone/commit/f74b54256b3611ac78e7fbb742064f6a7e0b388a)
We suspect there's a bad interaction between NVME firmware and I/O kernel cache.
Opening alternative shell and execing 'top' command shows increasing buff/cache size and writing speed quickly slows down about a minute after partclone.ntfs .
Valerio
You say that a recompiled partclone works. But is partclone on Clonezilla live different from Debian one?
I tried with standard partclone in Debian and it works.
I looked at "top" (with clonezilla live), and it's true: buff/cache size increase (near RAM size), but it happens the same with old version 2.8.1.12 that shows a normal speed.
Have you tried 2.8.1.2?
Last edit: Valerio Vanni 2023-05-17
"A difference could be kernel config, can you give me kernel config used in Clonezilla live?" -> We actually use the Linux kernel from Debian repository. Hence you can just find more info here:
https://tracker.debian.org/pkg/linux-signed-amd64
"is partclone on Clonezilla live different from Debian one?" -> they are different. In Clonezilla live, we use the one from our project repository, i.e., http://free.nchc.org.tw/drbl-core/
If the version is the same, basically they are very similar. The difference might only be the way how it is compiled.
You can refer to https://tracker.debian.org/pkg/partclone
"Have you tried 2.8.1.2?" -> not really. Since we can not reproduce this issue using the latest Partclone (0.3.23) here, nothing we can compare.
Steven
The suggestion to try 2.8.1.2 was for Valerio Pulese, since he owns a machine that triggers the issue.
I just tried these latest versions:
stable - 3.1.0-22
testing - 3.1.1-18
20230903-lunar
20230903-mantic
And I found an improvement: restore time is about 1h and 10 mins
There is no difference between these 4. Ubuntu based versions have a more responsive interface, but it has no effect on restore time.
Better than 1h and 50 mins, but still far from 5 minutes of version 2.8.
Could you please show the log file /var/log/clonezilla.log obtained by using Clonezilla live 3.1.1-18 and 2.8.1-2, separately?
Steven
I thought I had 3.1.1-18 in my USB key today, but it was 3.1.0-22.
Now I post these 2 logs, then I'll try also with 3.1.1-18.
Edit: done the test with 3.1.1-18, I add log.
Last edit: Valerio Vanni 2023-09-15
Thanks.
OK, the log files show that almost the restoring processes are the same.
To isolate the issue, I suggest we do some IO tests in different versions of Clonezilla live. I.e., please boot Clonezilla live 2.8.1-2 and 3.1.1-18 separately, and follow the following docs to do that:
https://linuxreviews.org/HOWTO_Test_Disk_I/O_Performance#hdparm
Please let us know the results. Thanks.
Steven
I'm trying. In the first message I reported some test with "hdparm -t -T". Now I've tried with hdparm --direct -t -T" as adviced in that page.
Results are identical between Clonezilla live versions: in both cases, I get 1500-1600 MB/sec for cached read and 1000-1100 for uncached read.
But these are read tests, our issue happens during restore on NVME driver.
Then I've tried this (I mounted the main NTFS partition on /home/partimage):
dd if=/dev/zero of=/home/partimage/test1.raw bs=512 count=1000000 oflag=dsync
With both 2.8.1.2 and 3.1.1-18 I get 1.6 MB/sec.
Using bigger chunks (bs=1024, bs=2048 etc) speed increases, but again at the same pace for Clonezilla live versions.
Last edit: Valerio Vanni 2023-09-18
OK, so the issue should not be on the OS itself.
Could you please try o save the image using "-z9p"? I.e., saving it by zstd compression format. Then restore the image? Please use Clonezilla live >= 3.1.1-18 in this test.
Steven
Tried with "-z9p", restore takes a little longer.
Image creation time is still ok, in the 5-7 minutes range.
Created image is smaller, 28 GB vs 30 of "-z1p".
Thanks. This is really interesting... Supposed zstdmt should have better performance than pigz....
Anyway, I believe maybe we can focus on Partclone itself.
Please have another test:
1. Boot Clonezilla live 3.1.1-18 amd64 version
2. Enter command line prompt.
3. sudo -i
4. wget http://free.nchc.org.tw/drbl-core/old/deb/partclone/partclone_0.3.18-drbl1_amd64.deb
5. dpkg -i partclone_0.3.18-drbl1_amd64.deb
6. clonezilla
7. Then do a normal restoring for your image.
Please show the log file so that we can compare that.
Thanks.
Steven
Done the test, still slow writing.
Done another test, I tried to apply the flag adviced here: https://sourceforge.net/p/clonezilla/bugs/395/#c934.
So I did this:
-On another machine, with Debian 12, I downloaded source package for partclone. I got version 0.3.23+repack-1.
-I compiled, packaged and saved a deb file.
-I modified partclone.c, adding O_DIRECT to open_target section flags
-I compiled and packaged again, and saved another deb file
-I copied both deb files on the usb key I use with Clonezilla 3.1.1-18: I wanted to have two version with only that difference.
-Then tried both versions in clonezilla live, like I tested 0.3.18. A time I installed modified one and another time original one. I didn't know if it matters, but to be sure I rebooted between tests.
The version with O_DIRECT flag restored main Windows partition in 428 seconds, only 23 more than 2.8.1-2. The original one went slow like the one we tested.
Last edit: Valerio Vanni 2023-09-22