A few months ago, I started using Clonezilla DRBL 2.2.2-1 (AMD64) from CD and it was working great. I could clone 20 and 30 workstations at a time at approximately 6GB/min! It was amazing.
Today, I went to do it again and found that it was 4 times slower: 1.5GB/min. This was a bummer because I had been cloning clusters of PCs in 15 minutes and now it was taking an hour.
I thought I had 1 PC connecting at 100mbps (rather than GigE) and slowing down the UDPcast. But it wasn't that. Then I thought one of the remote PCs had a bad hard drive, but it wasn't that. Then I thought it was a bad source drive and tried using an external USB2 hard drive, and that kinda fixed the problem. The copy went back to 4GB/min, but no faster. So I still wasn't convinced it was the source drive.
Then I enabled the "no GUI" and the "verbose" (-v) switches and could see that the UDPcast was getting 7 to 9% re-xmits. Each Timeout caused the whole system to pause. (When I was using the USB2 drive, I could also see that drive was limited to about 30MB/s so it wasn't able to UDPcast faster than 250mbps or so. As such it was only getting less than 1% re-xmits.)
Apparently, with the regular source drive, the Clonezilla server was trying to send data too fast and was causing so many re-xmits that the end result was an average of ~80mbps! Yikes! That's bad.
I found the /etc/drbl/drbl-ocs.conf file and changed:
udp_sender_extra_opt_default=""
to
udp_sender_extra_opt_default="--max-bitrate 400m"
And restarted the clone. The end result was that my re-xmits dropped to around 1.0% and my network rate went up to ~300mbps and put the total thru-put back to 5+ GB/min, which is good.
Now, I'm curious why it ever slowed down in the first place. I tried doing a multicast session from one PC wired directly to another PC using a single Cat5e cable. And it had the same problem of streaming at 80mbps, so I don't think it is my network.
I'm also curious if clonezilla can be tweaked to have a feedback loop to adjust the max-bitrate on-the-fly so that it can automatically find the "sweet spot".
Last edit: Pretzel 2014-06-12
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Reading up on this issue some more, I found that the UDPcast uses something called Forward Error Correct (FEC) and that I can use the "--fec" switch on the udp_sender_extra_opt_default parameter instead of the max-bitrate command.
After experimenting with the 3 values of FEC, I found that the best value (for my environment) was:
udp_sender_extra_opt_default="--fec 8x16/64"
With this configuration, I was able to get ~5.5GB/min (~350Mbps) with only 4% re-xmits.
I figure this is a better option than using --max-bitrate since it doesn't artificially limit the speed of the network, but allows it enough redundant data to keep pushing forward with minimal re-xmits.
Thoughts?
Also, what is the default FEC of udp-sender if the --fec switch is not specified?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
And it looks like the default FEC values are: 8x8/128 (which corresponds with the example given in the documentation. It's unfortunate that the documentation doesn't say, "oh yeah, this example is also the default value.")
Anyway, if I understand the FEC values correctly, this default configuration has only 6.25% redundant data in it (8 / 128 = 0.0625)
The configuration I came up with (8x16/64) is 25% redundant data which means a much less likelihood of re-transmits without taking up too much of the data stream.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Yeah, I have more tests to run. I was thinking of trying different configurations, like partitions with highly compressible data vs. low compressible data vs. a mix -- to see if it makes a difference. Honestly, though, the best way to test this would be to get some alpha testers to all try a few different configurations and see what everyone comes up with. I'm beginning to see how all of these "clonecasts" are affected by so many different variables.
Last edit: Pretzel 2014-06-14
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi Steven and Pretzel
Do you know if anyone is working on this? I think it would be a good addition to Clonezilla Server, DRBL and DRBL Live. The possibility of a very large speed up is very real. I would be glad to be a tester if someone would build the software and share the download links.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Chuck,
It's not a addition to clonezilla but a setting used by upd sender, all you need to do is add the specific setting ans test away. I have used this particular setting and saw a massive speed increase, but a fairly large failure rate also in comparison to not using it.
Maybe tweaking the percentage may create a happy medium eventually, but we shall see. I don't think its a setting for clonezilla to add out of the box IMO.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
A few months ago, I started using Clonezilla DRBL 2.2.2-1 (AMD64) from CD and it was working great. I could clone 20 and 30 workstations at a time at approximately 6GB/min! It was amazing.
Today, I went to do it again and found that it was 4 times slower: 1.5GB/min. This was a bummer because I had been cloning clusters of PCs in 15 minutes and now it was taking an hour.
I thought I had 1 PC connecting at 100mbps (rather than GigE) and slowing down the UDPcast. But it wasn't that. Then I thought one of the remote PCs had a bad hard drive, but it wasn't that. Then I thought it was a bad source drive and tried using an external USB2 hard drive, and that kinda fixed the problem. The copy went back to 4GB/min, but no faster. So I still wasn't convinced it was the source drive.
Then I enabled the "no GUI" and the "verbose" (-v) switches and could see that the UDPcast was getting 7 to 9% re-xmits. Each Timeout caused the whole system to pause. (When I was using the USB2 drive, I could also see that drive was limited to about 30MB/s so it wasn't able to UDPcast faster than 250mbps or so. As such it was only getting less than 1% re-xmits.)
Apparently, with the regular source drive, the Clonezilla server was trying to send data too fast and was causing so many re-xmits that the end result was an average of ~80mbps! Yikes! That's bad.
I found the /etc/drbl/drbl-ocs.conf file and changed:
udp_sender_extra_opt_default=""
to
udp_sender_extra_opt_default="--max-bitrate 400m"
And restarted the clone. The end result was that my re-xmits dropped to around 1.0% and my network rate went up to ~300mbps and put the total thru-put back to 5+ GB/min, which is good.
Now, I'm curious why it ever slowed down in the first place. I tried doing a multicast session from one PC wired directly to another PC using a single Cat5e cable. And it had the same problem of streaming at 80mbps, so I don't think it is my network.
I'm also curious if clonezilla can be tweaked to have a feedback loop to adjust the max-bitrate on-the-fly so that it can automatically find the "sweet spot".
Last edit: Pretzel 2014-06-12
Reading up on this issue some more, I found that the UDPcast uses something called Forward Error Correct (FEC) and that I can use the "--fec" switch on the udp_sender_extra_opt_default parameter instead of the max-bitrate command.
After experimenting with the 3 values of FEC, I found that the best value (for my environment) was:
udp_sender_extra_opt_default="--fec 8x16/64"
With this configuration, I was able to get ~5.5GB/min (~350Mbps) with only 4% re-xmits.
I figure this is a better option than using --max-bitrate since it doesn't artificially limit the speed of the network, but allows it enough redundant data to keep pushing forward with minimal re-xmits.
Thoughts?
Also, what is the default FEC of udp-sender if the --fec switch is not specified?
Thanks for sharing that. This is great!
From the udpcast manual, I do not see the default value of FEC. Maybe you can ask on the mailing list of udpcast:
https://www.udpcast.linux.lu/cgi-bin/mailman/listinfo/udpcast
Steven.
I took a look at the source code for "udp-sender" found here: http://sourcecodebrowser.com/udpcast/20040531/udp-sender_8c.html
And it looks like the default FEC values are: 8x8/128 (which corresponds with the example given in the documentation. It's unfortunate that the documentation doesn't say, "oh yeah, this example is also the default value.")
Anyway, if I understand the FEC values correctly, this default configuration has only 6.25% redundant data in it (8 / 128 = 0.0625)
The configuration I came up with (8x16/64) is 25% redundant data which means a much less likelihood of re-transmits without taking up too much of the data stream.
Nice hacking.
If you have more tests, and all have better performance. Maybe we can put it as default value.
Thanks!
Steven.
Yeah, I have more tests to run. I was thinking of trying different configurations, like partitions with highly compressible data vs. low compressible data vs. a mix -- to see if it makes a difference. Honestly, though, the best way to test this would be to get some alpha testers to all try a few different configurations and see what everyone comes up with. I'm beginning to see how all of these "clonecasts" are affected by so many different variables.
Last edit: Pretzel 2014-06-14
Hi Steven and Pretzel
Do you know if anyone is working on this? I think it would be a good addition to Clonezilla Server, DRBL and DRBL Live. The possibility of a very large speed up is very real. I would be glad to be a tester if someone would build the software and share the download links.
Chuck,
It's not a addition to clonezilla but a setting used by upd sender, all you need to do is add the specific setting ans test away. I have used this particular setting and saw a massive speed increase, but a fairly large failure rate also in comparison to not using it.
Maybe tweaking the percentage may create a happy medium eventually, but we shall see. I don't think its a setting for clonezilla to add out of the box IMO.
Not really. I do not get any conclusion about this. This is really depends on several parameters. Therefore I could not have the final results.
Steven.