Hello, I'm using Clonezilla lite-server to restore a disk image via network into several computers, the disk image contains 4 partitions, and when I boot the clients via network they restore well some partitions, but when they arrive to the last partition some of the clients hang (it shows the image about to restore partition but no progress bar).
However, some other clients (usually 2 of them, but not always the same PCs) restore everything well. So when these 2 clients finish, I see the rest are "hung" waiting for the last partition restoring job to start, and I think that my only choice is to stop server and all the clients, and launch another job in the server to clone only this last partition, then boot the clients that hanged before and then this last partition gets restored without issues.
I don't know why this happens. I have several classrooms to restore (hundreds of clients that are connected to different switches, so my plan is usually launch 1 restoring job per switch, connecting the computer acting as lite server to such switch). With this issue, I'm doing an extra restoring job for several clients each time (after trying to restore the whole disk I have to restore the last partition for the clients that failed).
Where can I look or research to try to diagnose the issue? If it is a time/sync issue with the network, can I tune the server to give some more time to wait the clients to launch the next partition restore job? (I give 120sec time at the beginning when I prepare the "massive-deployment" job and it's usually enough, but I don't know if that 120sec is applied to each partition or only for the first time the clients connect with the server).
Kind regards
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Broadcast is worse, no clients get to start restoring the first partition, I guess it's because my switches blocks it or start to drop packets very early.
With multicast it start restoring well, first 2 partitions are small, then the 3rd partition is like 400GB and restores successfully, and then it hangs starting to restore the 4th partition for most of the clients (all of them except two).
I tried in the same classroom with a different set of computers (same switch) and it happened exactly the same: I got 2 clients with full disk restoration well, and the rest with 3 partitions ok and hung at the last partition (not even starting to clone this last partition)
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hello, I'm using Clonezilla lite-server to restore a disk image via network into several computers, the disk image contains 4 partitions, and when I boot the clients via network they restore well some partitions, but when they arrive to the last partition some of the clients hang (it shows the image about to restore partition but no progress bar).
However, some other clients (usually 2 of them, but not always the same PCs) restore everything well. So when these 2 clients finish, I see the rest are "hung" waiting for the last partition restoring job to start, and I think that my only choice is to stop server and all the clients, and launch another job in the server to clone only this last partition, then boot the clients that hanged before and then this last partition gets restored without issues.
I don't know why this happens. I have several classrooms to restore (hundreds of clients that are connected to different switches, so my plan is usually launch 1 restoring job per switch, connecting the computer acting as lite server to such switch). With this issue, I'm doing an extra restoring job for several clients each time (after trying to restore the whole disk I have to restore the last partition for the clients that failed).
Where can I look or research to try to diagnose the issue? If it is a time/sync issue with the network, can I tune the server to give some more time to wait the clients to launch the next partition restore job? (I give 120sec time at the beginning when I prepare the "massive-deployment" job and it's usually enough, but I don't know if that 120sec is applied to each partition or only for the first time the clients connect with the server).
Kind regards
Forgot to say that I'm restoring using Multicast option.
Did you try broadcast mechanism?
Same issue?
Broadcast is worse, no clients get to start restoring the first partition, I guess it's because my switches blocks it or start to drop packets very early.
With multicast it start restoring well, first 2 partitions are small, then the 3rd partition is like 400GB and restores successfully, and then it hangs starting to restore the 4th partition for most of the clients (all of them except two).
I tried in the same classroom with a different set of computers (same switch) and it happened exactly the same: I got 2 clients with full disk restoration well, and the rest with 3 partitions ok and hung at the last partition (not even starting to clone this last partition)
Multicast mechanism need good quality of network switch... Not sure if your issue is on the switch or not.
However, maybe you can try bittorrent one?
I will try with BitTorrent, thanks!