I've been working with the app for a few days now and reading the posts here to try and work through these issues. I was having major problems at first due to packet loss on our WAN, after adjusting the RTT, Buffer, and Announce params it seemed to help a lot, until now.
I have 2 issues that may or may not be related...
Last night I thought I had everything working well. I had 2 uftp servers (on the same host) sending to 1 client-daemon (remote host), both were happening fast and running well when I went to bed last night. When I got up this morning and checked it, the large file copy I had been waiting on (160 GB ~7 hours with -R 48000) had timed out right at the end. I increased the buffer and re-ran this copy on it's own, with no other uftp processes and '-R 140000' to speed it up. Same result. After about 2.5 hours, right at what should have been the end of the copy, it timed out. The file appeared to be complete in a directory listing, but there were still Sections/NAK's being transfered and ultimately timed out / aborted.
Second issue... Not only had the large copy timed out this morning, but the other uftp process I had copying smaller files in bulk had come to a complete crawl, almost no output from the process and only about 1-2mbps on the remote end, despite being in the middle of copying a 78 MB file. I killed the process and started it again, and was back to full speed. It is worth noting the Server side command passed to uftp for this copy included a list of probably 270 folders to copy (each with about 700 files within).
Here is an example from the end of the Client Daemon side log when the large 160 GB file timed out.
2018/05/16 17:23:06.247829: [92DC306E/00:0004]: Sent 33 NAKs for section 11545
2018/05/16 17:23:06.247829: [92DC306E/00:0004]: Sent 703 NAKs for section 11558
2018/05/16 17:23:07.529080: [92DC306E/00:0004]: Sent 2 NAKs for section 11584
2018/05/16 17:23:07.529080: [92DC306E/00:0004]: Sent 1034 NAKs for section 11640
2018/05/16 17:23:09.607211: [92DC306E/00:0004]: Got DONE message for section 11833
2018/05/16 17:23:10.232213: [92DC306E/00:0004]: Got DONE message for section 11833
2018/05/16 17:23:16.700974: [92DC306E/00:0004]: Sent 3 NAKs for section 669
2018/05/16 17:23:23.997865: [92DC306E/00:0004]: Sent 17 NAKs for section 1131
2018/05/16 17:23:25.935370: [92DC306E/00:0004]: Sent 31 NAKs for section 1444
2018/05/16 17:23:25.935370: [92DC306E/00:0004]: Sent 54 NAKs for section 1465
2018/05/16 17:23:27.654124: [92DC306E/00:0004]: Sent 225 NAKs for section 1829
2018/05/16 17:23:34.544763: [92DC306E/00:0004]: Sent 864 NAKs for section 2394
2018/05/16 17:23:39.857273: [92DC306E/00:0004]: Sent 210 NAKs for section 2888
2018/05/16 17:23:44.904158: [92DC306E/00:0004]: Sent 368 NAKs for section 3364
2018/05/16 17:23:48.622919: [92DC306E/00:0004]: Sent 167 NAKs for section 3724
2018/05/16 17:23:48.622919: [92DC306E/00:0004]: Sent 133 NAKs for section 3725
2018/05/16 17:23:49.263543: [92DC306E/00:0004]: Sent 191 NAKs for section 3770
2018/05/16 17:23:51.326046: [92DC306E/00:0004]: Sent 1 NAKs for section 3837
2018/05/16 17:23:57.201060: [92DC306E/00:0004]: Sent 377 NAKs for section 4404
2018/05/16 17:24:00.544816: [92DC306E/00:0004]: Sent 412 NAKs for section 4744
2018/05/16 17:24:01.544819: [92DC306E/00:0004]: Sent 113 NAKs for section 5001
2018/05/16 17:24:03.701072: [92DC306E/00:0004]: Sent 112 NAKs for section 5217
2018/05/16 17:24:04.591698: [92DC306E/00:0004]: Sent 401 NAKs for section 5239
2018/05/16 17:24:07.060454: [92DC306E/00:0004]: Sent 125 NAKs for section 5567
2018/05/16 17:24:13.466717: [92DC306E/00:0004]: Sent 232 NAKs for section 6111
2018/05/16 17:24:14.372971: [92DC306E/00:0004]: Sent 88 NAKs for section 6291
2018/05/16 17:24:15.451096: [92DC306E/00:0004]: Sent 24 NAKs for section 6324
2018/05/16 17:24:15.451096: [92DC306E/00:0004]: Sent 36 NAKs for section 6402
2018/05/16 17:24:15.451096: [92DC306E/00:0004]: Sent 129 NAKs for section 6406
2018/05/16 17:24:19.388605: [92DC306E/00:0004]: Sent 372 NAKs for section 6824
2018/05/16 17:24:19.982357: [92DC306E/00:0004]: Sent 139 NAKs for section 6870
2018/05/16 17:24:21.216733: [92DC306E/00:0004]: Sent 622 NAKs for section 6980
2018/05/16 17:24:21.216733: [92DC306E/00:0004]: Sent 680 NAKs for section 6985
2018/05/16 17:24:23.966740: [92DC306E/00:0004]: Sent 550 NAKs for section 7145
2018/05/16 17:24:24.591739: [92DC306E/00:0004]: Sent 977 NAKs for section 7280
2018/05/16 17:24:34.966762: [92DC306E/00:0004]: Sent 44 NAKs for section 8232
2018/05/16 17:24:35.904265: [92DC306E/00:0004]: Sent 180 NAKs for section 8394
2018/05/16 17:24:35.904265: [92DC306E/00:0004]: Sent 434 NAKs for section 8395
2018/05/16 17:24:39.732396: [92DC306E/00:0004]: Sent 162 NAKs for section 8674
2018/05/16 17:24:45.388659: [92DC306E/00:0004]: Sent 325 NAKs for section 9080
2018/05/16 17:24:54.091803: [92DC306E/00:0004]: Sent 58 NAKs for section 9894
2018/05/16 17:24:54.091803: [92DC306E/00:0004]: Sent 138 NAKs for section 9895
2018/05/16 17:24:54.091803: [92DC306E/00:0004]: Sent 1022 NAKs for section 9900
2018/05/16 17:24:54.091803: [92DC306E/00:0004]: Sent 207 NAKs for section 9901
2018/05/16 17:24:54.091803: [92DC306E/00:0004]: Sent 309 NAKs for section 9927
2018/05/16 17:24:54.716803: [92DC306E/00:0004]: Sent 1109 NAKs for section 10151
2018/05/16 17:24:59.998064: [92DC306E/00:0004]: Sent 77 NAKs for section 10569
2018/05/16 17:25:00.998066: [92DC306E/00:0004]: Sent 467 NAKs for section 10728
2018/05/16 17:25:04.576201: [92DC306E/00:0004]: Sent 97 NAKs for section 10942
2018/05/16 17:25:04.576201: [92DC306E/00:0004]: Sent 207 NAKs for section 10943
2018/05/16 17:25:04.576201: [92DC306E/00:0004]: Sent 136 NAKs for section 11064
2018/05/16 17:25:10.060585: [92DC306E/00:0004]: Sent 294 NAKs for section 11480
2018/05/16 17:25:12.826216: [92DC306E/00:0004]: Got DONE message for section 11833
2018/05/16 17:25:13.044967: [92DC306E/00:0004]: Sent 424 NAKs for section 11833
2018/05/16 17:25:13.419970: [92DC306E/00:0004]: Got DONE message for section 11833
2018/05/16 17:25:26.794994: [92DC306E/00:0004]: Sent 1 NAKs for section 1131
2018/05/16 17:25:47.373164: [92DC306E/00:0004]: Sent 157 NAKs for section 3769
2018/05/16 17:25:47.982538: [92DC306E/00:0004]: Sent 659 NAKs for section 3774
2018/05/16 17:26:07.998204: [92DC306E/00:0004]: Sent 43 NAKs for section 5996
2018/05/16 17:26:09.607582: [92DC306E/00:0004]: Sent 73 NAKs for section 6286
2018/05/16 17:26:14.388843: [92DC306E/00:0004]: Sent 38 NAKs for section 6820
2018/05/16 17:26:14.732593: [92DC306E/00:0004]: Sent 243 NAKs for section 6824
2018/05/16 17:26:27.185744: [92DC306E/00:0004]: Sent 740 NAKs for section 7572
2018/05/16 17:26:34.717011: [92DC306E/00:0004]: Sent 757 NAKs for section 9078
2018/05/16 17:26:44.701407: [92DC306E/00:0004]: Sent 445 NAKs for section 10148
2018/05/16 17:26:59.467061: [92DC306E/00:0004]: Got DONE message for section 11833
2018/05/16 17:26:59.998313: [92DC306E/00:0004]: Got DONE message for section 11833
2018/05/16 17:27:20.560855: [92DC306E/00:0004]: Transfer timed out
2018/05/16 17:27:20.623355: [92DC306E/00:0004]: starting file close
2018/05/16 17:27:20.638983: [92DC306E/00:0004]: done file close
I tried again over night, with max buffer and cache settings on both sides, almost limited to 50mbps... same problem :/ It gets through the whole file, then aborts at some point during replay of NAK's I believe
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
You may want to try using restart mode. You'll need to pass -f to the server to enable restart mode and also pass -T to the client to use a temp directory. Then if the session failed you can restart it by using the -F option and passing it the name of the server's restart file.
This means you can't use a absolute pathname for the server's -D option, however since you only have one destination directory on the client you can remove the leading path elements up to the name of the destionation. So instead of -D \\remoteplace\stuff\morespecificstuff you would just pass -D morespecificstuff.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
UFTP version 4.9.6 Copyright (C) 2001-2018 Dennis A. Bush
Starting at Thu May 24 14:37:46 2018
Transfer rate: 150000 Kbps (18750 KB/s)
Wait between packets: 70 us
Using private multicast address 1.2.2.1 Group ID: 9F7BC5A0
Initializing group
Sending ANNOUNCE 1.1
Received REGISTER from client 0x0ACA140A
Late registers:
Sending REG_CONF 2.1
----- FileName.bak -----
File ID: 0001 Name: Filename.bak
sending as: FolderName
Bytes: 160116334592 Blocks: 123166412 Sections: 11843
Sending FILEINFO 1.1
Received FILEINFO_ACK from client 0x0ACA140A
Sending file
Starting pass 1
Sending DONE 1.1
Got 30 NAKs for section 1320 from client 0x0ACA140A
Got 6 NAKs for section 1380 from client 0x0ACA140A
Got 3 NAKs for section 1375 from client 0x0ACA140A
Got 24 NAKs for section 1732 from client 0x0ACA140A
Got 177 NAKs for section 1753 from client 0x0ACA140A
Got 129 NAKs for section 1767 from client 0x0ACA140A
Got 38 NAKs for section 1925 from client 0x0ACA140A
Got 164 NAKs for section 2049 from client 0x0ACA140A
Got 62 NAKs for section 2098 from client 0x0ACA140A
<repeat similar="" messages="" for="" another="" ~800="" lines="">
Got 124 NAKs for section 11726 from client 0x0ACA140A
Got 35 NAKs for section 11747 from client 0x0ACA140A
Got 1 NAKs for section 11803 from client 0x0ACA140A
Got 14 NAKs for section 11809 from client 0x0ACA140A
Sending DONE 2.1
Starting pass 2
Transfer aborted by 0x0ACA140A: Transfer timed out
Transfer status:
Host: 0x0ACA140A Status: Aborted
Total elapsed time: 0.000 seconds
Overall throughput: 0.00 KB/s
uftp: Finishing at Thu May 24 14:37:58 2018
Client side:
<snip>
2018/05/24 14:36:23.590178: [9F7BC5A0/03:0001]: Sent 76 NAKs for section 11508
2018/05/24 14:36:23.590178: [9F7BC5A0/03:0001]: Sent 62 NAKs for section 11522
2018/05/24 14:36:23.590178: [9F7BC5A0/03:0001]: Sent 41 NAKs for section 11680
2018/05/24 14:36:23.590178: [9F7BC5A0/03:0001]: Sent 66 NAKs for section 11687
2018/05/24 14:36:23.590178: [9F7BC5A0/03:0001]: Sent 168 NAKs for section 11699
2018/05/24 14:36:23.590178: [9F7BC5A0/03:0001]: Sent 8 NAKs for section 11719
2018/05/24 14:36:23.590178: [9F7BC5A0/03:0001]: Sent 124 NAKs for section 11726
2018/05/24 14:36:23.590178: [9F7BC5A0/03:0001]: Sent 35 NAKs for section 11747
2018/05/24 14:36:23.590178: [9F7BC5A0/03:0001]: Sent 1 NAKs for section 11803
2018/05/24 14:36:23.590178: [9F7BC5A0/03:0001]: Sent 14 NAKs for section 11809
2018/05/24 14:36:23.777679: [9F7BC5A0/03:0001]: Got DONE message for section 11842
2018/05/24 14:36:25.230812: [9F7BC5A0/03:0001]: Transfer timed out
2018/05/24 14:36:25.230812: [9F7BC5A0/03:0001]: starting file close
2018/05/24 14:36:25.230812: [9F7BC5A0/03:0001]: done file close
2018/05/24 14:36:25.230812: [9F7BC5A0/03:0001]: Writing restart file
I am totally willing to admit that there's just way to much packet loss, but the initial copy runs at a solid 150mbps for 2-3 hours before failing, which gives me hope, then drops me :) lol
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Part of the issue with sending large files is that after the first pass the server needs to seek long distances across the file to retrieve the relavant parts of the file to send. This seeking can take a long time on files several GB in size resulting the clients getting starved for packets and timing out.
There are a few things you can do to address this. First on the client, you'll want to increase the cache size via the -c option from the default of 1048576 (1 MB). At speeds over 100 Mbps, this will reduce the number of NAKs by writing to disk fewer times but in larger chunks. You'll also want to use the -B option to increase the UDP receive buffer size from the default of 262144 (256K) to at least 2097152 (2 MB) or more.
On the server, I see you've already set the robustness factor to its maximum and increased the minimum GRTT to 0.2 seconds. You may want to increase the minumum GRTT further. This will give the server more time to perform large seeks.
One thing I might try doing is to seek and read in a separate thread and periodically send heartbeat messages if it's taking too long.
Regards,
Dennis
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I just released version 4.9.7 which should address the timout issue when sending large files. There was a delay in looping through the list of blocks to send which has been fixed.
With this version, it shouldn't be necessary to artifically increase the GRTT to prevent timeouts. You'll still want to tune the cache size and UDP buffer size on the client however.
Regards,
Dennis
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I've been working with the app for a few days now and reading the posts here to try and work through these issues. I was having major problems at first due to packet loss on our WAN, after adjusting the RTT, Buffer, and Announce params it seemed to help a lot, until now.
I have 2 issues that may or may not be related...
Last night I thought I had everything working well. I had 2 uftp servers (on the same host) sending to 1 client-daemon (remote host), both were happening fast and running well when I went to bed last night. When I got up this morning and checked it, the large file copy I had been waiting on (160 GB ~7 hours with -R 48000) had timed out right at the end. I increased the buffer and re-ran this copy on it's own, with no other uftp processes and '-R 140000' to speed it up. Same result. After about 2.5 hours, right at what should have been the end of the copy, it timed out. The file appeared to be complete in a directory listing, but there were still Sections/NAK's being transfered and ultimately timed out / aborted.
Second issue... Not only had the large copy timed out this morning, but the other uftp process I had copying smaller files in bulk had come to a complete crawl, almost no output from the process and only about 1-2mbps on the remote end, despite being in the middle of copying a 78 MB file. I killed the process and started it again, and was back to full speed. It is worth noting the Server side command passed to uftp for this copy included a list of probably 270 folders to copy (each with about 700 files within).
Here are my server side commands:
Client Daemon running this:
Here is an example from the end of the Client Daemon side log when the large 160 GB file timed out.
And of the Server side
Thank you!
Mark
Last edit: Markw 2018-05-16
I tried again over night, with max buffer and cache settings on both sides, almost limited to 50mbps... same problem :/ It gets through the whole file, then aborts at some point during replay of NAK's I believe
You may want to try using restart mode. You'll need to pass -f to the server to enable restart mode and also pass -T to the client to use a temp directory. Then if the session failed you can restart it by using the -F option and passing it the name of the server's restart file.
This means you can't use a absolute pathname for the server's -D option, however since you only have one destination directory on the client you can remove the leading path elements up to the name of the destionation. So instead of
-D \\remoteplace\stuff\morespecificstuff
you would just pass-D morespecificstuff
.Still not finishing in success. I've run the restart a few times, and each time the result is similar:
PS C:\uftp> .\uftp.exe -R 150000 -M Remoteclient -D FolderName -s 50 -r 1:0.2:100 -B 200971
UFTP version 4.9.6 Copyright (C) 2001-2018 Dennis A. Bush
Starting at Thu May 24 14:37:46 2018
Transfer rate: 150000 Kbps (18750 KB/s)
Wait between packets: 70 us
Using private multicast address 1.2.2.1 Group ID: 9F7BC5A0
Initializing group
Sending ANNOUNCE 1.1
Received REGISTER from client 0x0ACA140A
Late registers:
Sending REG_CONF 2.1
----- FileName.bak -----
File ID: 0001 Name: Filename.bak
sending as: FolderName
Bytes: 160116334592 Blocks: 123166412 Sections: 11843
Sending FILEINFO 1.1
Received FILEINFO_ACK from client 0x0ACA140A
Sending file
Starting pass 1
Sending DONE 1.1
Got 30 NAKs for section 1320 from client 0x0ACA140A
Got 6 NAKs for section 1380 from client 0x0ACA140A
Got 3 NAKs for section 1375 from client 0x0ACA140A
Got 24 NAKs for section 1732 from client 0x0ACA140A
Got 177 NAKs for section 1753 from client 0x0ACA140A
Got 129 NAKs for section 1767 from client 0x0ACA140A
Got 38 NAKs for section 1925 from client 0x0ACA140A
Got 164 NAKs for section 2049 from client 0x0ACA140A
Got 62 NAKs for section 2098 from client 0x0ACA140A
<repeat similar="" messages="" for="" another="" ~800="" lines="">
Got 124 NAKs for section 11726 from client 0x0ACA140A
Got 35 NAKs for section 11747 from client 0x0ACA140A
Got 1 NAKs for section 11803 from client 0x0ACA140A
Got 14 NAKs for section 11809 from client 0x0ACA140A
Sending DONE 2.1
Starting pass 2
Transfer aborted by 0x0ACA140A: Transfer timed out
Transfer status:
Host: 0x0ACA140A Status: Aborted
Total elapsed time: 0.000 seconds
Overall throughput: 0.00 KB/s
uftp: Finishing at Thu May 24 14:37:58 2018
Client side:
<snip>
2018/05/24 14:36:23.590178: [9F7BC5A0/03:0001]: Sent 76 NAKs for section 11508
2018/05/24 14:36:23.590178: [9F7BC5A0/03:0001]: Sent 62 NAKs for section 11522
2018/05/24 14:36:23.590178: [9F7BC5A0/03:0001]: Sent 41 NAKs for section 11680
2018/05/24 14:36:23.590178: [9F7BC5A0/03:0001]: Sent 66 NAKs for section 11687
2018/05/24 14:36:23.590178: [9F7BC5A0/03:0001]: Sent 168 NAKs for section 11699
2018/05/24 14:36:23.590178: [9F7BC5A0/03:0001]: Sent 8 NAKs for section 11719
2018/05/24 14:36:23.590178: [9F7BC5A0/03:0001]: Sent 124 NAKs for section 11726
2018/05/24 14:36:23.590178: [9F7BC5A0/03:0001]: Sent 35 NAKs for section 11747
2018/05/24 14:36:23.590178: [9F7BC5A0/03:0001]: Sent 1 NAKs for section 11803
2018/05/24 14:36:23.590178: [9F7BC5A0/03:0001]: Sent 14 NAKs for section 11809
2018/05/24 14:36:23.777679: [9F7BC5A0/03:0001]: Got DONE message for section 11842
2018/05/24 14:36:25.230812: [9F7BC5A0/03:0001]: Transfer timed out
2018/05/24 14:36:25.230812: [9F7BC5A0/03:0001]: starting file close
2018/05/24 14:36:25.230812: [9F7BC5A0/03:0001]: done file close
2018/05/24 14:36:25.230812: [9F7BC5A0/03:0001]: Writing restart file
I am totally willing to admit that there's just way to much packet loss, but the initial copy runs at a solid 150mbps for 2-3 hours before failing, which gives me hope, then drops me :) lol
Mark,
Part of the issue with sending large files is that after the first pass the server needs to seek long distances across the file to retrieve the relavant parts of the file to send. This seeking can take a long time on files several GB in size resulting the clients getting starved for packets and timing out.
There are a few things you can do to address this. First on the client, you'll want to increase the cache size via the -c option from the default of 1048576 (1 MB). At speeds over 100 Mbps, this will reduce the number of NAKs by writing to disk fewer times but in larger chunks. You'll also want to use the -B option to increase the UDP receive buffer size from the default of 262144 (256K) to at least 2097152 (2 MB) or more.
On the server, I see you've already set the robustness factor to its maximum and increased the minimum GRTT to 0.2 seconds. You may want to increase the minumum GRTT further. This will give the server more time to perform large seeks.
One thing I might try doing is to seek and read in a separate thread and periodically send heartbeat messages if it's taking too long.
Regards,
Dennis
Mark,
I just released version 4.9.7 which should address the timout issue when sending large files. There was a delay in looping through the list of blocks to send which has been fixed.
With this version, it shouldn't be necessary to artifically increase the GRTT to prevent timeouts. You'll still want to tune the cache size and UDP buffer size on the client however.
Regards,
Dennis
" on files several GB in size"
This one is a 156 GB file...
I will give the new version a try with the above advice and new version.
Thank you for your help with this! Better support than some commercial products I've used