From: Mantis B. T. <no...@bu...> - 2010-05-30 10:50:08
|
A NOTE has been added to this issue. ====================================================================== http://bugs.bacula.org/view.php?id=1493 ====================================================================== Reported By: robe Assigned To: kern ====================================================================== Project: bacula Issue ID: 1493 Category: File Daemon Reproducibility: always Severity: tweak Priority: normal Status: feedback ====================================================================== Date Submitted: 2010-02-09 11:32 GMT Last Modified: 2010-05-30 11:49 BST ====================================================================== Summary: Don't set the TCP send buffer size when it's not specified Description: I'm currently working on a setup where data is backed up from a Data center in the USA to a bacula dir/sd in the netherlands. The latency on the link is noticeable which in combination with small TCP windows causes very low throughput and long-running backup jobs. Since there's enough bandwidth available this is easily fixed by increasing the TCP window size to something in the half-megabyte range [1], which isn't an issue especially on modern servers with many GB of RAM. This can usually be done easily by adjusting the given parameter in the OS network stack, all newly created sockets then will use the new window size as it's default. The Bacula FileDaemon though, always resets the default send buffer size to 64KiB when there's no "Maximum Network Buffer Size" specified, which is very unexpected for an Unix daemon and increases the amount of work for Backup Infrastructure operators in analyzing the situation and then introducing the necessary changes in the infrastructure. Please consider not changing the SNDBUF size in the FD when there's no value configured in it's configfile. The documentation of "Maximum Network Buffer Size" [2] also needs a bit of reworking, you want to have a very large send buffer especially on (high latency) internet links to achieve a sufficient throughput. The only drawback this brings is, that Bacula FDs might be able to saturate the network connections for the first time when being run with sufficient large TCP windows... best regards, Michael Renner [1] Mattias Wadenstein and others have written about this, see http://www.acc.umu.se/~maswan/linux-netperf.txt [2] http://www.bacula.org/5.0.x-manuals/en/main/main/Storage_Daemon_Configuratio.html ====================================================================== ---------------------------------------------------------------------- (0005077) Dan Langille (manager) - 2010-02-09 11:45 http://bugs.bacula.org/view.php?id=1493#c5077 ---------------------------------------------------------------------- Reading this made me think of better backups over less reliable networks. ---------------------------------------------------------------------- (0005093) kern (administrator) - 2010-02-10 21:14 http://bugs.bacula.org/view.php?id=1493#c5093 ---------------------------------------------------------------------- I would like to discuss this over the next week or so and come up with some reasonable solution -- probably some new directive or directives that allow you to control Bacula to accomplish what you want, but first, I need to re-read your report and the documentation at the link you provide. This is, however, not a bug, but rather a feature request, and I think the best way to work on this is via email on the bacula-devel list. There are two points within Bacula that come into play here: 1. is the size of the buffer that Bacula uses to read the file. This buffer is then passed across the network. 2. when the FD connects to the SD, the two ends attempt to negotiate the largest possible network buffer, which is the TCP/IP size used by the OSes to communicate and not directly related to the packet size that Bacula writes to the network. I need to research this a bit more to understand how these two points work with what you are seeing and the changes you want made. Bottom line: I would like to close this bug report and continue the discussion on the bacula-devel list. ---------------------------------------------------------------------- (0005426) kern (administrator) - 2010-05-30 11:38 http://bugs.bacula.org/view.php?id=1493#c5426 ---------------------------------------------------------------------- OK, I have looked at what you wrote a second time and think I understand it. I have done as you have asked: if the user does not specify a buffer size, Bacula does not attempt to change it with the OS. However, there is one thing I don't understand: if you set your OS buffer size to a 500K, but you do not change Bacula's buffer size, then Bacula is going to write 32K buffers. Won't that create the problems with latency -- or does the OS combine multiple socket writes to form a big buffer? ---------------------------------------------------------------------- (0005427) robe (reporter) - 2010-05-30 11:49 http://bugs.bacula.org/view.php?id=1493#c5427 ---------------------------------------------------------------------- All writes to TCP sockets are buffered by the OS and sent over the network by the OS without regard on how big the writes to the buffer were in the first place. There are ways to modify this behavior, see for example the TCP_NODELAY option in the Linux tcp(7) manpage and http://en.wikipedia.org/wiki/Nagle's_algorithm. For Bacula it shouldn't matter how big the writes to the socket are [1] as long as the TCP buffer size is untouched. [1] There's always the chance of noticeable syscall overhead, but that's probably something which only affects highspeed environments. Issue History Date Modified Username Field Change ====================================================================== 2010-02-09 11:32 robe New Issue 2010-02-09 11:45 Dan Langille Note Added: 0005077 2010-02-10 21:14 kern Note Added: 0005093 2010-02-10 21:14 kern Assigned To => kern 2010-02-10 21:14 kern Status new => assigned 2010-03-10 15:06 mnalis Issue Monitored: mnalis 2010-05-30 11:38 kern Note Added: 0005426 2010-05-30 11:38 kern Status assigned => feedback 2010-05-30 11:49 robe Note Added: 0005427 ====================================================================== |