From: Erik A. H. <er...@he...> - 2002-04-30 13:53:14
|
On Fri, Apr 26, 2002 at 03:47:37PM -0400, Sean DIlda wrote: > We found a bug in bpsh where if you send a single command to several > (hundreds) of nodes at once, the output from some nodes would be lost. > > It turns out that in this case, some of the remote commands would finish > running and exit before bpsh ever started handling I/O, therefore the > sigchld handler would decrease outstanding_connections, then when bpsh > started handling those connections (which were still pending) it > decreased outstanding_connections again. This led to a condition where > outstanding connections would be way below 0, and io_to_do and > late_connections would both be zero, and thus bpsh would stop looping an > exit, even though there might still be connections it hasn't handled > yet. The attached patch keeps outstanding_connections from being > decremented twice for a node that finishes early, thus solving the > problem. The patch looks good to me. I'll apply it. - Erik -- Erik Arjan Hendriks Printed On 100 Percent Recycled Electrons er...@he... Contents may settle during shipment |