From: Daniel P. <da...@po...> - 2012-04-24 15:47:52
|
On 24/04/12 16:51, Ramon Bastiaans wrote: > > On 23-4-2012 15:26, Daniel Pocock wrote: >> Actually, apr can be a little bit more naughty than that: for Vladimir >> and myself, attempting to query the buffer size from APR reports the >> value 0. Querying the underlying socket directly reports another >> value. I'm using apr-1.4.2 on Debian squeeze, which version do you have? > > Looking at APR's source it seems as if it only queries (on unix) if the > option is set and not the actual value of the option: Great, thanks for confirming the root cause of this issue >> However, because we know there are issues with getting/setting the value >> through APR, your patch would also need to consider: >> >> - is there a minimum APR version required for the patch to work? > > Seems setting APR_SO_RCVBUF was added to APR in 2003 to version 0.9.4 I don't think we support 0.9.4 anyway, Ganglia refuses to compile with it, so no extra effort needed to document that > >> - could you set the value, query the value, and if it hasn't accepted >> the value, try setting the value on the native socket? >> - or maybe just ignore the APR code completely and go directly to set >> the value on the native socket? > Think to be safe I will just skip all the APR weirdness and use the > native socket. Unless there might be portability issues with that? Exactly - we use APR to make Ganglia safer. So we should avoid building in too much native code stuff If an apr upstream fix comes quickly, then I suggest ganglia should not include the hack, it should use the proper apr call, and people who have such heavily loaded gmonds that they need this functionality should be told it is only supported on a recent Linux/apr version. However, given that the problem is quite severe and likely to exist in most current Linux distributions, maybe the current debug messages that I added should also log a warning (or even error) message if (a) the buffer size has been set manually and (b) a bad apr is detected (or querying the value returns 0) Maybe gmond should even refuse to start if the user has requested a bigger buffer and it is not supported? Then they are forced to find out what is going on and upgrade their apr. |