From: <Ric...@ba...> - 2006-07-28 14:11:48
|
Martin, thanks for the reply. It would make sense that perhaps the code is a legacy of before when collection_groups were implemented (aside: collection groups are smart and obviously necessary, but did I realise this before seeing Ganglia? Nope...) Setting the intervals to 0 seems to work fine. e.g from an strace -e trace=3Dopen: 14:41:52 open("/proc/stat", O_RDONLY) =3D 8 14:41:52 open("/proc/loadavg", O_RDONLY) =3D 8 14:41:52 open("/proc/meminfo", O_RDONLY) =3D 8 14:41:52 open("/proc/net/dev", O_RDONLY) =3D 8 14:41:57 open("/proc/stat", O_RDONLY) =3D 8 14:41:57 open("/proc/loadavg", O_RDONLY) =3D 8 14:41:57 open("/proc/meminfo", O_RDONLY) =3D 8 14:41:57 open("/proc/net/dev", O_RDONLY) =3D 8 But as the update_file code has this: if(now - tf->last_read > tf->thresh) it means that opens of some /proc file won't happen faster than once in any second even though I set the minimum delay to 0. It all works fine for me. I was wrong about the poll delay and the above delays adding up. They don't. As for wanting a 5 second polling rate. Sigh. Its what they want. Also our job mix in this investment bank is a mixture of risk calculations they take a while, and pricing calculations that sometimes need to be delivered in seconds. They want to see the load spike for those calculations. kind regards, Richard -----Original Message----- From: Martin Knoblauch [mailto:kn...@kn...]=20 Sent: 28 July 2006 14:19 To: Grevis, Richard: IT (LDN); gan...@li... Subject: Re: [Ganglia-developers] Linux/cygwin gmond metric poll rate. Hi Richard, --- Ric...@ba... wrote: > Guys, >=20 >=20 > the code below is in the cygwin and linux metric.c files. >=20 > -------------------------------------------------------- > typedef struct { > uint32_t last_read; > uint32_t thresh; > char *name; > char buffer[BUFFSIZE]; > } timely_file; >=20 > timely_file proc_stat =3D { 0, 15, "/proc/stat" }; > timely_file proc_loadavg =3D { 0, 15, "/proc/loadavg" }; timely_file=20 > proc_meminfo =3D { 0, 30, "/proc/meminfo" }; timely_file proc_net_dev =3D= =20 > { 0, 30, "/proc/net/dev" }; >=20 > char *update_file(timely_file *tf) > { > int now,rval; > now =3D time(0); > if(now - tf->last_read > tf->thresh) { > rval =3D slurpfile(tf->name, tf->buffer, BUFFSIZE); > if(rval =3D=3D SYNAPSE_FAILURE) { > err_msg("update_file() got an error from slurpfile() reading=20 > %s", > tf->name); > return (char *)SYNAPSE_FAILURE; > } > else tf->last_read =3D now; > } > return tf->buffer; > } > -------------------------------------------------------- >=20 > In my ganglia setup I have a small number of metrics polled often=20 > (5-10 > seconds) on a large number > of host (4,000!). I have already observed that values did not seem to=20 > change that fast, but have only now found the above code. I obviously=20 > can't poll fast with > the above code. > =20let me understand, you want to sample (e.g.) cpu_* at 5 sc intervalls?= > Given that metrics are measured only at the rate defined by their poll > time, what is the point > of the above code? Is it to ensure that when (say) cpu_user, cpu_sys, > cpu_wio etc are measured, the > /proc file is only opened once? > =20You might be right. Before we had "collection groups" I guess this would [kind of] ensure that all the cpu_* stats would be taken from the same sample. Otherwise they would not add up to 100%. =20Also it rate limits the reading of the /proc files, which might be considered to be "expensive". =20What happens when you compile with smaller values? > Also it seems the delay between /proc file reads is the delay above,=20 > plus the poll delay of the metric. It this true? I measured this, but=20 > did not trace through the code source > to confirm. > =20Not sure about that. If true is sounds wrong. =20 > So Matt (or a similar guru), can you let me know the intent of this=20 > code? >=20 > kind regards, > Richard G ------------------------------------------------------ Martin Knoblauch email: k n o b i AT knobisoft DOT de www: http://www.knobisoft.de ------------------------------------------------------------------------ For more information about Barclays Capital, please visit our web site at= =20http://www.barcap.com. Internet communications are not secure and therefore the Barclays Group d= oes not accept legal responsibility for the contents of this message. Al= though the Barclays Group operates anti-virus programmes, it does not acc= ept responsibility for any damage whatsoever that is caused by viruses be= ing passed. Any views or opinions presented are solely those of the auth= or and do not necessarily represent those of the Barclays Group. Replies= =20to this email may be monitored by the Barclays Group for operational o= r business reasons. ------------------------------------------------------------------------ |