FWIW, I find gmetad utterly unusable in my configuration without running
out of tmpfs. I don't believe it is specifically rrdtool, since I have
a cacti instance polling several hundred devices for hundreds of
thousands of OID's every five minutes (near-continual RRD write/update
period)... and that has very little impact on IOwait or disk contention,
such that I still run that directly off a local array.
gmetad on the other hand will completely crush any system I try to run
atainst the _fastest_ local disk I throw it at. This with a client base
of (average) 1000 nodes/5-8 clusters. When run out of tmpfs, I maintain
a load of .01 and a network BW utilization of ~250KB/sec... when run vs
local fast disk and an external write journal, this will flatten a
quad-proc Opteron to the point of SSH response being _really_ slow,
without showing more than a few MB/sec R/W vs. the disk... just seems
like very inefficient IO patterns.
I haven't even put as much effort into looking at the problem as Gilad,
as the tmpfs fix works fine... but IMO I think it probably has something
to do with the way rrdtool is being called, not just its use vs. that
Jason A. Smith wrote:
> Hi Gilad,
> I thought I remember a sort of mini HOWTO or FAQ that existed on the old
> ganglia web page which gave suggestions on how to setup ganglia, but I
> can't find it now.
> Anyway, I think ganglia's heavy IO requirements (mostly from rrdtool)
> are fairly well known to long time users, and each has probably come up
> with their own way around it. Here, we are using a diskless database
> directory for ganglia's rrd area, by using Linux's tmpfs:
> /etc/fstab contains this line:
> none /var/lib/ganglia/rrds tmpfs size=1024M,mode=755,uid=nobody,gid=nobody 0 0
> The uid & gid options should match your gmetad.conf's setuid setting.
> Then we backup the database directory using tar every night just to
> prevent complete data loss in case our ganglia server crashes.
> On Fri, 2006-04-14 at 13:06 -0700, Gilad Raphaelli wrote:
>> I'm actually seeing 100% disk busy under both rhel4 and freebsd 4.11
>> with just 98 nodes in 13 clusters. My goal is to get gmetad running
>> on freebsd, rhel4 was just for comparision's sake. A ktrace reveals
>> 100s of failed mkdirs during every writing period - traceable to
>> rrd_helpers.c. There don't seem to be any other significant events.
>> When the disk hits 100% iowait the system is unusable.
>> I was under the impression that a relatively low powered system could
>> handle something like this configuration - perhaps that is the issue?
>> The box is a PIII 800 with 1.5 GB mem - the rrds are stored on a
>> dedicated 70 GB ide disk.
>> Any insight would be appreciated. I'm hanging out in #ganglia on
>> freenode if anyone wants to chat.
>> Thank you,
>> ----- Original Message ----
>> From: Bernard Li <bli@...>
>> To: knobi@...; knobi@...; ganglia-
>> Sent: Thursday, April 13, 2006 11:19:50 PM
>> Subject: [Ganglia-developers] RE: [Ganglia-general] New (final?)
>> tarball for ganglia-3.0.3
>> Hi Martin:
>> Finally had the time to test it, here's the text in the webpage now:
>> Gmetad Web Frontend version 22.214.171.124604132304 Check for Updates.
>> Gmetad Web Backend (gmetad) version 126.96.36.199604102000 Check for
>> Looks like it's fixed.
>> BTW, I tested Ganglia on Fedora Core 5 x86 and it is working fine.
>> Did anybody else test 3.0.3? Somebody on IRC mentioned that he was
>> having issues with gmetad using up 99% CPU with a large number of
>> clients (50+).
>> From: Martin Knoblauch [mailto:knobi@...]
>> Sent: Tue 11/04/2006 11:38
>> To: Bernard Li; knobi@...; ganglia-
>> Subject: RE: [Ganglia-general] New (final?) tarball for ganglia-3.0.3
>> could you please test the following patch in "web" to solve this
>> really really big problem :-) You need to run "./configure" to
>> $diff -u -r1.9 ganglia.php
>> --- ganglia.php 25 Mar 2006 01:53:57 -0000 1.9
>> +++ ganglia.php 11 Apr 2006 18:34:31 -0000
>> @@ -33,7 +33,8 @@
>> $version = array();
>> # The web frontend version, from conf.php.
>> -$version["webfrontend"] =
>> +#$version["webfrontend"] =
>> +$version["webfrontend"] = "$ganglia_version";
>> # The name of our local grid.
>> $self = " ";
>> $diff -u -r1.1 version.php.in
>> --- version.php.in 10 Dec 2004 21:34:04 -0000 1.1
>> +++ version.php.in 11 Apr 2006 18:34:50 -0000
>> @@ -5,7 +5,7 @@
>> $minorversion = @GANGLIA_MINOR_VERSION@;
>> $microversion = @GANGLIA_MICRO_VERSION@;
>> -$ganglia_version =
>> +$ganglia_version = "@GANGLIA_VERSION@";
>> $ganglia_release_name = "@GANGLIA_RELEASE_NAME@";
>> --- Bernard Li <bli@...> wrote:
>>> Just tested building and running on Fedora Core 4 x86, everything
>>> checks out (minimal installation test) - did notice this minor issue
>>> Gmetad Web Frontend version 3.0.3 Check for Updates.
>>> Gmetad Web Backend (gmetad) version 188.8.131.52604102000 Check for
>>> Notice the versions are different between webfrontend and gmetad - I
>>> guess they use difference sources for the version string?
>>> Chris, are you still planning to help us test with your hardware?
>>> P.S. If anybody wants the RPMs, please ping me.
>>> From: ganglia-general-admin@... on behalf of
>>> Sent: Sat 08/04/2006 00:31
>>> To: ganglia general; ganglia-developers@...
>>> Subject: [Ganglia-general] New (final?) tarball for ganglia-3.0.3
>>> as promised, I have created a new pre-3.0.3 tarball. It can be
>>> downloaded from:
>>> Due to the release plans for OSCAR5, this could be the last snaphot
>>> before a release next week.
>>> Especially the following problems are supposed to be solved:
>>> - truncated XML
>>> - bogus "old protocol" messages in dead-host detection
>>> - gmetad will not stop updating RRDs after a previous failure
>>> - apr-0.9.7 is now officially in CVS
>>> - minor fixes to the webfrontend
>>> - more minor stuff -> See the ChangeLog
>>> Martin Knoblauch
>>> email: k n o b i AT knobisoft DOT de
>>> www: http://www.knobisoft.de
>>> This SF.Net email is sponsored by xPML, a groundbreaking scripting
>>> that extends applications into web and mobile media. Attend the live
>>> and join the prime developer group breaking into this new coding
>>> Ganglia-general mailing list
>> Martin Knoblauch
>> email: k n o b i AT knobisoft DOT de
>> www: http://www.knobisoft.de