From: Brad N. <BNI...@no...> - 2010-10-18 16:56:03
|
I built gmond with the sflow patch and got it up and running. Then I downloaded hsflowd from the sourceforge project as described in the gmond doc. hsflowd build and installed as expected and everything seemed to be up and running. I also added the extra udp_recv_channel block to the gmond.conf file. But now after everything is up and running, I don't see anything different in Ganglia. The web front end is just showing the same monitored computers as it did before with the same metrics. If I query gmond through telnet, I am not seeing any new metrics or spoofed nodes. What am I missing? I also tried to trace the network traffic on one of the machines that is running hsflowd for anything from port 6343. I'm not seeing anything there either even though the box says that hsflowd is running. I currently have hsflowd running on two different boxes. One is a SLED 10 box and the other is a SLES 10 box. Brad >>> On 10/11/2010 at 4:33 PM, in message <9DD...@in...>, Neil McKee <nei...@in...> wrote: > As suggested, I moved the sFlow receiver into a new file "sflow.c" and > eliminated any C99 assumptions. This time there is a "--disable-sflow" > configure option too: > > http://bugzilla.ganglia.info/cgi-bin/bugzilla/show_bug.cgi?id=276 > > In order for sflow.c to feed data directly into the repository, I had to > expose 1 structure definition and 5 functions that were previously private to > gmond.c. Hence the new .h file, "gmond_internal.h". > > Neil > > > On Oct 7, 2010, at 2:56 PM, Peter Phaal wrote: > >> Brad, >> >> Thanks for the feedback. My comments are in-line. >> >> On Oct 7, 2010, at 12:27 PM, Brad Nicholes wrote: >> >>> Sorry to jump into this thread so late but I thought that I would throw my 2 > cents in. >>> >>> I finally got a chance to take a look at the code. I was able to compile it > but ran into some C99 issues with variable declarations. Once I got the code > to compile, I was able to take a closer look at what it was doing. From what > I could tell, it looks like the sflow integration is based around reading XDR > packets from an sflow agent and turning them into gmond spoofing metrics. My > first question after seeing this is why does this code have to be built into > gmond.c? Why can't it just do the same thing in a module that would be > plugged into gmond? >>> >>> The reason why I ask this is because we went to a lot of work to pull all of > the metric gathering out of gmond and into modules (including all of the > standard metrics). Some of the main reasons for this is so that metric > gathering could be pluggable without having to affect the gmond code itself. > That way if a bug was ever found and fixed for a specific metrics, we > wouldn't have to re-release all of Ganglia just for one metric fix. Also, > modules give the user the ability to customize each gmond agent to conform to > the specific needs of the node where gmond is running. Regarding sflow, it > seems that in order to integrate the sflow metrics into the Ganglia > monitoring system, only a single gmond node needs to be configured to gather > the sflow metrics. All of the other gmond agents can continue to be > configured and run as they were. Given that, it would make more sense to > integrate sflow as a module that could be loaded under a single gmond agent > rather than replacing all the gmond agents or even upgrading just a single > agent. It would also seem to follow the way that other metric modules and > spoofing modules have been implemented as well. >> >> >> I am not very familiar with gmod modules, but it looks like they are > designed around a polling model and used to retrieve metrics from the server > that the particular instance of gmond is running on: >> 1. a module is loaded in the modules section of the gmond.conf file and > registers a set of metrics it can provide >> 2. metrics are then included in collection_group sections and polled at the > specified intervals >> >> With sFlow, the counters are being pushed by remote servers. There may be > hundreds of sFlow agents sending XDR packets to the single gmond instance. > Our code acts as a gateway, translating the metrics from the remote hosts and > presenting them as if they had arrived in the form of Ganglia XDR datagrams > from remote gmond instances. This function needs to be part of the main > datagram processing loop. I don't see a way for a module to inject code into > the packet processing loop(?) >> >> We do of course plan to limit the changes to gmond.c by moving most of the > code to a separate sflow.c file, leaving just the minimal changes in gmond.c. > We also plan to address the issues with the C99 variable declarations. > Thanks for pointing that out. >> >>> Implementing the sflow integration as a module would also allow it to change > whenever a newer version of sflow is released or whenever the sflow spec or > transport changes. A user could simply upgrade his ganglia sflow module and > be up to date with the latest spec without having to wait for the Ganglia > project to re-release ganglia. >> >> The sFlow version 5 specification hasn't changed since July 2004. The sFlow > version 5 protocol sends TLV data containing XDR encoded structures, making > it extensible. However, once an sFlow structure is published by sFlow.org, > it is immutable. The structures we are decoding are part of the recent sFlow > Host Structures specification and will not change: >> http://www.sflow.org/sflow_host.txt >> >> The gmond sFlow decoder skips over structures it is not interested in and > won't be affected by future sFlow extensions. However, converting additional > sFlow structures into Ganglia metrics would involve extra coding effort > (decoding the structure, calculating counter deltas, defining metadata etc), > but is relatively straightforward. The current code lays down a framework for > supporting additional sFlow metrics in the future. >> >>> >>> Anyway, the more that I am learning about sflow and what it does especially > in relation to Ganglia and what it does, this all seems like a really cool > idea. I am looking forward to seeing this integration done especially if it > is through a pluggable module. >> >> We were very excited to see how easily the data propagated into the Ganglia > UI. The sFlow standard leverages the work that the Ganglia community has > done to define a core set of portable metrics, making sFlow and Ganglia a > good fit. >> >> Peter >> ------------------------------------------------------------------------------ >> Beautiful is writing same markup. Internet Explorer 9 supports >> standards for HTML5, CSS3, SVG 1.1, ECMAScript5, and DOM L2 & L3. >> Spend less time writing and rewriting code and more time creating great >> experiences on the web. Be a part of the beta today. >> http://p.sf.net/sfu/beautyoftheweb >> _______________________________________________ >> Ganglia-developers mailing list >> Gan...@li... >> https://lists.sourceforge.net/lists/listinfo/ganglia-developers > > > ------------------------------------------------------------------------------ > Beautiful is writing same markup. Internet Explorer 9 supports > standards for HTML5, CSS3, SVG 1.1, ECMAScript5, and DOM L2 & L3. > Spend less time writing and rewriting code and more time creating great > experiences on the web. Be a part of the beta today. > http://p.sf.net/sfu/beautyoftheweb > _______________________________________________ > Ganglia-developers mailing list > Gan...@li... > https://lists.sourceforge.net/lists/listinfo/ganglia-developers |