You can subscribe to this list here.
2002 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
(52) |
Jul
(63) |
Aug
(106) |
Sep
(129) |
Oct
(36) |
Nov
(61) |
Dec
(48) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2003 |
Jan
(35) |
Feb
(25) |
Mar
(70) |
Apr
(41) |
May
(47) |
Jun
(12) |
Jul
(77) |
Aug
(28) |
Sep
(35) |
Oct
(16) |
Nov
(56) |
Dec
(20) |
2004 |
Jan
(45) |
Feb
(74) |
Mar
(67) |
Apr
(18) |
May
(55) |
Jun
(56) |
Jul
(35) |
Aug
(26) |
Sep
(34) |
Oct
(31) |
Nov
(72) |
Dec
(66) |
2005 |
Jan
(80) |
Feb
(70) |
Mar
(47) |
Apr
(33) |
May
(7) |
Jun
(21) |
Jul
(20) |
Aug
(32) |
Sep
(49) |
Oct
(45) |
Nov
(27) |
Dec
(22) |
2006 |
Jan
(10) |
Feb
(106) |
Mar
(119) |
Apr
(66) |
May
(50) |
Jun
(34) |
Jul
(36) |
Aug
(72) |
Sep
(36) |
Oct
(6) |
Nov
(28) |
Dec
(50) |
2007 |
Jan
(52) |
Feb
(25) |
Mar
(41) |
Apr
(102) |
May
(70) |
Jun
(20) |
Jul
(61) |
Aug
(43) |
Sep
(39) |
Oct
(106) |
Nov
(146) |
Dec
(71) |
2008 |
Jan
(61) |
Feb
(254) |
Mar
(180) |
Apr
(254) |
May
(46) |
Jun
(199) |
Jul
(190) |
Aug
(167) |
Sep
(91) |
Oct
(27) |
Nov
(36) |
Dec
(24) |
2009 |
Jan
(73) |
Feb
(33) |
Mar
(16) |
Apr
(8) |
May
(11) |
Jun
(5) |
Jul
(76) |
Aug
(20) |
Sep
(50) |
Oct
(71) |
Nov
(102) |
Dec
(65) |
2010 |
Jan
(36) |
Feb
(60) |
Mar
(32) |
Apr
(10) |
May
(7) |
Jun
(30) |
Jul
(17) |
Aug
(11) |
Sep
(16) |
Oct
(30) |
Nov
(12) |
Dec
(16) |
2011 |
Jan
(20) |
Feb
(71) |
Mar
(25) |
Apr
(29) |
May
(3) |
Jun
(11) |
Jul
(12) |
Aug
|
Sep
(7) |
Oct
(1) |
Nov
(8) |
Dec
(40) |
2012 |
Jan
(58) |
Feb
(15) |
Mar
(110) |
Apr
(54) |
May
(49) |
Jun
(33) |
Jul
(40) |
Aug
(37) |
Sep
(16) |
Oct
(10) |
Nov
(6) |
Dec
(1) |
2013 |
Jan
(13) |
Feb
(13) |
Mar
(3) |
Apr
(22) |
May
(8) |
Jun
(8) |
Jul
(38) |
Aug
(14) |
Sep
(10) |
Oct
(2) |
Nov
(23) |
Dec
(28) |
2014 |
Jan
(21) |
Feb
(75) |
Mar
(33) |
Apr
(11) |
May
(10) |
Jun
(4) |
Jul
(10) |
Aug
(16) |
Sep
(13) |
Oct
(17) |
Nov
(5) |
Dec
(6) |
2015 |
Jan
|
Feb
(18) |
Mar
(9) |
Apr
(7) |
May
(16) |
Jun
(3) |
Jul
(7) |
Aug
(1) |
Sep
|
Oct
(1) |
Nov
|
Dec
|
2016 |
Jan
|
Feb
|
Mar
(1) |
Apr
(2) |
May
|
Jun
|
Jul
(6) |
Aug
(1) |
Sep
|
Oct
(1) |
Nov
(5) |
Dec
(1) |
2017 |
Jan
|
Feb
|
Mar
(9) |
Apr
|
May
(1) |
Jun
|
Jul
|
Aug
|
Sep
(3) |
Oct
|
Nov
|
Dec
(2) |
2018 |
Jan
|
Feb
|
Mar
(3) |
Apr
|
May
(2) |
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
(1) |
2021 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
(1) |
Nov
|
Dec
|
From: Hui Li <wat...@gm...> - 2021-10-23 05:16:54
|
Hi, Could anyone tell me the development status of Ganglia? I checked the git repo and seems they are not updated for years already. And I raised a PR 2 days ago but seems the community is pretty quiet. Ganglia is a great project and I started to use it 12 years ago when I set up HPC clusters in my university so I do really hope this project can be continued and active. Thanks. Hui |
From: Phong N. <p-n...@ra...> - 2018-12-04 20:53:52
|
Hello Ganglia Developer My name is Phong Nguyen and I have the following question. I have a working GANGLIA environment that I installed using the Open Source (None RPM) Management also like to see NFS show beside CPU/MEMORY/DISK on Ganglia website I am looking at the website called GITHUB and it sound like I have to download the NFSSTATS.PY module. My question is, what are the steps should I take to install the NFSSTATS module into my working GANGLIA? The website I am looking at is https://github.com/ganglia/monitor-core/tree/master/gmond/python_modules and there is a link called NFS toward the middle of the page |
From: Jeffrey F. <fr...@ud...> - 2018-05-31 16:42:08
|
Background ========== On a new cluster we are building right now I moved from Ganglia 3.6.1 to 3.7.2. 3.6.1 has been rock-solid on previous clusters. After 3.7.2 gmond has been up for a short period of time, it begins emitting the error message: Incorrect format for spoof argument. exiting. Debugging ========= If I enable debugging (e.g. -d 4) I'm shown the parsed contents of the spoof string -- and they are non-zero garbage strings. Doing some gdb tracing with breakpoints on that error message, the metric_id passed to the function has non-zero .spoof and the .host value is a garbage string. In one trace, the .host was an empty string (""); the code in Ganglia_host_get() assumes that if .spoof is non-zero, then .host is non-null and a string with length > 0. So the subsequent code: spoof_info_len = strlen(metric_id->host); buff = malloc(spoof_info_len+1); strncpy(buff, metric_id->host, spoof_info_len + 1); spoofIP = buff; if( !(spoofName = strchr(buff+1,':')) ){ can produce a buffer overrun for a zero-length string. To isolate possible reasons for the botched spoofing hostname I compared the gmond/gmond.c source between 3.6.1 and 3.7.2. In Ganglia_collection_group_send() the following code name = cb->msg.Ganglia_value_msg_u.gstr.metric_id.name; if (override_hostname != NULL) { cb->msg.Ganglia_value_msg_u.gstr.metric_id.host = apr_pstrcat(gm_pool, (char *)( override_ip != NULL ? override_ip : override_hostname ), ":", (char *) override_hostname, NULL); cb->msg.Ganglia_value_msg_u.gstr.metric_id.spoof = TRUE; } is allocating the callback's .host field from the temporary metrics APR pool; but the callback is external to this function and lives on beyond the destruction of that temporary APR pool. Eventually the memory behind cb->msg.Ganglia_value_msg_u.gstr.metric_id.host will be reused and overwritten, yielding the "garbage string" condition that's being observed. In 3.6.1, the .host field was allocated from global_context. If I modified the code cited above to use global_context rather than gm_pool, gmond runs without throwing "Incorrect format for spoof argument" errors. Also, in lib/libgmond.c the static global "myhost" static char myhost[APRMAXHOSTLEN+1]; is assumed by the rest of the code to have been initialized by the compiler to be a zero-length string: if (myhost[0] == '\0') apr_gethostname( (char*)myhost, APRMAXHOSTLEN+1, gm_pool); Probably best to be explicit about the initial value of myhost and not assume an initial value? static char myhost[APRMAXHOSTLEN+1] = ""; Happy to contribute patch files, etc. :::::::::::::::::::::::::::::::::::::::::::::::::::::: Jeffrey T. Frey, Ph.D. Systems Programmer V / HPC Management Network & Systems Services / College of Engineering University of Delaware, Newark DE 19716 Office: (302) 831-6034 Mobile: (302) 419-4976 :::::::::::::::::::::::::::::::::::::::::::::::::::::: |
From: Dmitry A. <fre...@gm...> - 2018-03-16 15:32:10
|
Thanks Robin! That's exactly what I needed. On Tue, Mar 13, 2018 at 3:12 AM Robin Humble <rjh...@ci...> wrote: > Hi Dmitry, > > On Fri, Mar 09, 2018 at 08:11:08PM +0000, Dmitry Akselrod wrote: > >2. As I am collecting the metrics for the remote hosts on my utility > >hosts, the Ganglia website will show my utility host as the node name for > >all the metrics. That all makes sense since gmetad is polling the gmond > >on my utility host and the gmond on my utility host is storing the > >metrics. Is there a way to override the hostname for the specific metric > >I am collecting via SNMP? I would like the Ganglia cluster to have a > node > >for each of the appliance I am polling via my SNMP module with its metrics > >assigned to it. It seems like it should be theoretically possible since > >gmond can aggregate metrics from multiple hosts. I am just not sure how > >to get to get to this programmatically. > > rather than use a gmond python module, you could probably accomplish > what you want using external python program that gathers up all your > SNMP data and then spoofs it into ganglia using gmetric.py. > https://github.com/ganglia/ganglia_contrib/tree/master/gmetric-python > > the data will appear to be coming from another host even though you > are inserting it all into ganglia from your utility host. > https://github.com/ganglia/monitor-core/wiki/Gmetric-Spoofing > > eg. > g = gmetric.Gmetric( gmondHost, gmondPort, gmondProtocol ) > spoofStr = ip + ':' + host > g.send( name, '%.2f' % d, 'float', unit, 'both', 60, 0, "", spoofStr ) > > I do this for a bunch of 'out of band' data like node temps, fans, > infiniband traffic, filesystem traffic, etc. > > the only quirk in doing it this way is that if a host is down then this > spoof'd data will make it appear like it's still up. but for pure 'fake' > hosts like it sounds like you have, then that's probably what you want. > > cheers, > robin > |
From: Robin H. <rjh...@ci...> - 2018-03-13 07:13:00
|
Hi Dmitry, On Fri, Mar 09, 2018 at 08:11:08PM +0000, Dmitry Akselrod wrote: >2. As I am collecting the metrics for the remote hosts on my utility >hosts, the Ganglia website will show my utility host as the node name for >all the metrics. That all makes sense since gmetad is polling the gmond >on my utility host and the gmond on my utility host is storing the >metrics. Is there a way to override the hostname for the specific metric >I am collecting via SNMP? I would like the Ganglia cluster to have a node >for each of the appliance I am polling via my SNMP module with its metrics >assigned to it. It seems like it should be theoretically possible since >gmond can aggregate metrics from multiple hosts. I am just not sure how >to get to get to this programmatically. rather than use a gmond python module, you could probably accomplish what you want using external python program that gathers up all your SNMP data and then spoofs it into ganglia using gmetric.py. https://github.com/ganglia/ganglia_contrib/tree/master/gmetric-python the data will appear to be coming from another host even though you are inserting it all into ganglia from your utility host. https://github.com/ganglia/monitor-core/wiki/Gmetric-Spoofing eg. g = gmetric.Gmetric( gmondHost, gmondPort, gmondProtocol ) spoofStr = ip + ':' + host g.send( name, '%.2f' % d, 'float', unit, 'both', 60, 0, "", spoofStr ) I do this for a bunch of 'out of band' data like node temps, fans, infiniband traffic, filesystem traffic, etc. the only quirk in doing it this way is that if a host is down then this spoof'd data will make it appear like it's still up. but for pure 'fake' hosts like it sounds like you have, then that's probably what you want. cheers, robin |
From: Dmitry A. <fre...@gm...> - 2018-03-09 20:11:27
|
Hey all, I am working on a python module for gmond. I am following examples here: https://github.com/ganglia/gmond_python_modules and the quick start guide here: https://github.com/ganglia/monitor-core/wiki/Ganglia-GMond-Python-Modules. What I am trying to do specifically is SNMP poll a bunch of appliances for various metrics. These appliances can't run a gmond client or any other client. My thought process is to have a utility server that would run the gmond agent with my custom python module to SNMP poll the various appliances for the metrics that I want. I am struggling in the following two areas. 1. I'd like to be able to specify an array of dictionaries that describe the various remote hosts and SNMP OIDs i want to poll. I would prefer to do this in my module's pyconf file. I was hoping to do this using the param clauses in the pyconf, but they only seem to take a key value pair. I would actually like to pass something like snmp_collection_set[] which would be an array of dictionaries defined by the user. This would let the module users just specify which hosts and OID they want to poll in the pyconf and make the module flexible. I can get around the key/value pair param limitation by passing a path to my own config file (like a yaml file) as a value and then having the actual module code deal with it. That seems a bit hacky. Is there a better / recommended way to handle this? 2. As I am collecting the metrics for the remote hosts on my utility hosts, the Ganglia website will show my utility host as the node name for all the metrics. That all makes sense since gmetad is polling the gmond on my utility host and the gmond on my utility host is storing the metrics. Is there a way to override the hostname for the specific metric I am collecting via SNMP? I would like the Ganglia cluster to have a node for each of the appliance I am polling via my SNMP module with its metrics assigned to it. It seems like it should be theoretically possible since gmond can aggregate metrics from multiple hosts. I am just not sure how to get to get to this programmatically. thanks in advance! |
From: Adrian C. <adr...@gm...> - 2017-12-07 14:09:52
|
Apologies for the format of previous email. I just need to know when springboot framework is used in Ganglia 2.7.1 and what version is used in Ganglia 3.7.1, when this used and why. Thanks |
From: Manjunath K E <man...@ii...> - 2017-09-20 05:47:48
|
Hi All, I have installed ganglia (gmetad & gmond) on RHEL linux server. I have written python modules to collect few custom metrics and plot them. There are 183 custom metrics to be collected and separate scripts are written for each of the custom metrics. But, out of 183 metrics, only 96 metrics are getting plotted on the browser. I have set the following variables in the ".pyconf" file as : time_threshold = 180 collect_every = 600 Could anyone help me, so that I would be able to get the graph plotted for all the 183 custom metrics Thanks & Regards, Manjunath K E +91 9620469651 |
From: Manjunath K E <man...@ii...> - 2017-09-20 05:30:38
|
hello all The default behavior of ganglia while plotting a graph is take max and min value of the data and use them as values for labeling y-axis. But I would like to customize the labeling values of y-axis i.e I always want to plot between 0 and 100. Any Help will be appreciated. Thanks & Regards, Manjunath K E ke....@gm... |
From: <bre...@tx...> - 2017-09-13 16:51:48
|
I have run into an issue caused by the libtool configuration in the ganglia source build of 3.7.2 and am looking for some help. My system has ganglia installed from the OS DVD in the default location. I also have a test build that I'm running from a separate location using the --prefix option passed to configure. In my test build, libtool is adding a -rpath option that prefers to find files in /usr/lib64 *before* looking in my install location which is not what I need since I've changed libganglia. This only is a problem if you have a libganglia.so* in both locations - which unfortunately my system has. Here is what a link looks like if you are curious: /bin/sh ../libtool --tag=CC --mode=link gcc -std=gnu99 ... -lpthread libtool: link: gcc -std=gnu99 ... -pthread -Wl,-rpath -Wl,/usr/lib64 -Wl,-rpath -Wl,/opt/ganglia_brent/lib64 I only saw one option about rpath in libtool, but modifying it didn't change things for the better. I attempted to dive into the code that actually constructed the rpath and quickly became confused. I was able to work around the problem by setting LDFLAGS during the configure phase, but it lead to the generation of the Makefiles taking multiple steps - once during the configure, and again during the first make. In case others are curious, here is what I did: % env LDFLAGS=-Wl,-rpath,/opt/ganglia_brent/lib64 ./configure --with-gmetad --prefix=/opt/ganglia_brent % make With this I'm off and running with the correct libraries, but I'm hoping there is a better way to address this issue. Any thoughts? thanks! brent |
From: <sob...@la...> - 2017-05-18 12:13:51
|
Hello everybody, I am a brend new guy in Ganglia. I have to monitoring a cluster of 20 nodes plus a master node. This single cluster has three networks. More precisly, I had install ganglia-gmond x86_64 3.7.2-2.el7 ganglia x86_64 3.7.2-2.el7 libconfuse x86_64 2.7-7.el7 ganglia-gmetad x86_64 3.7.2-2.el7 rrdtool x86_64 1.4.8-9.el7 ganglia-web-3.7.2 As I said, I have a physical cluster of 20 node plus a master node with a network of 1 Gb/s, 10 Gb/s and Infiniband. So i had three switchs in the same cluster. In gmetad on the master node I wrote the name of 4 nodes , corresponding to IP ( class C) of 1 Gb/s network, and the name of machine for 10 Gb/s. I would like to have a different windows in Gweb for each network, and/or different color with correponding scale in order to know if its 1 Gb/s or 10 Gb/s that is used. Thanks in advance With best regards Laurent LABATUT "Sobov34" |
From: Anders B. <an...@ec...> - 2017-03-30 21:36:42
|
Also, if the size of the XML payload is the biggest concern (rather than the sheer amount of XDR traffic) then gzip compression would be a good idea: gzip_output = yes See https://www.quantcast.com/blog/quantcast-open-source-diaries-ganglia-gzip/ for some background. Also might want to look into using rrdcached ? https://github.com/ganglia/monitor-core/wiki/Integrating-Ganglia-with-rrdcached /Anders Den 2017-03-30 kl. 15:22, skrev Vladimir Vuksan: Clusters are logical grouping of like hosts. This can be e.g. per location (same data center), per app or per function (DB, web, etc.). It really depends how you are viewing your environment. There is no right or wrong way to group it. Vladimir 03/30/2017 u 04:30 AM, Guo, Jason je napisao/la: Thanks Vladimir As you mentioned, FB had clusters with tens of thousands of nodes in a cluster. How they orchestrate these nodes? Here are some options in my mind 1. All the nodes share a few centralized gmonds and all of them belong to a single cluster (the cluster concept in ganglia) 2. All the nodes share a few centralized gmonds and each centralized gmond belong to different cluster, and there is a single gmetad which poll data from these centralized gmond 3. There are multiple gmetad/grid and then orchestrate these grids with a centralized gmetad/grid\ Thanks & Best Regards, Jason Guo From: Vladimir Vuksan <vl...@ve...><mailto:vl...@ve...> Date: Wednesday, March 29, 2017 at 20:09 To: "Guo, Jason" <ju...@eb...><mailto:ju...@eb...>, "gan...@li..."<mailto:gan...@li...> <gan...@li...><mailto:gan...@li...> Subject: Re: [Ganglia-developers] Does Ganglia work well for a large-scale cluster Hi Jason, it depends on the number of metrics and associated metadata in the cluster and how busy gmetad is overall. Also depends on your hardware. At one point FB had clusters with tens of thousands of nodes in a cluster. Try to keep your metrics lean ie. don't add any metric descriptions if you don't have to so to keep the XML payload small and it should be fine. Vladimir 3/28/2017 u 10:19 PM, Guo, Jason je napisao/la: Hi, I’m writing this mail to discuss whether Ganglia works well for a large-scale cluster (more than 4000 nodes). As per Ganglia document, ganglia can scale to handle clusters with 2000 nodes. So many people have concern on using Ganglia for a 4000 nodes production cluster. It has been used to link clusters across university campuses and around the world and can scale to handle clusters with 2000 nodes. If the cluster is large than 2000 nodes, say 4000 nodes, can Ganglia handle it properly? To verify this, I create a 5000 nodes ganglia cluster on top of Docker cluster (10 machine). I put 500 nodes in a cluster, so there are 10 cluster. And these 10 clusters are in the same Grid. For each gmond, I use a script to generate 30 customized metrics (with gmetric). Currently it works fine in the Docker based test environment. So, my question is whether Ganglia is suitable for 4000 nodes cluster? |
From: Anders B. <an...@ec...> - 2017-03-30 18:52:24
|
Hi! Would there be any interest in trying to bring the "gexec" family back ? - gexec http://www.theether.org/gexec/ - authd http://www.theether.org/authd/ - pcp http://www.theether.org/pcp/ - libe http://www.theether.org/libe/ It used to be in the Subversion repo, but it has been out for "a while". https://sourceforge.net/p/ganglia/code/HEAD/tree/trunk/gexec/ A while ago I made some minor updates, to import and to make it compile: Git-SVN https://github.com/afbjorklund/gexec It might need some updating, but n-ary TCP trees are still useful ? Currently it uses old 512 bit RSA keys and SHA-0 for the checksums. Can we move this to the "ganglia" organization ? It's now licensed under the BSD license (not GPL) /Anders |
From: Vladimir V. <vl...@ve...> - 2017-03-30 13:23:09
|
<html> <head> <meta content="text/html; charset=utf-8" http-equiv="Content-Type"> </head> <body bgcolor="#FFFFFF" text="#000000"> <div class="moz-cite-prefix">Clusters are logical grouping of like hosts. This can be e.g. per location (same data center), per app or per function (DB, web, etc.). It really depends how you are viewing your environment. There is no right or wrong way to group it.<br> <br> Vladimir<br> <br> 03/30/2017 u 04:30 AM, Guo, Jason je napisao/la:<br> </div> <blockquote cite="mid:EFF...@eb..." type="cite"> <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> <meta name="Title" content=""> <meta name="Keywords" content=""> <meta name="Generator" content="Microsoft Word 15 (filtered medium)"> <style><!-- /* Font Definitions */ @font-face {font-family:"Courier New"; panose-1:2 7 3 9 2 2 5 2 4 4;} @font-face {font-family:Wingdings; panose-1:5 0 0 0 0 0 0 0 0 0;} @font-face {font-family:"Cambria Math"; panose-1:2 4 5 3 5 4 6 3 2 4;} @font-face {font-family:DengXian; panose-1:2 1 6 0 3 1 1 1 1 1;} @font-face {font-family:Calibri; panose-1:2 15 5 2 2 2 4 3 2 4;} /* Style Definitions */ p.MsoNormal, li.MsoNormal, div.MsoNormal {margin:0in; margin-bottom:.0001pt; font-size:12.0pt; font-family:Calibri;} a:link, span.MsoHyperlink {mso-style-priority:99; color:#0563C1; text-decoration:underline;} a:visited, span.MsoHyperlinkFollowed {mso-style-priority:99; color:#954F72; text-decoration:underline;} p {mso-style-priority:99; mso-margin-top-alt:auto; margin-right:0in; mso-margin-bottom-alt:auto; margin-left:0in; font-size:12.0pt; font-family:"Times New Roman";} pre {mso-style-priority:99; mso-style-link:"HTML Preformatted Char"; margin:0in; margin-bottom:.0001pt; font-size:10.0pt; font-family:"Courier New";} p.MsoListParagraph, li.MsoListParagraph, div.MsoListParagraph {mso-style-priority:34; margin-top:0in; margin-right:0in; margin-bottom:0in; margin-left:.5in; margin-bottom:.0001pt; font-size:12.0pt; font-family:Calibri;} span.EmailStyle17 {mso-style-type:personal; font-family:Calibri; color:windowtext;} span.HTMLPreformattedChar {mso-style-name:"HTML Preformatted Char"; mso-style-priority:99; mso-style-link:"HTML Preformatted"; font-family:Courier;} span.EmailStyle21 {mso-style-type:personal-reply; font-family:Calibri; color:windowtext;} span.msoIns {mso-style-type:export-only; mso-style-name:""; text-decoration:underline; color:teal;} .MsoChpDefault {mso-style-type:export-only; font-size:10.0pt;} @page WordSection1 {size:8.5in 11.0in; margin:1.0in 1.0in 1.0in 1.0in;} div.WordSection1 {page:WordSection1;} /* List Definitions */ @list l0 {mso-list-id:1830517967; mso-list-type:hybrid; mso-list-template-ids:-797825752 67698689 67698691 67698693 67698689 67698691 67698693 67698689 67698691 67698693;} @list l0:level1 {mso-level-number-format:bullet; mso-level-text:; mso-level-tab-stop:none; mso-level-number-position:left; text-indent:-.25in; font-family:Symbol;} @list l0:level2 {mso-level-number-format:bullet; mso-level-text:o; mso-level-tab-stop:none; mso-level-number-position:left; text-indent:-.25in; font-family:"Courier New";} @list l0:level3 {mso-level-number-format:bullet; mso-level-text:; mso-level-tab-stop:none; mso-level-number-position:left; text-indent:-.25in; font-family:Wingdings;} @list l0:level4 {mso-level-number-format:bullet; mso-level-text:; mso-level-tab-stop:none; mso-level-number-position:left; text-indent:-.25in; font-family:Symbol;} @list l0:level5 {mso-level-number-format:bullet; mso-level-text:o; mso-level-tab-stop:none; mso-level-number-position:left; text-indent:-.25in; font-family:"Courier New";} @list l0:level6 {mso-level-number-format:bullet; mso-level-text:; mso-level-tab-stop:none; mso-level-number-position:left; text-indent:-.25in; font-family:Wingdings;} @list l0:level7 {mso-level-number-format:bullet; mso-level-text:; mso-level-tab-stop:none; mso-level-number-position:left; text-indent:-.25in; font-family:Symbol;} @list l0:level8 {mso-level-number-format:bullet; mso-level-text:o; mso-level-tab-stop:none; mso-level-number-position:left; text-indent:-.25in; font-family:"Courier New";} @list l0:level9 {mso-level-number-format:bullet; mso-level-text:; mso-level-tab-stop:none; mso-level-number-position:left; text-indent:-.25in; font-family:Wingdings;} @list l1 {mso-list-id:2087918548; mso-list-type:hybrid; mso-list-template-ids:-1378208578 67698703 67698713 67698715 67698703 67698713 67698715 67698703 67698713 67698715;} @list l1:level1 {mso-level-tab-stop:none; mso-level-number-position:left; text-indent:-.25in;} @list l1:level2 {mso-level-number-format:alpha-lower; mso-level-tab-stop:none; mso-level-number-position:left; text-indent:-.25in;} @list l1:level3 {mso-level-number-format:roman-lower; mso-level-tab-stop:none; mso-level-number-position:right; text-indent:-9.0pt;} @list l1:level4 {mso-level-tab-stop:none; mso-level-number-position:left; text-indent:-.25in;} @list l1:level5 {mso-level-number-format:alpha-lower; mso-level-tab-stop:none; mso-level-number-position:left; text-indent:-.25in;} @list l1:level6 {mso-level-number-format:roman-lower; mso-level-tab-stop:none; mso-level-number-position:right; text-indent:-9.0pt;} @list l1:level7 {mso-level-tab-stop:none; mso-level-number-position:left; text-indent:-.25in;} @list l1:level8 {mso-level-number-format:alpha-lower; mso-level-tab-stop:none; mso-level-number-position:left; text-indent:-.25in;} @list l1:level9 {mso-level-number-format:roman-lower; mso-level-tab-stop:none; mso-level-number-position:right; text-indent:-9.0pt;} ol {margin-bottom:0in;} ul {margin-bottom:0in;} --></style> <div class="WordSection1"> <p class="MsoNormal"><span style="font-size:11.0pt">Thanks Vladimir<o:p></o:p></span></p> <p class="MsoNormal"><span style="font-size:11.0pt"><o:p> </o:p></span></p> <p class="MsoNormal"><span style="font-size:11.0pt">As you mentioned, FB had clusters with tens of thousands of nodes in a cluster. <o:p></o:p></span></p> <p class="MsoNormal"><span style="font-size:11.0pt"><o:p> </o:p></span></p> <p class="MsoNormal"><span style="font-size:11.0pt">How they orchestrate these nodes? Here are some options in my mind<o:p></o:p></span></p> <p class="MsoListParagraph" style="text-indent:-.25in;mso-list:l1 level1 lfo2"><!--[if !supportLists]--><span style="font-size:11.0pt"><span style="mso-list:Ignore">1.<span style="font:7.0pt "Times New Roman""> </span></span></span><!--[endif]--><span style="font-size:11.0pt">All the nodes share a few centralized gmonds and all of them belong to a single cluster (the cluster concept in ganglia)<o:p></o:p></span></p> <p class="MsoListParagraph" style="text-indent:-.25in;mso-list:l1 level1 lfo2"><!--[if !supportLists]--><span style="font-size:11.0pt"><span style="mso-list:Ignore">2.<span style="font:7.0pt "Times New Roman""> </span></span></span><!--[endif]--><span style="font-size:11.0pt">All the nodes share a few centralized gmonds and each centralized gmond belong to different cluster, and there is a single gmetad which poll data from these centralized gmond<o:p></o:p></span></p> <p class="MsoListParagraph" style="text-indent:-.25in;mso-list:l1 level1 lfo2"><!--[if !supportLists]--><span style="font-size:11.0pt"><span style="mso-list:Ignore">3.<span style="font:7.0pt "Times New Roman""> </span></span></span><!--[endif]--><span style="font-size:11.0pt">There are multiple gmetad/grid and then orchestrate these grids with a centralized gmetad/grid\<o:p></o:p></span></p> <p class="MsoNormal"><span style="font-size:11.0pt"><o:p> </o:p></span></p> <p class="MsoNormal"><span style="font-size:11.0pt">Thanks & Best Regards,<o:p></o:p></span></p> <p class="MsoNormal"><span style="font-size:11.0pt">Jason Guo<o:p></o:p></span></p> <p class="MsoNormal"><span style="font-size:11.0pt"><o:p> </o:p></span></p> <div style="border:none;border-top:solid #B5C4DF 1.0pt;padding:3.0pt 0in 0in 0in"> <p class="MsoNormal"><b><span style="color:black">From: </span></b><span style="color:black">Vladimir Vuksan <a class="moz-txt-link-rfc2396E" href="mailto:vl...@ve..."><vl...@ve...></a><br> <b>Date: </b>Wednesday, March 29, 2017 at 20:09<br> <b>To: </b>"Guo, Jason" <a class="moz-txt-link-rfc2396E" href="mailto:ju...@eb..."><ju...@eb...></a>, <a class="moz-txt-link-rfc2396E" href="mailto:gan...@li...">"gan...@li..."</a> <a class="moz-txt-link-rfc2396E" href="mailto:gan...@li..."><gan...@li...></a><br> <b>Subject: </b>Re: [Ganglia-developers] Does Ganglia work well for a large-scale cluster<o:p></o:p></span></p> </div> <div> <p class="MsoNormal"><span style="font-family:"Times New Roman""><o:p> </o:p></span></p> </div> <div> <p class="MsoNormal">Hi Jason,<br> <br> it depends on the number of metrics and associated metadata in the cluster and how busy gmetad is overall. Also depends on your hardware. At one point FB had clusters with tens of thousands of nodes in a cluster. <br> <br> Try to keep your metrics lean ie. don't add any metric descriptions if you don't have to so to keep the XML payload small and it should be fine.<br> <br> Vladimir<br> <br> 3/28/2017 u 10:19 PM, Guo, Jason je napisao/la:<o:p></o:p></p> </div> <blockquote style="margin-top:5.0pt;margin-bottom:5.0pt"> <p class="MsoNormal"><span style="font-size:11.0pt">Hi,</span><o:p></o:p></p> <p class="MsoNormal"><span style="font-size:11.0pt"> </span><o:p></o:p></p> <p class="MsoNormal"><span style="font-size:11.0pt"> </span><o:p></o:p></p> <p class="MsoNormal"><span style="font-size:11.0pt">I’m writing this mail to discuss whether Ganglia works well for a large-scale cluster (more than 4000 nodes).</span><o:p></o:p></p> <p class="MsoNormal"><span style="font-size:11.0pt"> </span><o:p></o:p></p> <p class="MsoNormal"><span style="font-size:11.0pt"> </span><o:p></o:p></p> <p class="MsoNormal"><span style="font-size:11.0pt">As per Ganglia document, ganglia can scale to handle clusters with 2000 nodes. So many people have concern on using Ganglia for a 4000 nodes production cluster.</span><o:p></o:p></p> <p class="MsoNormal"><i><span style="font-size:11.0pt">It has been used to link clusters across university campuses and around the world and can <b>scale to handle clusters with 2000 nodes</b>.</span></i><o:p></o:p></p> <p class="MsoNormal"><span style="font-size:11.0pt"> </span><o:p></o:p></p> <p class="MsoNormal"><span style="font-size:11.0pt">If the cluster is large than 2000 nodes, say 4000 nodes, can Ganglia handle it properly?</span><o:p></o:p></p> <p class="MsoNormal"><span style="font-size:11.0pt"> </span><o:p></o:p></p> <p class="MsoNormal"><span style="font-size:11.0pt"> </span><o:p></o:p></p> <p class="MsoNormal"><span style="font-size:11.0pt">To verify this, I create a 5000 nodes ganglia cluster on top of Docker cluster (10 machine). </span><o:p></o:p></p> <p class="MsoNormal"><span style="font-size:11.0pt">I put 500 nodes in a cluster, so there are 10 cluster. And these 10 clusters are in the same Grid.</span><o:p></o:p></p> <p class="MsoNormal"><span style="font-size:11.0pt">For each gmond, I use a script to generate 30 customized metrics (with gmetric).</span><o:p></o:p></p> <p class="MsoNormal"><span style="font-size:11.0pt"> </span><o:p></o:p></p> <p class="MsoNormal"><span style="font-size:11.0pt">Currently it works fine in the Docker based test environment.</span><o:p></o:p></p> <p class="MsoNormal"><span style="font-size:11.0pt"> </span><o:p></o:p></p> <p class="MsoNormal"><span style="font-size:11.0pt">So, my question is whether Ganglia is suitable for 4000 nodes cluster?</span><o:p></o:p></p> </blockquote> </div> </blockquote> <br> </body> </html> |
From: Guo, J. <ju...@eb...> - 2017-03-30 08:30:46
|
Thanks Vladimir As you mentioned, FB had clusters with tens of thousands of nodes in a cluster. How they orchestrate these nodes? Here are some options in my mind 1. All the nodes share a few centralized gmonds and all of them belong to a single cluster (the cluster concept in ganglia) 2. All the nodes share a few centralized gmonds and each centralized gmond belong to different cluster, and there is a single gmetad which poll data from these centralized gmond 3. There are multiple gmetad/grid and then orchestrate these grids with a centralized gmetad/grid\ Thanks & Best Regards, Jason Guo From: Vladimir Vuksan <vl...@ve...> Date: Wednesday, March 29, 2017 at 20:09 To: "Guo, Jason" <ju...@eb...>, "gan...@li..." <gan...@li...> Subject: Re: [Ganglia-developers] Does Ganglia work well for a large-scale cluster Hi Jason, it depends on the number of metrics and associated metadata in the cluster and how busy gmetad is overall. Also depends on your hardware. At one point FB had clusters with tens of thousands of nodes in a cluster. Try to keep your metrics lean ie. don't add any metric descriptions if you don't have to so to keep the XML payload small and it should be fine. Vladimir 3/28/2017 u 10:19 PM, Guo, Jason je napisao/la: Hi, I’m writing this mail to discuss whether Ganglia works well for a large-scale cluster (more than 4000 nodes). As per Ganglia document, ganglia can scale to handle clusters with 2000 nodes. So many people have concern on using Ganglia for a 4000 nodes production cluster. It has been used to link clusters across university campuses and around the world and can scale to handle clusters with 2000 nodes. If the cluster is large than 2000 nodes, say 4000 nodes, can Ganglia handle it properly? To verify this, I create a 5000 nodes ganglia cluster on top of Docker cluster (10 machine). I put 500 nodes in a cluster, so there are 10 cluster. And these 10 clusters are in the same Grid. For each gmond, I use a script to generate 30 customized metrics (with gmetric). Currently it works fine in the Docker based test environment. So, my question is whether Ganglia is suitable for 4000 nodes cluster? Thanks & Best Regards, Jason Gu0o ------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot _______________________________________________ Ganglia-developers mailing list Gan...@li...<mailto:Gan...@li...> https://lists.sourceforge.net/lists/listinfo/ganglia-developers |
From: Guo, J. <ju...@eb...> - 2017-03-29 02:54:23
|
Hi, I’m writing this mail to discuss whether Ganglia works well for a large-scale cluster (more than 4000 nodes). As per Ganglia document, ganglia can scale to handle clusters with 2000 nodes. So many people have concern on using Ganglia for a 4000 nodes production cluster. It has been used to link clusters across university campuses and around the world and can scale to handle clusters with 2000 nodes. If the cluster is large than 2000 nodes, say 4000 nodes, can Ganglia handle it properly? To verify this, I create a 5000 nodes ganglia cluster on top of Docker cluster (10 machine). I put 500 nodes in a cluster, so there are 10 cluster. And these 10 clusters are in the same Grid. For each gmond, I use a script to generate 30 customized metrics (with gmetric). Currently it works fine in the Docker based test environment. So, my question is whether Ganglia is suitable for 4000 nodes cluster? Thanks & Best Regards, Jason Gu0o |
From: Dockendorf, T. <tdo...@os...> - 2017-03-21 16:45:20
|
A more detailed backtrace now that I have correct debug symbols installed: [root@metrics ~]# gdb /usr/sbin/gmetad -c /var/spool/abrt/ccpp-2017-03-21-11\:17\:53-54229/coredump GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-94.el7 Copyright (C) 2013 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-redhat-linux-gnu". For bug reporting instructions, please see: <http://www.gnu.org/software/gdb/bugs/>... Reading symbols from /usr/sbin/gmetad...Reading symbols from /usr/lib/debug/usr/sbin/gmetad.debug...done. done. [New LWP 54250] [New LWP 54229] [New LWP 54238] [New LWP 54235] [New LWP 54234] [New LWP 54236] [New LWP 54230] [New LWP 54231] [New LWP 54232] [New LWP 54237] [New LWP 54233] [New LWP 54240] [New LWP 54241] [New LWP 54239] [New LWP 54242] [New LWP 54249] [New LWP 54246] [New LWP 54248] [New LWP 54247] [New LWP 54243] [New LWP 54245] [New LWP 54244] [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib64/libthread_db.so.1". Core was generated by `/usr/sbin/gmetad -d 5'. Program terminated with signal 11, Segmentation fault. #0 0x00007f2c36eb2106 in hash_key (key=0x0, len=945, seed=0) at hash.c:182 182 seed ^= (uint64_t)*bp++; Missing separate debuginfos, use: debuginfo-install apr-1.4.8-3.el7.x86_64 cairo-1.14.2-1.el7.x86_64 cyrus-sasl-lib-2.1.26-20.el7_2.x86_64 expat-2.1.0-10.el7_3.x86_64 fontconfig-2.10.95-10.el7.x86_64 freetype-2.4.11-12.el7.x86_64 glib2-2.46.2-4.el7.x86_64 glibc-2.17-157.el7_3.1.x86_64 graphite2-1.3.6-1.el7_2.x86_64 harfbuzz-0.9.36-1.el7.x86_64 libX11-1.6.3-3.el7.x86_64 libXau-1.0.8-2.1.el7.x86_64 libXdamage-1.1.4-4.1.el7.x86_64 libXext-1.3.3-3.el7.x86_64 libXfixes-5.0.1-2.1.el7.x86_64 libXrender-0.9.8-2.1.el7.x86_64 libXxf86vm-1.1.3-2.1.el7.x86_64 libconfuse-2.7-7.el7.x86_64 libdrm-2.4.67-3.el7.x86_64 libffi-3.0.13-18.el7.x86_64 libgcc-4.8.5-11.el7.x86_64 libmemcached-1.0.16-5.el7.x86_64 libpng-1.5.13-7.el7_2.x86_64 libselinux-2.5-6.el7.x86_64 libstdc++-4.8.5-11.el7.x86_64 libuuid-2.23.2-33.el7.x86_64 libxcb-1.11-4.el7.x86_64 libxml2-2.9.1-6.el7_2.3.x86_64 libxshmfence-1.2-1.el7.x86_64 mesa-libEGL-11.2.2-2.20160614.el7.x86_64 mesa-libGL-11.2.2-2.20160614.el7.x86_64 mesa-libgbm-11.2.2-2.20160614.el7.x86_64 mesa-libglapi-11.2.2-2.20160614.el7.x86_64 nss-softokn-freebl-3.16.2.3-14.4.el7.x86_64 pango-1.36.8-2.el7.x86_64 pcre-8.32-15.el7_2.1.x86_64 pixman-0.34.0-1.el7.x86_64 rrdtool-1.4.8-9.el7.x86_64 xz-libs-5.2.2-1.el7.x86_64 zlib-1.2.7-17.el7.x86_64 (gdb) bt #0 0x00007f2c36eb2106 in hash_key (key=0x0, len=945, seed=0) at hash.c:182 #1 0x00007f2c36eb2166 in hashval (key=0x7f2c080008b0, hash=0x7f2c09402890) at hash.c:195 #2 0x00007f2c36eb2725 in hash_delete (key=0x7f2c080008b0, hash=0x7f2c09402890) at hash.c:335 #3 0x00007f2c36eb2028 in hash_destroy (hash=0x7f2c09402890) at hash.c:145 #4 0x00007f2c37524062 in cleanup_source (key=0x7f2c0977a320, val=0x7f2c0977a360, arg=0x7f2c1fffebe0) at cleanup.c:170 #5 0x00007f2c36eb297d in hash_walkfrom (hash=0x7f2c38c74fe0, from=0, func=0x7f2c37523f02 <cleanup_source>, arg=0x7f2c1fffebe0) at hash.c:402 #6 0x00007f2c37524250 in cleanup_thread (arg=0x0) at cleanup.c:206 #7 0x00007f2c35436dc5 in start_thread () from /lib64/libpthread.so.0 #8 0x00007f2c34f4f73d in clone () from /lib64/libc.so.6 -- Trey Dockendorf HPC Systems Engineer Ohio Supercomputer Center From: Trey Dockendorf <tdo...@os...<mailto:tdo...@os...>> Date: Tuesday, March 21, 2017 at 11:51 AM To: "gan...@li...<mailto:gan...@li...>" <gan...@li...<mailto:gan...@li...>> Subject: [Ganglia-developers] Segmentation fault in gmetad I have been getting frequent, many times per day, segfaults with gmetad 3.7.2 on RHEL 7 that is installed from rebuilt SRPM I pulled from Fedora. Below is the backtrace of a core generated by abrt-ccpp. Let me know what other information would be useful in getting this segfault fixed or if this is more appropriate for github issue. (gdb) bt #0 0x00007f2c36eb2106 in hash_key () from /lib64/libganglia.so.0 #1 0x00007f2c36eb2166 in hashval () from /lib64/libganglia.so.0 #2 0x00007f2c36eb2725 in hash_delete () from /lib64/libganglia.so.0 #3 0x00007f2c36eb2028 in hash_destroy () from /lib64/libganglia.so.0 #4 0x00007f2c37524062 in cleanup_source () #5 0x00007f2c36eb297d in hash_walkfrom () from /lib64/libganglia.so.0 #6 0x00007f2c37524250 in cleanup_thread () #7 0x00007f2c35436dc5 in start_thread () from /lib64/libpthread.so.0 #8 0x00007f2c34f4f73d in clone () from /lib64/libc.so.6 -- Trey Dockendorf HPC Systems Engineer Ohio Supercomputer Center |
From: Dockendorf, T. <tdo...@os...> - 2017-03-21 16:25:11
|
I have been getting frequent, many times per day, segfaults with gmetad 3.7.2 on RHEL 7 that is installed from rebuilt SRPM I pulled from Fedora. Below is the backtrace of a core generated by abrt-ccpp. Let me know what other information would be useful in getting this segfault fixed or if this is more appropriate for github issue. (gdb) bt #0 0x00007f2c36eb2106 in hash_key () from /lib64/libganglia.so.0 #1 0x00007f2c36eb2166 in hashval () from /lib64/libganglia.so.0 #2 0x00007f2c36eb2725 in hash_delete () from /lib64/libganglia.so.0 #3 0x00007f2c36eb2028 in hash_destroy () from /lib64/libganglia.so.0 #4 0x00007f2c37524062 in cleanup_source () #5 0x00007f2c36eb297d in hash_walkfrom () from /lib64/libganglia.so.0 #6 0x00007f2c37524250 in cleanup_thread () #7 0x00007f2c35436dc5 in start_thread () from /lib64/libpthread.so.0 #8 0x00007f2c34f4f73d in clone () from /lib64/libc.so.6 -- Trey Dockendorf HPC Systems Engineer Ohio Supercomputer Center |
From: Lohit V. <lo...@gw...> - 2017-03-08 19:22:35
|
Hello all, I face the following issues with gpfs.py. I have followed all the instructions in the Readme as follows: 1. The nodes do have modpython.so installed and other python modules work as expected. 2. All the files gpfs.pyconf and gpfs.py are in there respective place. 3. I made sure sudoers exists with ganglia user and nopasswd to access mmpmon 4. gmond -m does show that it is able to see gpfs metrics. 5. When i run gpfs.py like this : 'sudo -u ganglia python gpfs.py' I do get printed results and also respective values every 15 minutes when i run IO on GPFS filesystems. However i see the following issues. 1. When i start GPFS - i see the following message. /usr/sbin/gmond[4190]: Unable to find any metric information for 'gpfs_(.+)'. Possible that a module has not been loaded. 2. Majority of the systems on ganglia web do not show that gpfs metrics exists, though the nodes themselves show that gpfs_* in one of there metrics in gmond -m. 3. Some nodes show that GPFS metrics exists but do not show any values . They are always -nan. I have tried looking at the python module to see if there is anything explicit, but i dont see anything wrong. Also i tried using tcpdump on the respective ganglia port on few of the client nodes to see if they are sending any gpfs metrics. To my surprise i dont see any gpfs named metrics in the tcpdump. Could anyone help me out with this issue? Thank you, Lohit |
From: Hilmi E. C. <hil...@gm...> - 2016-12-12 14:41:33
|
Hi guys, I'm using ganglia and with hadoop plugin. Today I realised that rrd files takes more than 300gb only for logs and generated too much files. I think its about hadoop metrics before metrics I didn't have any problem about disk space. Does anybody else encounter this problem before ? Regards, Egemen |
From: Adrian S. <Adr...@ce...> - 2016-11-21 20:59:03
|
On 11/21/2016 12:39 AM, Hilmi Egemen Ciritoğlu wrote: > Hi Adrian, Hi! > Thank you for your answer. I'm new ganglia world. So I have one question > I would be happy if you can answer. > > I guess this: $UPS_OUT_LOAD1 is variable. Is it predefined because you > didn't mention any thing about it ? yes, that is something that i take from the ups and pdu through SNMP (i have Symmetra PX, but you can find specific OIDs from (any) provider) see https://github.com/adriansev/ISSMON/blob/master/get_ups_data https://github.com/adriansev/ISSMON/blob/master/get_PDUtable or overall https://github.com/adriansev/ISSMON > Also I didn't figured out why this variable stands for $ups_name_low ? i just changed the SNMP name of the device to all low letters in principle the content of the repository is self-explanatory, so if you have questions first you have to read documentation on SNMP (how to get information from devices and servers, what is an OID, etc..) and then how to push metrics to ganglia (gmetric). On the ganglia subject, it could be that is not very clear but i run separate gmonds for each type of services which is not easy to do as i had to rewrite the sysinit script to be more flexible w.r.t configuration location. (on my monitoring machine i run 5 aggregator gmonds) Adrian > > Regards, > Egemen > > 2016-11-20 20:45 GMT+00:00 Adrian Sevcenco <Adr...@ce... > <mailto:Adr...@ce...>>: > > On 11/20/2016 08:29 PM, Hilmi Egemen Ciritoğlu wrote: > > Hi all, > > Hi! > > Do you have any idea how can I also collect power consumption ? > Do you > have any plugin for this situation ? Any help would be greatly > appreciated :) > > As far i am aware, all power related metrics are available only > through snmp (or other means.). So you would need to use gmetric to > push into ganglia the metrics that you want ... something like : > > $SEND -S $SPOOF -g "Output_PH1" -n "ups_ph1_load" -v > $UPS_OUT_LOAD1 -t "float" -u "VA" -s "both" -D "Output Power - > Phase1" -T "Output Power - PH1" $LIFETIME > > where: > CONF="/etc/ganglia_ups/gmond.conf" > SEND="/usr/bin/gmetric -c $CONF" > LIFETIME="--tmax=900 --dmax=0" > SPOOF=$IP":"$ups_name_low > > HTH, > Adrian > > > > ------------------------------------------------------------------------------ > > _______________________________________________ > Ganglia-developers mailing list > Gan...@li... > <mailto:Gan...@li...> > https://lists.sourceforge.net/lists/listinfo/ganglia-developers > <https://lists.sourceforge.net/lists/listinfo/ganglia-developers> > > > > > ------------------------------------------------------------------------------ > > > > _______________________________________________ > Ganglia-developers mailing list > Gan...@li... > https://lists.sourceforge.net/lists/listinfo/ganglia-developers > -- ---------------------------------------------- Adrian Sevcenco, Ph.D. | Institute of Space Science - ISS, Romania | adrian.sevcenco at {cern.ch,spacescience.ro} | ---------------------------------------------- |
From: Troy B. <tr...@os...> - 2016-11-21 02:30:26
|
On 11/20/2016 01:29 PM, Hilmi Egemen Ciritoğlu wrote: > Do you have any idea how can I also collect power consumption ? Do you > have any plugin for this situation ? Any help would be greatly > appreciated :) This depends largely on what sort of hardware you have. If you have hardware that can report power consumption metrics via SNMP or IPMI, there are likely plugins out there that can help you. For instance, my fork of the Ganglia Python plugins repo [1] includes a refactored IPMI plugin that can be used to measure power consumption (among other things) if your IPMI implementation supports it. We're currently using it to monitor power consumption, temperature, fan speeds, and some other metrics from IPMI on about 800 of our ~2,000 nodes. --Troy [1] https://github.com/tabaer/gmond_python_modules |
From: Hilmi E. C. <hil...@gm...> - 2016-11-20 22:39:51
|
Hi Adrian, Thank you for your answer. I'm new ganglia world. So I have one question I would be happy if you can answer. I guess this: $UPS_OUT_LOAD1 is variable. Is it predefined because you didn't mention any thing about it ? Also I didn't figured out why this variable stands for $ups_name_low ? Regards, Egemen 2016-11-20 20:45 GMT+00:00 Adrian Sevcenco <Adr...@ce...>: > On 11/20/2016 08:29 PM, Hilmi Egemen Ciritoğlu wrote: > >> Hi all, >> > Hi! > > Do you have any idea how can I also collect power consumption ? Do you >> have any plugin for this situation ? Any help would be greatly >> appreciated :) >> > As far i am aware, all power related metrics are available only through > snmp (or other means.). So you would need to use gmetric to push into > ganglia the metrics that you want ... something like : > > $SEND -S $SPOOF -g "Output_PH1" -n "ups_ph1_load" -v $UPS_OUT_LOAD1 > -t "float" -u "VA" -s "both" -D "Output Power - Phase1" -T "Output > Power - PH1" $LIFETIME > > where: > CONF="/etc/ganglia_ups/gmond.conf" > SEND="/usr/bin/gmetric -c $CONF" > LIFETIME="--tmax=900 --dmax=0" > SPOOF=$IP":"$ups_name_low > > HTH, > Adrian > > > > ------------------------------------------------------------ > ------------------ > > _______________________________________________ > Ganglia-developers mailing list > Gan...@li... > https://lists.sourceforge.net/lists/listinfo/ganglia-developers > > |
From: Adrian S. <Adr...@ce...> - 2016-11-20 20:45:48
|
On 11/20/2016 08:29 PM, Hilmi Egemen Ciritoğlu wrote: > Hi all, Hi! > Do you have any idea how can I also collect power consumption ? Do you > have any plugin for this situation ? Any help would be greatly > appreciated :) As far i am aware, all power related metrics are available only through snmp (or other means.). So you would need to use gmetric to push into ganglia the metrics that you want ... something like : $SEND -S $SPOOF -g "Output_PH1" -n "ups_ph1_load" -v $UPS_OUT_LOAD1 -t "float" -u "VA" -s "both" -D "Output Power - Phase1" -T "Output Power - PH1" $LIFETIME where: CONF="/etc/ganglia_ups/gmond.conf" SEND="/usr/bin/gmetric -c $CONF" LIFETIME="--tmax=900 --dmax=0" SPOOF=$IP":"$ups_name_low HTH, Adrian |