monami-users Mailing List for MonAMI - your friendly monitoring daemon (Page 2)
Status: Alpha
Brought to you by:
paulmillar
You can subscribe to this list here.
2007 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(1) |
Jun
|
Jul
|
Aug
(2) |
Sep
|
Oct
|
Nov
(2) |
Dec
|
---|---|---|---|---|---|---|---|---|---|---|---|---|
2008 |
Jan
|
Feb
(8) |
Mar
(13) |
Apr
(16) |
May
(13) |
Jun
(9) |
Jul
(11) |
Aug
(3) |
Sep
(4) |
Oct
(2) |
Nov
(6) |
Dec
(6) |
2009 |
Jan
|
Feb
|
Mar
(6) |
Apr
|
May
(9) |
Jun
(1) |
Jul
(10) |
Aug
(6) |
Sep
|
Oct
(3) |
Nov
(7) |
Dec
|
2010 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
(1) |
From: Paul M. <p.m...@ph...> - 2009-07-27 08:23:42
|
Hi Arnau, On Thursday 23 July 2009 16:03:02 Arnau Bria wrote: > firs of all I'd like to thank Paul Millar and other contributors for > your great job on this project. I've been looking for some maui/torque > graphs and your project really fits our needs. Many thanks guys! Happy to help! [...] > and i did something like: > 1.-) Download files from ganglia > http://monami.cvs.sourceforge.net/viewvc/monami/external/ganglia/ganglia/ > 2.-) patch php files > 3.-) create host_extra_$fqdn.php file > 4.-) modify rrd_version var That all sounds OK. > I did not download all files from ganglia directory, I only got the ones > about torque/maui and those that seem generic (seems to me): [...] > So dpm/rgma/mysql are missing. That should be fine: you shouldn't need them unless you wish to monitor a DPM, an RGMA or a MySQL instance. > So now I see my pie charts but not the "accumulative ones". Let me > explain with an example, if you look at > http://monami.sourceforge.net/tutorial/ar01s06.html > > "torque plots" section, I can see pie ones but not the others. This is a fairly common problem when developing these custom graphs. Here's the general process for discovering the cause. I'm guessing that your web browser shows a broken image symbol for these graphs: when there is a problem generating the graph, PHP usually emits an error message. However, by this time, the MIME type for the content has already been sent, so the web browser will be expecting a PNG file (or SVG, or ...), so the error message is rendered as a broken image. You need to switch on "debug" to see the command that is being run. Here's how: o Edit the file torque-graph.php and change the line: $rrdgraph->cmd_go( $cmd); to $rrdgraph->cmd_go( $cmd, 1); o Open the page with your browser and click on one of the torque graphs. This should view a "large" version of the graph without any HTML. This allows the browser to respect the MIME type (which is now text) so the command should be visible. So, you should see a page containing the RRDTool command that would have been run. Sometimes the problem is obvious (if one knows how to drive RRDTool), but if not then try running this command (in a standard console) and see what error message RRDTool returns. Be careful to redirect the output somewhere (e.g., /tmp/rrdoutput) as RRDTool will emit graphical output (e.g., PNG) to stdout if everything is OK. (Most terminals don't take too kindly to having raw binary data on stdout) Since you also reported a "PHP Fatal error", it might be worth trying to fix that issue first before investigating the problem via this route. > I see this error in http log: > > [Thu Jul 23 16:02:35 2009] [error] [client 192.168.10.38] PHP Fatal error: > Call to undefined function clean_string() in > /usr/share/ganglia/get_context.php on line 9, referer: > http://ganglia-test.pic.es/ganglia/?m=load_one&r=hour&s=descending&c=Grid+S >ervices&h=pbs02.pic.es&sh=1&hc=4 > > Is it related? It certainly doesn't look good. However, this sounds familiar: there's a problem with Ganglia's web front-end in that its dependencies are a little broken: the file get_context.php uses clean_string() without ensuring it has been defined. The file multiple-graphs.php has a work-around for this, but looking at it again, this fix might be wrong. Near the beginning of multiple-graphs.php you should see a line: include_once "./functions.php"; This is currently after the line: include_once "./get_context.php"; Could you try moving the including functions.php line to somewhere before the including get_context.php line, reload the page and see if that stops PHP emitting the error message? Cheers, Paul. |
From: Arnau B. <arn...@pi...> - 2009-07-23 14:11:12
|
Hi all, firs of all I'd like to thank Paul Millar and other contributors for your great job on this project. I've been looking for some maui/torque graphs and your project really fits our needs. Many thanks guys! So, I've followed "MonAMI by example" and I was trying to add torque/maui data into our ganglia server. All worked fine, so I have many graphs about toruqe server now. But I wanted extra features, the ones from "external" package that you mention in: http://monami.sourceforge.net/tutorial/ar01s06.html Section "Producing complex graphs". From that section and this post: http://www.mail-archive.com/gan...@li.../msg02977.html I was able to download install instructions http://monami.cvs.sourceforge.net/viewvc/monami/external/ganglia/README.txt?revision=1.7&view=markup and i did something like: 1.-) Download files from ganglia http://monami.cvs.sourceforge.net/viewvc/monami/external/ganglia/ganglia/ 2.-) patch php files 3.-) create host_extra_$fqdn.php file 4.-) modify rrd_version var I did not download all files from ganglia directory, I only got the ones about torque/maui and those that seem generic (seems to me): 4 -rw-r--r-- 1 root root 4058 Jul 23 15:29 maui-graph.php 8 -rw-r--r-- 1 root root 6778 Jul 23 15:30 google.php 12 -rw-r--r-- 1 root root 10217 Jul 23 15:31 mg-frame-maui.php 16 -rw-r--r-- 1 root root 14780 Jul 23 15:31 mg-frame-torque.php 4 -rw-r--r-- 1 root root 2824 Jul 23 15:32 mg-single-frame.php 8 -rw-r--r-- 1 root root 5470 Jul 23 15:32 multiple-graphs.css 8 -rw-r--r-- 1 root root 4980 Jul 23 15:33 torque-graph.php 24 -rw-r--r-- 1 root root 21798 Jul 23 15:34 multiple-graphs.php So dpm/rgma/mysql are missing. So now I see my pie charts but not the "accumulative ones". Let me explain with an example, if you look at http://monami.sourceforge.net/tutorial/ar01s06.html "torque plots" section, I can see pie ones but not the others. Pop-ups seems to work, ALL seems to work but those ones. I see this error in http log: [Thu Jul 23 16:02:35 2009] [error] [client 192.168.10.38] PHP Fatal error: Call to undefined function clean_string() in /usr/share/ganglia/get_context.php on line 9, referer: http://ganglia-test.pic.es/ganglia/?m=load_one&r=hour&s=descending&c=Grid+Services&h=pbs02.pic.es&sh=1&hc=4 Is it related? Any clue on what happening in my site? # rpm -qa|grep ganglia ganglia-web-3.0.7-1.el5 ganglia-3.0.7-1.el5 ganglia-gmetad-3.0.7-1.el5 ganglia-pbs-1.3-2 ganglia-gmond-3.0.7-1.el5 ganglia-fs-1.16-2 # rpm -qa|grep rrd rrdtool-1.2.27-3.el5 rrdtool-perl-1.2.27-3.el5 TIA, Arnau |
From: Paul M. <p.m...@ph...> - 2009-06-02 07:35:19
|
Hi Stephen, On Thursday 28 May 2009 18:09:26 Stephen Childs wrote: > Stephen Childs wrote: > > I think the static password I had is no longer valid. The latest plugin > > from cvs detects the password OK but still doesn't seem to get data. > Actually it all works now! (I had disabled ganglia while testing.) Yes, CVS maui plugin has support for scanning 64-bit ELF binaries, so you shouldn't need to hard-code the password: it should work out-of-the-box. > Think we need to roll some new RPMs from CVS. True; it's been far too long since the last release. I think the remaining hurdle is to allow building the MonAMI RPMs on a system without the DPM libraries installed. On platforms that DPM doesn't support (a non-RHEL/SL distro) it should just skip building the DPM plugin (which the build system does) but unfortunately the spec file doesn't know this, so fails due to the missing /usr/lib/monami/dpm.so file. Hopefully this won't be too difficult to fix. Cheers, |
From: Stephen C. <ch...@cs...> - 2009-05-28 16:09:33
|
Stephen Childs wrote: > I think the static password I had is no longer valid. The latest plugin > from cvs detects the password OK but still doesn't seem to get data. Actually it all works now! (I had disabled ganglia while testing.) Think we need to roll some new RPMs from CVS. Stephen -- Dr. Stephen Childs, Research Fellow, EGEE Project, phone: +353-1-8961797 Computer Architecture Group, email: Stephen.Childs @ cs.tcd.ie Trinity College Dublin, Ireland web: http://www.cs.tcd.ie/Stephen.Childs |
From: Stephen C. <ch...@cs...> - 2009-05-28 15:57:18
|
Stephen Childs wrote: > Looking further into my problems getting the maui plugin working with > recent maui, I see this in the maui logs: > > > > 05/28 15:33:38 INFO: message 'CK=a860d2ab83edd7b1 TS=1243521218 > AUTH=monami DT=CMD=diagnose AUTH=monami ARG=7 0 ALL [NONE] > ' read > 05/28 15:33:38 MSecGetChecksum(Buf,73,Checksum,DES,CSKey) > 05/28 15:33:38 ALERT: checksum does not match > (237d49f87075ea3e:a860d2ab83edd7b1) request 'TS=1243521218 AUTH=monami > DT=CMD=diagnose AUTH=monami ARG=7 0 ALL [NONE] I think the static password I had is no longer valid. The latest plugin from cvs detects the password OK but still doesn't seem to get data. Stephen -- Dr. Stephen Childs, Research Fellow, EGEE Project, phone: +353-1-8961797 Computer Architecture Group, email: Stephen.Childs @ cs.tcd.ie Trinity College Dublin, Ireland web: http://www.cs.tcd.ie/Stephen.Childs |
From: Stephen C. <ch...@cs...> - 2009-05-28 14:38:50
|
Looking further into my problems getting the maui plugin working with recent maui, I see this in the maui logs: 05/28 15:33:38 INFO: message 'CK=a860d2ab83edd7b1 TS=1243521218 AUTH=monami DT=CMD=diagnose AUTH=monami ARG=7 0 ALL [NONE] ' read 05/28 15:33:38 MSecGetChecksum(Buf,73,Checksum,DES,CSKey) 05/28 15:33:38 ALERT: checksum does not match (237d49f87075ea3e:a860d2ab83edd7b1) request 'TS=1243521218 AUTH=monami DT=CMD=diagnose AUTH=monami ARG=7 0 ALL [NONE] -- Dr. Stephen Childs, Research Fellow, EGEE Project, phone: +353-1-8961797 Computer Architecture Group, email: Stephen.Childs @ cs.tcd.ie Trinity College Dublin, Ireland web: http://www.cs.tcd.ie/Stephen.Childs |
From: Paul M. <p.m...@ph...> - 2009-05-25 23:41:01
|
Hi Stephen, On Friday 22 May 2009 10:54:54 Stephen Childs wrote: > 2009-05-22 09:27:16 dpm> Cannot add node "used" (under branch > "/extern-0-0") as there is already a node with this name. Yes, as you've explain it, this error message makes sense (at least, to me). The current tree structure assumes that nodes are (within the DPM instance) unique. When this assumption breaks down, it attempts to place metrics on top of existing metrics, which isn't allowed. I think the underlying problem here is the datatree structure is broken. > Would it be possible to prepend the server name to the FS so then > server1:extern-0-0 and server2:extern-0-0 would be different? Sure, this is possible. Although I was initially in favour of this, having thought about it some more, I would advocate a slightly different approach. The current datatree structure is something like (if you forgive the ASCII art ;-) DPM | +-- "filesystems" | | | +-- filesystem1 | | | | | (filesystem1 metrics; e.g., used) | | | +-- filesystem2 | | | | | (filesystem2 metrics; e.g., used) | | | (the remaining filesystems) | (other DPM high-level concepts; e.g., "space") As an alternative, I would propose restructuring the datatree so its more like: DPM | +-- "servers" | | | +-- server1 | | | | | +-- "filesystems" | | | | | | | +-- filesystem1 (of server1) | | | | | | | | | (filesystem1 metrics; used, etc...) | | | | | | | +-- filesystem2 (of server1) | | | | | | | | | (filesystem2 metrics; used, etc...) | | | | | | | (other filesystems on server1) | | | | | (other metrics for server1) | | | +-- server2 | | | | | +-- "filesystems" | | | | | | | +-- filesystem1 (of server2) | | | | | | | | | (filesystem1 metrics; used, etc...) | | | | | | | +-- filesystem2 (of server2) | | | | | | | (filesystem2 metrics; used, etc...) | | | | | (other metrics for server2) | | | (metrics about other servers) | (other high-level concepts; e.g., "space") This would avoid the need to create canonical names for filesystems based on the hosting computer. It also provides a natural place for other server- specific values. Also, by expressing the metrics as above, one could select metrics for all filesystems hosted on a specific server (select=dpm.servers.server1.filesystems), which might be useful. The user-interface, mg-frames in Ganglia specifically, would still need to display the information unambiguously. Unfortunately, in Ganglia, the tree structure is lost so one must "recreate it" (sort of). I think this would be simply passing the metric names through a suitable reg-exp expression, something like: \.servers\.([^.]*)\.filesystems\.([^.]*)\.([^.]*) and use the first two group matches to build the canonical name for the metric (third group). IIRC, there's examples of this already in the Torque. Does this make sense? Cheers, Paul. |
From: Paul M. <p.m...@ph...> - 2009-05-25 23:13:49
|
Hi Stephen, On Friday 22 May 2009 16:56:45 Stephen Childs wrote: > I have been doing some work on the monami dpm plugin > (http://monami.sourceforge.net/userguide/ch03s04.html#dpm). Excellent! > I have already fixed up the filesystems feature so that identically-named > filesystems on different disk servers can be distinguished. > > and would like to add some new features: > * list pools and display usage (absolute and proportion) > * size of a given FS (would allow percentage used to be calculated) These both sound useful additions. > The monami dpm plugin currently does not use the DPM api, but rather > queries the DB directly. It seems to me it would be better to start using > the API: I agree that using the DPM API is a better option. (iirc) The plan was to implement the database queries as an initial implementation and, longer term, to implement the missing monitoring features in DPM API and migrate the plugin over. Using the API is definitely better since it hides us from any schema change and, I believe, there's some state held in-memory (i.e., not maintained in the database) that would be useful to monitor. Pretty much two years ago I started trying to move the DPM plugin over to using the DPM API. I found a few bugs in the process, the most serious being that if DPM happens not to be running then the DPM calls never return: https://gus.fzk.de/ws/ticket_info.php?ticket=21731 Although the bug is marked "unsolved", Jean-Philippe said that Sophie had implemented a fix. I've not verified this, but I'm quietly optimistic that this is no longer a problem (with the right env. variables set). > in particular, while usage can be calculated by summing file > sizes, the capacity of a pool or FS seems hard to derive from the > database, whereas dpm_getpoolfs etc. can report this. True. IIRC, this is done by exec-ing df on the remote pool via some rfio exec call. The really nice part is the process is extensible; for example, code up a 'cat /proc/loadavg' (parse the output) and you can monitor the load-average on all the pool nodes. > Where can I find an up-to-date dpm-api.h? It doesn't seem to be installed > on my DPM server -- is there a DPM-devel RPM or similar? Later, Stephen Childs wrote: > Doh, it's dpm_api.h and is at: > /opt/lcg/include/dpm/dpm_api.h True, but the nice way to do this is to "discover" the location using autoconf. That way, should someone install the library/header-files somewhere else, the build can be adjusted accordingly and can complete. Also, it would allow one to skip over building the DPM plugin if the headers aren't installed. I don't know how familiar you are with autoconf. If not, then I should be able to help out here. > And (for monami developers) is there any reason not to start using the DPM > api in the monami plugin? No, there's no reason not to start migrating over to using the API (eek, double negative!). For the sake of completeness, here are the two things that are somewhat awkward about the move (I don't believe either are show-stoppers) First, I'm not sure how thread-safe is the DPM library (or libraries). If it's not thread-safe then a MonAMI daemon will have to be limited to monitoring a single DPM instance. Not likely to be a major limitation, but one that I believe the current (db-querying) plugin doesn't suffer from. Second, we'll need to make the spec file somehow conditional on whether the platform has DPM installed or not. Perhaps the nicest approach here would be to make the spec file (which is built as part of the configure process) include or exclude the packaged plugins, depending on whether the necessary libraries are found. I think this would be good for the other plugins, too. So, best of luck with moving DPM over to using the DPM API. I'll do what I can to help. Cheers, Paul. |
From: Stephen C. <ch...@cs...> - 2009-05-22 14:57:49
|
Stephen Childs wrote: > Where can I find an up-to-date dpm-api.h? It doesn't seem to be > installed on my DPM server -- is there a DPM-devel RPM or similar? Doh, it's dpm_api.h and is at: /opt/lcg/include/dpm/dpm_api.h -- Dr. Stephen Childs, Research Fellow, EGEE Project, phone: +353-1-8961797 Computer Architecture Group, email: Stephen.Childs @ cs.tcd.ie Trinity College Dublin, Ireland web: http://www.cs.tcd.ie/Stephen.Childs |
From: Stephen C. <ch...@cs...> - 2009-05-22 14:56:54
|
I have been doing some work on the monami dpm plugin (http://monami.sourceforge.net/userguide/ch03s04.html#dpm). I have already fixed up the filesystems feature so that identically-named filesystems on different disk servers can be distinguished. and would like to add some new features: * list pools and display usage (absolute and proportion) * size of a given FS (would allow percentage used to be calculated) The monami dpm plugin currently does not use the DPM api, but rather queries the DB directly. It seems to me it would be better to start using the API: in particular, while usage can be calculated by summing file sizes, the capacity of a pool or FS seems hard to derive from the database, whereas dpm_getpoolfs etc. can report this. Where can I find an up-to-date dpm-api.h? It doesn't seem to be installed on my DPM server -- is there a DPM-devel RPM or similar? And (for monami developers) is there any reason not to start using the DPM api in the monami plugin? Stephen -- Dr. Stephen Childs, Research Fellow, EGEE Project, phone: +353-1-8961797 Computer Architecture Group, email: Stephen.Childs @ cs.tcd.ie Trinity College Dublin, Ireland web: http://www.cs.tcd.ie/Stephen.Childs |
From: Stephen C. <ch...@cs...> - 2009-05-22 14:05:08
|
I have written a patch that implements this functionality (attached). Filesystems are now nicely distinguished as they are named as follows: server1./fsname server2./fsname I also added a little change to ganglia's pie.php which causes the legend to have more space if the overall pie size is increased. When combined with increasing the pie size in multiple-graphs.php (say to $pie_width = 450;) this allows long strings such as "atlas_Role_production" to be represented better in the pie chart. (see attached image) Paul, it would be great if you could have a look at these changes and check them in. Stephen -- Dr. Stephen Childs, Research Fellow, EGEE Project, phone: +353-1-8961797 Computer Architecture Group, email: Stephen.Childs @ cs.tcd.ie Trinity College Dublin, Ireland web: http://www.cs.tcd.ie/Stephen.Childs |
From: Stephen C. <ch...@cs...> - 2009-05-22 08:55:06
|
Hi, We have a consistent naming scheme for the FS on our different DPM disk servers, so each server has FS with the same name. The monami DPM plugin doesn't seem to incorporate the server name into the ID of the FS and so the FS names aren't unique resulting in errors like this in the log: 2009-05-22 09:27:16 dpm> Cannot add node "used" (under branch "/extern-0-0") as there is already a node with this name. Would it be possible to prepend the server name to the FS so then server1:extern-0-0 and server2:extern-0-0 would be different? I've started looking at the code, but any pointers as to where to fix this would be great! Stephen -- Dr. Stephen Childs, Research Fellow, EGEE Project, phone: +353-1-8961797 Computer Architecture Group, email: Stephen.Childs @ cs.tcd.ie Trinity College Dublin, Ireland web: http://www.cs.tcd.ie/Stephen.Childs |
From: Paul M. <p.m...@ph...> - 2009-03-09 16:41:39
|
On Monday 09 March 2009 13:27:47 Stephen Childs wrote: > When I run "diagnose -f" I see output for USER, GROUP, ACCT, CLASS, so I'm > not sure I have the same problem. True. This sounds like the problem has a different cause. > I don't get any Maui-related output at all in the ganglia stream. Could you send me a sample copy of the output from "diagnose-f", plus the info-level output from MonAMI (e.g., running ./monamid -fv with the default configuration for logging) ? Also, (just to double-check) could you try setting up a dataflow that samples every minute (say) saving the output to a snapshot target, leave monamid running for a couple of minutes and check whether anything appears in the snapshot output file ? Cheers, Paul. |
From: Steve T. <ste...@ce...> - 2009-03-09 12:38:04
|
On Mon, Mar 9, 2009 at 1:27 PM, Stephen Childs <ch...@cs...> wrote: > Paul Millar wrote: >>> >>> I notice that even at Glasgow >>> >>> (http://svr031.gla.scotgrid.ac.uk/ganglia/?m=load_one&r=day&s=descending&c= >>> Grid+Servers&h=svr016.gla.scotgrid.ac.uk&sh=1&hc=4) the torque monitoring >>> is >>> working but the maui monitoring isn't. >> >> True. >> >> The problem here is with a bug in Maui. The "diagnose -f" [*] will only >> send a finite amount of information (1k's worth, if memory serves) and the >> remaining is simply not sent, so truncating the output. > > When I run "diagnose -f" I see output for USER, GROUP, ACCT, CLASS, so I'm > not sure I have the same problem. I don't get any Maui-related output at all > in the ganglia stream. > >> The first lot of information returned is the USER block, detailing the >> fairshares details for each torque user. With the number of VOs that >> ScotGrid supports, this exceeds the 1k limit, so no diagnoses information >> can be received. >> >> I believe Steve Traylen (wave!) has a fix for this. I don't believe the >> Glasgow site-admins have deployed it, though. > > Steve, which version number should I be looking at to get this fix? > The latest one of course. There is one if certification, which I need to reject. I know no one is working on it at the moment. http://eticssoft.web.cern.ch/eticssoft/repository/torquemaui/maui/3.2.6p21-3/slc4_ia32_gcc346/ Sorry there is no 64 bit building pending some nonsense in ETICS that needs fixing. rpmbuild --rebuild --with key 230774 maui*src.rpm should create it for you though. Steve > Stephen > > -- > Dr. Stephen Childs, > Research Fellow, EGEE Project, phone: +353-1-8961797 > Computer Architecture Group, email: Stephen.Childs @ cs.tcd.ie > Trinity College Dublin, Ireland web: http://www.cs.tcd.ie/Stephen.Childs > -- Steve Traylen |
From: Stephen C. <ch...@cs...> - 2009-03-09 12:28:06
|
Paul Millar wrote: >> I notice that even at Glasgow >> (http://svr031.gla.scotgrid.ac.uk/ganglia/?m=load_one&r=day&s=descending&c= >> Grid+Servers&h=svr016.gla.scotgrid.ac.uk&sh=1&hc=4) the torque monitoring is >> working but the maui monitoring isn't. > > True. > > The problem here is with a bug in Maui. The "diagnose -f" [*] will only send > a finite amount of information (1k's worth, if memory serves) and the > remaining is simply not sent, so truncating the output. When I run "diagnose -f" I see output for USER, GROUP, ACCT, CLASS, so I'm not sure I have the same problem. I don't get any Maui-related output at all in the ganglia stream. > The first lot of information returned is the USER block, detailing the > fairshares details for each torque user. With the number of VOs that > ScotGrid supports, this exceeds the 1k limit, so no diagnoses information can > be received. > > I believe Steve Traylen (wave!) has a fix for this. I don't believe the > Glasgow site-admins have deployed it, though. Steve, which version number should I be looking at to get this fix? Stephen -- Dr. Stephen Childs, Research Fellow, EGEE Project, phone: +353-1-8961797 Computer Architecture Group, email: Stephen.Childs @ cs.tcd.ie Trinity College Dublin, Ireland web: http://www.cs.tcd.ie/Stephen.Childs |
From: Andrew E. <and...@gm...> - 2009-03-06 21:12:41
|
> I believe Steve Traylen (wave!) has a fix for this. it's available from http://eticssoft.web.cern.ch/eticssoft/repository/torquemaui/ Andrew |
From: Paul M. <p.m...@ph...> - 2009-03-06 19:37:14
|
Hi Stephen, On Friday 06 March 2009 12:01:25 Stephen Childs wrote: > Has anyone got the maui monitoring working with recent releases as used in > gLite (we have maui-3.2.6p20-snap.1212617145.12.slc4.x86_64). I've fairly recently added support in the maui plugin to scan AMD-64 binaries for the magic maui password. The code is in CVS now, so will be part of the next release. > I notice that even at Glasgow > (http://svr031.gla.scotgrid.ac.uk/ganglia/?m=load_one&r=day&s=descending&c= >Grid+Servers&h=svr016.gla.scotgrid.ac.uk&sh=1&hc=4) the torque monitoring is > working but the maui monitoring isn't. True. The problem here is with a bug in Maui. The "diagnose -f" [*] will only send a finite amount of information (1k's worth, if memory serves) and the remaining is simply not sent, so truncating the output. The first lot of information returned is the USER block, detailing the fairshares details for each torque user. With the number of VOs that ScotGrid supports, this exceeds the 1k limit, so no diagnoses information can be received. I believe Steve Traylen (wave!) has a fix for this. I don't believe the Glasgow site-admins have deployed it, though. [*] the maui plugin *doesn't* use the maui diagnose command, but it talks to the maui server in the same way, so suffers from the same server-side bugs. > Steve, did you by any chance change the top-secret password in recent maui > builds? (Although I can't see any errors in the monami logs -- it just > doesn't work!) :-( My guess is you're suffering from the same bug: a 1k limit on what diagnose -f returns. Cheers, Paul. |
From: Stephen C. <ch...@cs...> - 2009-03-06 11:01:27
|
Has anyone got the maui monitoring working with recent releases as used in gLite (we have maui-3.2.6p20-snap.1212617145.12.slc4.x86_64). I notice that even at Glasgow (http://svr031.gla.scotgrid.ac.uk/ganglia/?m=load_one&r=day&s=descending&c=Grid+Servers&h=svr016.gla.scotgrid.ac.uk&sh=1&hc=4) the torque monitoring is working but the maui monitoring isn't. Steve, did you by any chance change the top-secret password in recent maui builds? (Although I can't see any errors in the monami logs -- it just doesn't work!) Stephen -- Dr. Stephen Childs, Research Fellow, EGEE Project, phone: +353-1-8961797 Computer Architecture Group, email: Stephen.Childs @ cs.tcd.ie Trinity College Dublin, Ireland web: http://www.cs.tcd.ie/Stephen.Childs |
From: Paul M. <p.m...@ph...> - 2008-12-18 13:52:33
|
Dear all, As I'm sure you're all aware, spam is a big problem for anyone dealing with email. The MonAMI mailing lists are currently attracting a modest amount spam; on average, each list (announce, devel and users) receives several spam emails per day. The previous configuration places all non-list-member posts on hold, allowing the list admin (i.e., me) to allow non-spam emails. Over the last year, I believe there have been no such emails and, over the same time, there's not been a day without at least one list requiring intervention. Now, all lists are receiving several spam emails per day. So, reluctantly, I've reconfigured the lists to silently drop all incoming emails from people not registered on the list. If you're registered (i.e., if you're reading this!) then you shouldn't be affected. However, If you have more than one email address and send an email from a different address then your post will be silently dropped. If anyone experiences any problems, please feel free to email the -owner address (e.g., mon...@li...) or email me directly. Cheers, Paul. |
From: Paul M. <p.m...@ph...> - 2008-12-17 09:15:44
|
Hi Ido, On Sunday 14 December 2008 08:39:56 Ido Gan wrote: > Great!! I installed TrueType fonts on my system and I referred the > *$font_ttf_default > *to one of them, as you suggested. Then, I changed this bit of code in > multiple-graphs.php: > > case "google": > $height = 50; > $width = $this->rrd_google_width; > $title_option = ""; > *$fonts .= "--font DEFAULT:6: --font TITLE:8: ";* > break; > > to this: > > case "google": > $height = 50; > $width = $this->rrd_google_width; > $title_option = ""; > *$fonts = $font_ttf_default;* > break; You shouldn't have to do this: the $fonts variable is set earlier in cmd_start(), when choosing the output format (SVG vs PNG vs...). By hard-coding the TTF font, you may have upset the font settings for SVG and PDF output. That said, under normal circumstances "google"-sized output is only used with PNG output; so you'll probably not notice the difference. > I was able to tweak the width of the google gadget as well, > > /* Define certain sizes so HTML agrees with the RRDTool generated files */ > var $rrd_google_width = 150; Good. > I appreciate your help very much, your application will be of great service > to our cluster users at the Life Sciences department at Tel Aviv > University! Best of luck! BTW, what do you use as a scheduler? If you use maui then there's similar support (MonAMI + mg-graph) for that, too. Cheers, Paul. PS. The web front-end seems to be lacking the support for the "show all graphs" option (near the top of page). I suspect one of the patches against ganglia didn't get applied. |
From: Ido G. <id...@ta...> - 2008-12-14 07:40:09
|
Hi Paul, Great!! I installed TrueType fonts on my system and I referred the *$font_ttf_default *to one of them, as you suggested. Then, I changed this bit of code in multiple-graphs.php: case "google": $height = 50; $width = $this->rrd_google_width; $title_option = ""; *$fonts .= "--font DEFAULT:6: --font TITLE:8: ";* break; to this: case "google": $height = 50; $width = $this->rrd_google_width; $title_option = ""; *$fonts = $font_ttf_default;* break; I was able to tweak the width of the google gadget as well, /* Define certain sizes so HTML agrees with the RRDTool generated files */ var $rrd_google_width = 150; I appreciate your help very much, your application will be of great service to our cluster users at the Life Sciences department at Tel Aviv University! Ido On Fri, Dec 12, 2008 at 12:50 AM, Paul Millar <p.m...@ph...>wrote: > Hi Ido, > > On Wednesday 10 December 2008 03:22:01 Ido Gan wrote: > > How can the size of the grid inside the widget be diminished? It's too > > large and takes up the space of the x,y axes, and the rest of the info > that > > shows up on the original Monami generated graph. In other words: how can > I > > make the iGoogle widget look exactly like the original graph? I tried > > tweaking with the height and width settings in multiple-graphs.php but it > > didn't help. > > > > > http://biocluster.tau.ac.il/ganglia/?r=week&g=2&c=biocluster&h=biocluster.t > >au.ac.il > > OK, at the risk of telling you stuff you already know, what's supposed to > happen is the Gadget has a (deliberately) shrunk-down version of the image. > This image is "google" size ("[...]&z=google&[...]" in the URI), which > should > be 252×116 (and not look too bad). The graphs on the host-view page > are "medium" size. The following is an example graph in "google" size: > > > http://svr031.gla.scotgrid.ac.uk/ganglia/torque-graph.php?z=google&g=3&c=Grid > Servers&h=svr016.gla.scotgrid.ac.uk > > You seem to have two problems: first, the smaller, google-size images have > no > readable text. Choosing a graphs at random: > > > http://biocluster.tau.ac.il/ganglia/torque-graph.php?r=week&z=google&g=2&c=biocluster&h=biocluster.tau.ac.il > > This might be because of the fonts rrdtool is using. I suggest having a > look > at what fonts you have installed; for example, are you using TrueType > fonts? > (see $font_ttf_default in multiple-graphs.php) > > The second problem is that you want bigger graphs! There's some support > for > this: the overall height of the gadget can be set using the "he" parameter > of > the google.php script (see comments within google.php and > build_gadget_url() > inside multiple-graphs.php). > > It's been a while since I worked on the gadget code, but I believe the > width > of the gadget is not configurable. Instead, it is resized automatically > based on the browser's width. There's also a new "maximise" feature of > iGoogle that should show a single gadget as large as possible. Neither of > these will change the size of the graph. It either fits inside the gadget > (centred) or it doesn't. > > HTH, > > Paul. > -- Ido Gan IT Unit Life Sciences Faculty Sherman Bldg. room 03 Tel-Aviv University Israel web: http://www.tau.ac.il/lifesci/computer PH: +972-3-6407841 ex. 2 |
From: Paul M. <p.m...@ph...> - 2008-12-11 22:56:38
|
Hi Stephen, On Thursday 27 November 2008 10:21:02 Stephen Childs wrote: > As for the MAUI password, I just got it from the person who built the MAUI > RPMs! Excellent. The good news is I've added some extra code to CVS for parsing 32-bit and 64-bit binaries. This can extract the Maui password automatically from either type of binary. (The 64-bit binaries store the p/w in a different class of ELF section so it took a bit longer to hunt it down.) The next release will include this. Cheers, Paul. |
From: Paul M. <p.m...@ph...> - 2008-12-11 22:52:23
|
Hi Ido, On Wednesday 10 December 2008 03:22:01 Ido Gan wrote: > How can the size of the grid inside the widget be diminished? It's too > large and takes up the space of the x,y axes, and the rest of the info that > shows up on the original Monami generated graph. In other words: how can I > make the iGoogle widget look exactly like the original graph? I tried > tweaking with the height and width settings in multiple-graphs.php but it > didn't help. > > http://biocluster.tau.ac.il/ganglia/?r=week&g=2&c=biocluster&h=biocluster.t >au.ac.il OK, at the risk of telling you stuff you already know, what's supposed to happen is the Gadget has a (deliberately) shrunk-down version of the image. This image is "google" size ("[...]&z=google&[...]" in the URI), which should be 252×116 (and not look too bad). The graphs on the host-view page are "medium" size. The following is an example graph in "google" size: http://svr031.gla.scotgrid.ac.uk/ganglia/torque-graph.php?z=google&g=3&c=Grid Servers&h=svr016.gla.scotgrid.ac.uk You seem to have two problems: first, the smaller, google-size images have no readable text. Choosing a graphs at random: http://biocluster.tau.ac.il/ganglia/torque-graph.php?r=week&z=google&g=2&c=biocluster&h=biocluster.tau.ac.il This might be because of the fonts rrdtool is using. I suggest having a look at what fonts you have installed; for example, are you using TrueType fonts? (see $font_ttf_default in multiple-graphs.php) The second problem is that you want bigger graphs! There's some support for this: the overall height of the gadget can be set using the "he" parameter of the google.php script (see comments within google.php and build_gadget_url() inside multiple-graphs.php). It's been a while since I worked on the gadget code, but I believe the width of the gadget is not configurable. Instead, it is resized automatically based on the browser's width. There's also a new "maximise" feature of iGoogle that should show a single gadget as large as possible. Neither of these will change the size of the graph. It either fits inside the gadget (centred) or it doesn't. HTH, Paul. |
From: Ido G. <id...@ta...> - 2008-12-10 02:22:13
|
How can the size of the grid inside the widget be diminished? It's too large and takes up the space of the x,y axes, and the rest of the info that shows up on the original Monami generated graph. In other words: how can I make the iGoogle widget look exactly like the original graph? I tried tweaking with the height and width settings in multiple-graphs.php but it didn't help. http://biocluster.tau.ac.il/ganglia/?r=week&g=2&c=biocluster&h=biocluster.tau.ac.il Thanks! Ido |
From: Stephen C. <ch...@cs...> - 2008-11-27 12:02:46
|
Actually it doesn't seem to be just the torque plugin at all. I have disabled torque in the configuration and I still get the problem. However, when I disable both torque and maui, the problem goes away. For example, I can publish info from the tcp plugin to ganglia without the high CPU load. When I start monami using strace -f the bulk of the time is spent on stuff like this: [pid 5738] gettimeofday({1227786967, 716209}, NULL) = 0 [pid 5738] gettimeofday({1227786967, 716339}, NULL) = 0 [pid 5738] poll([{fd=7, events=POLLIN, revents=POLLIN}], 1, -1) = 1 [pid 5738] read(7, "", 9) = 0 Stephen -- Dr. Stephen Childs, Research Fellow, EGEE Project, phone: +353-1-8961797 Computer Architecture Group, email: Stephen.Childs @ cs.tcd.ie Trinity College Dublin, Ireland web: http://www.cs.tcd.ie/Stephen.Childs |