Home

Matt Massie

Ganglia is a scalable distributed monitoring system for high-performance computing systems such as clusters and Grids. It is based on a hierarchical design targeted at federations of clusters. Supports clusters up to 2000 nodes in size.

Screenshot thumbnail
Ganglia can scale to handle clusters with thousands of nodes


Project Admins:


Discussion

  • char tao

    char tao - 2013-03-08

    cygwin 1.7 + ganglia 3.5

    gmond.c:160: error: parse error before '*' token
    gmond.c:160: warning: type defaults to int' in declaration ofhosts_mutex'
    gmond.c:160: warning: data definition has no type or storage class
    gmond.c: In function Ganglia_host_get': gmond.c:1029: warning: implicit declaration of functionapr_thread_mutex_create'
    gmond.c:1029: error: APR_THREAD_MUTEX_DEFAULT' undeclared (first use in this function) gmond.c:1029: error: (Each undeclared identifier is reported only once gmond.c:1029: error: for each function it appears in.) gmond.c:1055: warning: implicit declaration of functionapr_thread_mutex_lock'
    gmond.c:1057: warning: implicit declaration of function apr_thread_mutex_unlock' gmond.c: In functiontcp_listener':
    gmond.c:3056: warning: implicit declaration of function apr_thread_exit' gmond.c: In functionmain':
    gmond.c:3174: error: `APR_THREAD_MUTEX_DEFAULT' undeclared (first use in this function)

     
    Last edit: char tao 2013-03-08
  • Ayman

    Ayman - 2013-11-21

    am using Ganglia to monitor huge infrastructure with more than 300 nodes, But the central machine which collecting data from those nodes by gmetad has very high cpu load due to heavy I/O operations, i tried to put the rrds files in ramdisk it gets better but still has load about 9!! Any one has resolution for this please help me as this cause me panic.

    Thanks

     
  • char tao

    char tao - 2013-11-26

    rrdcached is a daemon that receives updates to existing RRD files,
    accumulates them and, if enough have been received or a defined time has
    passed, writes the updates to the RRD file. A flush command may be used to
    force writing of values to disk, so that graphing facilities and similar
    can work with up-to-date data.

    2013/11/21 Ayman ayman-shorman@users.sf.net

    am using Ganglia to monitor huge infrastructure with more than 300 nodes,
    But the central machine which collecting data from those nodes by gmetad
    has very high cpu load due to heavy I/O operations, i tried to put the rrds
    files in ramdisk it gets better but still has load about 9!! Any one has
    resolution for this please help me as this cause me panic.

    Thanks

    Sent from sourceforge.net because you indicated interest in
    https://sourceforge.net/p/ganglia/wiki/Home/

    To unsubscribe from further messages, please visit
    https://sourceforge.net/auth/subscriptions/

     
  • Charlotte9

    Charlotte9 - 2015-08-05

    Great software, really couldnt ask more from it

     
  • jigli

    jigli - 2016-04-01

    Hello:
    I using ganglia-3.7.1 on aix7.1 of IBM POWER6 . When I configure it , receive error messages:
    Checking for confuse
    checking for cfg_parse in -lconfuse... no
    Trying harder including gettext
    checking for cfg_parse in -lconfuse... no
    Trying harder including iconv
    checking for cfg_parse in -lconfuse... no
    libconfuse not found

    But I have installed libconfuse-2.7-1 and libconfuse-devel-2.7-1
    Please help me!

     
    • char tao

      char tao - 2016-04-14

      --with-libconfuse=/usr/lib or LDFLAGS="-L /usr/lib"

      2016-04-01 10:39 GMT+08:00 jigli jigli@users.sf.net:

      Hello:
      I using ganglia-3.7.1 on aix7.1 of IBM POWER6 . When I configure it ,
      receive error messages:
      Checking for confuse
      checking for cfg_parse in -lconfuse... no
      Trying harder including gettext
      checking for cfg_parse in -lconfuse... no
      Trying harder including iconv
      checking for cfg_parse in -lconfuse... no
      libconfuse not found

      But I have installed libconfuse-2.7-1 and libconfuse-devel-2.7-1
      Please help me!


      Sent from sourceforge.net because you indicated interest in
      https://sourceforge.net/p/ganglia/wiki/Home/

      To unsubscribe from further messages, please visit
      https://sourceforge.net/auth/subscriptions/

       
  • Jagannath Nagare

    I have to monitor fds for perticular processes. I did below changes in ganglia configuration but I am getting blank graph.When same script I run through console I get output.

    ***In this path I added /usr/lib64/ganglia/pythonmodules/
    procfds.py: *

    import os

    OBSOLETE_POPEN = False
    try:
    import subprocess
    except ImportError:
    import popen2
    OBSOLETE_POPEN = True

    import threading
    import time

    _refresh_rate = 30 # Refresh rate of the netstat data

    _conns = {'process_fds': 0}

    def TCP_Connections(name):
    global tempconns
    tempconns= []
    pid = file('/var/run/computenode.pid', 'rt').readline().strip()
    process = subprocess.Popen("ls /proc/"+pid+"/fd | wc -l", stdout=subprocess.PIPE,shell=True)
    lines = process.communicate()[0].strip()
    _conns['process_fds']=lines
    ret = int(_conns[name])
    return ret

    Metric descriptions

    _descriptors = [{
    'name': 'process_fds',
    'call_back': TCP_Connections,
    'time_max': 20,
    'value_type': 'uint',
    'units': '',
    'slope': 'both',
    'format': '%u',
    'description': 'Total number of file descriptor ',
    'groups': 'procstat'
    }]

    def metric_init(params):
    '''Initialize the tcp connection status module and create the
    metric definition dictionary object for each metric.'''
    global _refresh_rate

    if 'RefreshRate' in params:
        _refresh_rate = int(params['RefreshRate'])
    _descriptors[:]=[]
    #Return the metric descriptions to Gmond
    _descriptors.append({
                        'name':'process_fds',
                        'call_back': TCP_Connections,
                        'time_max': 20,
                        'value_type': 'uint',
                        'units': '',
                        'slope': 'both',
                        'format': '%u',
                        'description': 'Total number of file descriptor ',
                        'groups': 'procstat'
                        })
    
    return _descriptors
    

    def metric_cleanup():
    '''Clean up the metric module.'''
    pass

    if name == 'main':
    params = {'Refresh': '20'}
    metric_init(params)
    while True:
    try:
    for d in _descriptors:
    v = d'call_back'
    print 'value for %s is %u' % (d['name'], v)
    time.sleep(5)
    except KeyboardInterrupt:
    os._exit(1)

    configuration file:

    [root@mtl-nes-qa3-cn1 conf.d]# cat proc_fds.pyconf
    modules {
    module {
    name = 'proc_fds'
    language = 'python'

    }
    }

    collection_group {
    collect_every = 30
    time_threshold = 30

    metric {
    name = "process_fds"
    value_threshold = "256.0"
    title = "process_fds"

    }
    }

    Please let me know why blank graph is showing and also value_threshold is not changing its 0.0/0.5 range

     

Log in to post a comment.

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:





No, thanks