Thread: [Shinken-devel] HA config
Status: Beta
Brought to you by:
naparuba
From: Jörg S. <jor...@ln...> - 2011-09-22 13:02:45
|
Hi Playing around with HA configuration , i use the example "shinken-specific-high-availability.cfg" . I changed the node name / host_name etc . when I try to check or restart shinken I get the following error >> Shinken 0.6.5 Copyright (c) 2009-2011 : Gabes Jean (nap...@gm...) Gerhard Lausser, Ger...@co... Gregory Starck, g.s...@gm... Hartmut Goebel, h.g...@go... License: AGPL Loading configuration Opening configuration file /etc/shinken/nagios.cfg Processing object config file '/usr/local/nagios/etc/objects/Default_collector/hostgroups.cfg' Processing object config file '/usr/local/nagios/etc/objects/Default_collector/servicegroups.cfg' Processing object config file '/usr/local/nagios/etc/objects/Default_collector/extended_host_info.cfg' Processing object config file '/usr/local/nagios/etc/objects/Default_collector/hosts.cfg' Processing object config file '/usr/local/nagios/etc/objects/Default_collector/services.cfg' Processing object config file '/usr/local/nagios/etc/objects/Default_collector/extended_service_info.cfg' Processing object config file '/usr/local/nagios/etc/objects/global/service_templates.cfg' Processing object config file '/usr/local/nagios/etc/objects/global/contactgroups.cfg' Processing object config file '/usr/local/nagios/etc/objects/global/contacts.cfg' Processing object config file '/usr/local/nagios/etc/objects/global/timeperiods.cfg' Processing object config file '/usr/local/nagios/etc/objects/global/misccommands.cfg' Processing object config file '/usr/local/nagios/etc/objects/global/checkcommands.cfg' Processing object config file '/usr/local/nagios/etc/objects/global/host_templates.cfg' Processing object config file '/etc/shinken/resource.cfg' Opening configuration file /etc/shinken/shinken-specific.cfg Warning : I autogenerated some Arbiter modules, please look at your configuration Warning : the module NamedPipe-Autogenerated is autogenerated CRITICAL ERROR : I got an non recovarable error. I must exit You can log a bug ticket at https://sourceforge.net/apps/trac/shinken/newticket for geting help Back trace of it: Traceback (most recent call last): File "/usr/local/lib64/python2.6/site-packages/shinken/daemons/arbiterdaemon.py", line 411, in main self.load_config_file() File "/usr/local/lib64/python2.6/site-packages/shinken/daemons/arbiterdaemon.py", line 246, in load_config_file "Thanks.") TypeError: exit expected at most 1 arguments, got 2 What I did was to copy the " shinken-specific-high-availability.cfg" > shinken-specific.cfg , maybe wrong ? I followed the " http://www.shinken-monitoring.org/wiki/setup_high_availability_shinken" scenario Local is no problem Cheers /J Jörg Schulz |
From: nap <nap...@gm...> - 2011-09-22 13:25:03
|
On Thu, Sep 22, 2011 at 3:02 PM, Jörg Schulz <jor...@ln...> wrote: > Hi > [...] > You can log a bug ticket at > https://sourceforge.net/apps/trac/shinken/newticket for geting help > Back trace of it: Traceback (most recent call last): > File > "/usr/local/lib64/python2.6/site-packages/shinken/daemons/arbiterdaemon.py", > line 411, in main > self.load_config_file() > File > "/usr/local/lib64/python2.6/site-packages/shinken/daemons/arbiterdaemon.py", > line 246, in load_config_file > "Thanks.") > TypeError: exit expected at most 1 arguments, got 2 > > What I did was to copy the " shinken-specific-high-availability.cfg" > > shinken-specific.cfg , maybe wrong ? > I followed the " > http://www.shinken-monitoring.org/wiki/setup_high_availability_shinken" > scenario > > Hi, It's a bug. The message it is trying to raise is : Error: I cannot find my own Arbiter object, I bail out. " "To solve it, please change the host_name parameter in " "the object Arbiter in the file shinken-specific.cfg. " "With the value BLABLA Thanks" (with BLABLA the hostname value) Thanks for reporting it, I'm fixing this :) For your installation, it means that the host_name parameters in your shinken-specific.cfg file are not configured correctly. You should put in the arbiters objects the host_name value that you got with a hostname command, so the arbiter will got a way to find which arbiter object it is :) Regards, Jean > Local is no problem > > Cheers > /J > > > Jörg Schulz > |
From: Jörg S. <jor...@ln...> - 2011-09-23 06:45:50
|
Hello OK , i got it , i could start shinken on SITE A , the I copied the same config to Site B, restart shinken ☹ the bad_start_arbiter log shows shinken.pyro_wrapper.PortNotFree: Sorry, the port 7770 is not free: Couldn't start Pyro daemon: [Errno 99] Cannot assign requested address But told me that he is the slave arbiter ! but 7770 is free and not showing up in netstat Loading configuration Opening configuration file /etc/shinken/nagios.cfg Processing object config file '/usr/local/nagios/etc/objects/Default_collector/extended_host_info.cfg' Processing object config file '/usr/local/nagios/etc/objects/Default_collector/hosts.cfg' Processing object config file '/usr/local/nagios/etc/objects/Default_collector/hostgroups.cfg' Processing object config file '/usr/local/nagios/etc/objects/Default_collector/servicegroups.cfg' Processing object config file '/usr/local/nagios/etc/objects/Default_collector/services.cfg' Processing object config file '/usr/local/nagios/etc/objects/Default_collector/extended_service_info.cfg' Processing object config file '/usr/local/nagios/etc/objects/global/timeperiods.cfg' Processing object config file '/usr/local/nagios/etc/objects/global/service_templates.cfg' Processing object config file '/usr/local/nagios/etc/objects/global/contacts.cfg' Processing object config file '/usr/local/nagios/etc/objects/global/host_templates.cfg' Processing object config file '/usr/local/nagios/etc/objects/global/checkcommands.cfg' Processing object config file '/usr/local/nagios/etc/objects/global/misccommands.cfg' Processing object config file '/usr/local/nagios/etc/objects/global/contactgroups.cfg' Processing object config file '/etc/shinken/resource.cfg' Opening configuration file /etc/shinken/shinken-specific.cfg Warning : I autogenerated some Arbiter modules, please look at your configuration Warning : the module NamedPipe-Autogenerated is autogenerated I am the spare Arbiter : Arbiter-slave My own modules : NamedPipe-Autogenerated Warning in importing module : No module named redis Warning in importing module : No module named memcache Get a Named pipe module for plugin NamedPipe-Autogenerated I correctly loaded the modules : [NamedPipe-Autogenerated] All : (in/potential) (schedulers:2) (pollers:1/2) (reactionners:1/2) (brokers:1/2) (receivers:0/0) Running pre-flight check on configuration data... Checking global parameters... Checking hosts... Checked 308 hosts Checking hostgroups... Checked 27 hostgroups Checking contacts... Checked 1 contacts Checking contactgroups... Checked 31 contactgroups Checking notificationways... Checked 1 notificationways Checking escalations... Checked 0 escalations Checking services... Checked 102 services Checking servicegroups... Checked 0 servicegroups Checking timeperiods... Checked 7 timeperiods Checking commands... Checked 48 commands Checking servicedependencies... Checked 0 servicedependencies Checking hostdependencies... Checked 0 hostdependencies Checking arbiterlinks... Checked 2 arbiterlinks Checking schedulerlinks... Checked 2 schedulerlinks Checking reactionners... Checked 2 reactionners Checking pollers... Checked 2 pollers Checking brokers... Checked 2 brokers Checking receivers... Checked 0 receivers Checking resultmodulations... Checked 1 resultmodulations Checking discoveryrules... Checked 0 discoveryrules Checking discoveryruns... Checked 0 discoveryruns Checking criticitymodulations... Checked 0 criticitymodulations Cutting the hosts and services into parts Creating packs for realms Number of hosts in the realm All : 308 Things look okay - No serious problems were detected during the pre-flight check Configuration Loaded Successfully changed to workdir: /var/lib/shinken opening pid file: /var/lib/shinken/arbiterd.pid /var/lib/shinken/arbiterd.pid stale pidfile exists (no or invalid or unreadable content). reusing it. CRITICAL ERROR : I got an non recovarable error. I must exit You can log a bug ticket at https://sourceforge.net/apps/trac/shinken/newticket for geting help Back trace of it: Traceback (most recent call last): File "/usr/local/lib64/python2.6/site-packages/shinken/daemons/arbiterdaemon.py", line 413, in main self.do_daemon_init_and_start() File "/usr/local/lib64/python2.6/site-packages/shinken/daemon.py", line 429, in do_daemon_init_and_start self.setup_pyro_daemon() File "/usr/local/lib64/python2.6/site-packages/shinken/daemon.py", line 470, in setup_pyro_daemon self.pyro_daemon = pyro.ShinkenPyroDaemon(self.host, self.port, ssl_conf.use_ssl) File "/usr/local/lib64/python2.6/site-packages/shinken/pyro_wrapper.py", line 77, in __init__ raise PortNotFree(msg) PortNotFree: Sorry, the port 7770 is not free: Couldn't start Pyro daemon: [Errno 99] Cannot assign requested address Traceback (most recent call last): File "/usr/local/bin/shinken-arbiter", line 100, in <module> daemon.main() File "/usr/local/lib64/python2.6/site-packages/shinken/daemons/arbiterdaemon.py", line 413, in main self.do_daemon_init_and_start() File "/usr/local/lib64/python2.6/site-packages/shinken/daemon.py", line 429, in do_daemon_init_and_start self.setup_pyro_daemon() File "/usr/local/lib64/python2.6/site-packages/shinken/daemon.py", line 470, in setup_pyro_daemon self.pyro_daemon = pyro.ShinkenPyroDaemon(self.host, self.port, ssl_conf.use_ssl) File "/usr/local/lib64/python2.6/site-packages/shinken/pyro_wrapper.py", line 77, in __init__ raise PortNotFree(msg) shinken.pyro_wrapper.PortNotFree: Sorry, the port 7770 is not free: Couldn't start Pyro daemon: [Errno 99] Cannot assign requested address Jörg Schulz From: nap [mailto:nap...@gm...] Sent: den 22 september 2011 15:25 To: shi...@li... Subject: Re: [Shinken-devel] HA config On Thu, Sep 22, 2011 at 3:02 PM, Jörg Schulz <jor...@ln...> wrote: Hi [...] You can log a bug ticket at https://sourceforge.net/apps/trac/shinken/newticket for geting help Back trace of it: Traceback (most recent call last): File "/usr/local/lib64/python2.6/site-packages/shinken/daemons/arbiterdaemon.py", line 411, in main self.load_config_file() File "/usr/local/lib64/python2.6/site-packages/shinken/daemons/arbiterdaemon.py", line 246, in load_config_file "Thanks.") TypeError: exit expected at most 1 arguments, got 2 What I did was to copy the " shinken-specific-high-availability.cfg" > shinken-specific.cfg , maybe wrong ? I followed the " http://www.shinken-monitoring.org/wiki/setup_high_availability_shinken" scenario Hi, It's a bug. The message it is trying to raise is : Error: I cannot find my own Arbiter object, I bail out. " "To solve it, please change the host_name parameter in " "the object Arbiter in the file shinken-specific.cfg. " "With the value BLABLA Thanks" (with BLABLA the hostname value) Thanks for reporting it, I'm fixing this :) For your installation, it means that the host_name parameters in your shinken-specific.cfg file are not configured correctly. You should put in the arbiters objects the host_name value that you got with a hostname command, so the arbiter will got a way to find which arbiter object it is :) Regards, Jean Local is no problem Cheers /J Jörg Schulz |
From: Denis G. <dt....@gm...> - 2011-09-23 11:09:46
|
Hi, Are you really sure that nothing is listening on this port? When using restart, I sometimes have some issues with shinken processes not being killed as they should. That's why most of the time I use stop/start rather than restart, and I check for remaining processes before doing the start. Sorry if you already tried that (you speak of netstat), but you could try to search for shinken processes with *ps -fu shinken* and for listening processes on Pyro ports with *lsof -i ":7770"*? Regards, Denis GERMAIN 2011/9/23 Jörg Schulz <jor...@ln...> > Hello > > OK , i got it , i could start shinken on SITE A , the I copied the same > config to Site B, restart shinken ☹ the bad_start_arbiter log shows > shinken.pyro_wrapper.PortNotFree: Sorry, the port 7770 is not free: > Couldn't start Pyro daemon: [Errno 99] Cannot assign requested address > But told me that he is the slave arbiter ! > but 7770 is free and not showing up in netstat > > > Loading configuration > Opening configuration file /etc/shinken/nagios.cfg > Processing object config file > '/usr/local/nagios/etc/objects/Default_collector/extended_host_info.cfg' > Processing object config file > '/usr/local/nagios/etc/objects/Default_collector/hosts.cfg' > Processing object config file > '/usr/local/nagios/etc/objects/Default_collector/hostgroups.cfg' > Processing object config file > '/usr/local/nagios/etc/objects/Default_collector/servicegroups.cfg' > Processing object config file > '/usr/local/nagios/etc/objects/Default_collector/services.cfg' > Processing object config file > '/usr/local/nagios/etc/objects/Default_collector/extended_service_info.cfg' > Processing object config file > '/usr/local/nagios/etc/objects/global/timeperiods.cfg' > Processing object config file > '/usr/local/nagios/etc/objects/global/service_templates.cfg' > Processing object config file > '/usr/local/nagios/etc/objects/global/contacts.cfg' > Processing object config file > '/usr/local/nagios/etc/objects/global/host_templates.cfg' > Processing object config file > '/usr/local/nagios/etc/objects/global/checkcommands.cfg' > Processing object config file > '/usr/local/nagios/etc/objects/global/misccommands.cfg' > Processing object config file > '/usr/local/nagios/etc/objects/global/contactgroups.cfg' > Processing object config file '/etc/shinken/resource.cfg' > Opening configuration file /etc/shinken/shinken-specific.cfg > Warning : I autogenerated some Arbiter modules, please look at your > configuration > Warning : the module NamedPipe-Autogenerated is autogenerated > I am the spare Arbiter : Arbiter-slave > My own modules : NamedPipe-Autogenerated > Warning in importing module : No module named redis > Warning in importing module : No module named memcache > Get a Named pipe module for plugin NamedPipe-Autogenerated > I correctly loaded the modules : [NamedPipe-Autogenerated] > All : (in/potential) (schedulers:2) (pollers:1/2) (reactionners:1/2) > (brokers:1/2) (receivers:0/0) > Running pre-flight check on configuration data... > Checking global parameters... > Checking hosts... > Checked 308 hosts > Checking hostgroups... > Checked 27 hostgroups > Checking contacts... > Checked 1 contacts > Checking contactgroups... > Checked 31 contactgroups > Checking notificationways... > Checked 1 notificationways > Checking escalations... > Checked 0 escalations > Checking services... > Checked 102 services > Checking servicegroups... > Checked 0 servicegroups > Checking timeperiods... > Checked 7 timeperiods > Checking commands... > Checked 48 commands > Checking servicedependencies... > Checked 0 servicedependencies > Checking hostdependencies... > Checked 0 hostdependencies > Checking arbiterlinks... > Checked 2 arbiterlinks > Checking schedulerlinks... > Checked 2 schedulerlinks > Checking reactionners... > Checked 2 reactionners > Checking pollers... > Checked 2 pollers > Checking brokers... > Checked 2 brokers > Checking receivers... > Checked 0 receivers > Checking resultmodulations... > Checked 1 resultmodulations > Checking discoveryrules... > Checked 0 discoveryrules > Checking discoveryruns... > Checked 0 discoveryruns > Checking criticitymodulations... > Checked 0 criticitymodulations > Cutting the hosts and services into parts > Creating packs for realms > Number of hosts in the realm All : 308 > Things look okay - No serious problems were detected during the pre-flight > check > Configuration Loaded > > Successfully changed to workdir: /var/lib/shinken > opening pid file: /var/lib/shinken/arbiterd.pid > /var/lib/shinken/arbiterd.pid > stale pidfile exists (no or invalid or unreadable content). reusing it. > CRITICAL ERROR : I got an non recovarable error. I must exit > You can log a bug ticket at > https://sourceforge.net/apps/trac/shinken/newticket for geting help > Back trace of it: Traceback (most recent call last): > File > "/usr/local/lib64/python2.6/site-packages/shinken/daemons/arbiterdaemon.py", > line 413, in main > self.do_daemon_init_and_start() > File "/usr/local/lib64/python2.6/site-packages/shinken/daemon.py", line > 429, in do_daemon_init_and_start > self.setup_pyro_daemon() > File "/usr/local/lib64/python2.6/site-packages/shinken/daemon.py", line > 470, in setup_pyro_daemon > self.pyro_daemon = pyro.ShinkenPyroDaemon(self.host, self.port, > ssl_conf.use_ssl) > File "/usr/local/lib64/python2.6/site-packages/shinken/pyro_wrapper.py", > line 77, in __init__ > raise PortNotFree(msg) > PortNotFree: Sorry, the port 7770 is not free: Couldn't start Pyro daemon: > [Errno 99] Cannot assign requested address > > Traceback (most recent call last): > File "/usr/local/bin/shinken-arbiter", line 100, in <module> > daemon.main() > File > "/usr/local/lib64/python2.6/site-packages/shinken/daemons/arbiterdaemon.py", > line 413, in main > self.do_daemon_init_and_start() > File "/usr/local/lib64/python2.6/site-packages/shinken/daemon.py", line > 429, in do_daemon_init_and_start > self.setup_pyro_daemon() > File "/usr/local/lib64/python2.6/site-packages/shinken/daemon.py", line > 470, in setup_pyro_daemon > self.pyro_daemon = pyro.ShinkenPyroDaemon(self.host, self.port, > ssl_conf.use_ssl) > File "/usr/local/lib64/python2.6/site-packages/shinken/pyro_wrapper.py", > line 77, in __init__ > raise PortNotFree(msg) > shinken.pyro_wrapper.PortNotFree: Sorry, the port 7770 is not free: > Couldn't start Pyro daemon: [Errno 99] Cannot assign requested address > > > > Jörg Schulz > > From: nap [mailto:nap...@gm...] > Sent: den 22 september 2011 15:25 > To: shi...@li... > Subject: Re: [Shinken-devel] HA config > > > On Thu, Sep 22, 2011 at 3:02 PM, Jörg Schulz <jor...@ln...> wrote: > Hi > [...] > You can log a bug ticket at > https://sourceforge.net/apps/trac/shinken/newticket for geting help > Back trace of it: Traceback (most recent call last): > File > "/usr/local/lib64/python2.6/site-packages/shinken/daemons/arbiterdaemon.py", > line 411, in main > self.load_config_file() > File > "/usr/local/lib64/python2.6/site-packages/shinken/daemons/arbiterdaemon.py", > line 246, in load_config_file > "Thanks.") > TypeError: exit expected at most 1 arguments, got 2 > > What I did was to copy the " shinken-specific-high-availability.cfg" > > shinken-specific.cfg , maybe wrong ? > I followed the " > http://www.shinken-monitoring.org/wiki/setup_high_availability_shinken" > scenario > Hi, > > It's a bug. The message it is trying to raise is : > Error: I cannot find my own Arbiter object, I bail out. " > "To solve it, please change the host_name parameter in > " > "the object Arbiter in the file shinken-specific.cfg. > " > "With the value BLABLA Thanks" > (with BLABLA the hostname value) > > Thanks for reporting it, I'm fixing this :) > > For your installation, it means that the host_name parameters in your > shinken-specific.cfg file are not configured correctly. You should put in > the arbiters objects the host_name value that you got with a hostname > command, so the arbiter will got a way to find which arbiter object it is :) > > Regards, > > > Jean > > Local is no problem > > Cheers > /J > > > Jörg Schulz > > ------------------------------------------------------------------------------ > All of the data generated in your IT infrastructure is seriously valuable. > Why? It contains a definitive record of application performance, security > threats, fraudulent activity, and more. Splunk takes this data and makes > sense of it. IT sense. And common sense. > http://p.sf.net/sfu/splunk-d2dcopy2 > _______________________________________________ > Shinken-devel mailing list > Shi...@li... > https://lists.sourceforge.net/lists/listinfo/shinken-devel > |
From: nap <nap...@gm...> - 2011-09-23 11:54:21
|
On Fri, Sep 23, 2011 at 1:09 PM, Denis GERMAIN <dt....@gm...> wrote: > Hi, > > Are you really sure that nothing is listening on this port? When using > restart, I sometimes have some issues with shinken processes not being > killed as they should. That's why most of the time I use stop/start rather > than restart, and I check for remaining processes before doing the start. > > Sorry if you already tried that (you speak of netstat), but you could try > to search for shinken processes with *ps -fu shinken* and for listening > processes on Pyro ports with *lsof -i ":7770"*? > > Regards, > > Denis GERMAIN Hi, You can also look at the address parameter of your arbiter object. It should be a valid name so the network stack will now which interface to open. By default ti's localhost, but in a HA installation it's of course not a good idea, because the other daemon will not know the real address. If unsure, look if you got your own address entry in /etc/hosts, and not for 127.0.0.1 but the real LAN entry. Jean |
From: Jörg S. <jor...@ln...> - 2011-09-23 11:55:51
|
Hi, I did lsof -i ":7770" and nothing is running on this port , disabled firewall , reboot machine, same FAILED: shinken.pyro_wrapper.PortNotFree: Sorry, the port 7770 is not free: Couldn't start Pyro daemon: [Errno 99] Cannot assign requested address (full output is in /tmp/bad_start_for_arbiter) I don’t get it ! My config ... define arbiter{ arbiter_name Arbiter-master host_name nagi address <SITE A IP> port 7770 spare 0 #modules No module for now } #the slave, waiting patiently for its master to die define arbiter{ arbiter_name Arbiter-slave host_name nilban address <SITE B IP> port 7770 spare 1 #modules No module for now } Site A is working fine , Site B not :( The linux machines site a/b are 100% identical Greetings Jörg From: Denis GERMAIN [mailto:dt....@gm...] Sent: den 23 september 2011 13:10 To: shi...@li... Subject: Re: [Shinken-devel] HA config Hi, Are you really sure that nothing is listening on this port? When using restart, I sometimes have some issues with shinken processes not being killed as they should. That's why most of the time I use stop/start rather than restart, and I check for remaining processes before doing the start. Sorry if you already tried that (you speak of netstat), but you could try to search for shinken processes with ps -fu shinken and for listening processes on Pyro ports with lsof -i ":7770"? Regards, Denis GERMAIN 2011/9/23 Jörg Schulz <jor...@ln...> Hello OK , i got it , i could start shinken on SITE A , the I copied the same config to Site B, restart shinken ☹ the bad_start_arbiter log shows shinken.pyro_wrapper.PortNotFree: Sorry, the port 7770 is not free: Couldn't start Pyro daemon: [Errno 99] Cannot assign requested address But told me that he is the slave arbiter ! but 7770 is free and not showing up in netstat Loading configuration Opening configuration file /etc/shinken/nagios.cfg Processing object config file '/usr/local/nagios/etc/objects/Default_collector/extended_host_info.cfg' Processing object config file '/usr/local/nagios/etc/objects/Default_collector/hosts.cfg' Processing object config file '/usr/local/nagios/etc/objects/Default_collector/hostgroups.cfg' Processing object config file '/usr/local/nagios/etc/objects/Default_collector/servicegroups.cfg' Processing object config file '/usr/local/nagios/etc/objects/Default_collector/services.cfg' Processing object config file '/usr/local/nagios/etc/objects/Default_collector/extended_service_info.cfg' Processing object config file '/usr/local/nagios/etc/objects/global/timeperiods.cfg' Processing object config file '/usr/local/nagios/etc/objects/global/service_templates.cfg' Processing object config file '/usr/local/nagios/etc/objects/global/contacts.cfg' Processing object config file '/usr/local/nagios/etc/objects/global/host_templates.cfg' Processing object config file '/usr/local/nagios/etc/objects/global/checkcommands.cfg' Processing object config file '/usr/local/nagios/etc/objects/global/misccommands.cfg' Processing object config file '/usr/local/nagios/etc/objects/global/contactgroups.cfg' Processing object config file '/etc/shinken/resource.cfg' Opening configuration file /etc/shinken/shinken-specific.cfg Warning : I autogenerated some Arbiter modules, please look at your configuration Warning : the module NamedPipe-Autogenerated is autogenerated I am the spare Arbiter : Arbiter-slave My own modules : NamedPipe-Autogenerated Warning in importing module : No module named redis Warning in importing module : No module named memcache Get a Named pipe module for plugin NamedPipe-Autogenerated I correctly loaded the modules : [NamedPipe-Autogenerated] All : (in/potential) (schedulers:2) (pollers:1/2) (reactionners:1/2) (brokers:1/2) (receivers:0/0) Running pre-flight check on configuration data... Checking global parameters... Checking hosts... Checked 308 hosts Checking hostgroups... Checked 27 hostgroups Checking contacts... Checked 1 contacts Checking contactgroups... Checked 31 contactgroups Checking notificationways... Checked 1 notificationways Checking escalations... Checked 0 escalations Checking services... Checked 102 services Checking servicegroups... Checked 0 servicegroups Checking timeperiods... Checked 7 timeperiods Checking commands... Checked 48 commands Checking servicedependencies... Checked 0 servicedependencies Checking hostdependencies... Checked 0 hostdependencies Checking arbiterlinks... Checked 2 arbiterlinks Checking schedulerlinks... Checked 2 schedulerlinks Checking reactionners... Checked 2 reactionners Checking pollers... Checked 2 pollers Checking brokers... Checked 2 brokers Checking receivers... Checked 0 receivers Checking resultmodulations... Checked 1 resultmodulations Checking discoveryrules... Checked 0 discoveryrules Checking discoveryruns... Checked 0 discoveryruns Checking criticitymodulations... Checked 0 criticitymodulations Cutting the hosts and services into parts Creating packs for realms Number of hosts in the realm All : 308 Things look okay - No serious problems were detected during the pre-flight check Configuration Loaded Successfully changed to workdir: /var/lib/shinken opening pid file: /var/lib/shinken/arbiterd.pid /var/lib/shinken/arbiterd.pid stale pidfile exists (no or invalid or unreadable content). reusing it. CRITICAL ERROR : I got an non recovarable error. I must exit You can log a bug ticket at https://sourceforge.net/apps/trac/shinken/newticket for geting help Back trace of it: Traceback (most recent call last): File "/usr/local/lib64/python2.6/site-packages/shinken/daemons/arbiterdaemon.py", line 413, in main self.do_daemon_init_and_start() File "/usr/local/lib64/python2.6/site-packages/shinken/daemon.py", line 429, in do_daemon_init_and_start self.setup_pyro_daemon() File "/usr/local/lib64/python2.6/site-packages/shinken/daemon.py", line 470, in setup_pyro_daemon self.pyro_daemon = pyro.ShinkenPyroDaemon(self.host, self.port, ssl_conf.use_ssl) File "/usr/local/lib64/python2.6/site-packages/shinken/pyro_wrapper.py", line 77, in __init__ raise PortNotFree(msg) PortNotFree: Sorry, the port 7770 is not free: Couldn't start Pyro daemon: [Errno 99] Cannot assign requested address Traceback (most recent call last): File "/usr/local/bin/shinken-arbiter", line 100, in <module> daemon.main() File "/usr/local/lib64/python2.6/site-packages/shinken/daemons/arbiterdaemon.py", line 413, in main self.do_daemon_init_and_start() File "/usr/local/lib64/python2.6/site-packages/shinken/daemon.py", line 429, in do_daemon_init_and_start self.setup_pyro_daemon() File "/usr/local/lib64/python2.6/site-packages/shinken/daemon.py", line 470, in setup_pyro_daemon self.pyro_daemon = pyro.ShinkenPyroDaemon(self.host, self.port, ssl_conf.use_ssl) File "/usr/local/lib64/python2.6/site-packages/shinken/pyro_wrapper.py", line 77, in __init__ raise PortNotFree(msg) shinken.pyro_wrapper.PortNotFree: Sorry, the port 7770 is not free: Couldn't start Pyro daemon: [Errno 99] Cannot assign requested address Jörg Schulz From: nap [mailto:nap...@gm...] Sent: den 22 september 2011 15:25 To: shi...@li... Subject: Re: [Shinken-devel] HA config On Thu, Sep 22, 2011 at 3:02 PM, Jörg Schulz <jor...@ln...> wrote: Hi [...] You can log a bug ticket at https://sourceforge.net/apps/trac/shinken/newticket for geting help Back trace of it: Traceback (most recent call last): File "/usr/local/lib64/python2.6/site-packages/shinken/daemons/arbiterdaemon.py", line 411, in main self.load_config_file() File "/usr/local/lib64/python2.6/site-packages/shinken/daemons/arbiterdaemon.py", line 246, in load_config_file "Thanks.") TypeError: exit expected at most 1 arguments, got 2 What I did was to copy the " shinken-specific-high-availability.cfg" > shinken-specific.cfg , maybe wrong ? I followed the " http://www.shinken-monitoring.org/wiki/setup_high_availability_shinken" scenario Hi, It's a bug. The message it is trying to raise is : Error: I cannot find my own Arbiter object, I bail out. " "To solve it, please change the host_name parameter in " "the object Arbiter in the file shinken-specific.cfg. " "With the value BLABLA Thanks" (with BLABLA the hostname value) Thanks for reporting it, I'm fixing this :) For your installation, it means that the host_name parameters in your shinken-specific.cfg file are not configured correctly. You should put in the arbiters objects the host_name value that you got with a hostname command, so the arbiter will got a way to find which arbiter object it is :) Regards, Jean Local is no problem Cheers /J Jörg Schulz ------------------------------------------------------------------------------ All of the data generated in your IT infrastructure is seriously valuable. Why? It contains a definitive record of application performance, security threats, fraudulent activity, and more. Splunk takes this data and makes sense of it. IT sense. And common sense. http://p.sf.net/sfu/splunk-d2dcopy2 _______________________________________________ Shinken-devel mailing list Shi...@li... https://lists.sourceforge.net/lists/listinfo/shinken-devel |
From: nap <nap...@gm...> - 2011-09-23 12:04:24
|
On Fri, Sep 23, 2011 at 1:55 PM, Jörg Schulz <jor...@ln...> wrote: > Hi, > > I did lsof -i ":7770" and nothing is running on this port , disabled > firewall , reboot machine, same > > FAILED: shinken.pyro_wrapper.PortNotFree: Sorry, the port 7770 is not free: > Couldn't start Pyro daemon: [Errno 99] Cannot assign requested address (full > output is in /tmp/bad_start_for_arbiter) > > I don’t get it ! > > My config > > ... > define arbiter{ > arbiter_name Arbiter-master > host_name nagi > address <SITE A IP> > port 7770 > spare 0 > #modules No module for now > } > > #the slave, waiting patiently for its master to die > define arbiter{ > arbiter_name Arbiter-slave > host_name nilban > address <SITE B IP> > port 7770 > spare 1 > #modules No module for now > } > > Site A is working fine , Site B not :( > The linux machines site a/b are 100% identical > > Can you apply this patch : diff --git a/shinken/pyro_wrapper.py b/shinken/pyro_wrapper.py index b3c8ef7..4f6ae2a 100644 --- a/shinken/pyro_wrapper.py +++ b/shinken/pyro_wrapper.py @@ -67,6 +67,7 @@ try: else: prtcol = 'PYRO' + print "Initializing Pyro connection with host:%s port:%s ssl:%s" % (host, port, use_ssl) # Now the real start try: Pyro.core.Daemon.__init__(self, host=host, port=port, prtcol=prtcol, norange=True) @@ -151,6 +152,7 @@ except AttributeError, exp: # so we allow to retry during 35 sec (30 sec is the default # timewait for close sockets) while nb_try <= 35: + print "Initializing Pyro connection with host:%s port:%s ssl:%s" % (host, port, use_ssl) # And port already use now raise an exception try: Pyro.core.Daemon.__init__(self, host=host, port=port) And see the output of the arbiter? And which Pyro version are you using? (shinken-arbiter --version) Thanks, Jean > > > Greetings > Jörg > > |
From: Jörg S. <jor...@ln...> - 2011-09-23 12:29:54
|
Hi Applied the patch , not working , get the git shinken-arbiter : 0.6.5+ with pyro : 3.15 bad_ass_arbiter Creating packs for realms Number of hosts in the realm All : 308 Things look okay - No serious problems were detected during the pre-flight check Configuration Loaded Successfully changed to workdir: /var/lib/shinken opening pid file: /var/lib/shinken/arbiterd.pid /var/lib/shinken/arbiterd.pid stale pidfile exists (no or invalid or unreadable content). reusing it. Initializing Pyro connection with host:194.47.210.110 port:7770 ssl:False !THIS IS SITE B! CRITICAL ERROR : I got an non recoverable error. I must exit You can log a bug ticket at https://sourceforge.net/apps/trac/shinken/newticket for geting help Back trace of it: Traceback (most recent call last): File "/usr/local/lib64/python2.6/site-packages/shinken/daemons/arbiterdaemon.py", line 437, in main self.do_daemon_init_and_start() File "/usr/local/lib64/python2.6/site-packages/shinken/daemon.py", line 430, in do_daemon_init_and_start self.setup_pyro_daemon() File "/usr/local/lib64/python2.6/site-packages/shinken/daemon.py", line 475, in setup_pyro_daemon self.pyro_daemon = pyro.ShinkenPyroDaemon(self.host, self.port, ssl_conf.use_ssl) File "/usr/local/lib64/python2.6/site-packages/shinken/pyro_wrapper.py", line 79, in __init__ raise PortNotFree(msg) PortNotFree: Sorry, the port 7770 is not free: Couldn't start Pyro daemon: [Errno 99] Cannot assign requested address Traceback (most recent call last): File "/usr/local/bin/shinken-arbiter", line 101, in <module> daemon.main() File "/usr/local/lib64/python2.6/site-packages/shinken/daemons/arbiterdaemon.py", line 437, in main self.do_daemon_init_and_start() File "/usr/local/lib64/python2.6/site-packages/shinken/daemon.py", line 430, in do_daemon_init_and_start self.setup_pyro_daemon() File "/usr/local/lib64/python2.6/site-packages/shinken/daemon.py", line 475, in setup_pyro_daemon self.pyro_daemon = pyro.ShinkenPyroDaemon(self.host, self.port, ssl_conf.use_ssl) File "/usr/local/lib64/python2.6/site-packages/shinken/pyro_wrapper.py", line 79, in __init__ raise PortNotFree(msg) shinken.pyro_wrapper.PortNotFree: Sorry, the port 7770 is not free: Couldn't start Pyro daemon: [Errno 99] Cannot assign requested address Vänliga hälsningar ______________________ Jörg Schulz Systemtekniker IT-Sektion Linnéuniversitetet 391 82 Kalmar / 351 95 Växjö 0480-44 62 44 Direkt 0705-946244 Mobil jor...@ln... From: nap [mailto:nap...@gm...] Sent: den 23 september 2011 14:04 To: shi...@li... Subject: Re: [Shinken-devel] HA config On Fri, Sep 23, 2011 at 1:55 PM, Jörg Schulz <jor...@ln...> wrote: Hi, I did lsof -i ":7770" and nothing is running on this port , disabled firewall , reboot machine, same FAILED: shinken.pyro_wrapper.PortNotFree: Sorry, the port 7770 is not free: Couldn't start Pyro daemon: [Errno 99] Cannot assign requested address (full output is in /tmp/bad_start_for_arbiter) I don't get it ! My config ... define arbiter{ arbiter_name Arbiter-master host_name nagi address <SITE A IP> port 7770 spare 0 #modules No module for now } #the slave, waiting patiently for its master to die define arbiter{ arbiter_name Arbiter-slave host_name nilban address <SITE B IP> port 7770 spare 1 #modules No module for now } Site A is working fine , Site B not :( The linux machines site a/b are 100% identical Can you apply this patch : diff --git a/shinken/pyro_wrapper.py b/shinken/pyro_wrapper.py index b3c8ef7..4f6ae2a 100644 --- a/shinken/pyro_wrapper.py +++ b/shinken/pyro_wrapper.py @@ -67,6 +67,7 @@ try: else: prtcol = 'PYRO' + print "Initializing Pyro connection with host:%s port:%s ssl:%s" % (host, port, use_ssl) # Now the real start try: Pyro.core.Daemon.__init__(self, host=host, port=port, prtcol=prtcol, norange=True) @@ -151,6 +152,7 @@ except AttributeError, exp: # so we allow to retry during 35 sec (30 sec is the default # timewait for close sockets) while nb_try <= 35: + print "Initializing Pyro connection with host:%s port:%s ssl:%s" % (host, port, use_ssl) # And port already use now raise an exception try: Pyro.core.Daemon.__init__(self, host=host, port=port) And see the output of the arbiter? And which Pyro version are you using? (shinken-arbiter --version) Thanks, Jean Greetings Jörg |
From: nap <nap...@gm...> - 2011-09-23 12:38:33
|
On Fri, Sep 23, 2011 at 2:29 PM, Jörg Schulz <jor...@ln...> wrote: > Hi > Applied the patch , not working , get the git > shinken-arbiter : 0.6.5+ with pyro : 3.15 > bad_ass_arbiter > > Creating packs for realms > Number of hosts in the realm All : 308 > Things look okay - No serious problems were detected during the pre-flight > check Configuration Loaded > > Successfully changed to workdir: /var/lib/shinken opening pid file: > /var/lib/shinken/arbiterd.pid /var/lib/shinken/arbiterd.pid stale pidfile > exists (no or invalid or unreadable content). reusing it. > Initializing Pyro connection with host:194.47.210.110 port:7770 ssl:False > !THIS IS SITE B! > CRITICAL ERROR : I got an non recoverable error. I must exit You can log a > bug ticket at https://sourceforge.net/apps/trac/shinken/newticket for > geting help Back trace of it: Traceback (most recent call last): > File > "/usr/local/lib64/python2.6/site-packages/shinken/daemons/arbiterdaemon.py", > line 437, in main > self.do_daemon_init_and_start() > File "/usr/local/lib64/python2.6/site-packages/shinken/daemon.py", line > 430, in do_daemon_init_and_start > self.setup_pyro_daemon() > File "/usr/local/lib64/python2.6/site-packages/shinken/daemon.py", line > 475, in setup_pyro_daemon > self.pyro_daemon = pyro.ShinkenPyroDaemon(self.host, self.port, > ssl_conf.use_ssl) > File "/usr/local/lib64/python2.6/site-packages/shinken/pyro_wrapper.py", > line 79, in __init__ > raise PortNotFree(msg) > PortNotFree: Sorry, the port 7770 is not free: Couldn't start Pyro daemon: > [Errno 99] Cannot assign requested address > And so the 194.47.210.110 is the site B address. Is IPV6 enabled? Can you give us a ipconfig output and the /etc/hosts file? I try with my own arbiter with the LAN address, but it start without error. Can you try to launch : nc 194.47.210.110 -l 7770 Thanks, Jean |