Thread: [Shinken-devel] Some bugs or possible new features
Status: Beta
Brought to you by:
naparuba
From: Markus E. <mar...@un...> - 2011-05-25 14:38:49
|
Hi @all, we've tested our configuration with shinken the last days and we found some bugs or at least nice to have features. ;) A very very short description of our configuration: - a shinken master - some satellites with scheduler and poller 1 When I add the module Status-Dat to my broker (on the master, with a single broker) and restart shinken, the status.dat is created, but not filled with data (should be updated every 15 seconds). It seems that the module simple-log has problems with archiving old logs (directory 'archives' not found). After I created the directory archives, deleted status.dat and objects.cache I restarted shinken and everything worked. I think it would be nice if the simple-log wouldn't block the other modules from running? This is the error the broker wrote to his log: 2011-05-25 00:00:11,325 [1306274411] [broker-all] We are archiving the old log file 2011-05-25 00:00:11,326 [1306274411] [broker-all] Moving the old log file from nagios.log to archives/nagios-05-24-2011-00.log 2011-05-25 00:00:11,326 [1306274411] [broker-all] Error : the instance Simple-log raised an exception [Errno 2] No such file or directory: u'archives/nagios-05-24-2011-00.log', I remove it! 2011-05-25 00:00:11,327 [1306274411] [broker-all] Back trace of this remove : Traceback (most recent call last): File "/usr/lib/python2.5/site-packages/shinken/modulesmanager.py", line 124, in try_instance_init inst.init() File "/usr/lib/python2.5/site-packages/shinken/modules/simplelog_broker.py", line 143, in init moved = self.check_and_do_archive(first_pass=True) File "/usr/lib/python2.5/site-packages/shinken/modules/simplelog_broker.py", line 115, in check_and_do_archive shutil.move(self.path, file_archive_path) File "/usr/lib/python2.5/shutil.py", line 199, in move copy2(src,dst) File "/usr/lib/python2.5/shutil.py", line 91, in copy2 copyfile(src, dst) File "/usr/lib/python2.5/shutil.py", line 47, in copyfile fdst = open(dst, 'wb') IOError: [Errno 2] No such file or directory: u'archives/nagios-05-24-2011-00.log' 2 I have some hosts with 'parents' which are not in the same realm. During the configcheck the following error was thrown --> Error : the realm configuration of your hosts is not good because there a more than one realm in one pack (host relations). But at the end the configcheck says that everything looks good (Things look okay - No serious problems were detected during the pre-flight check). But Shinken doesn't check and process the Host. It would be nice, if such an Error could be mentioned at the end of the configcheck. So I have to search the whole checkconfig output, because there 'maybe' is a mistake which is not mentioned in the summary at the end. 3 When I start shinken, the arbiter splits the configuration and send it to the schedulers. They begin to work (I watched this in the logs). But in Thruk the number of hosts and services is changing repeatedly in the first minutes. The number is increasing and decreasing in the size of the host/service number of random realms. After a few minutes the numbers are correct. For example: I have 4 realms with 10 Hosts each. Then it could be that thruk says there are 20 hosts, then there are 10 hosts, then 20, then 30 ....and so on.... I think the Arbiter/Broker should know the whole number of hosts and services before the Arbiter splits them (that means from the beginning and not some minutes later)? So that the correct number can be send out immediately via Livestatus at the start of shinken. Because now I was sitting in front of Thruk and waited untill the correct number of hosts showed up. This is important, because together with my bug report number 2 I have to make sure that shinken is checking all hosts and services. 4 When I create a realm and assign a scheduler and poller to it, but don't make him a member of the default realm I get no error message. The scheduler and poller in the created realm get their configuration from the arbiter, but no broker will output data, because none is assigned to the new realm? Isn't it better if there is any warning/error message during configcheck? 5 The configcheck incorrectly counts the number of hosts in a realm. Well...I think it's better when I show you an example, instead of describing it ;) Running pre-flight check on configuration data... Checking global parameters... Checking hosts... Checked 574 hosts Checking hostgroups... Checked 173 hostgroups Checking contacts... Checked 14 contacts Checking contactgroups... Checked 4 contactgroups Checking notificationways... Checked 14 notificationways Checking escalations... Checked 0 escalations Checking services... Checked 7579 services Checking servicegroups... Checked 24 servicegroups Checking timeperiods... Checked 5 timeperiods Checking commands... Checked 207 commands Checking servicedependencies... Checked 0 servicedependencies Checking hostdependencies... Checked 0 hostdependencies Checking arbiterlinks... Checked 1 arbiterlinks Checking schedulerlinks... Checked 11 schedulerlinks Checking reactionners... Checked 1 reactionners Checking pollers... Checked 11 pollers Checking brokers... Checked 1 brokers Checking receivers... Checked 1 receivers Checking resultmodulations... Checked 1 resultmodulations Checking discoveryrules... Checked 0 discoveryrules Checking discoveryruns... Checked 0 discoveryruns Cutting the hosts and services into parts Creating packs for realms Number of hosts in the realm All : 0 Number of hosts in the realm test1 : 0 Number of hosts in the realm test2 : 0 Number of hosts in the realm test3 : 0 Number of hosts in the realm test4 : 1 Number of hosts in the realm test5 : 1 Number of hosts in the realm test6 : 2 Number of hosts in the realm test7 : 0 Number of hosts in the realm test8 : 58 Number of hosts in the realm test9 : 1 Number of hosts in the realm test10 : 1 Things look okay - No serious problems were detected during the pre-flight check You see i've got 574 hosts which are in realms. But where are they after 'Creating packs for realms'? The realm test8 with it's 58 Hosts is the server with the realm, where the shinken master runs, and from where I run the configcheck (only on the master checkconfig counted correct). As I see the hosts are put to their realms correctly, because when I run shinken everything works fine. But it would be nice if this summary will be accurate, too ;) So, that's it for now. ;) Markus Elger Linux-Systemadministrator Unister GmbH Barfußgässchen 11 | 04109 Leipzig Telefon: +49 (0)341 49288 4537 mar...@un... www.unister.de Vertretungsberechtigter Geschäftsführer: Thomas Wagner Amtsgericht Leipzig, HRB: 19056 |
From: nap <nap...@gm...> - 2011-05-25 15:03:11
|
On Wed, May 25, 2011 at 4:38 PM, Markus Elger <mar...@un...>wrote: > Hi @all, > > Hi :) > we've tested our configuration with shinken the last days and we found > some bugs or at least nice to have features. ;) > Ok, lets see this :) > > A very very short description of our configuration: > > - a shinken master > - some satellites with scheduler and poller > > > 1 > When I add the module Status-Dat to my broker (on the master, with a > single broker) and restart shinken, the status.dat is created, but not > filled with data (should be updated every 15 seconds). It seems that the > module simple-log has problems with archiving old logs (directory > 'archives' not found). After I created the directory archives, deleted > status.dat and objects.cache I restarted shinken and everything worked. > I think it would be nice if the simple-log wouldn't block the other > modules from running? > [...] > IOError: [Errno 2] No such file or directory: > u'archives/nagios-05-24-2011-00.log' > The internal modules are not dependent each other, so I don't understand this problem. Can you try to reproduce it without the log module? I think it's a pure status.dat bug, because the broker is giving the brok to all modules, and if one die, it's just put in another queue but still give the brok to the other modules (or at least it should). > > > 2 > I have some hosts with 'parents' which are not in the same realm. During > the configcheck the following error was thrown --> Error : the realm > configuration of your hosts is not good because there a more than one > realm in one pack (host relations). > But at the end the configcheck says that everything looks good (Things > look okay - No serious problems were detected during the pre-flight > check). But Shinken doesn't check and process the Host. > It would be nice, if such an Error could be mentioned at the end of the > configcheck. So I have to search the whole checkconfig output, because > there 'maybe' is a mistake which is not mentioned in the summary at the > end. > It's a real problem here for sure. I'll have a look at it. (ticket https://sourceforge.net/apps/trac/shinken/ticket/263) > > > 3 > When I start shinken, the arbiter splits the configuration and send it > to the schedulers. They begin to work (I watched this in the logs). But > in Thruk the number of hosts and services is changing repeatedly in the > [...] showed up. This is important, because together with my bug report number > 2 I have to make sure that shinken is checking all hosts and services. > > No in fact it's a problem in the LiveStatus module. We put some fixes on this part some days ago for the 0.6.4, is your version ok? I think there is still some problems here for the "object creation pass" that give us some bad length of list, we will hunt this :) > > 4 > When I create a realm and assign a scheduler and poller to it, but > don't make him a member of the default realm I get no error message. > The scheduler and poller in the created realm get their configuration > from the arbiter, but no broker will output data, because none is > assigned to the new realm? > > Isn't it better if there is any warning/error message during > configcheck? > It do not have to be a member of the "default" realm. But the fact that they will have no broker can be a reason to raise a warning it's true :) Another ticket :) (https://sourceforge.net/apps/trac/shinken/ticket/264) > > > 5 > The configcheck incorrectly counts the number of hosts in a realm. > Well...I think it's better when I show you an example, instead of > describing it ;) > [...] > > > You see i've got 574 hosts which are in realms. But where are they after > 'Creating packs for realms'? The realm test8 with it's 58 Hosts is the > server with the realm, where the shinken master runs, and from where I > run the configcheck (only on the master checkconfig counted correct). > As I see the hosts are put to their realms correctly, because when I run > shinken everything works fine. But it would be nice if this summary will > be accurate, too ;) > Oh, still another ticket :) should be the easier of all :) ( https://sourceforge.net/apps/trac/shinken/ticket/265) > > > So, that's it for now. ;) > Ok thanks for all theses bugs reports :) The first one and the thruk count aside, others should be quickly fixed :) Regards, Jean > > > Markus Elger > Linux-Systemadministrator > > Unister GmbH > Barfußgässchen 11 | 04109 Leipzig > > Telefon: +49 (0)341 49288 4537 > mar...@un... > www.unister.de > > Vertretungsberechtigter Geschäftsführer: Thomas Wagner > Amtsgericht Leipzig, HRB: 19056 > > |
From: Grégory S. <g.s...@gm...> - 2011-05-25 15:36:04
|
2011/5/25 nap <nap...@gm...> > > > On Wed, May 25, 2011 at 4:38 PM, Markus Elger <mar...@un...>wrote: > >> Hi @all, >> >> Hi :) > > hi too :) > we've tested our configuration with shinken the last days and we found >> some bugs or at least nice to have features. ;) >> > Ok, lets see this :) > > >> >> A very very short description of our configuration: >> >> - a shinken master >> - some satellites with scheduler and poller >> >> >> 1 >> When I add the module Status-Dat to my broker (on the master, with a >> single broker) and restart shinken, the status.dat is created, but not >> filled with data (should be updated every 15 seconds). It seems that the >> module simple-log has problems with archiving old logs (directory >> 'archives' not found). After I created the directory archives, deleted >> status.dat and objects.cache I restarted shinken and everything worked. >> I think it would be nice if the simple-log wouldn't block the other >> modules from running? >> [...] >> >> IOError: [Errno 2] No such file or directory: >> u'archives/nagios-05-24-2011-00.log' >> > > do you have well a "archives" subdirectory present in your normal "var" broker directory ? if not then the fix is going to be trivial.. ;) regards, greg. |
From: Olivier H. <oli...@gm...> - 2011-05-25 17:22:02
|
Hi, Ronny said that you are in 0.6.3 version of Shinken. You should definitely try to "pull" the latest version. Jean has fixed some bugs about LiveStatus this week. I used to get some irrelevant configuration with Livestatus (duplicate number of hosts/services for example). Now with the lastest version, everything seems fine. Regards Olivier Le 25/05/2011 16:38, Markus Elger a écrit : > Hi @all, > > we've tested our configuration with shinken the last days and we found > some bugs or at least nice to have features. ;) > > > A very very short description of our configuration: > > - a shinken master > - some satellites with scheduler and poller > > > 1 > When I add the module Status-Dat to my broker (on the master, with a > single broker) and restart shinken, the status.dat is created, but not > filled with data (should be updated every 15 seconds). It seems that the > module simple-log has problems with archiving old logs (directory > 'archives' not found). After I created the directory archives, deleted > status.dat and objects.cache I restarted shinken and everything worked. > I think it would be nice if the simple-log wouldn't block the other > modules from running? > > This is the error the broker wrote to his log: > > 2011-05-25 00:00:11,325 [1306274411] [broker-all] We are archiving the > old log file > > 2011-05-25 00:00:11,326 [1306274411] [broker-all] Moving the old log > file from nagios.log to archives/nagios-05-24-2011-00.log > > 2011-05-25 00:00:11,326 [1306274411] [broker-all] Error : the instance > Simple-log raised an exception [Errno 2] No such file or directory: > u'archives/nagios-05-24-2011-00.log', I remove it! > > 2011-05-25 00:00:11,327 [1306274411] [broker-all] Back trace of this > remove : Traceback (most recent call last): > File "/usr/lib/python2.5/site-packages/shinken/modulesmanager.py", > line 124, in try_instance_init > inst.init() > File > "/usr/lib/python2.5/site-packages/shinken/modules/simplelog_broker.py", > line 143, in init > moved = self.check_and_do_archive(first_pass=True) > File > "/usr/lib/python2.5/site-packages/shinken/modules/simplelog_broker.py", > line 115, in check_and_do_archive > shutil.move(self.path, file_archive_path) > File "/usr/lib/python2.5/shutil.py", line 199, in move > copy2(src,dst) > File "/usr/lib/python2.5/shutil.py", line 91, in copy2 > copyfile(src, dst) > File "/usr/lib/python2.5/shutil.py", line 47, in copyfile > fdst = open(dst, 'wb') > IOError: [Errno 2] No such file or directory: > u'archives/nagios-05-24-2011-00.log' > > > 2 > I have some hosts with 'parents' which are not in the same realm. During > the configcheck the following error was thrown --> Error : the realm > configuration of your hosts is not good because there a more than one > realm in one pack (host relations). > But at the end the configcheck says that everything looks good (Things > look okay - No serious problems were detected during the pre-flight > check). But Shinken doesn't check and process the Host. > It would be nice, if such an Error could be mentioned at the end of the > configcheck. So I have to search the whole checkconfig output, because > there 'maybe' is a mistake which is not mentioned in the summary at the > end. > > > 3 > When I start shinken, the arbiter splits the configuration and send it > to the schedulers. They begin to work (I watched this in the logs). But > in Thruk the number of hosts and services is changing repeatedly in the > first minutes. The number is increasing and decreasing in the size of > the host/service number of random realms. After a few minutes the > numbers are correct. > > For example: > I have 4 realms with 10 Hosts each. Then it could be that thruk says > there are 20 hosts, then there are 10 hosts, then 20, then 30 ....and so > on.... > > I think the Arbiter/Broker should know the whole number of hosts and > services before the Arbiter splits them (that means from the beginning > and not some minutes later)? So that the correct number can be send out > immediately via Livestatus at the start of shinken. Because now I was > sitting in front of Thruk and waited untill the correct number of hosts > showed up. This is important, because together with my bug report number > 2 I have to make sure that shinken is checking all hosts and services. > > > 4 > When I create a realm and assign a scheduler and poller to it, but > don't make him a member of the default realm I get no error message. > The scheduler and poller in the created realm get their configuration > from the arbiter, but no broker will output data, because none is > assigned to the new realm? > > Isn't it better if there is any warning/error message during > configcheck? > > > 5 > The configcheck incorrectly counts the number of hosts in a realm. > Well...I think it's better when I show you an example, instead of > describing it ;) > > Running pre-flight check on configuration data... > Checking global parameters... > Checking hosts... > Checked 574 hosts > Checking hostgroups... > Checked 173 hostgroups > Checking contacts... > Checked 14 contacts > Checking contactgroups... > Checked 4 contactgroups > Checking notificationways... > Checked 14 notificationways > Checking escalations... > Checked 0 escalations > Checking services... > Checked 7579 services > Checking servicegroups... > Checked 24 servicegroups > Checking timeperiods... > Checked 5 timeperiods > Checking commands... > Checked 207 commands > Checking servicedependencies... > Checked 0 servicedependencies > Checking hostdependencies... > Checked 0 hostdependencies > Checking arbiterlinks... > Checked 1 arbiterlinks > Checking schedulerlinks... > Checked 11 schedulerlinks > Checking reactionners... > Checked 1 reactionners > Checking pollers... > Checked 11 pollers > Checking brokers... > Checked 1 brokers > Checking receivers... > Checked 1 receivers > Checking resultmodulations... > Checked 1 resultmodulations > Checking discoveryrules... > Checked 0 discoveryrules > Checking discoveryruns... > Checked 0 discoveryruns > Cutting the hosts and services into parts > Creating packs for realms > Number of hosts in the realm All : 0 > Number of hosts in the realm test1 : 0 > Number of hosts in the realm test2 : 0 > Number of hosts in the realm test3 : 0 > Number of hosts in the realm test4 : 1 > Number of hosts in the realm test5 : 1 > Number of hosts in the realm test6 : 2 > Number of hosts in the realm test7 : 0 > Number of hosts in the realm test8 : 58 > Number of hosts in the realm test9 : 1 > Number of hosts in the realm test10 : 1 > Things look okay - No serious problems were detected during the > pre-flight check > > > You see i've got 574 hosts which are in realms. But where are they after > 'Creating packs for realms'? The realm test8 with it's 58 Hosts is the > server with the realm, where the shinken master runs, and from where I > run the configcheck (only on the master checkconfig counted correct). > As I see the hosts are put to their realms correctly, because when I run > shinken everything works fine. But it would be nice if this summary will > be accurate, too ;) > > > So, that's it for now. ;) > > > Markus Elger > Linux-Systemadministrator > > Unister GmbH > Barfußgässchen 11 | 04109 Leipzig > > Telefon: +49 (0)341 49288 4537 > mar...@un... > www.unister.de > > Vertretungsberechtigter Geschäftsführer: Thomas Wagner > Amtsgericht Leipzig, HRB: 19056 > > > ------------------------------------------------------------------------------ > vRanger cuts backup time in half-while increasing security. > With the market-leading solution for virtual backup and recovery, > you get blazing-fast, flexible, and affordable data protection. > Download your free trial now. > http://p.sf.net/sfu/quest-d2dcopy1 > _______________________________________________ > Shinken-devel mailing list > Shi...@li... > https://lists.sourceforge.net/lists/listinfo/shinken-devel |
From: nap <nap...@gm...> - 2011-05-26 07:37:11
|
On Wed, May 25, 2011 at 7:21 PM, Olivier Hanesse <oli...@gm...>wrote: > Hi, > > Ronny said that you are in 0.6.3 version of Shinken. > You should definitely try to "pull" the latest version. > Jean has fixed some bugs about LiveStatus this week. > I used to get some irrelevant configuration with Livestatus (duplicate > number of hosts/services for example). > Now with the lastest version, everything seems fine. > > Regards > > Olivier > > Hi, The bug N°2 is fixed. Now all realm bad configuration or cutting pass is raised as errors with better error messages :) Jean |
From: nap <nap...@gm...> - 2011-05-26 07:57:42
|
> > > Hi, > > The bug N°2 is fixed. Now all realm bad configuration or cutting pass is > raised as errors with better error messages :) > > The N°5 is fixed too :) Jean |
From: Ronny L. <ron...@un...> - 2011-05-26 11:19:23
|
Hi, we tried the current git version (master) and I can confirm that Nr 5 is fixed - but now the hosts of only 1-2 realms are shown in Thruk. Approximately once a minute there is no backend available and then some other realms are shown. The brokerd.log shows these messages: ============================== 2011-05-26 13:08:09,190 [1306408089] [broker-all] Error : the external module Livestatus goes down unexpectly! 2011-05-26 13:08:09,190 [1306408089] [broker-all] Setting the module Livestatus to restart 2011-05-26 13:08:09,205 [1306408089] [broker-all] I'm stopping process pid:8848 2011-05-26 13:08:09,206 [1306408089] [broker-all] Starting external process for instance Livestatus 2011-05-26 13:08:09,212 [1306408089] [broker-all] Livestatus is now started ; pid=12889 2011-05-26 13:08:09,688 [1306408089] [broker-all] A module is asking me to get all initial data from the scheduler 2 ============================== Bye, Ronny -- Ronny Lindner Linux-Systemadministrator Unister GmbH Barfußgässchen 11 | 04109 Leipzig Telefon: +49 (0)341 49288 4537 ron...@un... www.unister.de Vertretungsberechtigter Geschäftsführer: Thomas Wagner Amtsgericht Leipzig, HRB: 19056 |
From: nap <nap...@gm...> - 2011-05-26 13:13:52
|
On Thu, May 26, 2011 at 1:19 PM, Ronny Lindner <ron...@un...>wrote: > Hi, > > we tried the current git version (master) and I can confirm that Nr 5 > is fixed - but now the hosts of only 1-2 realms are shown in Thruk. > Approximately once a minute there is no backend available and then > some other realms are shown. > > The brokerd.log shows these messages: > ============================== > 2011-05-26 13:08:09,190 [1306408089] [broker-all] Error : the external > module Livestatus goes down unexpectly! > > 2011-05-26 13:08:09,190 [1306408089] [broker-all] Setting the module > Livestatus to restart > > 2011-05-26 13:08:09,205 [1306408089] [broker-all] I'm stopping process > pid:8848 > > 2011-05-26 13:08:09,206 [1306408089] [broker-all] Starting external > process for instance Livestatus > > 2011-05-26 13:08:09,212 [1306408089] [broker-all] Livestatus is now > started ; pid=12889 > > 2011-05-26 13:08:09,688 [1306408089] [broker-all] A module is asking me > to get all initial data from the scheduler 2 > ============================== > Hi, It seems that the LiveSTatus broker crash. Can you start the broker in debug module to get the backtrace? Thanks. Jean > > Bye, Ronny > > -- > |
From: Ronny L. <ron...@un...> - 2011-05-26 14:21:06
Attachments:
brokerdebug.log
|
Hi Jean, I attached a part of the debug log with 2 exceptions. We tested it with version 95292d9121b731e1963ed5bcc0126f540c0afec2 . Bye, Ronny Am Thu, 26 May 2011 15:13:46 +0200 schrieb nap <nap...@gm...>: > On Thu, May 26, 2011 at 1:19 PM, Ronny Lindner > <ron...@un...>wrote: > > > Hi, > > > > we tried the current git version (master) and I can confirm that Nr > > 5 is fixed - but now the hosts of only 1-2 realms are shown in > > Thruk. Approximately once a minute there is no backend available > > and then some other realms are shown. > > > > The brokerd.log shows these messages: > > ============================== > > 2011-05-26 13:08:09,190 [1306408089] [broker-all] Error : the > > external module Livestatus goes down unexpectly! > > > > 2011-05-26 13:08:09,190 [1306408089] [broker-all] Setting the module > > Livestatus to restart > > > > 2011-05-26 13:08:09,205 [1306408089] [broker-all] I'm stopping > > process pid:8848 > > > > 2011-05-26 13:08:09,206 [1306408089] [broker-all] Starting external > > process for instance Livestatus > > > > 2011-05-26 13:08:09,212 [1306408089] [broker-all] Livestatus is now > > started ; pid=12889 > > > > 2011-05-26 13:08:09,688 [1306408089] [broker-all] A module is > > asking me to get all initial data from the scheduler 2 > > ============================== > > > Hi, > > It seems that the LiveSTatus broker crash. Can you start the broker > in debug module to get the backtrace? Thanks. > > > Jean > > > > > > Bye, Ronny > > > > -- > > -- Ronny Lindner Linux-Systemadministrator Unister GmbH Barfußgässchen 11 | 04109 Leipzig Telefon: +49 (0)341 49288 4537 ron...@un... www.unister.de Vertretungsberechtigter Geschäftsführer: Thomas Wagner Amtsgericht Leipzig, HRB: 19056 |
From: nap <nap...@gm...> - 2011-05-26 14:55:56
|
On Thu, May 26, 2011 at 4:20 PM, Ronny Lindner <ron...@un...>wrote: > Hi Jean, > > I attached a part of the debug log with 2 exceptions. We tested it with > version 95292d9121b731e1963ed5bcc0126f540c0afec2 . > > Bye, Ronny > The Status.dat crash due to the utf8 character is solved :) Jean |
From: nap <nap...@gm...> - 2011-05-26 14:28:06
|
On Thu, May 26, 2011 at 4:20 PM, Ronny Lindner <ron...@un...>wrote: > Hi Jean, > > I attached a part of the debug log with 2 exceptions. We tested it with > version 95292d9121b731e1963ed5bcc0126f540c0afec2 . > > Thanks. It seems that a log message is not formed as it should. Can you apply this patch, re-run it and grep for Warning messages? Thanks, diff --git a/shinken/modules/livestatus_broker/livestatus_broker.py b/shinken/modules/livestatus_broker/livestatus_broker.py index b687259..b17ebc6 100644 --- a/shinken/modules/livestatus_broker/livestatus_broker.py +++ b/shinken/modules/livestatus_broker/livestatus_broker.py @@ -567,7 +567,11 @@ class Livestatus_broker(BaseModule): if type == 'CURRENT SERVICE STATE': logobject = LOGOBJECT_SERVICE logclass = LOGCLASS_STATE - host_name, service_description, state, state_type, attempt, plugin_output = options.split(';') + try: + host_name, service_description, state, state_type, attempt, plugin_output = options.split(';') + except ValueError: + print "WARNING : bad CURRENT SERVICE STATE log : %s" % options + return elif type == 'INITIAL SERVICE STATE': logobject = LOGOBJECT_SERVICE logclass = LOGCLASS_STATE There is another exception in the status.dat, but it should be an utf8 character that make str() function not so happy, I'll try to reproduce it :) Jean > Bye, Ronny > |
From: nap <nap...@gm...> - 2011-05-26 14:29:16
|
On Thu, May 26, 2011 at 4:28 PM, nap <nap...@gm...> wrote: > > > On Thu, May 26, 2011 at 4:20 PM, Ronny Lindner <ron...@un...>wrote: > >> Hi Jean, >> >> I attached a part of the debug log with 2 exceptions. We tested it with >> version 95292d9121b731e1963ed5bcc0126f540c0afec2 . >> >> By the way, the bug N°4 is solved :) Now a scheduler with no broker will output warning on the configuration check. Jean |
From: Ronny L. <ron...@un...> - 2011-05-26 15:15:31
|
After sending the last email I noticed that the host count in Thruk is now correct (with your livestatus "patch" and without the statusdat patch). So it seems to be a problem with the ValueError you are now catching. Bye, Ronny -- Ronny Lindner Linux-Systemadministrator Unister GmbH Barfußgässchen 11 | 04109 Leipzig Telefon: +49 (0)341 49288 4537 ron...@un... www.unister.de Vertretungsberechtigter Geschäftsführer: Thomas Wagner Amtsgericht Leipzig, HRB: 19056 |
From: Ronny L. <ron...@un...> - 2011-05-26 15:12:42
|
Hi Jean, here are some of the messages: ==================== WARNING : bad CURRENT SERVICE STATE log : aaaaaaaaaa;Disk Space;WARNING;HARD;4;DISK WARNING - free space: / 1436 MB (8% inode=82%); WARNING : bad CURRENT SERVICE STATE log : xxxxxxxxxxxxxx;Memcache Status;OK;HARD;1;MEMCACHE OK - 6 plugins checked, 0 critical, 0 warning, 0 unknown, 6 ok [memcache_aaaaaaaaaaaa perfdata discarded for general error in '0.0.0.0:11111 time=0.000314s;', memcache_bbbbbbbbbbbbbbbbbbb perfdata discarded for general error in '0.0.0.0:22222 time=0.000323s;', memcache_cccccccccccccc perfdata discarded for general error in '0.0.0.0:33333 time=0.000342s;', memcache_ddddddddddd perfdata discarded for general error in '0.0.0.0:44444 time=0.000339s;', memcache_eeeeeeeeeeee perfdata discarded for general error in '0.0.0.0:55555 time=0.000385s;', memcache_fffffffffffffff perfdata discarded for general error in '0.0.0.0:11111 time=0.000345s;'] ==================== Bye, Ronny Am Thu, 26 May 2011 16:28:00 +0200 schrieb nap <nap...@gm...>: > On Thu, May 26, 2011 at 4:20 PM, Ronny Lindner > <ron...@un...>wrote: > > > Hi Jean, > > > > I attached a part of the debug log with 2 exceptions. We tested it > > with version 95292d9121b731e1963ed5bcc0126f540c0afec2 . > > > > Thanks. It seems that a log message is not formed as it should. Can > > you > apply this patch, re-run it and grep for Warning messages? > Thanks, > > > diff --git a/shinken/modules/livestatus_broker/livestatus_broker.py > b/shinken/modules/livestatus_broker/livestatus_broker.py > index b687259..b17ebc6 100644 > --- a/shinken/modules/livestatus_broker/livestatus_broker.py > +++ b/shinken/modules/livestatus_broker/livestatus_broker.py > @@ -567,7 +567,11 @@ class Livestatus_broker(BaseModule): > if type == 'CURRENT SERVICE STATE': > logobject = LOGOBJECT_SERVICE > logclass = LOGCLASS_STATE > - host_name, service_description, state, state_type, > attempt, plugin_output = options.split(';') > + try: > + host_name, service_description, state, > state_type, attempt, plugin_output = options.split(';') > + except ValueError: > + print "WARNING : bad CURRENT SERVICE STATE log : > %s" % options > + return > elif type == 'INITIAL SERVICE STATE': > logobject = LOGOBJECT_SERVICE > logclass = LOGCLASS_STATE > > > There is another exception in the status.dat, but it should be an utf8 > character that make str() function not so happy, I'll try to > reproduce it :) > > > Jean > > > > Bye, Ronny > > -- Ronny Lindner Linux-Systemadministrator Unister GmbH Barfußgässchen 11 | 04109 Leipzig Telefon: +49 (0)341 49288 4537 ron...@un... www.unister.de Vertretungsberechtigter Geschäftsführer: Thomas Wagner Amtsgericht Leipzig, HRB: 19056 |
From: nap <nap...@gm...> - 2011-05-27 12:57:59
|
On Thu, May 26, 2011 at 5:12 PM, Ronny Lindner <ron...@un...>wrote: > Hi Jean, > > here are some of the messages: > ==================== > [...] > ==================== > > Hi, Thanks for the trace. It was a too much values return from the split, because the log contains ; So the fix was to tell split to only split the good number of elements :) Jean > Bye, Ronny > > |