Menu

failed timers render server unusable

Fraxinas
2012-10-08
2013-03-03
  • Fraxinas

    Fraxinas - 2012-10-08

    hi jeff, i've got a timer here which went nuts at the last month change:

            <rule id="niederschlagsmenge_monatswechsel">
    <condition type="timer" trigger="true">
    <at day="01" hour="00" min="00"/>
    </condition>
                    <actionlist>
    <action type="script">
    delta = obj("sensorik:wetter:regenmesser") - obj("sensorik:wetter:regenmesser_monatlich");
    print("niederschlagsmenge_monatswechsel delta=", delta);
    value = 0.29 * delta;
    set("sensorik:wetter:niederschlagsmenge_monatlich", value);
    print("niederschlagsmenge_monatswechsel value=", value);
    </action>
    <action type="copy-value" from="sensorik:wetter:regenmesser" to="sensorik:wetter:regenmesser_monatlich"/>
                    </actionlist>
            </rule>

    it resulted in the logfile being spammed with lines like this
    WARN TimerManager : TimerTask skipped due to clock skew or heavy load. 1301184000
    INFO PeriodicTask : Rescheduled at 2011-3-27 1:0:0 (1301184000)
    until the 24 GB vserver hdd was completely full and the server stalled

    it worked fine during the previous months though. i'd be happy if you could tell me how to make the rule definition more robust. on the other hand, i think it should be programmatically prevented that timers going nuts can so easily break the whole server

    greetings
    fraxinas

     
  • jef2000

    jef2000 - 2012-10-17

    Hi,

    To be able to understand this problem, I would like to know if the date/time "2011-3-27 1:0:0" was the current date/time of the system when the problem occured or if the system clock was set correctly. Another interesting thing to know, is if the "Rescheduled at 2011-3-27 1:0:0" message was repeating endlessly with that same date/time or if the date/time is evolving, and how it's evolving.

    Thanks,

    Jean-François

     
  • Damago

    Damago - 2012-12-17

    This is probably the same problem as I reported: the rules go completely nuts when there is a time change from summer to winter time. I had server run out of disk on 2012-10-28 & the HUGE log file consiting mainly of:

    :::logfile
    20121028 01:56:00,028 INFO Rule : Evaluate rule pozna_noc
    20121028 01:56:00,028 INFO Condition : TimerCondition evaluated as '1'
    20121028 01:56:00,028 INFO Rule : Rule pozna_noc evaluated as 1, prev value was 0
    20121028 01:56:00,029 INFO Rule : Evaluate rule pozna_noc
    20121028 01:56:00,029 INFO Condition : TimerCondition evaluated as '0'
    20121028 01:56:00,029 INFO Rule : Rule pozna_noc evaluated as 0, prev value was 1
    20121028 01:56:00,029 INFO PeriodicTask : Rescheduled at 2012-10-28 1:56:0 (1351382160)
    20121028 01:56:00,029 INFO TimerManager : TimerTask execution. 1351382160
    20121028 01:56:00,029 INFO Rule : Evaluate rule pozna_noc
    20121028 01:56:00,029 INFO Condition : TimerCondition evaluated as '1'
    20121028 01:56:00,030 INFO Rule : Rule pozna_noc evaluated as 1, prev value was 0
    20121028 01:56:00,030 INFO Rule : Evaluate rule pozna_noc
    20121028 01:56:00,030 INFO Condition : TimerCondition evaluated as '0'
    20121028 01:56:00,030 INFO Rule : Rule pozna_noc evaluated as 0, prev value was 1
    20121028 01:56:00,030 INFO PeriodicTask : Rescheduled at 2012-10-28 1:56:0 (1351382160)
    20121028 01:56:00,030 INFO TimerManager : TimerTask execution. 1351382160
    20121028 01:56:00,031 INFO Rule : Evaluate rule pozna_noc
    20121028 01:56:00,031 INFO Condition : TimerCondition evaluated as '1'
    20121028 01:56:00,031 INFO Rule : Rule pozna_noc evaluated as 1, prev value was 0
    20121028 01:56:00,031 INFO Rule : Evaluate rule pozna_noc
    20121028 01:56:00,031 INFO Condition : TimerCondition evaluated as '0'
    20121028 01:56:00,031 INFO Rule : Rule pozna_noc evaluated as 0, prev value was 1
    20121028 01:56:00,032 INFO PeriodicTask : Rescheduled at 2012-10-28 1:56:0 (1351382160)
    20121028 01:56:00,032 INFO TimerManager : TimerTask execution. 1351382160
    20121028 01:56:00,032 INFO Rule : Evaluate rule pozna_noc
    20121028 01:56:00,032 INFO Condition : TimerCondition evaluated as '1'
    20121028 01:56:00,032 INFO Rule : Rule pozna_noc evaluated as 1, prev value was 0
    20121028 01:56:00,032 INFO Rule : Evaluate rule pozna_noc
    20121028 01:56:00,033 INFO Condition : TimerCondition evaluated as '0'
    20121028 01:56:00,033 INFO Rule : Rule pozna_noc evaluated as 0, prev value was 1
    20121028 01:56:00,033 INFO PeriodicTask : Rescheduled at 2012-10-28 1:56:0 (1351382160)
    20121028 01:56:00,033 INFO TimerManager : TimerTask execution. 1351382160
    20121028 01:56:00,033 INFO Rule : Evaluate rule pozna_noc
    20121028 01:56:00,033 INFO Condition : TimerCondition evaluated as '1'
    20121028 01:56:00,033 INFO Rule : Rule pozna_noc evaluated as 1, prev value was 0
    20121028 01:56:00,034 INFO Rule : Evaluate rule pozna_noc
    20121028 01:56:00,034 INFO Condition : TimerCondition evaluated as '0'
    20121028 01:56:00,034 INFO Rule : Rule pozna_noc evaluated as 0, prev value was 1
    20121028 01:56:00,034 INFO PeriodicTask : Rescheduled at 2012-10-28 1:56:0 (1351382160)
    20121028 01:56:00,035 INFO TimerManager : TimerTask execution. 1351382160
    20121028 01:56:00,035 INFO Rule : Evaluate rule pozna_noc
    20121028 01:56:00,035 INFO Condition : TimerCondition evaluated as '1'
    20121028 01:56:00,035 INFO Rule : Rule pozna_noc evaluated as 1, prev value was 0
    20121028 01:56:00,035 INFO Rule : Evaluate rule pozna_noc
    20121028 01:56:00,036 INFO Condition : TimerCondition evaluated as '0'
    20121028 01:56:00,036 INFO Rule : Rule pozna_noc evaluated as 0, prev value was 1
    20121028 01:56:00,036 INFO PeriodicTask : Rescheduled at 2012-10-28 1:56:0 (1351382160)
    20121028 01:56:00,036 INFO TimerManager : TimerTask execution. 1351382160
    ...and so on until disk full

    This happens every year. Or every half year depending on the rules.

    Yours
    Damago2

     
  • Damago

    Damago - 2012-12-17

    ... and the rule in question is:

    :::xml
    <rule id="pozna_noc">
    <condition type="timer" trigger="true">
    <at hour="0" min="56"/>
    </condition>
    <actionlist>
    <action type="set-value" id="elewacja_ganek_sluzbowy" value="off"/>
    and so on.

     
  • jef2000

    jef2000 - 2013-02-21

    Hi,

    I think I fixed this issue on CVS. It is caused by timers set during the first (or last) hour of the day. In this case, daylight saving time switch adjustments add/substract 1 hour and make 00:56 become 23:56 the day before. My code was already dealing with DST by setting back the hour to 00:56, but in your case, also the day switched back from 29 to 28, and my code was not dealing with that.
    In any case, I also added a protection to avoid the timer to re-schedule itself endlessly at the same time (adding an error message in the log and cancelling that timer, but keeping the rest of the system running)
    Hope it'll fix your issue.

    Jean-François

     
    • Damago

      Damago - 2013-02-22

      Thanks, super! That was the major issue when installing the system for administrator-less envirionment. Now I think the system can work for years not months.

      Marcin Gosiewski

      From: jef2000 [mailto:jef2000@users.sf.net]
      Sent: Friday, February 22, 2013 12:53 AM
      To: [linknx:discussion]
      Subject: [linknx:discussion] failed timers render server unusable

      Hi,

      I think I fixed this issue on CVS. It is caused by timers set during the first (or last) hour of the day. In this case, daylight saving time switch adjustments add/substract 1 hour and make 00:56 become 23:56 the day before. My code was already dealing with DST by setting back the hour to 00:56, but in your case, also the day switched back from 29 to 28, and my code was not dealing with that.
      In any case, I also added a protection to avoid the timer to re-schedule itself endlessly at the same time (adding an error message in the log and cancelling that timer, but keeping the rest of the system running)
      Hope it'll fix your issue.

      Jean-François

      failed timers render server unusable https://sourceforge.net/p/linknx/discussion/743669/thread/c061ef9d/?limit=25#e30d


      Sent from sourceforge.net because you indicated interest in https://sourceforge.net/p/linknx/discussion/743669/

      To unsubscribe from further messages, please visit https://sourceforge.net/auth/prefs/

       
    • Damago

      Damago - 2013-02-22

      I am not 100% sure if it will help. Please note, that the official time change is not from 1:00 to 0:00 but from 2:00am to 1:00am. See the log files attached in this thread above:

      20121028 01:56:00,028 INFO Rule : Evaluate rule pozna_noc
      20121028 01:56:00,028 INFO Condition : TimerCondition evaluated as '1'
      20121028 01:56:00,028 INFO Rule : Rule pozna_noc evaluated as 1, prev value was 0
      20121028 01:56:00,029 INFO Rule : Evaluate rule pozna_noc
      20121028 01:56:00,029 INFO Condition : TimerCondition evaluated as '0'
      20121028 01:56:00,029 INFO Rule : Rule pozna_noc evaluated as 0, prev value was 1
      20121028 01:56:00,029 INFO PeriodicTask : Rescheduled at 2012-10-28 1:56:0

      As you can see they all originate from 1:56. So maybe the reason was as you did mention or something else. We will see during nearest datetime change.

      If you give me the clue which source file and more or less which line or procedure contains the fixes maybe I will place some additional debugging in my compile and will be able to narrow down the problem if it occurs during nearest DST change.

      Yours

      Damago

      From: jef2000 [mailto:jef2000@users.sf.net]
      Sent: Friday, February 22, 2013 12:53 AM
      To: [linknx:discussion]
      Subject: [linknx:discussion] failed timers render server unusable

      Hi,

      I think I fixed this issue on CVS. It is caused by timers set during the first (or last) hour of the day. In this case, daylight saving time switch adjustments add/substract 1 hour and make 00:56 become 23:56 the day before. My code was already dealing with DST by setting back the hour to 00:56, but in your case, also the day switched back from 29 to 28, and my code was not dealing with that.
      In any case, I also added a protection to avoid the timer to re-schedule itself endlessly at the same time (adding an error message in the log and cancelling that timer, but keeping the rest of the system running)
      Hope it'll fix your issue.

      Jean-François

      failed timers render server unusable https://sourceforge.net/p/linknx/discussion/743669/thread/c061ef9d/?limit=25#e30d


      Sent from sourceforge.net because you indicated interest in https://sourceforge.net/p/linknx/discussion/743669/

      To unsubscribe from further messages, please visit https://sourceforge.net/auth/prefs/

       
  • jef2000

    jef2000 - 2013-02-23

    Hi,

    I made additional tests based on your suggestions and found additional cases where it can fail.
    I just added new bugfixes to CVS. Now I hope it's really fixed for all cases.

    Jean-François

     
  • Fraxinas

    Fraxinas - 2013-03-03

    Salut Jean-François,

    thanks for the fixes! i've recompiled cvs linknx and reactivated my month change rule. i'll report if any more issues arise.
    btw off-topic: did you get my e-mail from feb 8th about the additional eis types?
    regards
    Andy

     

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.