Menu

Slave falling back to master state.22:47:12.267292 (ptpd info) (slv) HandledelayReq : disreguard delayreq because of wrong SeqNo

Help
varks123
2014-02-27
2014-02-28
  • varks123

    varks123 - 2014-02-27

    22:47:11.017284 (ptpd debug1) (slv) kernel NANO recv time stamp 1393454824s 315894603ns
    22:47:11.017294 (ptpd debug1) (slv) __UTC_offset: 0 0
    22:47:11.017301 (ptpd debug1) (slv) Message header :
    22:47:11.267276 (ptpd debug1) (slv)
    22:47:11.267285 (ptpd debug1) (slv) transportSpecific : 8
    22:47:11.267292 (ptpd debug1) (slv) messageType : 1
    22:47:11.267298 (ptpd debug1) (slv) versionPTP : 2
    22:47:11.267304 (ptpd debug1) (slv) messageLength : 44
    22:47:11.517279 (ptpd debug1) (slv) domainNumber : 0
    22:47:11.517288 (ptpd debug1) (slv) FlagField 02:00
    22:47:11.517296 (ptpd debug1) (slv) Integer 64 :
    22:47:11.517302 (ptpd debug1) (slv) LSB : 0
    22:47:11.767278 (ptpd debug1) (slv) MSB : 0
    22:47:11.767287 (ptpd debug1) (slv) ClockIdentity : 00:30:48:ff:fe:fd:45:88
    22:47:11.767296 (ptpd debug1) (slv) port number : 1
    22:47:11.767302 (ptpd debug1) (slv) sequenceId : 0
    22:47:12.017279 (ptpd debug1) (slv) controlField : 1
    22:47:12.017287 (ptpd debug1) (slv) logMessageInterval : 127
    22:47:12.017294 (ptpd debug1) (slv)
    22:47:12.017301 (ptpd debug1) (slv) ==> DelayReq received
    22:47:12.267276 (ptpd debug1) (slv) delayReq message received :
    22:47:12.267285 (ptpd debug1) (slv) ==> Handle DelayReq (0)
    22:47:12.267292 (ptpd info) (slv) HandledelayReq : disreguard delayreq because of wrong SeqNo
    22:47:12.517277 (ptpd debug2) (slv) TimerUpdate: Timer 1 has now expired. (Re-armed again with interval 14, left 14)
    22:47:12.517287 (ptpd debug2) (slv) TimerUpdate: Timer 3 has now expired. (Re-armed again with interval 192, left 192)
    22:47:12.829275 (ptpd debug2) (slv) timerExpired: Timer 3 expired, taking actions. current interval: 192; current left: 192
    22:47:12.829286 (ptpd debug1) (slv) event ANNOUNCE_RECEIPT_TIMEOUT_EXPIRES
    22:47:13.079276 (ptpd debug2) (slv) timerStop: Stopping timer 3. (New interval: 192; New left: 0)
    22:47:13.079286 (ptpd debug2) (slv) timerStop: Stopping timer 1. (New interval: 14; New left: 0)
    22:47:13.329276 (ptpd debug1) (slv) initClock
    22:47:13.329288 (ptpd debug2) (slv) adj is 0; t freq is 0 (float: 0.000000 Integer32: 0)
    22:47:13.329301 (ptpd notice) (slv) state PTP_MASTER
    22:47:13.329307 (ptpd info) (slv) now in state PTP_MASTER
    22:47:13.579278 (ptpd debug2) (slv) timerStart: Set timer 2 to 1.000000. New interval: 16; new left: 16
    22:47:13.579290 (ptpd debug1) (slv) SYNC INTERVAL TIMER : 1.000000
    22:47:13.579298 (ptpd debug2) (slv) timerStart: Set timer 4 to 2.000000. New interval: 32; new left: 32
    22:47:13.829281 (ptpd debug2) (slv) timerStart: Set timer 0 to 2.000000. New interval: 32; new left: 32
    22:47:13.829374 (ptpd debug1) (mst) ns_set_state: Set state 5

    mst22:47:14.079279 (ptpd debug2) (mst) TimerUpdate: Timer 2 has now expired. (Re-armed again with interval 16, left 16)
    22:47:14.079289 (ptpd debug2) (mst) timerExpired: Timer 1 expired, taking actions. current interval: 14; current left: 0
    22:47:14.329277 (ptpd debug2) (mst) event DELAYREQ_INTERVAL_TIMEOUT_EXPIRES
    22:47:14.329286 (ptpd debug1) (mst) ==> Issue DelayReq (3)

    Can some one help me understand this behavior of why the slave node is going back to master ?

     
  • Jan Breuer

    Jan Breuer - 2014-02-27

    What is the version of PTPd you are using?

    If you don't want PTPd to became master anytime, you can run it as slave only.

    What is the master clock in your network, what is its announce interval and what configuration are you using with PTPd?

    Is 00:30:48:fd:45:88 your slave's clock MAC address?

     

    Last edit: Jan Breuer 2014-02-27
    • varks123

      varks123 - 2014-02-27

      Version PTPD 2.2.0 .I am using PTPD on ordinary clocks. I start them as masters with -W switch and want bmc algo to identify master/slave accordingly. While BMC is choosing a slave/master appropriately - the slave node is again going to master node citing wrong seqno

       
      • Jan Breuer

        Jan Breuer - 2014-02-27

        May be it is a bug and it is already solved in newer version.

        Do you have two real PCs and not virtual ones?

         
        • varks123

          varks123 - 2014-02-27

          Yes I have 2 freebsd physical boxes. I also tried to run one node as exclusively slave as you suggested.Still I hit the same issue of wrong sequence number. This time it is transitioning to lstn_reset. If it is a bug , if you can point me to a changelist, i can port those changes.

          01:40:51.011617 (ptpd debug1) (slv) port number : 1
          01:40:51.011624 (ptpd debug1) (slv) sequenceId : 0
          01:40:51.261600 (ptpd debug1) (slv) controlField : 1
          01:40:51.261610 (ptpd debug1) (slv) logMessageInterval : 127
          01:40:51.261617 (ptpd debug1) (slv)
          01:40:51.261624 (ptpd debug1) (slv) ==> DelayReq received
          01:40:51.511599 (ptpd debug1) (slv) delayReq message received :
          01:40:51.511609 (ptpd debug1) (slv) ==> Handle DelayReq (0)
          01:40:51.511616 (ptpd info) (slv) HandledelayReq : disreguard delayreq because of wrong SeqNo
          01:40:51.761599 (ptpd debug2) (slv) TimerUpdate: Timer 1 has now expired. (Re-armed again with interval 14, left 14)
          01:40:51.761609 (ptpd debug2) (slv) TimerUpdate: Timer 3 has now expired. (Re-armed again with interval 192, left 192)
          01:40:52.011599 (ptpd debug2) (slv) timerExpired: Timer 3 expired, taking actions. current interval: 192; current left: 192
          01:40:52.011610 (ptpd debug1) (slv) event ANNOUNCE_RECEIPT_TIMEOUT_EXPIRES
          01:40:52.261599 (ptpd debug2) (slv) timerStop: Stopping timer 3. (New interval: 192; New left: 0)
          01:40:52.261610 (ptpd debug2) (slv) timerStop: Stopping timer 1. (New interval: 14; New left: 0)
          01:40:52.261617 (ptpd debug1) (slv) initClock
          01:40:52.574600 (ptpd debug2) (slv) adj is 0; t freq is 0 (float: 0.000000 Integer32: 0)
          01:40:52.574615 (ptpd debug2) (slv) timerStop: Stopping timer 2. (New interval: 0; New left: 0)
          01:40:52.824599 (ptpd debug2) (slv) timerStop: Stopping timer 4. (New interval: 0; New left: 0)
          01:40:52.824609 (ptpd debug2) (slv) timerStop: Stopping timer 0. (New interval: 0; New left: 0)
          01:40:52.824616 (ptpd debug2) (slv) timerStop: Stopping timer 1. (New interval: 14; New left: 0)
          01:40:53.074599 (ptpd debug1) (slv) netRefreshIGMP
          01:40:53.136608 (ptpd info) (slv) refreshed IGMP multicast memberships
          01:40:53.136618 (ptpd debug1) (slv) state PTP_LISTENING
          01:40:53.324601 (ptpd info) (slv) now in state PTP_LISTENING
          01:40:53.324611 (ptpd debug2) (slv) timerStart: Set timer 3 to 12.000000. New interval: 192; new left: 192

          lst01:40:53.386597 (ptpd debug2) (lstn_reset) timerExpired: Timer 1 expired, taking actions. current interval: 14; current left: 0
          01:40:53.574602 (ptpd debug2) (lstn_reset) event DELAYREQ_INTERVAL_TIMEOUT_EXPIRES

          Thanks.

           
  • varks123

    varks123 - 2014-02-27

    I am using PTPD 2.2.0 in E2E mode. The two nodes are connected back to back.

     
  • Jan Breuer

    Jan Breuer - 2014-02-27

    Please try PTPd 2.3.0 on both nodes and tell me if problem persists.

     
    • varks123

      varks123 - 2014-02-27

      Hi Jan ,
      I recently ported my codebase from PTP version 1 to PTP version 2.2.0. I cannot move to 2.3.0 easily. Could you please explain me what could have caused such behavior. Is PTP 2.2.0 not stable ?

       
      • Jan Breuer

        Jan Breuer - 2014-02-27

        You are using modified PTPd? We can't support modified versions because we don't know what you change.

        PTP 2.2.0 was stable, but "stable" != "bug free". Now it is suppressed by 2.3.

        You can try to upgrade at least from 2.2.0 to 2.2.2.

         
  • Wojciech Owczarek

    This is just a guess, but regarding the switch from slave to master - are you running PTPd in slave-only mode? If not, then it will go from slave to master if it can't see announce coming in from its current master. That is the default state machine of the full IEEE 1588 ordinary clock implementation. In the slave-only version, the slave will go into listening instead.

     
    • varks123

      varks123 - 2014-02-28

      Hi Wojciech,

      This is just a guess, but regarding the switch from slave to master - are you running PTPd in slave-only mode?
      No I am running both the nodes with -W switch. After that, BMC chooses one asmaster and slave. After a while slave comes back to master state.
      22:47:12.017301 (ptpd debug1) (slv) ==> DelayReq received
      22:47:12.267276 (ptpd debug1) (slv) delayReq message received :
      22:47:12.267285 (ptpd debug1) (slv) ==> Handle DelayReq (0)
      22:47:12.267292 (ptpd info) (slv) HandledelayReq : disreguard delayreq because of wrong SeqNo
      22:47:12.517277 (ptpd debug2) (slv) TimerUpdate: Timer 1 has now expired. (Re-armed again with interval 14, left 14)
      22:47:12.517287 (ptpd debug2) (slv) TimerUpdate: Timer 3 has now expired. (Re-armed again with interval 192, left 192)
      22:47:12.829275 (ptpd debug2) (slv) timerExpired: Timer 3 expired, taking actions. current interval: 192; current left: 192
      22:47:12.829286 (ptpd debug1) (slv) event ANNOUNCE_RECEIPT_TIMEOUT_EXPIRES
      22:47:13.079276 (ptpd debug2) (slv) timerStop: Stopping timer 3. (New interval: 192; New left: 0)
      22:47:13.079286 (ptpd debug2) (slv) timerStop: Stopping timer 1. (New interval: 14; New left: 0)
      22:47:13.329276 (ptpd debug1) (slv) initClock
      22:47:13.329288 (ptpd debug2) (slv) adj is 0; t freq is 0 (float: 0.000000 Integer32: 0)
      22:47:13.329301 (ptpd notice) (slv) state PTP_MASTER
      22:47:13.329307 (ptpd info) (slv) now in state PTP_MASTER

      Every time , I see this "22:47:12.267292 (ptpd info) (slv) HandledelayReq : disreguard delayreq because of wrong SeqNo" there is a transition immediately after ANNOUNCE_INTERVAL_TIMEOUT. I understand that announce message has not been recieved and hence transition , but am not sure why this is happening. It will be helpful if you can throw some light on this.

       
      • Jan Breuer

        Jan Breuer - 2014-02-28

        In 2.2.0 is bug solved before 2 years by Wojciech in revisions r256, r255, r228 caused by sentDelayReqSequenceId rollover. This probably caused "disreguard delayreq because of wrong SeqNo" after 32768 delay requests.

        You should modify protocol.c and exhange all test of sentDelayReqSequenceId from

        ((ptpClock->sentDelayReqSequenceId - 1) != header->sequenceId)
        

        to

        ((UInteger16)(header->sequenceId + 1)) 
          != ptpClock->sentDelayReqSequenceId)
        

        Maybe there is similar rollover bug of ANNOUNCE_INTERVAL_TIMER and ANNOUNCE_RECEIPT_TIMEOUT but I don't know if it was already reported and solved.

        So, please use NEWEST STABLE (2.3.0) version of PTPd and try it without any custom modifications. If problem persists, please fill a bug report with detailed information how to reproduce this bug (OS, configuration of both nodes, how long to wait, ...).

         
        • varks123

          varks123 - 2014-02-28

          Hi Jan,
          It would be difficult to port to 2.3.0 at this stage.Is 2.2.2 fine ? I am not interested ntp engines and other stuff which are part of 2.3.0 . Does upgrading to 2.2.2 be helpful? My purpose will be served if ptp could sync time appropriately.

           
          • Jan Breuer

            Jan Breuer - 2014-02-28

            sentDelayReqSequenceId rollover is not solved in 2.2.2.

            You told that you have two FreeBSD machines so no need to port anything. You can just compile and run current version without modifications and then try to trigger this bug. If you can trigger it, you can send us complete description of how to trigger it and we can solve it in newest wersion. You can backport this solution to 2.2.0 after that.

             
          • Wojciech Owczarek

            Like Jan says, no need to port anything, some of our core developers (including me now) work on FreeBSD and this has always been a fully supported platform.

            As to upgrading to 2.3.0: 2.3.0 is not only a feature upgrade. 2.3.0 fixes a huge amount of bugs found in previous versions. The NTP stuff has no external dependencies and doesn't need any libraries. The only thing extra you need to run 2.3.0 are libpcap headers which you have as long as you have libpcap installed.

             
  • Wojciech Owczarek

    I see that there is a ANNOUNCE_RECEIPT_TIMEOUT event just before the switch, so I think this is not a bug, rather a case of setting it to slave only mode. I think 2.2.0 supports it.

     

Log in to post a comment.