Menu

ptpd on ARM: network problems

Help
Gertjan
2007-03-29
2012-11-23
  • Gertjan

    Gertjan - 2007-03-29

    Hi,

    We are exploring using a at91 ARM processor for embedded systems. We require time synchronization for data time stamping and I was pleased to find the Open Ptpd project. I have run it sucessfully on a small cluster of i386 Linux machines.

    I have cross compiled ptpd (to ARM linux 2.6.19 from TimeSys) without any problems - it starts well - see log output below, until it hangs at the event SYNC_RECEIPT_TIMEOUT_EXPIRES messages.

    The problem probably isn't with ptpd at all, but I am hoping someone can offer hints to debug the problem.  When I  run tcpdump on a different (i386) node, the node is seeing the multi-cast messages from the ARM. When I run ptpd on i386 and run tcpdump on ARM, the ARM eth0 never sees any of the port 319/320 packets. First I thought there is something wrong with receiving multi-cast messages on the ARM. Looked at kernel config option etc. Then tested it with a Tcl script that does UDP broadcast from i386 and the ARM sees them fine. So whats the problem ? I have feeling it has something to do with

    #define DEFAULT_PTP_DOMAIN_ADDRESS  "224.0.1.129"

    I have not touched these constants_dep.h line (haven't needed to before, to get it all running on my 192.168.0.X network). I am not entirely sure how the multi-cast grouping works. Am I on the right track ? Any other tests that I could run ?

    Much appreciated

    Gertjan

    [root@localhost src]# ./ptpd -p
    (debug) allocated 1072 bytes for protocol engine data
    (debug) allocated 600 bytes for foreign master data
    (debug) event POWERUP
    (debug) state PTP_INITIALIZING
    (debug) manufacturerIdentity: Kendall;1b3
    (debug) netInit
    (debug) initData
    (debug) initTimer
    (debug) initClock
    (debug) sync message interval: 2
    (debug) clock identifier: DFLT
    (debug) 256*log2(clock variance): -4000
    (debug) clock stratum: 4
    (debug) clock preferred?: yes
    (debug) bound interface name: eth0
    (debug) communication technology: 1
    (debug) uuid: 02:03:04:05:06:07
    (debug) PTP subdomain name: _DFLT
    (debug) subdomain address: e0.0.1.81
    (debug) event port address: 27 5
    (debug) general port address: 28 5
    (debug) state PTP_LISTENING
    (debug) event SYNC_RECEIPT_TIMEOUT_EXPIRES
    (debug) state PTP_MASTER

     
    • kendall

      kendall - 2007-03-29

      Is the above output from the x86 or the ARM machine? Either way, what's the result when the x86 is run as preferred and the ARM not preferred, and vice versa? If both machines become masters, then one must not be seeing the other. To find out if multi-casting is the problem, you can run both ends uni-cast with the '-u' option.

       
    • Gertjan

      Gertjan - 2007-03-29

      Kendall,

      Thanks for your prompt response. The above output is on the ARM.  When the x86 is run as preferred and ARM as slave or vice versa I get the same response - both try to become master. THe -u option is a great idea.

      However I am confused. I first try both methods again between two (working) i386 machines:

      Running between two i386 in multi-cast, master started with ptpd -p.  Slave (192.168.0.54) command ptpd -g -d

      Slave output:

      (debug) state PTP_PTP_SLAVE
      (debug) initClock
      (debug) Q = 0, R = 6
      (debug) handleFollowUp: unwanted
      (debug) offset from master:               0 (s)     1781000 (ns)
      (debug) observed drift:       1781 (ns)
      (debug) offset from master:               0 (s)     3415500 (ns)  etc etc.

      as expected - looks good I think.

      But running the master in ptpd -p -u 192.168.0.54 gives output of slave:
      (debug) updateForeign: new record (0,1) 1 1 00:10:b5:48:18:7d
      (debug) state PTP_PTP_SLAVE
      (debug) initClock
      (debug) Q = 0, R = 6
      (debug) Q = 0, R = 16
      (debug) handleDelayReq: self
      (debug) Q = 0, R = 9
      (debug) handleDelayReq: self
      (debug) Q = 0, R = 14

      It doesnt look like the clock is updating. Yet all that should be different is that I am only sending to the slave host, correct ?

      The ARM gives the same response as the slave above. So now I have two puzzles:

      1. why is uni-cast output different on slave
      2. somehow my ARM isnt responding to multi-cast.  But I get no errors from your setsocketopt calls and CONFIGURE_IP_MULTICAST is on.

      Another thing I do not understand is that a ping 224.0.0.1 on my subnet (from any node) does not return anything - as I understand it, all multi-cast capable hosts should be responding. Yes ptpd works fine (except for the ARM node of course).

      After thought: I suspect the slave should run with  ./ptpd -g -u 192.168.0.51  (i.e. use address of master host). If do that, the slave output (whether ARM or i386)  is:

      debug) general port address: 40 1
      (debug) state PTP_LISTENING
      ptpd!: short or truncated cmsghdr!
      : Resource temporarily unavailable
      ptpd!: short or truncated cmsghdr!
      : Resource temporarily unavailable
      ptpd!: short or truncated cmsghdr!
      : Resource temporarily unavailable

      Can you explain what this means ?

      Thanks again

      Gertjan

       
    • Gertjan

      Gertjan - 2007-03-29

      FYI: I am using v1b5, 3 December 2006:  THanks. GH

       
    • kendall

      kendall - 2007-04-01

      I think uni-cast operation is broken, sorry. See this thread:

      http://sourceforge.net/forum/forum.php?thread_id=1690851&forum_id=469208

       
    • Gertjan

      Gertjan - 2007-04-03

      Thanks. I downloaded b4, seems to run.  Then tries to synchronize but fails miserably (never converges). I have a feeling that this is due to the gettimeofday function and nothing to do with ptpd. Of course I havent solved my multi-cast problem either.....

      Thanks

      Gertjan

       
    • Gertjan

      Gertjan - 2007-04-04

      This might a dum question..but there is no way to use broadcast rather than multi-cast, is that right ?

      Thanks

      Gertjan

       
    • kendall

      kendall - 2007-04-05

      The 1588 (version 1) spec defines the protocol only for udp/ip over ethernet, and only on a specific multi-cast domain. I added uni-cast to help with development and testing, but it's not technically spec compliant. You might be able to modify it for broadcast as well, but I haven't looked into it.

      If multi-cast is not working, it's probably a bug in the ethernet driver. What ethernet controller does your platform use, and on what kernel?

       

Log in to post a comment.