Menu

ptpd 2.3.1rc2 100% CPU usage on CentOS 6.5

Help
2014-10-02
2015-06-22
1 2 > >> (Page 1 of 2)
  • Matt Garman

    Matt Garman - 2014-10-02

    Hi, I recently noticed that several of my CentOS 6.5 machines had 100% CPU usage due to ptpd2. (This is completely separate from the "High CPU usage by ptpd and events kernel thread in CentOS 6.5" thread I started several months ago.) This was on the 2.3.0 CentOS 6.4+ RPM downloaded here from SF.

    So I downloaded the 2.3.1rc2 source and built and RPM. I deployed it on several servers last night. Most of them are OK, but a couple already are in 100% CPU usage mode.

    Doing an strace on the ptpd2 process results in this repeating constantly:

    recvmsg(8, 0x7fff03fd9320, MSG_DONTWAIT) = -1 EAGAIN (Resource temporarily unavailable)
    select(10, [8 9], NULL, NULL, NULL) = 1 (in [8])
    recvmsg(8, 0x7fff03fd9320, MSG_DONTWAIT) = -1 EAGAIN (Resource temporarily unavailable)
    select(10, [8 9], NULL, NULL, NULL) = 1 (in [8])
    recvmsg(8, 0x7fff03fd9320, MSG_DONTWAIT) = -1 EAGAIN (Resource temporarily unavailable)
    select(10, [8 9], NULL, NULL, NULL) = 1 (in [8])
    recvmsg(8, 0x7fff03fd9320, MSG_DONTWAIT) = -1 EAGAIN (Resource temporarily unavailable)
    select(10, [8 9], NULL, NULL, NULL) = 1 (in [8])

    Any ideas here?

    It seems like I can kill and restart the daemon, and it's OK for a little while. I haven't yet been able to determine any consistent pattern to when the daemon changes from OK to getting stuck in the above loop. A very imprecise observation is that it seems to be most likely to occur on our systems that have the highest network load.

    Does anyone know of a reasonable workaround?

    Thanks!

     
  • Wojciech Owczarek

    Try using libpcap (ptpengine:use_libpcap=y), then try the current Subversion tree and see if it helps.

    I have seen this issue before but it is really hard to track down.

    cheers
    Wojciech

     
  • Matt Garman

    Matt Garman - 2014-10-03

    A couple other observations:

    (1) At the risk of jinxing myself, so far using libpcap seems to help. Thought it's been less than 24 hours, and this issue seems to be non-deterministic as far as I can tell. So it's hard to separate true fix from bug-just-hasn't-hit-yet.

    (2) It looks like in some cases the ptp daemon is able to recover from the "100% cpu" state. I took a survey of all my servers that had ptpd at 100% cpu, but only bounced some of them. I checked again the next day and some that I did not bounce were back to negligible CPU use.

     
  • Wojciech Owczarek

    Matt,

    Interesting - and hopefully this confirms my theory. 2.3.1rc3 is about to be released - once it is, would you be able to help us by running it as previously, that is from source / or RPMs built on your system, and without libpcap? That would be a great help.

    Regards,
    Wojciech

     
    • Matt Garman

      Matt Garman - 2014-10-20

      Hi Wojciech,

      I just deployed 2.3.1rc3 on one machine without libpcap. This machine was one of the worst in terms of getting into the 100% CPU state. I'll monitor it for a few days, and if it's good, I'll keep rolling it out to more systems.

       
  • Wojciech Owczarek

    As a side note, depending on the type of applications you are running, it usually makes sense to use cpusets or a similar mechanism around CPU affinity so that your critical apps run on dedicated cores or sets of cores and "the rest" such as system services say run on core 0. Ptpd itself supports binding to a single CPU core (global:cpuaffinity_cpucore=X). This is usually beneficial for timing performance.

     
  • Matt Garman

    Matt Garman - 2014-10-22

    Monday evening I upgraded to ptpd 2.3.1rc3 on one of the servers that seems to trigger this CPU usage issue the most. It ran overnight without any issues, so I upgraded a couple more yesterday morning.

    Just checking on them now, the originally-upgraded one has never hit 100% CPU. However, one of the other two I upgraded did get into the CPU-hogging state. So for that server I re-enabled libpcap in the ptpd2 config.

     
  • Steve Lunsford

    Steve Lunsford - 2015-05-13

    I've just upgraded to ptpd 2.3.1rc3 and am seeing the same behavior on some of my machines where ptpd2 goes to 100% CPU. I'm seeing this on a RHEL 6.6 server. Has anyone determined a work-around?

     
  • Wojciech Owczarek

    Hi Steve,

    Some questions for you - there is an imminent release coming and it would be great to have this resolved.

    • Can you confirm if switching to libpcap changes this?
    • Is this fully reproducible?
    • How long after startup does this happen?
    • What are the NICs you're using - can I get an "ethtool -i" and "ethtool -T" output for the interfaces?
    • Can you build ptpd with debug (configure --enable-runtime-debug --enable-debug-level=all) and run it with -DDD until it reaches 100% and send us the log file?

    Regards,
    Wojciech

     
  • Steve Lunsford

    Steve Lunsford - 2015-05-14

    I'm running with "--ptpengine:use_libpcap=y" and it still happens. Yes it is reproducible as I'm seeing it consistently. It doesn't happen immediately so I don't know how long it takes to happen. The only thing I can tell you is that it must somewhere < 12 hours.

    Here is the ethtool output:

    eth0:
    driver: e1000e
    version: 2.3.2-k
    firmware-version: 0.13-3
    bus-info: 0000:00:19.0
    supports-statistics: yes
    supports-test: yes
    supports-eeprom-access: yes
    supports-register-dump: yes
    supports-priv-flags: no

    Time stamping parameters for eth0:
    Capabilities:
    hardware-transmit (SOF_TIMESTAMPING_TX_HARDWARE)
    software-transmit (SOF_TIMESTAMPING_TX_SOFTWARE)
    hardware-receive (SOF_TIMESTAMPING_RX_HARDWARE)
    software-receive (SOF_TIMESTAMPING_RX_SOFTWARE)
    software-system-clock (SOF_TIMESTAMPING_SOFTWARE)
    hardware-raw-clock (SOF_TIMESTAMPING_RAW_HARDWARE)
    PTP Hardware Clock: 0
    Hardware Transmit Timestamp Modes:
    off (HWTSTAMP_TX_OFF)
    on (HWTSTAMP_TX_ON)
    Hardware Receive Filter Modes:
    none (HWTSTAMP_FILTER_NONE)
    all (HWTSTAMP_FILTER_ALL)
    ptpv1-l4-sync (HWTSTAMP_FILTER_PTP_V1_L4_SYNC)
    ptpv1-l4-delay-req (HWTSTAMP_FILTER_PTP_V1_L4_DELAY_REQ)
    ptpv2-l4-sync (HWTSTAMP_FILTER_PTP_V2_L4_SYNC)
    ptpv2-l4-delay-req (HWTSTAMP_FILTER_PTP_V2_L4_DELAY_REQ)
    ptpv2-l2-sync (HWTSTAMP_FILTER_PTP_V2_L2_SYNC)
    ptpv2-l2-delay-req (HWTSTAMP_FILTER_PTP_V2_L2_DELAY_REQ)
    ptpv2-event (HWTSTAMP_FILTER_PTP_V2_EVENT)
    ptpv2-sync (HWTSTAMP_FILTER_PTP_V2_SYNC)
    ptpv2-delay-req (HWTSTAMP_FILTER_PTP_V2_DELAY_REQ)

    eth1:
    driver: e1000e
    version: 2.3.2-k
    firmware-version: 15.255-15
    bus-info: 0000:03:00.0
    supports-statistics: yes
    supports-test: yes
    supports-eeprom-access: yes
    supports-register-dump: yes
    supports-priv-flags: no

    Time stamping parameters for eth1:
    Capabilities:
    hardware-transmit (SOF_TIMESTAMPING_TX_HARDWARE)
    software-transmit (SOF_TIMESTAMPING_TX_SOFTWARE)
    hardware-receive (SOF_TIMESTAMPING_RX_HARDWARE)
    software-receive (SOF_TIMESTAMPING_RX_SOFTWARE)
    software-system-clock (SOF_TIMESTAMPING_SOFTWARE)
    hardware-raw-clock (SOF_TIMESTAMPING_RAW_HARDWARE)
    PTP Hardware Clock: 1
    Hardware Transmit Timestamp Modes:
    off (HWTSTAMP_TX_OFF)
    on (HWTSTAMP_TX_ON)
    Hardware Receive Filter Modes:
    none (HWTSTAMP_FILTER_NONE)
    all (HWTSTAMP_FILTER_ALL)
    ptpv1-l4-sync (HWTSTAMP_FILTER_PTP_V1_L4_SYNC)
    ptpv1-l4-delay-req (HWTSTAMP_FILTER_PTP_V1_L4_DELAY_REQ)
    ptpv2-l4-sync (HWTSTAMP_FILTER_PTP_V2_L4_SYNC)
    ptpv2-l4-delay-req (HWTSTAMP_FILTER_PTP_V2_L4_DELAY_REQ)
    ptpv2-l2-sync (HWTSTAMP_FILTER_PTP_V2_L2_SYNC)
    ptpv2-l2-delay-req (HWTSTAMP_FILTER_PTP_V2_L2_DELAY_REQ)
    ptpv2-event (HWTSTAMP_FILTER_PTP_V2_EVENT)
    ptpv2-sync (HWTSTAMP_FILTER_PTP_V2_SYNC)
    ptpv2-delay-req (HWTSTAMP_FILTER_PTP_V2_DELAY_REQ)

    I will build the debug version and send you the log file.

     
  • Steve Lunsford

    Steve Lunsford - 2015-05-14

    And I'm using eth0.

     
  • Wojciech Owczarek

    Thanks - this partially confirms my guess. PTPd is trying to use the software transmit timestamps and it fails somewhere. With the logs we should at least be able to see what happened.

    Unfortunately I don't have any of the new Intel PTP NICs to hand - I only have a Solarflare NIC available that does not do software TX timestamping. Worst case we can remove the sw TX timestamp functionality if it proves problematic.

    Can you also do another test for me:

    1. edit src/ptpd.h and add to the end of the file (before the final #endif) :
    #undef SO_TIMESTAMPING
    
    1. edit src/dep/net.c and add the same line before the first function definition (netShutdownMulticastIPv4)
    #undef SO_TIMESTAMPING
    

    If I'm right, you shouldn't be able to reproduce the issue after this change.

    Regards,
    Wojciech

     

    Last edit: Wojciech Owczarek 2015-05-14
  • Kevin

    Kevin - 2015-05-14

    I just wanted to chime in here and say that I'm seeing the same behavior on 2.3.1-rc3, but all my hosts are Ubuntu running either 3.11 or 3.13 kernels, and NOT using an Intel PTP NIC. These have all been using e1000 NICs. Never saw this issue with ptpd 2.1, though I'm aware that 2.1 is much, much older code.

    $ ethtool -i eth0
    driver: e1000
    version: 7.3.21-k8-NAPI
    firmware-version:
    bus-info: 0000:02:01.0
    supports-statistics: yes
    supports-test: yes
    supports-eeprom-access: yes
    supports-register-dump: yes
    supports-priv-flags: no

    When CPU locks up at 100% on the ptpd process, I see the following output from an strace:

    recvfrom(11, "\t\2\0006\n\0\2\4\0\0\0\0\0\0\0\0\0\0\0\0\0\16\376\377\376\1\0058\0\1\232\21"..., 300, MSG_DONTWAIT, {sa_family=AF_INET, sin_port=htons(320), sin_addr=inet_addr("10.2.111.10")}, [16]) = 54
    select(12, [10 11], NULL, NULL, NULL) = 1 (in [10])
    recvmsg(10, 0x7fff5a855bb0, MSG_DONTWAIT) = -1 EAGAIN (Resource temporarily unavailable)
    select(12, [10 11], NULL, NULL, NULL) = 1 (in [10])
    recvmsg(10, 0x7fff5a855bb0, MSG_DONTWAIT) = -1 EAGAIN (Resource temporarily unavailable)
    select(12, [10 11], NULL, NULL, NULL) = 1 (in [10])
    recvmsg(10, 0x7fff5a855bb0, MSG_DONTWAIT) = -1 EAGAIN (Resource temporarily unavailable)
    select(12, [10 11], NULL, NULL, NULL) = 1 (in [10])
    recvmsg(10, 0x7fff5a855bb0, MSG_DONTWAIT) = -1 EAGAIN (Resource temporarily unavailable)
    select(12, [10 11], NULL, NULL, NULL) = 1 (in [10])
    recvmsg(10, 0x7fff5a855bb0, MSG_DONTWAIT) = -1 EAGAIN (Resource temporarily unavailable)

    (repeats on and on)

    Would you suggest the same code change for testing? Anything else I can do to help pin down this bug?

     

    Last edit: Kevin 2015-05-14
    • Wojciech Owczarek

      Kevin,

      Yes, try the same code change. How long after startup does this happen?

      This is less about HW timestamping and more about SW timestamping, what does ethtool -T give you for that NIC?

      So select() is saying that there is data to read in the socket, but recvmsg thinks there isn't. I'm wondering if this could be a driver issue (combined with SO_TIMESTAMPING - which PTPd 2.2 was not using). So in this case select() never waits.

      PTPd's whole event loop relies on select(). Select() exits either when there is data in one of the sockets or when it's interrupted by an event timer tick, otherwise it waits. Recvmsg() is set to non-blocking, so it exits immediately if there is no data.

      Are you able to test with any other NIC?

       
      • Kevin

        Kevin - 2015-05-18

        As for how long after startup: it varies, but usually on the order of a week or several days after startup. Definitely not right away. That's part of what's making this very difficult to pin down -- I have to just sit and wait several days for the situation to occur again.

        Here is the output of ethtool -T on the NIC:

        root@host1:~# ethtool -T eth0
        Time stamping parameters for eth0:
        Capabilities:
        software-transmit (SOF_TIMESTAMPING_TX_SOFTWARE)
        software-receive (SOF_TIMESTAMPING_RX_SOFTWARE)
        software-system-clock (SOF_TIMESTAMPING_SOFTWARE)
        PTP Hardware Clock: none
        Hardware Transmit Timestamp Modes: none
        Hardware Receive Filter Modes: none

        As for a different NIC, these are actually VMWare VMs, so I can try the vmxnet3 driver on another host running original 2.3.1-rc3 code. I'm also curious to see if that makes a difference. Also, for what it's worth, I'm aware that PTP isn't perfect on VMs but these are our test systems for production machines running PTP that are physicals. The high CPU/infinite loop situation didn't happen on VMs running ptp-2.1.0 either.

         
        • Wojciech Owczarek

          Kevin,

          Thanks, this is very helpful info.

          2.1.0 did not use SO_TIMESTAMPING. I implemented it just so those NICs that can support SO_TIMESTAMPING_TX_SOFTWARE can benefit from slightly improved accuracy.

          So you are using virtual e1000e. I checked my main devbox and it turns out it does support PTP on the onboard NICs - they are igb driver.

          I'll put a VM together and start testing. If I can catch the issue I should be able to at least provide a workaround. I suspect that the problem is not strictly with PTPd, it's an issue with software transmit timestamping.

           
          • Kevin

            Kevin - 2015-05-19

            Ok, thanks for looking into this. I haven't seen the issue on my test box since commenting out the SO_TIMESTAMPING. It's been about a week since I made the change so I'm getting more and more confident that it fixed or sidestepped the issue.

             
  • Wojciech Owczarek

    One more comment all - does this happen under any load or just on its own?

     
  • Steve Lunsford

    Steve Lunsford - 2015-05-15

    Hi Wojciech,

    I implemented the changes to the source to turn off the SO_TIMESTAMPING flag and ran overnight. The problem has not shown up yet so looking good so far. I have not been able to get a debug log file since the problem didn't show up while I was collecting it. I ran yesterday before turning off SO_TIMESTAMPING and collected log info but it did not occur within approximately 6 hours.

    Steve

     
  • Steve Lunsford

    Steve Lunsford - 2015-05-18

    I ran all all weekend with SO_TIMESTAMPING disabled and had no issues.

    Steve

     
    • Wojciech Owczarek

      Thank you Steve. We have it pinned down by the looks of it.

       
  • Wojciech Owczarek

    I should be providing a potential fix in the next few days (this fix is not disabling SO_TIMESTAMPING). My theory is that a TX timestamp is late and getTxTimestamp() exits, but the timestamp soon arrives, and yet the only place where we empty MSG_ERRQUEUE is in getTxTimestamp().

     
  • Kevin

    Kevin - 2015-05-28

    Awesome, thank you Wojciech. I look forward to testing the new patch!

     
  • Wojciech Owczarek

    Hi All,

    Can you please try this patch on 2.3.1rc3 and report back?

    What this patch does is it tries to catch a late transmit timestamp and continue on. This should prevent the situation where TX timestamping is not being used anymore but that late timestamp is stuck in the socket error queue and it never gets emptied, causing a loop.

    Thanks,
    Wojciech

     
  • Wojciech Owczarek

    With this patch, ptpd waits 30 milliseconds for the late timestamp to arrive. This is quite crude, probably 1 millisecond would be enough.

    Depending if this works or not, could you also try modifying the usleep() call in gettxtimestamp() in net.c to 1 millisecond (1000) and also test? Well, ideally I would like to try multiple values so that we can at least find out how much delay there is and how often this happens.

    The DBG in this:

    • if(length > 0) {
    • DBG("SO_TIMESTAMPING - delayed TX timestamp caught\n");
    • return TRUE;

    Could also be changed to INFO o we can see how often this is happening.

    Cheers,
    Wojciech

     
1 2 > >> (Page 1 of 2)

Log in to post a comment.