linuxptp-users Mailing List for linuxptp (Page 160)
PTP IEEE 1588 stack for Linux
Brought to you by:
rcochran
You can subscribe to this list here.
2012 |
Jan
|
Feb
(10) |
Mar
(47) |
Apr
|
May
(26) |
Jun
(10) |
Jul
(4) |
Aug
(2) |
Sep
(2) |
Oct
(20) |
Nov
(14) |
Dec
(8) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2013 |
Jan
(6) |
Feb
(18) |
Mar
(27) |
Apr
(57) |
May
(32) |
Jun
(21) |
Jul
(79) |
Aug
(108) |
Sep
(13) |
Oct
(73) |
Nov
(51) |
Dec
(24) |
2014 |
Jan
(24) |
Feb
(41) |
Mar
(39) |
Apr
(5) |
May
(6) |
Jun
(2) |
Jul
(5) |
Aug
(15) |
Sep
(7) |
Oct
(6) |
Nov
|
Dec
(7) |
2015 |
Jan
(27) |
Feb
(18) |
Mar
(37) |
Apr
(8) |
May
(13) |
Jun
(44) |
Jul
(4) |
Aug
(50) |
Sep
(35) |
Oct
(6) |
Nov
(24) |
Dec
(19) |
2016 |
Jan
(30) |
Feb
(30) |
Mar
(23) |
Apr
(4) |
May
(12) |
Jun
(19) |
Jul
(26) |
Aug
(13) |
Sep
|
Oct
(23) |
Nov
(37) |
Dec
(15) |
2017 |
Jan
(33) |
Feb
(19) |
Mar
(20) |
Apr
(43) |
May
(39) |
Jun
(23) |
Jul
(20) |
Aug
(27) |
Sep
(10) |
Oct
(15) |
Nov
|
Dec
(24) |
2018 |
Jan
(3) |
Feb
(10) |
Mar
(34) |
Apr
(34) |
May
(28) |
Jun
(50) |
Jul
(27) |
Aug
(75) |
Sep
(21) |
Oct
(42) |
Nov
(25) |
Dec
(31) |
2019 |
Jan
(39) |
Feb
(28) |
Mar
(19) |
Apr
(7) |
May
(30) |
Jun
(22) |
Jul
(54) |
Aug
(36) |
Sep
(19) |
Oct
(33) |
Nov
(36) |
Dec
(32) |
2020 |
Jan
(29) |
Feb
(38) |
Mar
(29) |
Apr
(30) |
May
(39) |
Jun
(45) |
Jul
(31) |
Aug
(52) |
Sep
(40) |
Oct
(8) |
Nov
(48) |
Dec
(30) |
2021 |
Jan
(35) |
Feb
(32) |
Mar
(23) |
Apr
(55) |
May
(43) |
Jun
(63) |
Jul
(17) |
Aug
(24) |
Sep
(9) |
Oct
(31) |
Nov
(67) |
Dec
(55) |
2022 |
Jan
(31) |
Feb
(48) |
Mar
(76) |
Apr
(18) |
May
(13) |
Jun
(46) |
Jul
(75) |
Aug
(54) |
Sep
(59) |
Oct
(65) |
Nov
(44) |
Dec
(7) |
2023 |
Jan
(38) |
Feb
(32) |
Mar
(35) |
Apr
(23) |
May
(46) |
Jun
(53) |
Jul
(18) |
Aug
(10) |
Sep
(24) |
Oct
(15) |
Nov
(40) |
Dec
(6) |
From: Stephan G. <st...@ga...> - 2012-05-30 16:35:47
|
Hello Richard, I'm a colleague of Mario and we found the reason for the strange data int the skb. > Wow, that looks really wrong. I don't see a MAC address anywhere. > The problem lies in the file linux/drivers/net/ethernet/freescale/fec_mpc52xx.c In the receive interrupt function mpc52xx_fec_rx_interrupt the call to skb_defer_rx_timestamp is done using a freshly allocated skb (called skb) and not with the received skb (called rskb). Calling skb_defer_rx_timestamp with the correct parameter engages the whole engine. We would be happy to make a patch for this fix. Do you think the powerpc mailing list is more suitable than the linux-netdev? Regards, Stephan |
From: Richard C. <ric...@gm...> - 2012-05-30 05:00:22
|
On Tue, May 29, 2012 at 08:54:36PM +0200, Mario Molitor wrote: > > I have modified my instrumentation which you can find as attached > message (messages.zip) > log (syslog). > In case of type (in the function "skb_defer_rx_timestamp()") with a value > 0 it makes printk hex output of skb->data. > I hope this help you. Please let me let me know if you need more > information. Wow, that looks really wrong. I don't see a MAC address anywhere. It is essential that the function, classify(), in skb_defer_rx_timestamp is passed a skb with skb->data pointing to the DST MAC address. Can you show the patch you are using? Thanks, Richard |
From: Mario M. <Mar...@we...> - 2012-05-29 18:54:42
|
Hi Richard, >> >(check the conditions within classify() and the returned type) >> >> It seems that classify (better the function sk_run_filter) don't detected >> any PTP event message. Tomorrow I will deeper TRACE this. > It would help to see a hex dump of skb->data from a PTP message packet > when sk_run_filter fails to detect it. I have modified my instrumentation which you can find as attached message (messages.zip) log (syslog). In case of type (in the function "skb_defer_rx_timestamp()") with a value 0 it makes printk hex output of skb->data. I hope this help you. Please let me let me know if you need more information. Thanks, Mario |
From: Richard C. <ric...@gm...> - 2012-05-28 14:01:17
|
On Mon, May 28, 2012 at 01:33:54PM +0200, Mario Molitor wrote: > > I have another question to use always layer 2 for status frame. > Is this always necessary? > Because I think UDP/IPv4 status frames should also possible. > > I could change this for our tests. What do you think about this? I am not sure what you mean. The PHY status frames are L2, but this has nothing to do with the actual PTP messages, which can be L2 or L4 in any case. HTH, Richard |
From: Richard C. <ric...@gm...> - 2012-05-28 13:57:58
|
On Mon, May 28, 2012 at 01:26:12PM +0200, Mario Molitor wrote: > >(check the conditions within classify() and the returned type) > > It seems that classify (better the function sk_run_filter) don't detected > any PTP event message. Tomorrow I will deeper TRACE this. It would help to see a hex dump of skb->data from a PTP message packet when sk_run_filter fails to detect it. Thanks, Richard |
From: Mario M. <Mar...@we...> - 2012-05-28 11:34:00
|
Hi Richard, >> >I have never seen the combination MPC5200/DP83640. Is this a >> >commercial board or a custom design? >> >> It is custom design. We design a module for our data acquisition systems. > > Here are two ideas to think about. > > 1. Perhaps the PHY device driver instance is not properly associated with > the FEC driver instance. Have you checked your DTS? We have checked this and it looks ok, but we can't shutout any problem. We will very this tomorrow again. > 2. The PHY sends the time stamps to the MAC using special "status" > frames. Perhaps your MAC is dropping these frames? They are sent > with a DST MAC address like this: > > static u8 status_frame_dst[6] = { 0x01, 0x1B, 0x19, 0x00, 0x00, 0x00 }; I could verify this with tcpdump (on the module) and with our probe (http://www.beckhoff.de/default.asp?ethercat/et2000.htm) over Wireshark. What do you think about this? I have another question to use always layer 2 for status frame. Is this always necessary? Because I think UDP/IPv4 status frames should also possible. I could change this for our tests. What do you think about this? Thanks, Mario |
From: Mario M. <Mar...@we...> - 2012-05-28 11:26:33
|
Hi Richard, >> Why the software time stamping work and only the hardware time stamping >> has >> this problem? This point has me irritated. > > SW and HW time stamping are completely different code paths and are > not really connected to each other. Ok. >> The other point we don't have connected the PTP server directly with our >> PTP >> slave module and it is some Ethernet switches involved in Ethernet >> connection. Could this make a problem? > > I doubt it. Ok fine. >> >Did you enable CONFIG_DP83640_PHY too? >> >> Yes we have also used this option. > > Good. Can you post your dmesg? Yes I have attached the DMESG log on this email. >> I have adapt the code of PHY driver (net/phy/dp83640.c) with printk and I >> have seen the probe function is executed. I can say the kernel driver is >> compiled and loaded. > > If you want to trace what is happening, you can add printks along the > following path. > > *** drivers/net/ethernet/freescale/fec_mpc52xx.c: > mpc52xx_fec_rx_interrupt() > > if (!skb_defer_rx_timestamp(skb)) > netif_rx(rskb); > > *** net/core/timestamping.c: skb_defer_rx_timestamp[104] > > type = classify(skb); > > (check the conditions within classify() and the returned type) It seems that classify (better the function sk_run_filter) don't detected any PTP event message. Tomorrow I will deeper TRACE this. > ... > phydev = skb->dev->phydev; > if (likely(phydev->drv->rxtstamp)) > return phydev->drv->rxtstamp(phydev, skb, type); > > *** drivers/net/phy/dp83640.c: dp83640_rxtstamp[1172] Best regards, Mario |
From: Mario M. <Mar...@we...> - 2012-05-28 11:25:24
|
Hi Richard, >> I have a question to phy driver dp83640.c only for interest. I have seen >> in >> probe function that recalibrate(clock) function not always executed >> directly >> after registration. What is the reason for this? > > The driver supports using multiple PHYs together as one PHC > clock. Calibration is needed to synchronize multiple PHYs, but only > when you have two or more PHYs on the same MDIO bus. > > Do you have more than one PHY in your design? No, we used only one PHY in our design. > > If so, their GPIOs need to be wired together correctly in order to > make this work. > > If not, then the recalibrate() function should never be called. Ok this is also my observation. Thanks for the info, Mario |
From: Richard C. <ric...@gm...> - 2012-05-23 15:45:43
|
On Wed, May 23, 2012 at 04:54:42PM +0200, Mario Molitor wrote: > >I have never seen the combination MPC5200/DP83640. Is this a > >commercial board or a custom design? > > It is custom design. We design a module for our data acquisition systems. Here are two ideas to think about. 1. Perhaps the PHY device driver instance is not properly associated with the FEC driver instance. Have you checked your DTS? 2. The PHY sends the time stamps to the MAC using special "status" frames. Perhaps your MAC is dropping these frames? They are sent with a DST MAC address like this: static u8 status_frame_dst[6] = { 0x01, 0x1B, 0x19, 0x00, 0x00, 0x00 }; HTH, Richard |
From: Richard C. <ric...@gm...> - 2012-05-23 15:29:58
|
On Wed, May 23, 2012 at 04:54:42PM +0200, Mario Molitor wrote: > > Why the software time stamping work and only the hardware time stamping has > this problem? This point has me irritated. SW and HW time stamping are completely different code paths and are not really connected to each other. > The other point we don't have connected the PTP server directly with our PTP > slave module and it is some Ethernet switches involved in Ethernet > connection. Could this make a problem? I doubt it. > >Did you enable CONFIG_DP83640_PHY too? > > Yes we have also used this option. Good. Can you post your dmesg? > I have adapt the code of PHY driver (net/phy/dp83640.c) with printk and I > have seen the probe function is executed. I can say the kernel driver is > compiled and loaded. If you want to trace what is happening, you can add printks along the following path. *** drivers/net/ethernet/freescale/fec_mpc52xx.c: mpc52xx_fec_rx_interrupt() if (!skb_defer_rx_timestamp(skb)) netif_rx(rskb); *** net/core/timestamping.c: skb_defer_rx_timestamp[104] type = classify(skb); (check the conditions within classify() and the returned type) ... phydev = skb->dev->phydev; if (likely(phydev->drv->rxtstamp)) return phydev->drv->rxtstamp(phydev, skb, type); *** drivers/net/phy/dp83640.c: dp83640_rxtstamp[1172] (If you can trace a sync message this far, then that will already help narrow down the problem.) Thanks, Richard |
From: Richard C. <ric...@gm...> - 2012-05-23 15:15:19
|
On Wed, May 23, 2012 at 04:54:42PM +0200, Mario Molitor wrote: > I have a question to phy driver dp83640.c only for interest. I have seen in > probe function that recalibrate(clock) function not always executed directly > after registration. What is the reason for this? The driver supports using multiple PHYs together as one PHC clock. Calibration is needed to synchronize multiple PHYs, but only when you have two or more PHYs on the same MDIO bus. Do you have more than one PHY in your design? If so, their GPIOs need to be wired together correctly in order to make this work. If not, then the recalibrate() function should never be called. HTH, Richard |
From: Mario M. <Mar...@we...> - 2012-05-23 14:54:52
|
Hello Richard, many thanks for your quick answer. >(I think you mean MPC) Yes, it was typo > I have never seen the combination MPC5200/DP83640. Is this a > commercial board or a custom design? It is custom design. We design a module for our data acquisition systems. >Looks like no packets are being time stamped at all. Why the software time stamping work and only the hardware time stamping has this problem? This point has me irritated. The other point we don't have connected the PTP server directly with our PTP slave module and it is some Ethernet switches involved in Ethernet connection. Could this make a problem? >(I think you mean PTP_1588_CLOCK) Yes, it was typo >Did you enable CONFIG_DP83640_PHY too? Yes we have also used this option. >Don't worry, we will figure this out... Thanks. I appreciate your help. :) >The ptp4l should work with every PHC Linux driver. Ok. Fine and I am happy to hear that. >This might be a driver issue with the MPC5200, since I never tested it >in combination with the DP83640. But first, make sure you have the PHY >driver compiled in your kernel. I have adapt the code of PHY driver (net/phy/dp83640.c) with printk and I have seen the probe function is executed. I can say the kernel driver is compiled and loaded. I have a question to phy driver dp83640.c only for interest. I have seen in probe function that recalibrate(clock) function not always executed directly after registration. What is the reason for this? I men following code sequence: ========================================= if (choose_this_phy(clock, phydev)) { clock->chosen = dp83640; clock->ptp_clock = ptp_clock_register(&clock->caps); if (IS_ERR(clock->ptp_clock)) { err = PTR_ERR(clock->ptp_clock); goto no_register; } } else list_add_tail(&dp83640->list, &clock->phylist); if (clock->chosen && !list_empty(&clock->phylist)) recalibrate(clock); ======================================== Kind regards and thanks again for your help, Mario ----- Original Message ----- From: "Richard Cochran" <ric...@gm...> To: "Mario Molitor" <Mar...@we...> Cc: <lin...@li...> Sent: Tuesday, May 22, 2012 8:17 PM Subject: Re: [Linuxptp-users] Problems with hardware time stamping with PHY DP83640 on MCP5200 powerpc platform > On Tue, May 22, 2012 at 07:34:55PM +0200, Mario Molitor wrote: >> Hallo linuxptp members, >> >> I have some problems to use our PTP device (PHY DP83640 on MCP5200 >> powerpc > ^^^ > (I think you mean MPC) > >> platform). > > I have never seen the combination MPC5200/DP83640. Is this a > commercial board or a custom design? > >> 2.) hardware time stamping >> =================== >> ptp4l[183]: selected /dev/ptp0 as PTP clock >> >> ptp4l[183]: port 1: INITIALIZING to LISTENING on INITIALIZE >> ptp4l[183]: received SYNC without timestamp >> ptp4l[183]: port 1: bad message >> ptp4l[183]: port 1: new foreign master 0050c2.fffe.c2dfc3-1 >> ptp4l[183]: received SYNC without timestamp >> ptp4l[183]: port 1: bad message >> ptp4l[183]: received SYNC without timestamp >> ptp4l[183]: port 1: bad message > > Looks like no packets are being time stamped at all. > >> I used following linux version without any source changes for test of >> PTP: >> >> Linux version 3.4.0-rc7-next-20120514 >> with following options >> - CONFIG_EXPERIMENTAL >> - CONFIG_PPS >> - CONFIG_NETWORK_PHY_TIMESTAMPING >> - PTP_1588_CLOCKPTP > ^^^ > (I think you mean PTP_1588_CLOCK) > > Did you enable CONFIG_DP83640_PHY too? > >> I have at the moment no any idea what is wrong and I need help to figure >> out >> where the problem is. > > Don't worry, we will figure this out... > >> The other point is, I am not sure that the ptp4l support the DP83640 >> PHY. > > The ptp4l should work with every PHC Linux driver. > >> Please let me know how I can debug this problem e.g. with some printks in >> the kernel driver or adapt the ptp4l application. > > This might be a driver issue with the MPC5200, since I never tested it > in combination with the DP83640. But first, make sure you have the PHY > driver compiled in your kernel. > > Thanks, > Richard > > > ------------------------------------------------------------------------------ > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. Discussions > will include endpoint security, mobile security and the latest in malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > _______________________________________________ > Linuxptp-users mailing list > Lin...@li... > https://lists.sourceforge.net/lists/listinfo/linuxptp-users |
From: Mario M. <Mar...@we...> - 2012-05-23 14:41:34
|
Hello Richard, many thanks for your quick answer. >(I think you mean MPC) Yes, it was typo >I have never seen the combination MPC5200/DP83640. Is this a > commercial board or a custom design? It is custom design. We design a module for our data acquisition systems. >Looks like no packets are being time stamped at all. Why the software time stamping work and only the hardware time stamping has this problem? This point has me irritated. The other point we don't have connected the PTP server directly with our PTP slave module and it is some Ethernet switches involved in Ethernet connection. Could this make a problem? >(I think you mean PTP_1588_CLOCK) Yes, it was typo >Did you enable CONFIG_DP83640_PHY too? Yes we have also used this option. >Don't worry, we will figure this out... Thanks. I appreciate your help. :) >The ptp4l should work with every PHC Linux driver. Ok. Fine and I am happy to hear that. >This might be a driver issue with the MPC5200, since I never tested it >in combination with the DP83640. But first, make sure you have the PHY >driver compiled in your kernel. I have adapt the code of PHY driver (net/phy/dp83640.c) with printk and I have seen the probe function is executed. I can say the kernel driver is compiled and loaded. I have a question to phy driver dp83640.c only for interest. I have seen in probe function that recalibrate(clock) function not always executed directly after registration. What is the reason for this? I men following code sequence: ========================================== if (choose_this_phy(clock, phydev)) { clock->chosen = dp83640; clock->ptp_clock = ptp_clock_register(&clock->caps); if (IS_ERR(clock->ptp_clock)) { err = PTR_ERR(clock->ptp_clock); goto } } else list_add_tail(&dp83640->list, &clock->phylist); if (clock->chosen && !list_empty(&clock->phylist)) recalibrate(clock); ======================================== Kind regards and thanks again for your help, Mario ----- Original Message ----- From: "Richard Cochran" <ric...@gm...> To: "Mario Molitor" <Mar...@we...> Cc: <lin...@li...> Sent: Tuesday, May 22, 2012 8:17 PM Subject: Re: [Linuxptp-users] Problems with hardware time stamping with PHY DP83640 on MCP5200 powerpc platform > On Tue, May 22, 2012 at 07:34:55PM +0200, Mario Molitor wrote: >> Hallo linuxptp members, >> >> I have some problems to use our PTP device (PHY DP83640 on MCP5200 >> powerpc > ^^^ > (I think you mean MPC) > >> platform). > > I have never seen the combination MPC5200/DP83640. Is this a > commercial board or a custom design? > >> 2.) hardware time stamping >> =================== >> ptp4l[183]: selected /dev/ptp0 as PTP clock >> >> ptp4l[183]: port 1: INITIALIZING to LISTENING on INITIALIZE >> ptp4l[183]: received SYNC without timestamp >> ptp4l[183]: port 1: bad message >> ptp4l[183]: port 1: new foreign master 0050c2.fffe.c2dfc3-1 >> ptp4l[183]: received SYNC without timestamp >> ptp4l[183]: port 1: bad message >> ptp4l[183]: received SYNC without timestamp >> ptp4l[183]: port 1: bad message > > Looks like no packets are being time stamped at all. > >> I used following linux version without any source changes for test of >> PTP: >> >> Linux version 3.4.0-rc7-next-20120514 >> with following options >> - CONFIG_EXPERIMENTAL >> - CONFIG_PPS >> - CONFIG_NETWORK_PHY_TIMESTAMPING >> - PTP_1588_CLOCKPTP > ^^^ > (I think you mean PTP_1588_CLOCK) > > Did you enable CONFIG_DP83640_PHY too? > >> I have at the moment no any idea what is wrong and I need help to figure >> out >> where the problem is. > > Don't worry, we will figure this out... > >> The other point is, I am not sure that the ptp4l support the DP83640 >> PHY. > > The ptp4l should work with every PHC Linux driver. > >> Please let me know how I can debug this problem e.g. with some printks in >> the kernel driver or adapt the ptp4l application. > > This might be a driver issue with the MPC5200, since I never tested it > in combination with the DP83640. But first, make sure you have the PHY > driver compiled in your kernel. > > Thanks, > Richard > > > ------------------------------------------------------------------------------ > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. Discussions > will include endpoint security, mobile security and the latest in malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > _______________________________________________ > Linuxptp-users mailing list > Lin...@li... > https://lists.sourceforge.net/lists/listinfo/linuxptp-users |
From: Richard C. <ric...@gm...> - 2012-05-22 18:18:17
|
On Tue, May 22, 2012 at 07:34:55PM +0200, Mario Molitor wrote: > Hallo linuxptp members, > > I have some problems to use our PTP device (PHY DP83640 on MCP5200 powerpc ^^^ (I think you mean MPC) > platform). I have never seen the combination MPC5200/DP83640. Is this a commercial board or a custom design? > 2.) hardware time stamping > =================== > ptp4l[183]: selected /dev/ptp0 as PTP clock > > ptp4l[183]: port 1: INITIALIZING to LISTENING on INITIALIZE > ptp4l[183]: received SYNC without timestamp > ptp4l[183]: port 1: bad message > ptp4l[183]: port 1: new foreign master 0050c2.fffe.c2dfc3-1 > ptp4l[183]: received SYNC without timestamp > ptp4l[183]: port 1: bad message > ptp4l[183]: received SYNC without timestamp > ptp4l[183]: port 1: bad message Looks like no packets are being time stamped at all. > I used following linux version without any source changes for test of PTP: > > Linux version 3.4.0-rc7-next-20120514 > with following options > - CONFIG_EXPERIMENTAL > - CONFIG_PPS > - CONFIG_NETWORK_PHY_TIMESTAMPING > - PTP_1588_CLOCKPTP ^^^ (I think you mean PTP_1588_CLOCK) Did you enable CONFIG_DP83640_PHY too? > I have at the moment no any idea what is wrong and I need help to figure out > where the problem is. Don't worry, we will figure this out... > The other point is, I am not sure that the ptp4l support the DP83640 PHY. The ptp4l should work with every PHC Linux driver. > Please let me know how I can debug this problem e.g. with some printks in > the kernel driver or adapt the ptp4l application. This might be a driver issue with the MPC5200, since I never tested it in combination with the DP83640. But first, make sure you have the PHY driver compiled in your kernel. Thanks, Richard |
From: Mario M. <Mar...@we...> - 2012-05-22 17:35:02
|
Hallo linuxptp members, I have some problems to use our PTP device (PHY DP83640 on MCP5200 powerpc platform). The software time stamping seems for the first look ok but the hardware time stamping have some problems. 1.) software time stamping: ==================== ptp4l[180]: port 1: INITIALIZING to LISTENING on INITIALIZE ptp4l[180]: port 1: new foreign master 0050c2.fffe.c2dfc3-1 ptp4l[180]: selected best master clock 0050c2.fffe.c2dfc3 ptp4l[180]: port 1: LISTENING to UNCALIBRATED on RS_SLAVE ptp4l[180]: port 1: minimum delay request interval 2^3 ptp4l[180]: master offset 536296 s0 adj +0 path delay 116376 ptp4l[180]: master offset 546610 s0 adj +0 path delay 116376 ptp4l[180]: master offset 552398 s0 adj +0 path delay 116376 ptp4l[180]: master offset 597686 s1 adj +0 path delay 116376 ptp4l[180]: master offset 30287 s2 adj +3059 path delay 116376 ptp4l[180]: port 1: UNCALIBRATED to SLAVE on MASTER_CLOCK_SELECTED ptp4l[180]: master offset 41825 s2 adj +4255 path delay 116376 ptp4l[180]: master offset 104402 s2 adj +10617 path delay 116376 ptp4l[180]: master offset 108479 s2 adj +11133 path delay 116376 ptp4l[180]: master offset 190504 s2 adj +19526 path delay 95604 ptp4l[180]: master offset 180213 s2 adj +18677 path delay 95604 ptp4l[180]: master offset 212836 s2 adj +22152 path delay 87573 ptp4l[180]: master offset 202140 s2 adj +21285 path delay 87573 ptp4l[180]: master offset 245722 s2 adj +25889 path delay 87573 ptp4l[180]: master offset 212369 s2 adj +22766 path delay 95802 ptp4l[180]: master offset 241451 s2 adj +25915 path delay 95802 ptp4l[180]: master offset 302218 s2 adj +32294 path delay 95802 ptp4l[180]: master offset 164231 s2 adj +18660 path delay 95802 ptp4l[180]: master offset 150769 s2 adj +17464 path delay 95724 It adjust only the system clock not PHY clock. 2.) hardware time stamping =================== ptp4l[183]: selected /dev/ptp0 as PTP clock ptp4l[183]: port 1: INITIALIZING to LISTENING on INITIALIZE ptp4l[183]: received SYNC without timestamp ptp4l[183]: port 1: bad message ptp4l[183]: port 1: new foreign master 0050c2.fffe.c2dfc3-1 ptp4l[183]: received SYNC without timestamp ptp4l[183]: port 1: bad message ptp4l[183]: received SYNC without timestamp ptp4l[183]: port 1: bad message ptp4l[183]: selected best master clock 0050c2.fffe.c2dfc3 ptp4l[183]: port 1: LISTENING to UNCALIBRATED on RS_SLAVE ptp4l[183]: recvmsg tx timestamp failed: Resource temporarily unavailable ptp4l[183]: port 1: send delay request failed ptp4l[183]: port 1: UNCALIBRATED to FAULTY on FAULT_DETECTED ptp4l[183]: port 1: FAULTY to LISTENING on FAULT_CLEARED ptp4l[183]: received SYNC without timestamp ptp4l[183]: port 1: bad message ptp4l[183]: port 1: new foreign master 0050c2.fffe.c2dfc3-1 ptp4l[183]: received SYNC without timestamp ptp4l[183]: port 1: bad message ptp4l[183]: received SYNC without timestamp ptp4l[183]: port 1: bad message ptp4l[183]: selected best master clock 0050c2.fffe.c2dfc3 ptp4l[183]: port 1: LISTENING to UNCALIBRATED on RS_SLAVE ptp4l[183]: recvmsg tx timestamp failed: Resource temporarily unavailable ptp4l[183]: port 1: send delay request failed ptp4l[183]: port 1: UNCALIBRATED to FAULTY on FAULT_DETECTED ptp4l[183]: port 1: FAULTY to LISTENING on FAULT_CLEARED ptp4l[183]: received SYNC without timestamp ptp4l[183]: port 1: bad message ptp4l[183]: port 1: new foreign master 0050c2.fffe.c2dfc3-1 ptp4l[183]: received SYNC without timestamp ptp4l[183]: port 1: bad message ptp4l[183]: received SYNC without timestamp ptp4l[183]: port 1: bad message ptp4l[183]: selected best master clock 0050c2.fffe.c2dfc3 ptp4l[183]: port 1: LISTENING to UNCALIBRATED on RS_SLAVE ptp4l[183]: recvmsg tx timestamp failed: Resource temporarily unavailable ptp4l[183]: port 1: send delay request failed ptp4l[183]: port 1: UNCALIBRATED to FAULTY on FAULT_DETECTED ptp4l[183]: port 1: FAULTY to LISTENING on FAULT_CLEARED ptp4l[183]: received SYNC without timestamp ptp4l[183]: port 1: bad message ptp4l[183]: port 1: new foreign master 0050c2.fffe.c2dfc3-1 ptp4l[183]: received SYNC without timestamp ptp4l[183]: port 1: bad message ptp4l[183]: received SYNC without timestamp ptp4l[183]: port 1: bad message ptp4l[183]: selected best master clock 0050c2.fffe.c2dfc3 ptp4l[183]: port 1: LISTENING to UNCALIBRATED on RS_SLAVE ptp4l[183]: recvmsg tx timestamp failed: Resource temporarily unavailable ptp4l[183]: port 1: send delay request failed ptp4l[183]: port 1: UNCALIBRATED to FAULTY on FAULT_DETECTED I used following linux version without any source changes for test of PTP: Linux version 3.4.0-rc7-next-20120514 with following options - CONFIG_EXPERIMENTAL - CONFIG_PPS - CONFIG_NETWORK_PHY_TIMESTAMPING - PTP_1588_CLOCKPTP I have at the moment no any idea what is wrong and I need help to figure out where the problem is. The other point is, I am not sure that the ptp4l support the DP83640 PHY. Please let me know how I can debug this problem e.g. with some printks in the kernel driver or adapt the ptp4l application. Best regards and many thanks for the help in advance, Mario Molitor |
From: Keller, J. E <jac...@in...> - 2012-05-07 19:29:09
|
> -----Original Message----- > From: Richard Cochran [mailto:ric...@gm...] > Sent: Monday, May 07, 2012 12:17 PM > To: Keller, Jacob E > Cc: lin...@li... > Subject: Re: [Linuxptp-users] Possible missing fault > On Mon, May 07, 2012 at 06:34:51PM +0000, Keller, Jacob E wrote: >> This is using layer2 ethernet, and the peer delay mechanism. The >> following output With a large chunk of similar output snipped out of >> the middle. >> >> ptp4l[9130]: master offset 0 s2 adj -0 path delay 2246 >> ptp4l[9130]: master offset 0 s2 adj -0 path delay 2246 >> ptp4l[9130]: master offset 0 s2 adj -0 path delay 2246 >> ptp4l[9130]: master offset 0 s2 adj -0 path delay 2246 > > BTW - Wow, those number look great. > > ...snip... > >> ptp4l[9130]: master offset 0 s2 adj -1 path delay 2246 >> ptp4l[9130]: recvmsg tx timestamp failed: Resource temporarily >> unavailable >> --- >> Here, recvmsg doesn't throw a fault? It just keeps going, but >> clearly Something is wrong. >> --- >> ptp4l[9130]: port 1: send peer delay response failed > > There is a check missing in port.c for this case. This should fix it. > > diff --git a/port.c b/port.c > index 4fe32d0..5edc772 100644 > --- a/port.c > +++ b/port.c > @@ -1297,7 +1297,8 @@ enum fsm_event port_event(struct port *p, int fd_index) > process_delay_req(p, msg); > break; > case PDELAY_REQ: > - process_pdelay_req(p, msg); > + if (process_pdelay_req(p, msg)) > + event = EV_FAULT_DETECTED; > break; > case PDELAY_RESP: > if (process_pdelay_resp(p, msg)) > >> ptp4l[9130]: master offset 63794 s2 adj +34723 path delay 2314 >> ptp4l[9130]: master offset 29063 s2 adj +19130 path delay 2314 >> ptp4l[9130]: master offset 9993 s2 adj +8779 path delay 2250 >> ptp4l[9130]: master offset 1210 s2 adj +2994 path delay 2250 >> ptp4l[9130]: master offset -1782 s2 adj +365 path delay 2247 >> ptp4l[9130]: master offset -2148 s2 adj -535 path delay 2246 >> ptp4l[9130]: master offset -1612 s2 adj -644 path delay 2246 >> ptp4l[9130]: master offset -971 s2 adj -486 path delay 2246 >> ptp4l[9130]: master offset -483 s2 adj -290 path delay 2245 > > Here it has started to recover, obviously. > >> Much later on a different error throws a fault and suddenly >> everything is better. Is that behavior in the Middle possibly caused >> by some faulty state that wasn't cleared? I am not sure, but those >> path delays And > values seem incredibly wrong. > > The peer path delay is a moving average of ten values, so even if you > get one or a few very wrong values, the bad effect should soon > disappear, but in your log the error persists much longer. > >> I think it's because somehow one of the sequence numbers got Out of >> order and messed up. I am going to try and take a look at that code >> and see if I can find the issue, But I am wondering if you've seen >> behavior > like this. > > I haven't seen that, and it does look like either the messages are > wrong (unlikely, but check with wireshark) or that the HW time stamps > are getting associated with the wrong messages. > > Good luck, > Richard I am not sure exactly what the error is considering this occurred after 3 days. I think adding the port fault check should be good though, as after the fault the major screwup appears to go away. - Jake PS: I sent a patch that adds those fault checks :) |
From: Richard C. <ric...@gm...> - 2012-05-07 19:17:23
|
On Mon, May 07, 2012 at 06:34:51PM +0000, Keller, Jacob E wrote: > This is using layer2 ethernet, and the peer delay mechanism. The following output > With a large chunk of similar output snipped out of the middle. > > ptp4l[9130]: master offset 0 s2 adj -0 path delay 2246 > ptp4l[9130]: master offset 0 s2 adj -0 path delay 2246 > ptp4l[9130]: master offset 0 s2 adj -0 path delay 2246 > ptp4l[9130]: master offset 0 s2 adj -0 path delay 2246 BTW - Wow, those number look great. ...snip... > ptp4l[9130]: master offset 0 s2 adj -1 path delay 2246 > ptp4l[9130]: recvmsg tx timestamp failed: Resource temporarily unavailable > --- > Here, recvmsg doesn't throw a fault? It just keeps going, but clearly > Something is wrong. > --- > ptp4l[9130]: port 1: send peer delay response failed There is a check missing in port.c for this case. This should fix it. diff --git a/port.c b/port.c index 4fe32d0..5edc772 100644 --- a/port.c +++ b/port.c @@ -1297,7 +1297,8 @@ enum fsm_event port_event(struct port *p, int fd_index) process_delay_req(p, msg); break; case PDELAY_REQ: - process_pdelay_req(p, msg); + if (process_pdelay_req(p, msg)) + event = EV_FAULT_DETECTED; break; case PDELAY_RESP: if (process_pdelay_resp(p, msg)) > ptp4l[9130]: master offset 63794 s2 adj +34723 path delay 2314 > ptp4l[9130]: master offset 29063 s2 adj +19130 path delay 2314 > ptp4l[9130]: master offset 9993 s2 adj +8779 path delay 2250 > ptp4l[9130]: master offset 1210 s2 adj +2994 path delay 2250 > ptp4l[9130]: master offset -1782 s2 adj +365 path delay 2247 > ptp4l[9130]: master offset -2148 s2 adj -535 path delay 2246 > ptp4l[9130]: master offset -1612 s2 adj -644 path delay 2246 > ptp4l[9130]: master offset -971 s2 adj -486 path delay 2246 > ptp4l[9130]: master offset -483 s2 adj -290 path delay 2245 Here it has started to recover, obviously. > Much later on a different error throws a fault and suddenly everything is better. Is that behavior in the > Middle possibly caused by some faulty state that wasn't cleared? I am not sure, but those path delays > And values seem incredibly wrong. The peer path delay is a moving average of ten values, so even if you get one or a few very wrong values, the bad effect should soon disappear, but in your log the error persists much longer. > I think it's because somehow one of the sequence numbers got > Out of order and messed up. I am going to try and take a look at that code and see if I can find the issue, > But I am wondering if you've seen behavior like this. I haven't seen that, and it does look like either the messages are wrong (unlikely, but check with wireshark) or that the HW time stamps are getting associated with the wrong messages. Good luck, Richard |
From: Richard C. <ric...@gm...> - 2012-05-07 18:57:20
|
On Mon, May 07, 2012 at 06:18:35PM +0000, Keller, Jacob E wrote: > > We were seeing faults. It is possible the default gateway goes down, but it should > Come back after the link returns, right? Usually, yes. > There was no slave only mode. So the reset > Process does close and open the socket again? Yes. > I will see if I can narrow this down. Jake, I am not sure what you are trying to say here... If the port link does down, then the fault it normal and expected. After at most fifteen seconds after the link comes back, you should see the fault cleared, just like in the log output you sent today. ptp4l[9130]: port 1: SLAVE to FAULTY on FAULT_DETECTED ptp4l[9130]: port 1: FAULTY to LISTENING on FAULT_CLEARED HTH, Richard |
From: Keller, J. E <jac...@in...> - 2012-05-07 18:35:01
|
This is using layer2 ethernet, and the peer delay mechanism. The following output With a large chunk of similar output snipped out of the middle. ptp4l[9130]: master offset 0 s2 adj -0 path delay 2246 ptp4l[9130]: master offset 0 s2 adj -0 path delay 2246 ptp4l[9130]: master offset 0 s2 adj -0 path delay 2246 ptp4l[9130]: master offset 0 s2 adj -0 path delay 2246 ptp4l[9130]: master offset 0 s2 adj -0 path delay 2246 ptp4l[9130]: master offset 0 s2 adj -0 path delay 2246 ptp4l[9130]: master offset -1 s2 adj -1 path delay 2246 ptp4l[9130]: master offset 0 s2 adj -1 path delay 2246 ptp4l[9130]: master offset 0 s2 adj -1 path delay 2246 ptp4l[9130]: master offset -1 s2 adj -2 path delay 2246 ptp4l[9130]: master offset 0 s2 adj -1 path delay 2246 ptp4l[9130]: master offset 0 s2 adj -1 path delay 2246 ptp4l[9130]: master offset -1 s2 adj -2 path delay 2246 ptp4l[9130]: master offset 0 s2 adj -1 path delay 2246 ptp4l[9130]: master offset 1 s2 adj -0 path delay 2246 ptp4l[9130]: master offset 1 s2 adj +0 path delay 2246 ptp4l[9130]: master offset 0 s2 adj -1 path delay 2246 ptp4l[9130]: master offset 1 s2 adj +0 path delay 2246 ptp4l[9130]: master offset 1 s2 adj +1 path delay 2246 ptp4l[9130]: master offset -1 s2 adj -1 path delay 2246 ptp4l[9130]: master offset 1 s2 adj +1 path delay 2246 ptp4l[9130]: master offset 0 s2 adj +0 path delay 2246 ptp4l[9130]: master offset 0 s2 adj +0 path delay 2246 ptp4l[9130]: master offset 1 s2 adj +1 path delay 2246 ptp4l[9130]: master offset 0 s2 adj +0 path delay 2246 ptp4l[9130]: master offset -1 s2 adj -1 path delay 2246 ptp4l[9130]: master offset -1 s2 adj -1 path delay 2246 ptp4l[9130]: master offset -1 s2 adj -1 path delay 2246 ptp4l[9130]: master offset 0 s2 adj -1 path delay 2246 ptp4l[9130]: master offset 0 s2 adj -1 path delay 2246 ptp4l[9130]: recvmsg tx timestamp failed: Resource temporarily unavailable --- Here, recvmsg doesn't throw a fault? It just keeps going, but clearly Something is wrong. --- ptp4l[9130]: port 1: send peer delay response failed ptp4l[9130]: master offset -1 s2 adj -2 path delay 2246 ptp4l[9130]: master offset -43938482 s2 adj -43938483 path delay 43940728 ptp4l[9130]: master offset -1171142 s2 adj -14352687 path delay 45114645 ptp4l[9130]: master offset -59895424 s2 adj -73428312 path delay 118195998 ptp4l[9130]: master offset 13533141 s2 adj -17968374 path delay 118195998 ptp4l[9130]: master offset -959421 s2 adj -28400994 path delay 150667108 ptp4l[9130]: master offset 19473679 s2 adj -8255720 path delay 158637541 ptp4l[9130]: master offset -13510091 s2 adj -35397387 path delay 199879994 ptp4l[9130]: master offset 21890754 s2 adj -4049569 path delay 199879994 ptp4l[9130]: master offset -17957633 s2 adj -37330730 path delay 243781453 ptp4l[9130]: master offset -44313497 s2 adj -69073884 path delay 307469087 ptp4l[9130]: master offset 41103457 s2 adj +3049021 path delay 291131068 ptp4l[9130]: master offset -20709563 s2 adj -46432962 path delay 349905094 ptp4l[9130]: master offset 25725135 s2 adj -6211132 path delay 349905094 ptp4l[9130]: master offset 101876700 s2 adj +77657973 path delay 279970952 ptp4l[9130]: master offset 36729373 s2 adj +43073656 path delay 267460041 ptp4l[9130]: master offset -12349994 s2 adj +5013101 path delay 273456473 ptp4l[9130]: master offset -38076301 s2 adj -24418204 path delay 294165576 ptp4l[9130]: master offset -13657682 s2 adj -11422476 path delay 294165576 ptp4l[9130]: master offset 4826744 s2 adj +2964646 path delay 287105953 ptp4l[9130]: master offset 75401760 s2 adj +74987685 path delay 213567350 ptp4l[9130]: master offset 11323148 s2 adj +33529601 path delay 202654824 ptp4l[9130]: master offset 8383536 s2 adj +33986933 path delay 172056349 ptp4l[9130]: master offset -63923663 s2 adj -35805205 path delay 210372287 ptp4l[9130]: master offset -36342287 s2 adj -27400928 path delay 218595737 ptp4l[9130]: master offset -25415962 s2 adj -27377289 path delay 235073893 ptp4l[9130]: master offset 54935774 s2 adj +45349659 path delay 182101818 ptp4l[9130]: master offset -1483435 s2 adj +5411182 path delay 193172785 ptp4l[9130]: master offset -21109976 s2 adj -14660390 path delay 207383663 ptp4l[9130]: master offset -6449498 s2 adj -6332904 path delay 207383663 <snip> ptp4l[9130]: master offset -29956183 s2 adj -45254075 path delay 257139999 ptp4l[9130]: master offset -6172599 s2 adj -30457345 path delay 278614170 ptp4l[9130]: master offset 24289681 s2 adj -1846845 path delay 278614170 ptp4l[9130]: master offset 41848792 s2 adj +22999170 path delay 262904929 ptp4l[9130]: master offset -30372859 s2 adj -36667843 path delay 312127569 ptp4l[9130]: master offset -3607673 s2 adj -19014515 path delay 322030202 ptp4l[9130]: master offset -30156558 s2 adj -46645702 path delay 367598211 ptp4l[9130]: master offset 16491069 s2 adj -9045042 path delay 367598211 ptp4l[9130]: master offset 52317772 s2 adj +31728981 path delay 340821191 ptp4l[9130]: master offset 45628905 s2 adj +40735446 path delay 315780591 ptp4l[9130]: master offset -1315375 s2 adj +7479838 path delay 321986499 ptp4l[9130]: master offset 22069277 s2 adj +30469877 path delay 291118076 ptp4l[9130]: master offset -8404691 s2 adj +6616692 path delay 291118076 ptp4l[9130]: master offset 42513447 s2 adj +55013423 path delay 233579702 ptp4l[9130]: master offset -12508577 s2 adj +12745433 path delay 233579702 ptp4l[9130]: master offset -50524014 s2 adj -29022577 path delay 258844549 ptp4l[9130]: master offset -15524387 s2 adj -9180154 path delay 252865750 ptp4l[9130]: master offset 9203193 s2 adj +10890110 path delay 237321729 ptp4l[9130]: master offset -1686792 s2 adj +2761082 path delay 237321729 ptp4l[9130]: master offset -1209646 s2 adj +2732191 path delay 234082385 ptp4l[9130]: recvmsg failed: No message of desired type ptp4l[9130]: port 1: recv message failed ptp4l[9130]: port 1: SLAVE to FAULTY on FAULT_DETECTED ptp4l[9130]: port 1: FAULTY to LISTENING on FAULT_CLEARED ptp4l[9130]: port 1: new foreign master a0369f.fffe.01b740-1 ptp4l[9130]: selected best master clock a0369f.fffe.01b740 ptp4l[9130]: port 1: LISTENING to UNCALIBRATED on RS_SLAVE ptp4l[9130]: master offset -55861189 s2 adj -52282246 path delay 234082385 ptp4l[9130]: port 1: UNCALIBRATED to SLAVE on MASTER_CLOCK_SELECTED ptp4l[9130]: master offset 73319566 s2 adj +60140152 path delay 157180112 ptp4l[9130]: master offset 25716180 s2 adj +34532636 path delay 144653042 ptp4l[9130]: master offset -5234805 s2 adj +11296505 path delay 141059857 ptp4l[9130]: master offset 67798032 s2 adj +82758901 path delay 56721755 ptp4l[9130]: master offset -14960156 s2 adj +20340122 path delay 56721755 ptp4l[9130]: master offset -34374705 s2 adj -3562474 path delay 55781427 ptp4l[9130]: master offset -18140580 s2 adj +2359240 path delay 43106767 ptp4l[9130]: master offset -4683127 s2 adj +10374519 path delay 27290731 ptp4l[9130]: master offset 12231747 s2 adj +25884455 path delay 1604 ptp4l[9130]: master offset -13652187 s2 adj +3670045 path delay 1115 ptp4l[9130]: master offset -17326157 s2 adj -4099581 path delay 1460 ptp4l[9130]: master offset -13227144 s2 adj -5198415 path delay 1460 ptp4l[9130]: master offset -8028704 s2 adj -3968118 path delay 1808 ptp4l[9130]: master offset -4060291 s2 adj -2408317 path delay 1808 ptp4l[9130]: master offset -1651755 s2 adj -1217868 path delay 2030 ptp4l[9130]: master offset -433545 s2 adj -495184 path delay 2030 ptp4l[9130]: master offset 61834 s2 adj -129869 path delay 2011 ptp4l[9130]: master offset 191631 s2 adj +18478 path delay 2155 ptp4l[9130]: master offset 172983 s2 adj +57320 path delay 2345 ptp4l[9130]: master offset 115659 s2 adj +51890 path delay 2345 ptp4l[9130]: master offset 63794 s2 adj +34723 path delay 2314 ptp4l[9130]: master offset 29063 s2 adj +19130 path delay 2314 ptp4l[9130]: master offset 9993 s2 adj +8779 path delay 2250 ptp4l[9130]: master offset 1210 s2 adj +2994 path delay 2250 ptp4l[9130]: master offset -1782 s2 adj +365 path delay 2247 ptp4l[9130]: master offset -2148 s2 adj -535 path delay 2246 ptp4l[9130]: master offset -1612 s2 adj -644 path delay 2246 ptp4l[9130]: master offset -971 s2 adj -486 path delay 2246 ptp4l[9130]: master offset -483 s2 adj -290 path delay 2245 Much later on a different error throws a fault and suddenly everything is better. Is that behavior in the Middle possibly caused by some faulty state that wasn't cleared? I am not sure, but those path delays And values seem incredibly wrong. I think it's because somehow one of the sequence numbers got Out of order and messed up. I am going to try and take a look at that code and see if I can find the issue, But I am wondering if you've seen behavior like this. - Jake |
From: Keller, J. E <jac...@in...> - 2012-05-07 18:18:42
|
> -----Original Message----- > From: Richard Cochran [mailto:ric...@gm...] > Sent: Friday, May 04, 2012 11:29 PM > To: Keller, Jacob E > Cc: lin...@li... > Subject: Re: [Linuxptp-users] Issue when partner machine reboots > > On Fri, May 04, 2012 at 08:39:36PM +0000, Keller, Jacob E wrote: > > > > Ok. For now I had validation try using the l2 ethernet packets. Reset > > doesn't make these invalid. I believe the reason why UDP sockets go > > down is because the IP address becomes invalid when the socket loses > > link. (Even if the same IP address came back!) I like the idea of a > > netlink socket that could trigger a proper port reset. Right now the > > default port reset doesn't appear to reset the socket. (It does get a > > fault because it is unable to transmit,but the socket isn't > > ^^^^^^^^^^^^^^^^^^^^ > > reset when it clears the fault.) > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > / > This surprises me. > > After a port fault, the clock waits 15 seconds and then clears the fault. This > causes the port to open the transport (ie socket) again. > So if the program faults after a failed transmission, then it should recover. > Could it be that your network setup loses the default gateway after linkdown? > (Without a default route, the UDP multicast does not > work.) > > In any case, if your UDP socket does recover after a failed transmission, then > there is a bug somewhere. I expect that ptp4l should reset (close, and then > open) the UDP socket for that port every fifteen seconds, until the link > recovers. If this isn't working, then I want to know why. > > There is _one_ case where I know that the port gets stuck, and that is when > you have slave only mode. If the link does down when the port is in the > listening state, the port never sends and thus never sees and error. For this > case the netlink socket is needed. > > Thanks, > Richard We were seeing faults. It is possible the default gateway goes down, but it should Come back after the link returns, right? There was no slave only mode. So the reset Process does close and open the socket again? I will see if I can narrow this down. - Jake |
From: Richard C. <ric...@gm...> - 2012-05-05 06:29:26
|
On Fri, May 04, 2012 at 08:39:36PM +0000, Keller, Jacob E wrote: > > Ok. For now I had validation try using the l2 ethernet packets. Reset doesn't > make these invalid. I believe the reason why UDP sockets go down is because the > IP address becomes invalid when the socket loses link. (Even if the same IP > address came back!) I like the idea of a netlink socket that could trigger a > proper port reset. Right now the default port reset doesn't appear to reset the > socket. (It does get a fault because it is unable to transmit,but the socket isn't ^^^^^^^^^^^^^^^^^^^^ > reset when it clears the fault.) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ / This surprises me. After a port fault, the clock waits 15 seconds and then clears the fault. This causes the port to open the transport (ie socket) again. So if the program faults after a failed transmission, then it should recover. Could it be that your network setup loses the default gateway after linkdown? (Without a default route, the UDP multicast does not work.) In any case, if your UDP socket does recover after a failed transmission, then there is a bug somewhere. I expect that ptp4l should reset (close, and then open) the UDP socket for that port every fifteen seconds, until the link recovers. If this isn't working, then I want to know why. There is _one_ case where I know that the port gets stuck, and that is when you have slave only mode. If the link does down when the port is in the listening state, the port never sends and thus never sees and error. For this case the netlink socket is needed. Thanks, Richard |
From: Keller, J. E <jac...@in...> - 2012-05-04 20:39:43
|
> -----Original Message----- > From: Richard Cochran [mailto:ric...@gm...] > Sent: Friday, May 04, 2012 6:06 AM > To: Keller, Jacob E > Cc: lin...@li... > Subject: Re: [Linuxptp-users] Issue when partner machine reboots > > On Thu, May 03, 2012 at 10:37:30PM +0000, Keller, Jacob E wrote: > > I am having an issue when two machines are hooked up for ptp. If one > > side reboots, the other side will never recover. This happens even > > after the port timeout for reseting the fault. I think the reason this > > occurs is because the socket for UDP goes invalid when the interface > > goes down because of the loss of link. Is there a way to solve this so > > that ptp4l would properly re-initialize when coming back? (attempt to > > re-get the entire sockets etc?) > > > > Or is this just an invalid usecase and I should expect to have to > > reset the daemon if its partners go down? > > I am aware of the problem with a link down. Once the link does down, the UDP > socket is no longer valid. I don't really understand why that should be the > case, but probably we will just have to live with it. > > I think the solution will be to open an additional netlink socket and poll for > the link events. Link down should trigger a port fault, and then the port > reset will then re-open the socket. > > But that will take some time. > > Thanks, > Richard Ok. For now I had validation try using the l2 ethernet packets. Reset doesn't make these invalid. I believe the reason why UDP sockets go down is because the IP address becomes invalid when the socket loses link. (Even if the same IP address came back!) I like the idea of a netlink socket that could trigger a proper port reset. Right now the default port reset doesn't appear to reset the socket. (It does get a fault because it is unable to transmit,but the socket isn't reset when it clears the fault.) |
From: Richard C. <ric...@gm...> - 2012-05-04 13:06:37
|
On Thu, May 03, 2012 at 10:37:30PM +0000, Keller, Jacob E wrote: > I am having an issue when two machines are hooked up for ptp. If one > side reboots, the other side will never recover. This happens even > after the port timeout for reseting the fault. I think the reason > this occurs is because the socket for UDP goes invalid when the > interface goes down because of the loss of link. Is there a way to > solve this so that ptp4l would properly re-initialize when coming > back? (attempt to re-get the entire sockets etc?) > > Or is this just an invalid usecase and I should expect to have to > reset the daemon if its partners go down? I am aware of the problem with a link down. Once the link does down, the UDP socket is no longer valid. I don't really understand why that should be the case, but probably we will just have to live with it. I think the solution will be to open an additional netlink socket and poll for the link events. Link down should trigger a port fault, and then the port reset will then re-open the socket. But that will take some time. Thanks, Richard |
From: Keller, J. E <jac...@in...> - 2012-05-03 22:37:40
|
I am having an issue when two machines are hooked up for ptp. If one side reboots, the other side will never recover. This happens even after the port timeout for reseting the fault. I think the reason this occurs is because the socket for UDP goes invalid when the interface goes down because of the loss of link. Is there a way to solve this so that ptp4l would properly re-initialize when coming back? (attempt to re-get the entire sockets etc?) Or is this just an invalid usecase and I should expect to have to reset the daemon if its partners go down? - Jake |
From: Richard C. <ric...@gm...> - 2012-03-22 06:36:44
|
On Thu, Mar 22, 2012 at 03:40:50PM +1300, Eliot Blennerhassett wrote: > > Seeing the commit of raw ethernet transport, I wonder if there are any > plans to support 802.1AS (aka gPTP)? I was not thinking about AS directly, but, yes, I think it can be supported. > Section 7.5 of the standard[1] covers differences between gPTP and PTP > > Very basic ways that gPTP is different from PTP > * layer 2 only, PTPv2 only Just added, probably needs more testing. The L2 code also needs VLAN support, and I already have and idea how to add that. > * only uses the peer to peer multicast address I definitely want to add the P2P mechanism. > * 4-bit transportSpecific field in PTP header is 0x1 > * most other differences are ways in which gPTP is a subset of PTP So it looks like adding gPTP is doable. > (Aside: I followed the links from http://linuxptp.sourceforge.net/ to > linuxptp-devel archives, but they appear to be empty - is that mailing > list active, or is this list being used instead?) It is not dead. It just never lived yet. The -devel list is meant to be a place to post patches. So far, I am the only code author, but I would welcome code contributions. People are only just starting to notice this project. > [1] http://standards.ieee.org/getieee802/download/802.1AS-2011.pdf Okay, I will take a look at this. Thanks, Richard |