linuxptp-users Mailing List for linuxptp (Page 41)
PTP IEEE 1588 stack for Linux
Brought to you by:
rcochran
You can subscribe to this list here.
2012 |
Jan
|
Feb
(10) |
Mar
(47) |
Apr
|
May
(26) |
Jun
(10) |
Jul
(4) |
Aug
(2) |
Sep
(2) |
Oct
(20) |
Nov
(14) |
Dec
(8) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2013 |
Jan
(6) |
Feb
(18) |
Mar
(27) |
Apr
(57) |
May
(32) |
Jun
(21) |
Jul
(79) |
Aug
(108) |
Sep
(13) |
Oct
(73) |
Nov
(51) |
Dec
(24) |
2014 |
Jan
(24) |
Feb
(41) |
Mar
(39) |
Apr
(5) |
May
(6) |
Jun
(2) |
Jul
(5) |
Aug
(15) |
Sep
(7) |
Oct
(6) |
Nov
|
Dec
(7) |
2015 |
Jan
(27) |
Feb
(18) |
Mar
(37) |
Apr
(8) |
May
(13) |
Jun
(44) |
Jul
(4) |
Aug
(50) |
Sep
(35) |
Oct
(6) |
Nov
(24) |
Dec
(19) |
2016 |
Jan
(30) |
Feb
(30) |
Mar
(23) |
Apr
(4) |
May
(12) |
Jun
(19) |
Jul
(26) |
Aug
(13) |
Sep
|
Oct
(23) |
Nov
(37) |
Dec
(15) |
2017 |
Jan
(33) |
Feb
(19) |
Mar
(20) |
Apr
(43) |
May
(39) |
Jun
(23) |
Jul
(20) |
Aug
(27) |
Sep
(10) |
Oct
(15) |
Nov
|
Dec
(24) |
2018 |
Jan
(3) |
Feb
(10) |
Mar
(34) |
Apr
(34) |
May
(28) |
Jun
(50) |
Jul
(27) |
Aug
(75) |
Sep
(21) |
Oct
(42) |
Nov
(25) |
Dec
(31) |
2019 |
Jan
(39) |
Feb
(28) |
Mar
(19) |
Apr
(7) |
May
(30) |
Jun
(22) |
Jul
(54) |
Aug
(36) |
Sep
(19) |
Oct
(33) |
Nov
(36) |
Dec
(32) |
2020 |
Jan
(29) |
Feb
(38) |
Mar
(29) |
Apr
(30) |
May
(39) |
Jun
(45) |
Jul
(31) |
Aug
(52) |
Sep
(40) |
Oct
(8) |
Nov
(48) |
Dec
(30) |
2021 |
Jan
(35) |
Feb
(32) |
Mar
(23) |
Apr
(55) |
May
(43) |
Jun
(63) |
Jul
(17) |
Aug
(24) |
Sep
(9) |
Oct
(31) |
Nov
(67) |
Dec
(55) |
2022 |
Jan
(31) |
Feb
(48) |
Mar
(76) |
Apr
(18) |
May
(13) |
Jun
(46) |
Jul
(75) |
Aug
(54) |
Sep
(59) |
Oct
(65) |
Nov
(44) |
Dec
(7) |
2023 |
Jan
(38) |
Feb
(32) |
Mar
(35) |
Apr
(23) |
May
(46) |
Jun
(53) |
Jul
(18) |
Aug
(10) |
Sep
(24) |
Oct
(15) |
Nov
(40) |
Dec
(6) |
From: <Bri...@L3...> - 2021-10-18 21:52:25
|
Hi Vladimir, > -----Original Message----- > From: Vladimir Oltean <ol...@gm...> > Sent: Monday, October 18, 2021 5:27 PM > To: Hutchinson, Brian (US) - PSPC <Bri...@L3...> > Cc: ce...@ar...; lin...@li... > Subject: [EXTERNAL] Re: [Linuxptp-users] Using G.8275.2 profile and getting > tx timestamp timeout, but changing logSyncInterval etc. changes how often > this happens > > On Mon, Oct 18, 2021 at 09:12:07PM +0000, Bri...@L3... > wrote: > > On the console I saw more info about the kernel oops. Posting output I > saw below: > > > > Console output: > > > > [ 1108.463268] Mem abort info: > > [ 1108.466171] ESR = 0x96000004 > > [ 1108.469247] EC = 0x25: DABT (current EL), IL = 32 bits > > [ 1108.474572] SET = 0, FnV = 0 > > 2021 Sep 30 18:00:36 imx8mmevk [ 1108.455330] Unable to handle kernel > > paging request at virtual address 00000026fffe0003 > > 2021 Sep 30 18:00:36 imx8mmevk [ 1108.463268] Mem abort info: > > 2021 Sep 30 18:00:36 imx8mmevk [ 1108.466171] ESR = 0x96000004 > > [ 1108.499442] EA = 0, S1PTW = 0 > > [ 1108.499445] Data abort info: > > [ 1108.499447] ISV = 0, ISS = 0x00000004 > > [ 1108.499450] CM = 0, WnR = 0 > > [ 1108.499455] user pgtable: 4k pages, 48-bit VAs, > > pgdp=000000004493f000 [ 1108.499458] [00000026fffe0003] > > pgd=0000000000000000, p4d=0000000000000000 [ 1108.499470] Internal > > error: Oops: 96000004 [#2] PREEMPT SMP [ 1108.499474] Modules linked in: > crct10dif_ce(+) fsl_imx8_ddr_perf(+) error(+) clk_bd718x7(+) > snvs_pwrkey(+) rtc_snvs(+) imx8mm_thermal(+) snd_soc_fsl_sai(+) > snd_soc_simple_card_utils(+) imx_cpufreq_dt(+) > > [ 1108.549100] CPU: 1 PID: 171 Comm: ksz_xmit Tainted: G D 5.10.32 > #1 > > [ 1108.549102] Hardware name: FSL i.MX8MM EVK board (DT) [ > > 1108.549108] pstate: 40000005 (nZcv daif -PAN -UAO -TCO BTYPE=--) [ > > 1108.549119] pc : ksz9477_port_deferred_xmit+0x70/0xe8 > > [ 1108.549126] lr : ksz9477_port_deferred_xmit+0x54/0xe8 > > [ 1108.577742] sp : ffff800012e7bdb0 > > [ 1108.577745] x29: ffff800012e7bdb0 x28: 0000000000000000 [ > > 1108.586372] x27: ffff8000128a3838 x26: ffff00000498f448 [ > > 1108.586380] x25: 0000000000000001 x24: ffff0000041a8188 [ > > 1108.597001] x23: ffff0000034c1080 x22: ffff000007666000 [ > > 1108.602317] x21: ffff0000034c12e8 x20: ffff00000000005c [ > > 1108.607635] x19: ffff00000346e580 x18: 0000000000000000 [ > > 1108.607641] x17: 0000000000000000 x16: 0000000000000000 [ > > 1108.607646] x15: 0000000000000000 x14: 0d3631207369206c [ > > 1108.607652] x13: 0000000000000007 x12: 0000000000000000 [ > > 1108.628887] x11: ffff000003453b08 x10: ffff00000539b540 [ > > 1108.628894] x9 : ffff800010010664 x8 : 00000000000003e8 [ > > 1108.639516] x7 : ffff00000a844000 x6 : 00000000025454c7 [ > > 1108.639525] x5 : 00ffffffffffffff x4 : 0000000000000016 [ > > 1108.650141] x3 : 00000000ffff0000 x2 : 00000026fffe0000 [ > > 1108.655456] x1 : 0000000000000064 x0 : ffff0000034c1878 > > 2021 Sep 30 18:00:36 imx8mmevk [ 1108.469247] EC = 0x25: DABT (current > EL), IL = 32 bits > > 2021 Sep 30 18:00:36 imx8mmevk [ 1108.474572] SET = 0, FnV = 0 > > 2021 Sep 30 18:00:36 imx8mmevk [ 1108.499442] EA = 0, S1PTW = 0 > > 2021 Sep 30 18:00:36 imx8mmevk [ 1108.499445] Data abort info: > > 2021 Sep 30 18:00:36 imx8mmevk [ 1108.499447] ISV = 0, ISS = 0x00000004 > > 2021 Sep 30 18:00:36 imx8mmevk [ 1108.499450] CM = 0, WnR = 0 > > 2021 Sep 30 18:00:36 imx8mmevk [ 1108.499455] user pgtable: 4k pages, > > 48-bit VAs, pgdp=000000004493f000 > > 2021 Sep 30 18:00:36 imx8mmevk [ 1108.499458] [00000026fffe0003] > > pgd=0000000000000000, p4d=0000000000000000 > > 2021 Sep 30 18:00:36 imx8mmevk [ 1108.499470] Internal error: Oops: > > 96000004 [#2] PREEMPT SMP [ 1108.724841] Call trace: > > [ 1108.724853] ksz9477_port_deferred_xmit+0x70/0xe8 > > [ 1108.724861] kthread_worker_fn+0xa0/0x170 [ 1108.724866] > > kthread+0x148/0x168 [ 1108.724872] ret_from_fork+0x10/0x34 [ > > 1108.724884] Code: d2800c81 f9406282 b940ba83 8b030042 (39400c43) [ > > 1108.748918] ---[ end trace 0eee13d84a999751 ]--- > > 2021 Sep 30 18:00:36 imx8mmevk [ 1108.724884] Code: d2800c81 f9406282 > > b940ba83 8b030042 (39400c43) > > 2021 Sep 30 18:00:36 imx8mmevk Unable to handle kernel paging request > > at virtual address 00000026fffe0003 > > 2021 Sep 30 18:00:36 imx8mmevk Mem abort info: > > 2021 Sep 30 18:00:36 imx8mmevk ESR = 0x96000004 > > 2021 Sep 30 18:00:36 imx8mmevk EC = 0x25: DABT (current EL), IL = 32 bits > > 2021 Sep 30 18:00:36 imx8mmevk SET = 0, FnV = 0 > > 2021 Sep 30 18:00:36 imx8mmevk EA = 0, S1PTW = 0 > > 2021 Sep 30 18:00:36 imx8mmevk Data abort info: > > 2021 Sep 30 18:00:36 imx8mmevk ISV = 0, ISS = 0x00000004 > > 2021 Sep 30 18:00:36 imx8mmevk CM = 0, WnR = 0 > > 2021 Sep 30 18:00:36 imx8mmevk user pgtable: 4k pages, 48-bit VAs, > > pgdp=000000004493f000 > > 2021 Sep 30 18:00:36 imx8mmevk [00000026fffe0003] > > pgd=0000000000000000, p4d=0000000000000000 > > 2021 Sep 30 18:00:36 imx8mmevk Internal error: Oops: 96000004 [#2] > > PREEMPT SMP > > 2021 Sep 30 18:00:36 imx8mmevk Code: d2800c81 f9406282 b940ba83 > > 8b030042 (39400c43) > > Ouch, my bad, those pesky data structures... > Can you please apply this extra patch on top (a fixup of my previous one). > Provided same content as attachment as well as plain text. Sure, no problem. Thanks. To be clear I'm applying these to my 5.10.32 linux-fslc kernel built from Yocto/meta-freescale Hardknott in case any of that matters. Full disclosure. I was able to do a git clone of your repo with --depth 1 but the --single-branch thing didn't work for me ... yet. Keep getting getting "remote hung up etc." Will try again. > > -----------------------------[ cut here ]----------------------------- From > ccfe702efa0c4d19d631fc58ed83a765077e4a62 Mon Sep 17 00:00:00 2001 > From: Vladimir Oltean <vla...@nx...> > Date: Tue, 19 Oct 2021 00:21:34 +0300 > Subject: [PATCH] net: dsa: ksz9477: fix ksz_port dereference from > ksz9477_port_deferred_xmit > > The previous patch left an incorrect dereference of struct ksz_port. > That's not how we get it, dp->priv points to a different structure. > > Signed-off-by: Vladimir Oltean <vla...@nx...> > --- > drivers/net/dsa/microchip/ksz9477_ptp.c | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/drivers/net/dsa/microchip/ksz9477_ptp.c > b/drivers/net/dsa/microchip/ksz9477_ptp.c > index 0f05aafbdd3d..fb4f89efd9cd 100644 > --- a/drivers/net/dsa/microchip/ksz9477_ptp.c > +++ b/drivers/net/dsa/microchip/ksz9477_ptp.c > @@ -762,7 +762,9 @@ static void ksz9477_port_deferred_xmit(struct > kthread_work *work) > struct sk_buff *skb = xmit_work->skb; > struct dsa_port *dp = xmit_work->dp; > struct ksz_device *dev = ds->priv; > - struct ksz_port *prt = dp->priv; > + struct ksz_port *prt; > + > + prt = &dev->ports[dp->index]; > > reinit_completion(&prt->tstamp_completion); > > -----------------------------[ cut here ]----------------------------- CONFIDENTIALITY NOTICE: This email and any attachments are for the sole use of the intended recipient and may contain material that is proprietary, confidential, privileged or otherwise legally protected or restricted under applicable government laws. Any review, disclosure, distributing or other use without expressed permission of the sender is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies without reading, printing, or saving. |
From: Vladimir O. <ol...@gm...> - 2021-10-18 21:27:05
|
On Mon, Oct 18, 2021 at 09:12:07PM +0000, Bri...@L3... wrote: > On the console I saw more info about the kernel oops. Posting output I saw below: > > Console output: > > [ 1108.463268] Mem abort info: > [ 1108.466171] ESR = 0x96000004 > [ 1108.469247] EC = 0x25: DABT (current EL), IL = 32 bits > [ 1108.474572] SET = 0, FnV = 0 > 2021 Sep 30 18:00:36 imx8mmevk [ 1108.455330] Unable to handle kernel paging request at virtual address 00000026fffe0003 > 2021 Sep 30 18:00:36 imx8mmevk [ 1108.463268] Mem abort info: > 2021 Sep 30 18:00:36 imx8mmevk [ 1108.466171] ESR = 0x96000004 > [ 1108.499442] EA = 0, S1PTW = 0 > [ 1108.499445] Data abort info: > [ 1108.499447] ISV = 0, ISS = 0x00000004 > [ 1108.499450] CM = 0, WnR = 0 > [ 1108.499455] user pgtable: 4k pages, 48-bit VAs, pgdp=000000004493f000 > [ 1108.499458] [00000026fffe0003] pgd=0000000000000000, p4d=0000000000000000 > [ 1108.499470] Internal error: Oops: 96000004 [#2] PREEMPT SMP > [ 1108.499474] Modules linked in: crct10dif_ce(+) fsl_imx8_ddr_perf(+) error(+) clk_bd718x7(+) snvs_pwrkey(+) rtc_snvs(+) imx8mm_thermal(+) snd_soc_fsl_sai(+) snd_soc_simple_card_utils(+) imx_cpufreq_dt(+) > [ 1108.549100] CPU: 1 PID: 171 Comm: ksz_xmit Tainted: G D 5.10.32 #1 > [ 1108.549102] Hardware name: FSL i.MX8MM EVK board (DT) > [ 1108.549108] pstate: 40000005 (nZcv daif -PAN -UAO -TCO BTYPE=--) > [ 1108.549119] pc : ksz9477_port_deferred_xmit+0x70/0xe8 > [ 1108.549126] lr : ksz9477_port_deferred_xmit+0x54/0xe8 > [ 1108.577742] sp : ffff800012e7bdb0 > [ 1108.577745] x29: ffff800012e7bdb0 x28: 0000000000000000 > [ 1108.586372] x27: ffff8000128a3838 x26: ffff00000498f448 > [ 1108.586380] x25: 0000000000000001 x24: ffff0000041a8188 > [ 1108.597001] x23: ffff0000034c1080 x22: ffff000007666000 > [ 1108.602317] x21: ffff0000034c12e8 x20: ffff00000000005c > [ 1108.607635] x19: ffff00000346e580 x18: 0000000000000000 > [ 1108.607641] x17: 0000000000000000 x16: 0000000000000000 > [ 1108.607646] x15: 0000000000000000 x14: 0d3631207369206c > [ 1108.607652] x13: 0000000000000007 x12: 0000000000000000 > [ 1108.628887] x11: ffff000003453b08 x10: ffff00000539b540 > [ 1108.628894] x9 : ffff800010010664 x8 : 00000000000003e8 > [ 1108.639516] x7 : ffff00000a844000 x6 : 00000000025454c7 > [ 1108.639525] x5 : 00ffffffffffffff x4 : 0000000000000016 > [ 1108.650141] x3 : 00000000ffff0000 x2 : 00000026fffe0000 > [ 1108.655456] x1 : 0000000000000064 x0 : ffff0000034c1878 > 2021 Sep 30 18:00:36 imx8mmevk [ 1108.469247] EC = 0x25: DABT (current EL), IL = 32 bits > 2021 Sep 30 18:00:36 imx8mmevk [ 1108.474572] SET = 0, FnV = 0 > 2021 Sep 30 18:00:36 imx8mmevk [ 1108.499442] EA = 0, S1PTW = 0 > 2021 Sep 30 18:00:36 imx8mmevk [ 1108.499445] Data abort info: > 2021 Sep 30 18:00:36 imx8mmevk [ 1108.499447] ISV = 0, ISS = 0x00000004 > 2021 Sep 30 18:00:36 imx8mmevk [ 1108.499450] CM = 0, WnR = 0 > 2021 Sep 30 18:00:36 imx8mmevk [ 1108.499455] user pgtable: 4k pages, 48-bit VAs, pgdp=000000004493f000 > 2021 Sep 30 18:00:36 imx8mmevk [ 1108.499458] [00000026fffe0003] pgd=0000000000000000, p4d=0000000000000000 > 2021 Sep 30 18:00:36 imx8mmevk [ 1108.499470] Internal error: Oops: 96000004 [#2] PREEMPT SMP > [ 1108.724841] Call trace: > [ 1108.724853] ksz9477_port_deferred_xmit+0x70/0xe8 > [ 1108.724861] kthread_worker_fn+0xa0/0x170 > [ 1108.724866] kthread+0x148/0x168 > [ 1108.724872] ret_from_fork+0x10/0x34 > [ 1108.724884] Code: d2800c81 f9406282 b940ba83 8b030042 (39400c43) > [ 1108.748918] ---[ end trace 0eee13d84a999751 ]--- > 2021 Sep 30 18:00:36 imx8mmevk [ 1108.724884] Code: d2800c81 f9406282 b940ba83 8b030042 (39400c43) > 2021 Sep 30 18:00:36 imx8mmevk Unable to handle kernel paging request at virtual address 00000026fffe0003 > 2021 Sep 30 18:00:36 imx8mmevk Mem abort info: > 2021 Sep 30 18:00:36 imx8mmevk ESR = 0x96000004 > 2021 Sep 30 18:00:36 imx8mmevk EC = 0x25: DABT (current EL), IL = 32 bits > 2021 Sep 30 18:00:36 imx8mmevk SET = 0, FnV = 0 > 2021 Sep 30 18:00:36 imx8mmevk EA = 0, S1PTW = 0 > 2021 Sep 30 18:00:36 imx8mmevk Data abort info: > 2021 Sep 30 18:00:36 imx8mmevk ISV = 0, ISS = 0x00000004 > 2021 Sep 30 18:00:36 imx8mmevk CM = 0, WnR = 0 > 2021 Sep 30 18:00:36 imx8mmevk user pgtable: 4k pages, 48-bit VAs, pgdp=000000004493f000 > 2021 Sep 30 18:00:36 imx8mmevk [00000026fffe0003] pgd=0000000000000000, p4d=0000000000000000 > 2021 Sep 30 18:00:36 imx8mmevk Internal error: Oops: 96000004 [#2] PREEMPT SMP > 2021 Sep 30 18:00:36 imx8mmevk Code: d2800c81 f9406282 b940ba83 8b030042 (39400c43) Ouch, my bad, those pesky data structures... Can you please apply this extra patch on top (a fixup of my previous one). Provided same content as attachment as well as plain text. -----------------------------[ cut here ]----------------------------- >From ccfe702efa0c4d19d631fc58ed83a765077e4a62 Mon Sep 17 00:00:00 2001 From: Vladimir Oltean <vla...@nx...> Date: Tue, 19 Oct 2021 00:21:34 +0300 Subject: [PATCH] net: dsa: ksz9477: fix ksz_port dereference from ksz9477_port_deferred_xmit The previous patch left an incorrect dereference of struct ksz_port. That's not how we get it, dp->priv points to a different structure. Signed-off-by: Vladimir Oltean <vla...@nx...> --- drivers/net/dsa/microchip/ksz9477_ptp.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/net/dsa/microchip/ksz9477_ptp.c b/drivers/net/dsa/microchip/ksz9477_ptp.c index 0f05aafbdd3d..fb4f89efd9cd 100644 --- a/drivers/net/dsa/microchip/ksz9477_ptp.c +++ b/drivers/net/dsa/microchip/ksz9477_ptp.c @@ -762,7 +762,9 @@ static void ksz9477_port_deferred_xmit(struct kthread_work *work) struct sk_buff *skb = xmit_work->skb; struct dsa_port *dp = xmit_work->dp; struct ksz_device *dev = ds->priv; - struct ksz_port *prt = dp->priv; + struct ksz_port *prt; + + prt = &dev->ports[dp->index]; reinit_completion(&prt->tstamp_completion); -----------------------------[ cut here ]----------------------------- |
From: <Bri...@L3...> - 2021-10-18 21:12:19
|
Hi all, I applied the patches Vladimir wanted me to try and am now reporting the findings below. > -----Original Message----- > From: Vladimir Oltean <ol...@gm...> > Sent: Monday, October 18, 2021 9:46 AM > To: Christian Eggers <ce...@ar...> > Cc: Hutchinson, Brian (US) - PSPC <Bri...@L3...>; > lin...@li... > Subject: [EXTERNAL] Re: [Linuxptp-users] Using G.8275.2 profile and getting > tx timestamp timeout, but changing logSyncInterval etc. changes how often > this happens > > Hi Christian, > > On Mon, Oct 18, 2021 at 02:14:12PM +0200, Christian Eggers wrote: > > Hi Vladimir, > > > > On Monday, 18 October 2021, 13:39:42 CEST, Vladimir Oltean wrote: > > > There's something that just doesn't compute for me. > > > In those patches, Christian wrote: > > > > > > /* Currently, only P2P delay measurement is supported. Setting > ocmode > > > * to slave will work independently of actually being master or slave. > > > * For E2E delay measurement, switching between master and slave > would > > > * be required, as the KSZ devices filters out PTP messages > depending on > > > * the ocmode setting: > > > * - in slave mode, DelayReq messages are filtered out > > > * - in master mode, Sync messages are filtered out > > > * Currently (and probably also in future) there is no interface in the > > > * kernel which allows switching between master and slave mode. > For > > > * this reason, E2E cannot be supported. See patchwork for full > > > * discussion: > > > * > https://patchwork.ozlabs.org/project/netdev/patch/20201019172435.4416- > 8-c...@ar.../ > > > */ > > > ksz9477_ptp_tcmode_set(dev, KSZ9477_PTP_TCMODE_P2P); > > > ksz9477_ptp_ocmode_set(dev, KSZ9477_PTP_OCMODE_SLAVE); > > > > > > Did you modify the driver's OCMODE? I am super confused as to which > > > packets ptp4l is actually waiting for a TX timestamp for. Because if > > > you're using E2E and not P2P, then the entire > > > ksz9477_port_deferred_xmit() is just dead code, is it not? > > > > I attached the patch series which I originally provided to Brian. This > > series is for linux-5.10.x. The backports folder contains patches > > which are already present in 5.11 and later kernels (some of them are > > even in the latest 5.10-stable). For recent kernels, the ksz9563_ptp folder is > sufficient. > > > > Compared with the latest series I sent to netdev, I added > > 0010-net-dsa-microchip-ksz9477-add-E2E-support.patch for E2E support. > > This was rejected by the ptp4l maintainer as it would require ptp4l to > > dynamically switch the KSZ hardware between master and slave mode > > (there are packet filters in hardware which cannot entirely be disabled). > > > > Currently there is not much demand for PTP in our current product > development. > > But if the KSZ work can be finished / mainlined, I am highly > > interested. My latest status was (IIRC): > > > > 1. There is something wrong with the time stamping offsets. As a > > result, 1PPS works nearly perfect with two KSZ devices, but shows a > > constant offset when using a Meinberg clock as master. > > > > 2. Currently the user is responsible for providing a start time (for > > PPS) that is in the future. In case the time point has already > > elapsed, the driver will report an error. > > > > 3. Occasional timeouts when waiting for TX timestamps. If think that I > > already implemented the driver changes you requested, but probably the > > problem still persist. It is even possible that the (Brian's) KSZ9567 > > suffers from hardware bugs here which are not present on (my) KSZ9563 > > (I think the guy from Microchip mentioned this). > > > > 4. Sometimes the 1PPS becomes completely out of sync (but recovers then > later). > > This is surprising for me, as ptp4l uses filters/regulators and should > > not be affected by single packet / timestamp failures. > > > > Is there anything I can help? > > Thanks for the clarification, it makes more sense now since the picture is > more complete. > > I've no idea about the hardware specifics since I don't own the hardware. But > I noticed a logical issue with might be relevant to the TX timestamp timeout > events that Brian is seeing. > > Brian, can you try out the patch below? Compile-tested only, as mentioned I > can't really do much more. > > -----------------------------[ cut here ]----------------------------- From > 7f4771bc19db9b48577f3f45ba907e5c13aea808 Mon Sep 17 00:00:00 2001 > From: Vladimir Oltean <vla...@nx...> > Date: Mon, 18 Oct 2021 16:35:18 +0300 > Subject: [PATCH] net: dsa: ksz9477: use a kthread work item per deferred > skb > > There might be a race in tag_ksz.c between these two lines: > > skb_queue_tail(&ptp_shared->xmit_queue, skb_get(skb)); > kthread_queue_work(ptp_shared->xmit_worker, &ptp_shared- > >xmit_work); > > and the skb dequeue logic in ksz9477_port_deferred_xmit(). For example, > the xmit_work might be already queued, however the work item has just > finished walking through the skb queue. Because we don't check the return > code from kthread_queue_work, we don't do anything if the work item is > already queued. > > However, nobody will take that skb and send it, at least until the next > timestampable skb is sent. > > With the ksz9477 driver, two-step TX timestamping is a rare process, and in > certain configs it may happen even as rarely as once per second. > > So if the race condition described above happens, we might experience huge > delays. > > To close that race, let's not keep a single work item per port, and a skb > timestamping queue, but rather dynamically allocate a work item per packet. > > It is also unnecessary to have more than one kthread that does the work. > So delete the per-port kthread allocation and replace them with a single > kthread which is global to the switch. > > Signed-off-by: Vladimir Oltean <vla...@nx...> > --- > drivers/net/dsa/microchip/ksz9477_ptp.c | 88 ++++++++++++++----------- > drivers/net/dsa/microchip/ksz_common.h | 1 - > include/linux/dsa/ksz_common.h | 11 ++-- > net/dsa/tag_ksz.c | 31 +++++---- > 4 files changed, 73 insertions(+), 58 deletions(-) > > diff --git a/drivers/net/dsa/microchip/ksz9477_ptp.c > b/drivers/net/dsa/microchip/ksz9477_ptp.c > index c646689cb71e..0f05aafbdd3d 100644 > --- a/drivers/net/dsa/microchip/ksz9477_ptp.c > +++ b/drivers/net/dsa/microchip/ksz9477_ptp.c > @@ -749,42 +749,62 @@ static void ksz9477_ptp_txtstamp_skb(struct > ksz_device *dev, > skb_complete_tx_timestamp(skb, &hwtstamps); } > > -#define work_to_port(work) \ > - container_of((work), struct ksz_port_ptp_shared, > xmit_work) > -#define ptp_shared_to_ksz_port(t) \ > - container_of((t), struct ksz_port, ptp_shared) > -#define ptp_shared_to_ksz_device(t) \ > - container_of((t), struct ksz_device, ptp_shared) > +#define work_to_xmit_work(w) \ > + container_of((w), struct ksz_deferred_xmit_work, work) > > /* Deferred work is necessary for time stamped PDelay_Req messages. This > cannot > * be done from atomic context as we have to wait for the hardware > interrupt. > */ > static void ksz9477_port_deferred_xmit(struct kthread_work *work) { > - struct ksz_port_ptp_shared *prt_ptp_shared = > work_to_port(work); > - struct ksz_port *prt = ptp_shared_to_ksz_port(prt_ptp_shared); > - struct ksz_device_ptp_shared *ptp_shared = prt_ptp_shared->dev; > - struct ksz_device *dev = ptp_shared_to_ksz_device(ptp_shared); > - int port = prt - dev->ports; > - struct sk_buff *skb; > + struct ksz_deferred_xmit_work *xmit_work = > work_to_xmit_work(work); > + struct dsa_switch *ds = xmit_work->dp->ds; > + struct sk_buff *skb = xmit_work->skb; > + struct dsa_port *dp = xmit_work->dp; > + struct ksz_device *dev = ds->priv; > + struct ksz_port *prt = dp->priv; > + > + reinit_completion(&prt->tstamp_completion); > > - while ((skb = skb_dequeue(&prt_ptp_shared->xmit_queue)) != > NULL) { > - struct sk_buff *clone = DSA_SKB_CB(skb)->clone; > + /* Transfer skb to the host port. */ > + dsa_enqueue_skb(skb, dp->slave); > > - reinit_completion(&prt->tstamp_completion); > + ksz9477_ptp_txtstamp_skb(dev, prt, DSA_SKB_CB(skb)->clone); > + kfree(xmit_work); > +} > > - /* Transfer skb to the host port. */ > - dsa_enqueue_skb(skb, dsa_to_port(dev->ds, port)->slave); > +static int ksz9477_ptp_shared_init(struct ksz_device *dev) { > + struct ksz_device_ptp_shared *ptp_shared = &dev->ptp_shared; > + int ret; > > - ksz9477_ptp_txtstamp_skb(dev, prt, clone); > + /* PDelay_Req messages require deferred transmit as the time > + * stamp unit provides no sequenceId or similar. So we must > + * wait for the time stamp interrupt. > + */ > + ptp_shared->xmit_work_fn = ksz9477_port_deferred_xmit; > + ptp_shared->xmit_worker = kthread_create_worker(0, "ksz_xmit"); > + if (IS_ERR(ptp_shared->xmit_worker)) { > + ret = PTR_ERR(ptp_shared->xmit_worker); > + dev_err(dev->dev, > + "failed to create deferred xmit thread: %d\n", ret); > + return ret; > } > + > + return 0; > +} > + > +static void ksz9477_ptp_shared_deinit(struct ksz_device *dev) { > + struct ksz_device_ptp_shared *ptp_shared = &dev->ptp_shared; > + > + kthread_destroy_worker(ptp_shared->xmit_worker); > } > > static int ksz9477_ptp_port_init(struct ksz_device *dev, int port) { > - struct ksz_port *prt = &dev->ports[port]; > - struct ksz_port_ptp_shared *ptp_shared = &prt->ptp_shared; > struct dsa_port *dp = dsa_to_port(dev->ds, port); > + struct ksz_port *prt = &dev->ports[port]; > int ret; > > if (port == dev->cpu_port) > @@ -809,31 +829,15 @@ static int ksz9477_ptp_port_init(struct ksz_device > *dev, int port) > if (ret) > goto error_disable_port_ptp_interrupts; > > - /* ksz_port::ptp_shared is used in tagging driver */ > - ptp_shared->dev = &dev->ptp_shared; > - dp->priv = ptp_shared; > - > /* PDelay_Req messages require deferred transmit as the time > * stamp unit provides no sequenceId or similar. So we must > * wait for the time stamp interrupt. > */ > + dp->priv = &dev->ptp_shared; > init_completion(&prt->tstamp_completion); > - kthread_init_work(&ptp_shared->xmit_work, > - ksz9477_port_deferred_xmit); > - ptp_shared->xmit_worker = kthread_create_worker(0, "%s_xmit", > - dp->slave->name); > - if (IS_ERR(ptp_shared->xmit_worker)) { > - ret = PTR_ERR(ptp_shared->xmit_worker); > - dev_err(dev->dev, > - "failed to create deferred xmit thread: %d\n", ret); > - goto error_disable_port_egress_interrupts; > - } > - skb_queue_head_init(&ptp_shared->xmit_queue); > > return 0; > > -error_disable_port_egress_interrupts: > - ksz9477_ptp_enable_port_egress_interrupts(dev, port, false); > error_disable_port_ptp_interrupts: > ksz9477_ptp_enable_port_ptp_interrupts(dev, port, false); > return ret; > @@ -841,12 +845,12 @@ static int ksz9477_ptp_port_init(struct ksz_device > *dev, int port) > > static void ksz9477_ptp_port_deinit(struct ksz_device *dev, int port) { > - struct ksz_port_ptp_shared *ptp_shared = &dev- > >ports[port].ptp_shared; > + struct dsa_port *dp = dsa_to_port(dev->ds, port); > > if (port == dev->cpu_port) > return; > > - kthread_destroy_worker(ptp_shared->xmit_worker); > + dp->priv = NULL; > ksz9477_ptp_enable_port_egress_interrupts(dev, port, false); > ksz9477_ptp_enable_port_ptp_interrupts(dev, port, false); } @@ - > 856,6 +860,10 @@ static int ksz9477_ptp_ports_init(struct ksz_device *dev) > int port; > int ret; > > + ret = ksz9477_ptp_shared_init(dev); > + if (ret) > + return ret; > + > for (port = 0; port < dev->port_cnt; port++) { > ret = ksz9477_ptp_port_init(dev, port); > if (ret) > @@ -867,6 +875,7 @@ static int ksz9477_ptp_ports_init(struct ksz_device > *dev) > error_deinit: > while (port-- > 0) > ksz9477_ptp_port_deinit(dev, port); > + ksz9477_ptp_shared_deinit(dev); > return ret; > } > > @@ -876,6 +885,7 @@ static void ksz9477_ptp_ports_deinit(struct > ksz_device *dev) > > for (port = 0; port < dev->port_cnt; port++) > ksz9477_ptp_port_deinit(dev, port); > + ksz9477_ptp_shared_deinit(dev); > } > > /* device attributes */ > diff --git a/drivers/net/dsa/microchip/ksz_common.h > b/drivers/net/dsa/microchip/ksz_common.h > index c9495c92a32d..abcbcbb3fcef 100644 > --- a/drivers/net/dsa/microchip/ksz_common.h > +++ b/drivers/net/dsa/microchip/ksz_common.h > @@ -45,7 +45,6 @@ struct ksz_port { > struct ksz_port_mib mib; > phy_interface_t interface; > #if IS_ENABLED(CONFIG_NET_DSA_MICROCHIP_KSZ9477_PTP) > - struct ksz_port_ptp_shared ptp_shared; > ktime_t tstamp_xdelay; > struct completion tstamp_completion; > bool hwts_tx_en; > diff --git a/include/linux/dsa/ksz_common.h > b/include/linux/dsa/ksz_common.h index a9b4720cc842..c75bc27e3e7a > 100644 > --- a/include/linux/dsa/ksz_common.h > +++ b/include/linux/dsa/ksz_common.h > @@ -35,13 +35,14 @@ struct ksz_device_ptp_shared { > /* approximated current time, read once per second from hardware > */ > struct timespec64 ptp_clock_time; > unsigned long state; > + void (*xmit_work_fn)(struct kthread_work *work); > + struct kthread_worker *xmit_worker; > }; > > -struct ksz_port_ptp_shared { > - struct ksz_device_ptp_shared *dev; > - struct kthread_worker *xmit_worker; > - struct kthread_work xmit_work; > - struct sk_buff_head xmit_queue; > +struct ksz_deferred_xmit_work { > + struct dsa_port *dp; > + struct sk_buff *skb; > + struct kthread_work work; > }; > > /* net/dsa/tag_ksz.c */ > diff --git a/net/dsa/tag_ksz.c b/net/dsa/tag_ksz.c index > 415a26044565..548f66888b0a 100644 > --- a/net/dsa/tag_ksz.c > +++ b/net/dsa/tag_ksz.c > @@ -175,11 +175,12 @@ static void ksz9477_xmit_timestamp(struct sk_buff > *skb) static struct sk_buff *ksz9477_defer_xmit(struct dsa_port *dp, > struct sk_buff *skb) > { > - struct ksz_port_ptp_shared *ptp_shared = dp->priv; > + struct ksz_device_ptp_shared *ptp_shared = dp->priv; > struct sk_buff *clone = DSA_SKB_CB(skb)->clone; > + struct ksz_deferred_xmit_work *xmit_work; > u8 ptp_msg_type; > > - if (!clone) > + if (!clone || !ptp_shared) > return skb; /* no deferred xmit for this packet */ > > /* Use cached PTP msg type from ksz9477_ptp_port_txtstamp(). */ > @@ -188,11 +189,18 @@ static struct sk_buff *ksz9477_defer_xmit(struct > dsa_port *dp, > ptp_msg_type != PTP_MSGTYPE_PDELAY_REQ) > goto out_free_clone; /* only PDelay_Req is deferred */ > > + xmit_work = kzalloc(sizeof(*xmit_work), GFP_ATOMIC); > + if (!xmit_work) > + return NULL; > + > + kthread_init_work(&xmit_work->work, ptp_shared- > >xmit_work_fn); > /* Increase refcount so the kfree_skb in dsa_slave_xmit > * won't really free the packet. > */ > - skb_queue_tail(&ptp_shared->xmit_queue, skb_get(skb)); > - kthread_queue_work(ptp_shared->xmit_worker, &ptp_shared- > >xmit_work); > + xmit_work->dp = dp; > + xmit_work->skb = skb_get(skb); > + > + kthread_queue_work(ptp_shared->xmit_worker, &xmit_work- > >work); > > return NULL; > > @@ -232,7 +240,7 @@ static void ksz9477_rcv_timestamp(struct sk_buff > *skb, u8 *tag, { > struct skb_shared_hwtstamps *hwtstamps = skb_hwtstamps(skb); > struct dsa_switch *ds = dev->dsa_ptr->ds; > - struct ksz_port_ptp_shared *port_ptp_shared; > + struct ksz_device_ptp_shared *ptp_shared; > u8 *tstamp_raw = tag - KSZ9477_PTP_TAG_LEN; > struct ptp_header *ptp_hdr; > unsigned int ptp_type; > @@ -240,15 +248,14 @@ static void ksz9477_rcv_timestamp(struct sk_buff > *skb, u8 *tag, > ktime_t tstamp; > s64 correction; > > - port_ptp_shared = dsa_to_port(ds, port)->priv; > - if (!port_ptp_shared) > + ptp_shared = dsa_to_port(ds, port)->priv; > + if (!ptp_shared) > return; > > /* convert time stamp and write to skb */ > tstamp = > ksz9477_decode_tstamp(get_unaligned_be32(tstamp_raw)); > memset(hwtstamps, 0, sizeof(*hwtstamps)); > - hwtstamps->hwtstamp = > ksz9477_tstamp_reconstruct(port_ptp_shared->dev, > - tstamp); > + hwtstamps->hwtstamp = ksz9477_tstamp_reconstruct(ptp_shared, > tstamp); > > /* For PDelay_Req messages, user space (ptp4l) expects that the > hardware > * subtracts the ingress time stamp from the correction field. The > @@ -289,8 +296,7 @@ static struct sk_buff *ksz9477_xmit(struct sk_buff > *skb, > struct net_device *dev) > { > struct dsa_port *dp = dsa_slave_to_port(dev); > - struct ksz_port_ptp_shared *port_ptp_shared = dp->priv; > - struct ksz_device_ptp_shared *ptp_shared = port_ptp_shared- > >dev; > + struct ksz_device_ptp_shared *ptp_shared = dp->priv; > __be16 *tag; > u8 *addr; > u16 val; > @@ -347,8 +353,7 @@ static struct sk_buff *ksz9893_xmit(struct sk_buff > *skb, > struct net_device *dev) > { > struct dsa_port *dp = dsa_slave_to_port(dev); > - struct ksz_port_ptp_shared *port_ptp_shared = dp->priv; > - struct ksz_device_ptp_shared *ptp_shared = port_ptp_shared- > >dev; > + struct ksz_device_ptp_shared *ptp_shared = dp->priv; > u8 *addr; > u8 *tag; > > -----------------------------[ cut here ]----------------------------- I applied the patches (with some help since the patch from Windows Outlook email didn't work so well). I used just plain ole 1588, E2E, IPV4 transport on GM. I turned on uber logging in ptp4l so you can see the config settings I used. LinuxPTP wouldn't sync with GM. I saw a tx timeout followed by what looks like a kernel oops. LinuxPTP tries to go on but it's done. For good measure I did a clean build and repeated tests and got the same results. On the console I saw more info about the kernel oops. Posting output I saw below: root@imx8mmevk:/etc/linuxptp# ptp4l -f ./ptp4l.conf_e2e_one_step -i lan1 -m -q -l 7 ptp4l[1104.214]: config item (null).assume_two_step is 0 ptp4l[1104.214]: config item (null).check_fup_sync is 0 ptp4l[1104.214]: config item (null).tx_timestamp_timeout is 1000 ptp4l[1104.214]: config item (null).hwts_filter is 0 ptp4l[1104.214]: config item (null).clock_servo is 0 ptp4l[1104.214]: config item (null).clock_type is 32768 ptp4l[1104.214]: config item (null).clock_servo is 0 ptp4l[1104.214]: config item (null).clockClass is 6 ptp4l[1104.214]: config item (null).clockAccuracy is 254 ptp4l[1104.214]: config item (null).offsetScaledLogVariance is 65535 ptp4l[1104.214]: config item (null).productDescription is ';;' ptp4l[1104.214]: config item (null).revisionData is ';;' ptp4l[1104.214]: config item (null).userDescription is ';' ptp4l[1104.214]: config item (null).manufacturerIdentity is '00:00:00' ptp4l[1104.214]: config item (null).domainNumber is 0 ptp4l[1104.214]: config item (null).slaveOnly is 1 ptp4l[1104.214]: config item (null).gmCapable is 1 ptp4l[1104.214]: config item (null).gmCapable is 1 ptp4l[1104.214]: config item (null).G.8275.defaultDS.localPriority is 128 ptp4l[1104.214]: config item (null).maxStepsRemoved is 255 ptp4l[1104.214]: config item (null).time_stamping is 4 ptp4l[1104.214]: config item (null).twoStepFlag is 0 ptp4l[1104.214]: config item (null).twoStepFlag is 0 ptp4l[1104.214]: config item (null).time_stamping is 4 ptp4l[1104.214]: config item (null).priority1 is 128 ptp4l[1104.214]: config item (null).priority2 is 128 ptp4l[1104.215]: interface index 3 is up ptp4l[1104.215]: config item (null).free_running is 0 ptp4l[1104.215]: selected /dev/ptp1 as PTP clock ptp4l[1104.215]: config item (null).clockIdentity is '000000.0000.000000' ptp4l[1104.215]: config item (null).uds_address is '/var/run/ptp4l' ptp4l[1104.215]: section item /var/run/ptp4l.announceReceiptTimeout now 0 ptp4l[1104.215]: section item /var/run/ptp4l.delay_mechanism now 0 ptp4l[1104.215]: section item /var/run/ptp4l.network_transport now 0 ptp4l[1104.215]: section item /var/run/ptp4l.delay_filter_length now 1 ptp4l[1104.215]: config item (null).free_running is 0 ptp4l[1104.215]: config item (null).freq_est_interval is 1 ptp4l[1104.215]: config item (null).write_phase_mode is 0 ptp4l[1104.215]: config item (null).gmCapable is 1 ptp4l[1104.215]: config item (null).kernel_leap is 1 ptp4l[1104.215]: config item (null).utc_offset is 37 ptp4l[1104.215]: config item (null).timeSource is 160 ptp4l[1104.218]: config item (null).pi_proportional_const is 0.000000 ptp4l[1104.218]: config item (null).pi_integral_const is 0.000000 ptp4l[1104.218]: config item (null).pi_proportional_scale is 0.000000 ptp4l[1104.218]: config item (null).pi_proportional_exponent is -0.300000 ptp4l[1104.218]: config item (null).pi_proportional_norm_max is 0.700000 ptp4l[1104.218]: config item (null).pi_integral_scale is 0.000000 ptp4l[1104.218]: config item (null).pi_integral_exponent is 0.400000 ptp4l[1104.218]: config item (null).pi_integral_norm_max is 0.300000 ptp4l[1104.218]: config item (null).step_threshold is 0.000000 ptp4l[1104.218]: config item (null).first_step_threshold is 0.000020 ptp4l[1104.218]: config item (null).max_frequency is 900000000 ptp4l[1104.218]: config item (null).servo_offset_threshold is 0 ptp4l[1104.218]: config item (null).servo_num_offset_values is 10 ptp4l[1104.218]: config item (null).dataset_comparison is 0 ptp4l[1104.218]: config item (null).tsproc_mode is 3 ptp4l[1104.218]: config item (null).delay_filter is 1 ptp4l[1104.218]: config item (null).delay_filter_length is 10 ptp4l[1104.218]: config item (null).initial_delay is 0 ptp4l[1104.218]: config item (null).summary_interval is 4 ptp4l[1104.218]: config item (null).sanity_freq_limit is 200000000 ptp4l[1104.218]: PI servo: sync interval 1.000 kp 0.700 ki 0.300000 ptp4l[1104.218]: config item /var/run/ptp4l.boundary_clock_jbod is 0 ptp4l[1104.218]: config item /var/run/ptp4l.network_transport is 0 ptp4l[1104.218]: config item /var/run/ptp4l.masterOnly is 0 ptp4l[1104.218]: config item /var/run/ptp4l.BMCA is 0 ptp4l[1104.218]: config item /var/run/ptp4l.delayAsymmetry is 0 ptp4l[1104.218]: config item /var/run/ptp4l.follow_up_info is 0 ptp4l[1104.218]: config item /var/run/ptp4l.freq_est_interval is 1 ptp4l[1104.218]: config item /var/run/ptp4l.msg_interval_request is 0 ptp4l[1104.218]: config item /var/run/ptp4l.net_sync_monitor is 0 ptp4l[1104.219]: config item /var/run/ptp4l.path_trace_enabled is 0 ptp4l[1104.219]: config item /var/run/ptp4l.tc_spanning_tree is 0 ptp4l[1104.219]: config item /var/run/ptp4l.ingressLatency is 0 ptp4l[1104.219]: config item /var/run/ptp4l.egressLatency is 0 ptp4l[1104.219]: config item /var/run/ptp4l.delay_mechanism is 0 ptp4l[1104.219]: config item /var/run/ptp4l.hybrid_e2e is 0 ptp4l[1104.219]: config item /var/run/ptp4l.fault_badpeernet_interval is 16 ptp4l[1104.219]: config item /var/run/ptp4l.fault_reset_interval is -128 ptp4l[1104.219]: config item /var/run/ptp4l.tsproc_mode is 3 ptp4l[1104.219]: config item /var/run/ptp4l.delay_filter is 1 ptp4l[1104.219]: config item /var/run/ptp4l.delay_filter_length is 1 ptp4l[1104.219]: config item (null).slave_event_monitor is '' ptp4l[1104.219]: config item lan1.boundary_clock_jbod is 0 ptp4l[1104.219]: config item lan1.network_transport is 1 ptp4l[1104.219]: config item lan1.masterOnly is 0 ptp4l[1104.219]: config item lan1.BMCA is 0 ptp4l[1104.219]: config item lan1.delayAsymmetry is 0 ptp4l[1104.219]: config item lan1.follow_up_info is 0 ptp4l[1104.219]: config item lan1.freq_est_interval is 1 ptp4l[1104.219]: config item lan1.msg_interval_request is 0 ptp4l[1104.219]: config item lan1.net_sync_monitor is 0 ptp4l[1104.219]: config item lan1.path_trace_enabled is 0 ptp4l[1104.219]: config item lan1.tc_spanning_tree is 0 ptp4l[1104.219]: config item lan1.ingressLatency is 0 ptp4l[1104.219]: config item lan1.egressLatency is 0 ptp4l[1104.219]: config item lan1.delay_mechanism is 1 ptp4l[1104.219]: config item lan1.unicast_master_table is 0 ptp4l[1104.219]: config item lan1.unicast_listen is 1 ptp4l[1104.219]: section item lan1.hybrid_e2e now 1 ptp4l[1104.219]: config item lan1.inhibit_multicast_service is 0 ptp4l[1104.219]: config item lan1.hybrid_e2e is 1 ptp4l[1104.219]: config item lan1.fault_badpeernet_interval is 16 ptp4l[1104.219]: config item lan1.fault_reset_interval is -128 ptp4l[1104.219]: config item lan1.tsproc_mode is 3 ptp4l[1104.219]: config item lan1.delay_filter is 1 ptp4l[1104.219]: config item lan1.delay_filter_length is 10 ptp4l[1104.219]: config item lan1.logMinDelayReqInterval is 0 ptp4l[1104.219]: config item lan1.logAnnounceInterval is 1 ptp4l[1104.219]: config item lan1.inhibit_announce is 0 ptp4l[1104.219]: config item lan1.ignore_source_id is 0 ptp4l[1104.219]: config item lan1.announceReceiptTimeout is 3 ptp4l[1104.219]: config item lan1.syncReceiptTimeout is 0 ptp4l[1104.219]: config item lan1.transportSpecific is 0 ptp4l[1104.219]: config item lan1.ignore_transport_specific is 0 ptp4l[1104.219]: config item lan1.G.8275.portDS.localPriority is 128 ptp4l[1104.219]: config item lan1.logSyncInterval is 0 ptp4l[1104.219]: config item lan1.operLogSyncInterval is 0 ptp4l[1104.219]: config item lan1.logMinPdelayReqInterval is 0 ptp4l[1104.219]: config item lan1.operLogPdelayReqInterval is 0 ptp4l[1104.219]: config item lan1.neighborPropDelayThresh is 20000000 ptp4l[1104.219]: config item lan1.min_neighbor_prop_delay is -20000000 ptp4l[1104.219]: config item lan1.asCapable is 1 ptp4l[1104.219]: config item lan1.inhibit_delay_req is 0 ptp4l[1104.219]: config item lan1.udp_ttl is 1 ptp4l[1104.220]: config item (null).dscp_event is 0 ptp4l[1104.220]: config item (null).dscp_general is 0 ptp4l[1104.220]: port 1: INITIALIZING to LISTENING on INIT_COMPLETE ptp4l[1104.220]: config item /var/run/ptp4l.logMinDelayReqInterval is 0 ptp4l[1104.220]: config item /var/run/ptp4l.logAnnounceInterval is 1 ptp4l[1104.220]: config item /var/run/ptp4l.inhibit_announce is 0 ptp4l[1104.220]: config item /var/run/ptp4l.ignore_source_id is 0 ptp4l[1104.220]: config item /var/run/ptp4l.announceReceiptTimeout is 0 ptp4l[1104.220]: config item /var/run/ptp4l.syncReceiptTimeout is 0 ptp4l[1104.220]: config item /var/run/ptp4l.transportSpecific is 0 ptp4l[1104.220]: config item /var/run/ptp4l.ignore_transport_specific is 0 ptp4l[1104.220]: config item /var/run/ptp4l.G.8275.portDS.localPriority is 128 ptp4l[1104.220]: config item /var/run/ptp4l.logSyncInterval is 0 ptp4l[1104.220]: config item /var/run/ptp4l.operLogSyncInterval is 0 ptp4l[1104.220]: config item /var/run/ptp4l.logMinPdelayReqInterval is 0 ptp4l[1104.220]: config item /var/run/ptp4l.operLogPdelayReqInterval is 0 ptp4l[1104.220]: config item /var/run/ptp4l.neighborPropDelayThresh is 20000000 ptp4l[1104.220]: config item /var/run/ptp4l.min_neighbor_prop_delay is -20000000 ptp4l[1104.220]: config item /var/run/ptp4l.asCapable is 1 ptp4l[1104.220]: config item /var/run/ptp4l.inhibit_delay_req is 0 ptp4l[1104.220]: config item (null).uds_address is '/var/run/ptp4l' ptp4l[1104.221]: port 0: INITIALIZING to LISTENING on INIT_COMPLETE ptp4l[1104.221]: port 1: received link status notification ptp4l[1104.221]: interface index 3 is up ptp4l[1104.636]: port 1: setting asCapable ptp4l[1104.636]: port 1: new foreign master 001747.fffe.700d6b-1 ptp4l[1106.637]: selected best master clock 001747.fffe.700d6b ptp4l[1106.637]: updating UTC offset to 37 ptp4l[1106.637]: port 1: LISTENING to UNCALIBRATED on RS_SLAVE ptp4l[1108.311]: port 1: delay timeout 2021 Sep 30 18:00:36 imx8mmevk [ 1108.455330] Unable to handle kernel paging request at virtual address 00000026fffe0003 2021 Sep 30 18:00:36 imx8mmevk [ 1108.463268] Mem abort info: 2021 Sep 30 18:00:36 imx8mmevk [ 1108.466171] ESR = 0x96000004 2021 Sep 30 18:00:36 imx8mmevk [ 1108.469247] EC = 0x25: DABT (current EL), IL = 32 bits 2021 Sep 30 18:00:36 imx8mmevk [ 1108.474572] SET = 0, FnV = 0 2021 Sep 30 18:00:36 imx8mmevk [ 1108.499442] EA = 0, S1PTW = 0 2021 Sep 30 18:00:36 imx8mmevk [ 1108.499445] Data abort info: 2021 Sep 30 18:00:36 imx8mmevk [ 1108.499447] ISV = 0, ISS = 0x00000004 2021 Sep 30 18:00:36 imx8mmevk [ 1108.499450] CM = 0, WnR = 0 2021 Sep 30 18:00:36 imx8mmevk [ 1108.499455] user pgtable: 4k pages, 48-bit VAs, pgdp=000000004493f000 2021 Sep 30 18:00:36 imx8mmevk [ 1108.499458] [00000026fffe0003] pgd=0000000000000000, p4d=0000000000000000 2021 Sep 30 18:00:36 imx8mmevk [ 1108.499470] Internal error: Oops: 96000004 [#2] PREEMPT SMP 2021 Sep 30 18:00:36 imx8mmevk [ 1108.724884] Code: d2800c81 f9406282 b940ba83 8b030042 (39400c43) 2021 Sep 30 18:00:36 imx8mmevk Unable to handle kernel paging request at virtual address 00000026fffe0003 2021 Sep 30 18:00:36 imx8mmevk Mem abort info: 2021 Sep 30 18:00:36 imx8mmevk ESR = 0x96000004 2021 Sep 30 18:00:36 imx8mmevk EC = 0x25: DABT (current EL), IL = 32 bits 2021 Sep 30 18:00:36 imx8mmevk SET = 0, FnV = 0 2021 Sep 30 18:00:36 imx8mmevk EA = 0, S1PTW = 0 2021 Sep 30 18:00:36 imx8mmevk Data abort info: 2021 Sep 30 18:00:36 imx8mmevk ISV = 0, ISS = 0x00000004 2021 Sep 30 18:00:36 imx8mmevk CM = 0, WnR = 0 2021 Sep 30 18:00:36 imx8mmevk user pgtable: 4k pages, 48-bit VAs, pgdp=000000004493f000 2021 Sep 30 18:00:36 imx8mmevk [00000026fffe0003] pgd=0000000000000000, p4d=0000000000000000 2021 Sep 30 18:00:36 imx8mmevk Internal error: Oops: 96000004 [#2] PREEMPT SMP 2021 Sep 30 18:00:36 imx8mmevk Code: d2800c81 f9406282 b940ba83 8b030042 (39400c43) ptp4l[1109.312]: timed out while polling for tx timestamp ptp4l[1109.312]: increasing tx_timestamp_timeout may correct this issue, but it is likely caused by a driver bug ptp4l[1109.312]: port 1: send delay request failed ptp4l[1109.312]: port 1: clearing fault immediately ptp4l[1109.312]: config item lan1.logMinDelayReqInterval is 0 ptp4l[1109.312]: config item lan1.logAnnounceInterval is 1 ptp4l[1109.312]: config item lan1.inhibit_announce is 0 ptp4l[1109.312]: config item lan1.ignore_source_id is 0 ptp4l[1109.312]: config item lan1.announceReceiptTimeout is 3 ptp4l[1109.312]: config item lan1.syncReceiptTimeout is 0 ptp4l[1109.312]: config item lan1.transportSpecific is 0 ptp4l[1109.312]: config item lan1.ignore_transport_specific is 0 ptp4l[1109.312]: config item lan1.G.8275.portDS.localPriority is 128 ptp4l[1109.312]: config item lan1.logSyncInterval is 0 ptp4l[1109.312]: config item lan1.operLogSyncInterval is 0 ptp4l[1109.312]: config item lan1.logMinPdelayReqInterval is 0 ptp4l[1109.312]: config item lan1.operLogPdelayReqInterval is 0 ptp4l[1109.312]: config item lan1.neighborPropDelayThresh is 20000000 ptp4l[1109.312]: config item lan1.min_neighbor_prop_delay is -20000000 ptp4l[1109.312]: config item lan1.asCapable is 1 ptp4l[1109.312]: config item lan1.inhibit_delay_req is 0 ptp4l[1109.312]: config item lan1.udp_ttl is 1 ptp4l[1109.313]: config item (null).dscp_event is 0 ptp4l[1109.313]: config item (null).dscp_general is 0 ptp4l[1109.313]: port 1: UNCALIBRATED to LISTENING on INIT_COMPLETE ptp4l[1109.313]: port 1: received link status notification ptp4l[1109.313]: interface index 3 is up ptp4l[1109.638]: port 1: setting asCapable ptp4l[1109.638]: port 1: new foreign master 001747.fffe.700d6b-1 ptp4l[1111.639]: selected best master clock 001747.fffe.700d6b ptp4l[1111.639]: updating UTC offset to 37 ptp4l[1111.639]: port 1: LISTENING to UNCALIBRATED on RS_SLAVE ptp4l[1112.047]: port 1: delay timeout ptp4l[1113.048]: timed out while polling for tx timestamp ptp4l[1113.048]: increasing tx_timestamp_timeout may correct this issue, but it is likely caused by a driver bug ptp4l[1113.048]: port 1: send delay request failed ptp4l[1113.048]: port 1: clearing fault immediately ptp4l[1113.048]: config item lan1.logMinDelayReqInterval is 0 ptp4l[1113.048]: config item lan1.logAnnounceInterval is 1 ptp4l[1113.048]: config item lan1.inhibit_announce is 0 ptp4l[1113.048]: config item lan1.ignore_source_id is 0 ptp4l[1113.048]: config item lan1.announceReceiptTimeout is 3 ptp4l[1113.048]: config item lan1.syncReceiptTimeout is 0 ptp4l[1113.048]: config item lan1.transportSpecific is 0 ptp4l[1113.048]: config item lan1.ignore_transport_specific is 0 ptp4l[1113.048]: config item lan1.G.8275.portDS.localPriority is 128 ptp4l[1113.048]: config item lan1.logSyncInterval is 0 ptp4l[1113.048]: config item lan1.operLogSyncInterval is 0 ptp4l[1113.048]: config item lan1.logMinPdelayReqInterval is 0 ptp4l[1113.048]: config item lan1.operLogPdelayReqInterval is 0 ptp4l[1113.048]: config item lan1.neighborPropDelayThresh is 20000000 ptp4l[1113.048]: config item lan1.min_neighbor_prop_delay is -20000000 ptp4l[1113.048]: config item lan1.asCapable is 1 ptp4l[1113.048]: config item lan1.inhibit_delay_req is 0 ptp4l[1113.048]: config item lan1.udp_ttl is 1 ptp4l[1113.049]: config item (null).dscp_event is 0 ptp4l[1113.049]: config item (null).dscp_general is 0 ptp4l[1113.049]: port 1: UNCALIBRATED to LISTENING on INIT_COMPLETE ptp4l[1113.049]: port 1: received link status notification ptp4l[1113.049]: interface index 3 is up ptp4l[1113.640]: port 1: setting asCapable ptp4l[1113.640]: port 1: new foreign master 001747.fffe.700d6b-1 ptp4l[1115.640]: selected best master clock 001747.fffe.700d6b ptp4l[1115.640]: updating UTC offset to 37 ptp4l[1115.640]: port 1: LISTENING to UNCALIBRATED on RS_SLAVE ptp4l[1117.135]: port 1: delay timeout ptp4l[1118.136]: timed out while polling for tx timestamp ptp4l[1118.136]: increasing tx_timestamp_timeout may correct this issue, but it is likely caused by a driver bug ptp4l[1118.136]: port 1: send delay request failed ptp4l[1118.136]: port 1: clearing fault immediately ptp4l[1118.136]: config item lan1.logMinDelayReqInterval is 0 ptp4l[1118.136]: config item lan1.logAnnounceInterval is 1 ptp4l[1118.136]: config item lan1.inhibit_announce is 0 ptp4l[1118.136]: config item lan1.ignore_source_id is 0 ptp4l[1118.136]: config item lan1.announceReceiptTimeout is 3 ptp4l[1118.136]: config item lan1.syncReceiptTimeout is 0 ptp4l[1118.136]: config item lan1.transportSpecific is 0 ptp4l[1118.136]: config item lan1.ignore_transport_specific is 0 ptp4l[1118.136]: config item lan1.G.8275.portDS.localPriority is 128 ptp4l[1118.136]: config item lan1.logSyncInterval is 0 ptp4l[1118.136]: config item lan1.operLogSyncInterval is 0 ptp4l[1118.136]: config item lan1.logMinPdelayReqInterval is 0 ptp4l[1118.136]: config item lan1.operLogPdelayReqInterval is 0 ptp4l[1118.136]: config item lan1.neighborPropDelayThresh is 20000000 ptp4l[1118.136]: config item lan1.min_neighbor_prop_delay is -20000000 ptp4l[1118.136]: config item lan1.asCapable is 1 ptp4l[1118.136]: config item lan1.inhibit_delay_req is 0 ptp4l[1118.136]: config item lan1.udp_ttl is 1 ptp4l[1118.137]: config item (null).dscp_event is 0 ptp4l[1118.137]: config item (null).dscp_general is 0 ptp4l[1118.137]: port 1: UNCALIBRATED to LISTENING on INIT_COMPLETE ptp4l[1118.137]: port 1: received link status notification ptp4l[1118.137]: interface index 3 is up ptp4l[1118.642]: port 1: setting asCapable ptp4l[1118.642]: port 1: new foreign master 001747.fffe.700d6b-1 ptp4l[1120.643]: selected best master clock 001747.fffe.700d6b ptp4l[1120.643]: updating UTC offset to 37 ptp4l[1120.643]: port 1: LISTENING to UNCALIBRATED on RS_SLAVE ptp4l[1120.787]: port 1: delay timeout ptp4l[1121.788]: timed out while polling for tx timestamp ptp4l[1121.788]: increasing tx_timestamp_timeout may correct this issue, but it is likely caused by a driver bug ptp4l[1121.788]: port 1: send delay request failed ptp4l[1121.788]: port 1: clearing fault immediately ptp4l[1121.788]: config item lan1.logMinDelayReqInterval is 0 ptp4l[1121.788]: config item lan1.logAnnounceInterval is 1 ptp4l[1121.788]: config item lan1.inhibit_announce is 0 ptp4l[1121.788]: config item lan1.ignore_source_id is 0 ptp4l[1121.788]: config item lan1.announceReceiptTimeout is 3 ptp4l[1121.788]: config item lan1.syncReceiptTimeout is 0 ptp4l[1121.788]: config item lan1.transportSpecific is 0 ptp4l[1121.788]: config item lan1.ignore_transport_specific is 0 ptp4l[1121.788]: config item lan1.G.8275.portDS.localPriority is 128 ptp4l[1121.788]: config item lan1.logSyncInterval is 0 ptp4l[1121.788]: config item lan1.operLogSyncInterval is 0 ptp4l[1121.788]: config item lan1.logMinPdelayReqInterval is 0 ptp4l[1121.788]: config item lan1.operLogPdelayReqInterval is 0 ptp4l[1121.788]: config item lan1.neighborPropDelayThresh is 20000000 ptp4l[1121.788]: config item lan1.min_neighbor_prop_delay is -20000000 ptp4l[1121.788]: config item lan1.asCapable is 1 ptp4l[1121.788]: config item lan1.inhibit_delay_req is 0 ptp4l[1121.788]: config item lan1.udp_ttl is 1 ptp4l[1121.789]: config item (null).dscp_event is 0 ptp4l[1121.789]: config item (null).dscp_general is 0 ptp4l[1121.789]: port 1: UNCALIBRATED to LISTENING on INIT_COMPLETE ptp4l[1121.789]: port 1: received link status notification ptp4l[1121.789]: interface index 3 is up ptp4l[1121.818]: port 1: setting asCapable ptp4l[1122.643]: port 1: new foreign master 001747.fffe.700d6b-1 ^Croot@imx8mmevk:/etc/linuxptp# Console output: [ 1108.463268] Mem abort info: [ 1108.466171] ESR = 0x96000004 [ 1108.469247] EC = 0x25: DABT (current EL), IL = 32 bits [ 1108.474572] SET = 0, FnV = 0 2021 Sep 30 18:00:36 imx8mmevk [ 1108.455330] Unable to handle kernel paging request at virtual address 00000026fffe0003 2021 Sep 30 18:00:36 imx8mmevk [ 1108.463268] Mem abort info: 2021 Sep 30 18:00:36 imx8mmevk [ 1108.466171] ESR = 0x96000004 [ 1108.499442] EA = 0, S1PTW = 0 [ 1108.499445] Data abort info: [ 1108.499447] ISV = 0, ISS = 0x00000004 [ 1108.499450] CM = 0, WnR = 0 [ 1108.499455] user pgtable: 4k pages, 48-bit VAs, pgdp=000000004493f000 [ 1108.499458] [00000026fffe0003] pgd=0000000000000000, p4d=0000000000000000 [ 1108.499470] Internal error: Oops: 96000004 [#2] PREEMPT SMP [ 1108.499474] Modules linked in: crct10dif_ce(+) fsl_imx8_ddr_perf(+) error(+) clk_bd718x7(+) snvs_pwrkey(+) rtc_snvs(+) imx8mm_thermal(+) snd_soc_fsl_sai(+) snd_soc_simple_card_utils(+) imx_cpufreq_dt(+) [ 1108.549100] CPU: 1 PID: 171 Comm: ksz_xmit Tainted: G D 5.10.32 #1 [ 1108.549102] Hardware name: FSL i.MX8MM EVK board (DT) [ 1108.549108] pstate: 40000005 (nZcv daif -PAN -UAO -TCO BTYPE=--) [ 1108.549119] pc : ksz9477_port_deferred_xmit+0x70/0xe8 [ 1108.549126] lr : ksz9477_port_deferred_xmit+0x54/0xe8 [ 1108.577742] sp : ffff800012e7bdb0 [ 1108.577745] x29: ffff800012e7bdb0 x28: 0000000000000000 [ 1108.586372] x27: ffff8000128a3838 x26: ffff00000498f448 [ 1108.586380] x25: 0000000000000001 x24: ffff0000041a8188 [ 1108.597001] x23: ffff0000034c1080 x22: ffff000007666000 [ 1108.602317] x21: ffff0000034c12e8 x20: ffff00000000005c [ 1108.607635] x19: ffff00000346e580 x18: 0000000000000000 [ 1108.607641] x17: 0000000000000000 x16: 0000000000000000 [ 1108.607646] x15: 0000000000000000 x14: 0d3631207369206c [ 1108.607652] x13: 0000000000000007 x12: 0000000000000000 [ 1108.628887] x11: ffff000003453b08 x10: ffff00000539b540 [ 1108.628894] x9 : ffff800010010664 x8 : 00000000000003e8 [ 1108.639516] x7 : ffff00000a844000 x6 : 00000000025454c7 [ 1108.639525] x5 : 00ffffffffffffff x4 : 0000000000000016 [ 1108.650141] x3 : 00000000ffff0000 x2 : 00000026fffe0000 [ 1108.655456] x1 : 0000000000000064 x0 : ffff0000034c1878 2021 Sep 30 18:00:36 imx8mmevk [ 1108.469247] EC = 0x25: DABT (current EL), IL = 32 bits 2021 Sep 30 18:00:36 imx8mmevk [ 1108.474572] SET = 0, FnV = 0 2021 Sep 30 18:00:36 imx8mmevk [ 1108.499442] EA = 0, S1PTW = 0 2021 Sep 30 18:00:36 imx8mmevk [ 1108.499445] Data abort info: 2021 Sep 30 18:00:36 imx8mmevk [ 1108.499447] ISV = 0, ISS = 0x00000004 2021 Sep 30 18:00:36 imx8mmevk [ 1108.499450] CM = 0, WnR = 0 2021 Sep 30 18:00:36 imx8mmevk [ 1108.499455] user pgtable: 4k pages, 48-bit VAs, pgdp=000000004493f000 2021 Sep 30 18:00:36 imx8mmevk [ 1108.499458] [00000026fffe0003] pgd=0000000000000000, p4d=0000000000000000 2021 Sep 30 18:00:36 imx8mmevk [ 1108.499470] Internal error: Oops: 96000004 [#2] PREEMPT SMP [ 1108.724841] Call trace: [ 1108.724853] ksz9477_port_deferred_xmit+0x70/0xe8 [ 1108.724861] kthread_worker_fn+0xa0/0x170 [ 1108.724866] kthread+0x148/0x168 [ 1108.724872] ret_from_fork+0x10/0x34 [ 1108.724884] Code: d2800c81 f9406282 b940ba83 8b030042 (39400c43) [ 1108.748918] ---[ end trace 0eee13d84a999751 ]--- 2021 Sep 30 18:00:36 imx8mmevk [ 1108.724884] Code: d2800c81 f9406282 b940ba83 8b030042 (39400c43) 2021 Sep 30 18:00:36 imx8mmevk Unable to handle kernel paging request at virtual address 00000026fffe0003 2021 Sep 30 18:00:36 imx8mmevk Mem abort info: 2021 Sep 30 18:00:36 imx8mmevk ESR = 0x96000004 2021 Sep 30 18:00:36 imx8mmevk EC = 0x25: DABT (current EL), IL = 32 bits 2021 Sep 30 18:00:36 imx8mmevk SET = 0, FnV = 0 2021 Sep 30 18:00:36 imx8mmevk EA = 0, S1PTW = 0 2021 Sep 30 18:00:36 imx8mmevk Data abort info: 2021 Sep 30 18:00:36 imx8mmevk ISV = 0, ISS = 0x00000004 2021 Sep 30 18:00:36 imx8mmevk CM = 0, WnR = 0 2021 Sep 30 18:00:36 imx8mmevk user pgtable: 4k pages, 48-bit VAs, pgdp=000000004493f000 2021 Sep 30 18:00:36 imx8mmevk [00000026fffe0003] pgd=0000000000000000, p4d=0000000000000000 2021 Sep 30 18:00:36 imx8mmevk Internal error: Oops: 96000004 [#2] PREEMPT SMP 2021 Sep 30 18:00:36 imx8mmevk Code: d2800c81 f9406282 b940ba83 8b030042 (39400c43) CONFIDENTIALITY NOTICE: This email and any attachments are for the sole use of the intended recipient and may contain material that is proprietary, confidential, privileged or otherwise legally protected or restricted under applicable government laws. Any review, disclosure, distributing or other use without expressed permission of the sender is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies without reading, printing, or saving. |
From: <Bri...@L3...> - 2021-10-18 14:06:48
|
Hi Vladimir, Thanks for digging into this ... answers below. > -----Original Message----- > From: Vladimir Oltean <ol...@gm...> > Sent: Monday, October 18, 2021 7:40 AM > To: Hutchinson, Brian (US) - PSPC <Bri...@L3...> > Cc: lin...@li...; ce...@ar... > Subject: [EXTERNAL] Re: [Linuxptp-users] Using G.8275.2 profile and getting > tx timestamp timeout, but changing logSyncInterval etc. changes how often > this happens > > On Fri, Oct 15, 2021 at 12:01:24AM +0000, Bri...@L3... > wrote: > > > > > If this is a "stack" issue, what can I do to reduce the "message rate" > > > > > or "grant duration" if these are related to whatever a "stack" > > > > > issue is? > > > > > > > > I'd be willing to put my money on a driver bug. But for that you'd > > > > need to confirm that the issue reproduces with the default.cfg and > > > > not just with the > > > > G.8275.2 profile. Don't try to run before you can walk. > > > > So I ran tests using a plain 1588 profile and E2E and yes the problem still > happens. Here is that config: > > There's something that just doesn't compute for me. > In those patches, Christian wrote: > > /* Currently, only P2P delay measurement is supported. Setting > ocmode > * to slave will work independently of actually being master or slave. > * For E2E delay measurement, switching between master and slave > would > * be required, as the KSZ devices filters out PTP messages > depending on > * the ocmode setting: > * - in slave mode, DelayReq messages are filtered out > * - in master mode, Sync messages are filtered out > * Currently (and probably also in future) there is no interface in the > * kernel which allows switching between master and slave mode. > For > * this reason, E2E cannot be supported. See patchwork for full > * discussion: > * > https://patchwork.ozlabs.org/project/netdev/patch/20201019172435.4416- > 8-c...@ar.../ > */ > ksz9477_ptp_tcmode_set(dev, KSZ9477_PTP_TCMODE_P2P); > ksz9477_ptp_ocmode_set(dev, KSZ9477_PTP_OCMODE_SLAVE); > > Did you modify the driver's OCMODE? I am super confused as to which Yes. You echo -n E2E > /sys/class/ptp/ptp1/device/tcmode ... and echo -n slave > /sys/class/ptp/ptp1/device/ocmode ... but for me they default to E2E and slave so I just verify that they are correct before running. For me I'm using the im8mm fec mac driver as a fixed-link. Before we realized we needed G.8275.2 and bonding for redundancy we just used the fec_ptp which shows up as /dev/ptp0 and the ksz9567 shows up as /dev/ptp1. > packets ptp4l is actually waiting for a TX timestamp for. Because if you're > using E2E and not P2P, then the entire ksz9477_port_deferred_xmit() is just > dead code, is it not? It doesn't look like dead code to me ... > > > [global] > > # > > # Default Data Set > > (summary of your changes) > > twoStepFlag: 1 to 0 > slaveOnly: 0 to 1 > clockClass: 248 to 6 > fault_reset_interval: 4 to -128 > tx_timestamp_timeout: 10 to 1000 > unicast_listen: 0 to 1 > unicast_req_duration: 3600 to 300 > summary_interval: 0 to 4 > time_stamping: hardware to p2p1step > tsproc_mode: filter to raw_weight > > Can you just print the packet in ptp4l? You're using the default.cfg settings > otherwise, so the UDPv4 network_transport, so: > > static int udp_send(struct transport *t, struct fdarray *fda, > enum transport_event event, int peer, void *buf, int len, > struct address *addr, struct hw_timestamp *hwts) ... > > cnt = sendto(fd, buf, len, 0, &addr->sa, sizeof(addr->sin)); > if (cnt < 1) { > pr_err("sendto failed: %m"); > return -errno; > } > /* > * Get the time stamp right away. > */ > return event == TRANS_EVENT ? sk_receive(fd, junk, len, NULL, > hwts, MSG_ERRQUEUE) : cnt; > ^ > you can print the buf here if > sk_receive returns negative Ok, I'll look at it. > > The only place I find where this makes sense to be called from is: > port_delay_request: > if (port_prepare_and_send(p, msg, TRANS_EVENT)) { > > But that further suggests that you've modified the driver, because: > > /* Defer transmit if waiting for egress time stamp is required. */ static struct > sk_buff *ksz9477_defer_xmit(struct dsa_port *dp, > struct sk_buff *skb) > { > /* Use cached PTP msg type from ksz9477_ptp_port_txtstamp(). */ > ptp_msg_type = KSZ9477_SKB_CB(clone)->ptp_msg_type; > if (ptp_msg_type != PTP_MSGTYPE_PDELAY_REQ) > goto out_free_clone; /* only PDelay_Req is deferred */ > > So could you share the exact list of changes you've made to the patches from > the form that they were posted in? I haven't really changed anything with Christian's code so maybe best to check out his attached .tar in his recent email. I thought his patches were all posted but maybe not. > > > > > And I did find a bug in the DSA driver but it didn't appear to change > anything. > > > > In ksz9477_ptp_txtstamp_skb function the "ret" that is being assigned > > by "wait_for_completion_timeout" returning is declared as an "int" > > instead of an "unsigned long" so I fixed that. > > Doesn't really make a difference on a 64-bit machine. > Nonetheless, is that the sticking point? Do you see this error message in > dmesg when user space loses the TX timestamp? > > dev_err(dev->dev, "timeout waiting for time stamp\n"); Yes, that's what I'm seeing. s > > > ... still looking for other stuff but again, I'm probably not > > experienced enough (yet) with DSA and LinuxPTP to do much good. Regards, Brian CONFIDENTIALITY NOTICE: This email and any attachments are for the sole use of the intended recipient and may contain material that is proprietary, confidential, privileged or otherwise legally protected or restricted under applicable government laws. Any review, disclosure, distributing or other use without expressed permission of the sender is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies without reading, printing, or saving. |
From: <Bri...@L3...> - 2021-10-18 13:58:27
|
Hi Vladimir, > -----Original Message----- > From: Vladimir Oltean <ol...@gm...> > Sent: Monday, October 18, 2021 9:46 AM > To: Christian Eggers <ce...@ar...> > Cc: Hutchinson, Brian (US) - PSPC <Bri...@L3...>; > lin...@li... > Subject: [EXTERNAL] Re: [Linuxptp-users] Using G.8275.2 profile and getting > tx timestamp timeout, but changing logSyncInterval etc. changes how often > this happens > > Hi Christian, > > On Mon, Oct 18, 2021 at 02:14:12PM +0200, Christian Eggers wrote: > > Hi Vladimir, > > > > On Monday, 18 October 2021, 13:39:42 CEST, Vladimir Oltean wrote: > > > There's something that just doesn't compute for me. > > > In those patches, Christian wrote: > > > > > > /* Currently, only P2P delay measurement is supported. Setting > ocmode > > > * to slave will work independently of actually being master or slave. > > > * For E2E delay measurement, switching between master and slave > would > > > * be required, as the KSZ devices filters out PTP messages > depending on > > > * the ocmode setting: > > > * - in slave mode, DelayReq messages are filtered out > > > * - in master mode, Sync messages are filtered out > > > * Currently (and probably also in future) there is no interface in the > > > * kernel which allows switching between master and slave mode. > For > > > * this reason, E2E cannot be supported. See patchwork for full > > > * discussion: > > > * > https://patchwork.ozlabs.org/project/netdev/patch/20201019172435.4416- > 8-c...@ar.../ > > > */ > > > ksz9477_ptp_tcmode_set(dev, KSZ9477_PTP_TCMODE_P2P); > > > ksz9477_ptp_ocmode_set(dev, KSZ9477_PTP_OCMODE_SLAVE); > > > > > > Did you modify the driver's OCMODE? I am super confused as to which > > > packets ptp4l is actually waiting for a TX timestamp for. Because if > > > you're using E2E and not P2P, then the entire > > > ksz9477_port_deferred_xmit() is just dead code, is it not? > > > > I attached the patch series which I originally provided to Brian. This > > series is for linux-5.10.x. The backports folder contains patches > > which are already present in 5.11 and later kernels (some of them are > > even in the latest 5.10-stable). For recent kernels, the ksz9563_ptp folder is > sufficient. > > > > Compared with the latest series I sent to netdev, I added > > 0010-net-dsa-microchip-ksz9477-add-E2E-support.patch for E2E support. > > This was rejected by the ptp4l maintainer as it would require ptp4l to > > dynamically switch the KSZ hardware between master and slave mode > > (there are packet filters in hardware which cannot entirely be disabled). > > > > Currently there is not much demand for PTP in our current product > development. > > But if the KSZ work can be finished / mainlined, I am highly > > interested. My latest status was (IIRC): > > > > 1. There is something wrong with the time stamping offsets. As a > > result, 1PPS works nearly perfect with two KSZ devices, but shows a > > constant offset when using a Meinberg clock as master. > > > > 2. Currently the user is responsible for providing a start time (for > > PPS) that is in the future. In case the time point has already > > elapsed, the driver will report an error. > > > > 3. Occasional timeouts when waiting for TX timestamps. If think that I > > already implemented the driver changes you requested, but probably the > > problem still persist. It is even possible that the (Brian's) KSZ9567 > > suffers from hardware bugs here which are not present on (my) KSZ9563 > > (I think the guy from Microchip mentioned this). > > > > 4. Sometimes the 1PPS becomes completely out of sync (but recovers then > later). > > This is surprising for me, as ptp4l uses filters/regulators and should > > not be affected by single packet / timestamp failures. > > > > Is there anything I can help? > > Thanks for the clarification, it makes more sense now since the picture is > more complete. > > I've no idea about the hardware specifics since I don't own the hardware. But > I noticed a logical issue with might be relevant to the TX timestamp timeout > events that Brian is seeing. > > Brian, can you try out the patch below? Compile-tested only, as mentioned I > can't really do much more. > > -----------------------------[ cut here ]----------------------------- From > 7f4771bc19db9b48577f3f45ba907e5c13aea808 Mon Sep 17 00:00:00 2001 > From: Vladimir Oltean <vla...@nx...> > Date: Mon, 18 Oct 2021 16:35:18 +0300 > Subject: [PATCH] net: dsa: ksz9477: use a kthread work item per deferred > skb > > There might be a race in tag_ksz.c between these two lines: > > skb_queue_tail(&ptp_shared->xmit_queue, skb_get(skb)); > kthread_queue_work(ptp_shared->xmit_worker, &ptp_shared- > >xmit_work); Yes sir, happy to try that and report back. I suspect a race condition somewhere too and have been looking along those lines. > > and the skb dequeue logic in ksz9477_port_deferred_xmit(). For example, > the xmit_work might be already queued, however the work item has just > finished walking through the skb queue. Because we don't check the return > code from kthread_queue_work, we don't do anything if the work item is > already queued. > > However, nobody will take that skb and send it, at least until the next > timestampable skb is sent. > > With the ksz9477 driver, two-step TX timestamping is a rare process, and in > certain configs it may happen even as rarely as once per second. I have both LinuxPTP and my Trimble GM200 GM configured for 1 Step. I think there are issues with this part doing Two Step. > > So if the race condition described above happens, we might experience huge > delays. > > To close that race, let's not keep a single work item per port, and a skb > timestamping queue, but rather dynamically allocate a work item per packet. > > It is also unnecessary to have more than one kthread that does the work. > So delete the per-port kthread allocation and replace them with a single > kthread which is global to the switch. > > Signed-off-by: Vladimir Oltean <vla...@nx...> > --- > drivers/net/dsa/microchip/ksz9477_ptp.c | 88 ++++++++++++++----------- > drivers/net/dsa/microchip/ksz_common.h | 1 - > include/linux/dsa/ksz_common.h | 11 ++-- > net/dsa/tag_ksz.c | 31 +++++---- > 4 files changed, 73 insertions(+), 58 deletions(-) > > diff --git a/drivers/net/dsa/microchip/ksz9477_ptp.c > b/drivers/net/dsa/microchip/ksz9477_ptp.c > index c646689cb71e..0f05aafbdd3d 100644 > --- a/drivers/net/dsa/microchip/ksz9477_ptp.c > +++ b/drivers/net/dsa/microchip/ksz9477_ptp.c > @@ -749,42 +749,62 @@ static void ksz9477_ptp_txtstamp_skb(struct > ksz_device *dev, > skb_complete_tx_timestamp(skb, &hwtstamps); } > > -#define work_to_port(work) \ > - container_of((work), struct ksz_port_ptp_shared, > xmit_work) > -#define ptp_shared_to_ksz_port(t) \ > - container_of((t), struct ksz_port, ptp_shared) > -#define ptp_shared_to_ksz_device(t) \ > - container_of((t), struct ksz_device, ptp_shared) > +#define work_to_xmit_work(w) \ > + container_of((w), struct ksz_deferred_xmit_work, work) > > /* Deferred work is necessary for time stamped PDelay_Req messages. This > cannot > * be done from atomic context as we have to wait for the hardware > interrupt. > */ > static void ksz9477_port_deferred_xmit(struct kthread_work *work) { > - struct ksz_port_ptp_shared *prt_ptp_shared = > work_to_port(work); > - struct ksz_port *prt = ptp_shared_to_ksz_port(prt_ptp_shared); > - struct ksz_device_ptp_shared *ptp_shared = prt_ptp_shared->dev; > - struct ksz_device *dev = ptp_shared_to_ksz_device(ptp_shared); > - int port = prt - dev->ports; > - struct sk_buff *skb; > + struct ksz_deferred_xmit_work *xmit_work = > work_to_xmit_work(work); > + struct dsa_switch *ds = xmit_work->dp->ds; > + struct sk_buff *skb = xmit_work->skb; > + struct dsa_port *dp = xmit_work->dp; > + struct ksz_device *dev = ds->priv; > + struct ksz_port *prt = dp->priv; > + > + reinit_completion(&prt->tstamp_completion); > > - while ((skb = skb_dequeue(&prt_ptp_shared->xmit_queue)) != > NULL) { > - struct sk_buff *clone = DSA_SKB_CB(skb)->clone; > + /* Transfer skb to the host port. */ > + dsa_enqueue_skb(skb, dp->slave); > > - reinit_completion(&prt->tstamp_completion); > + ksz9477_ptp_txtstamp_skb(dev, prt, DSA_SKB_CB(skb)->clone); > + kfree(xmit_work); > +} > > - /* Transfer skb to the host port. */ > - dsa_enqueue_skb(skb, dsa_to_port(dev->ds, port)->slave); > +static int ksz9477_ptp_shared_init(struct ksz_device *dev) { > + struct ksz_device_ptp_shared *ptp_shared = &dev->ptp_shared; > + int ret; > > - ksz9477_ptp_txtstamp_skb(dev, prt, clone); > + /* PDelay_Req messages require deferred transmit as the time > + * stamp unit provides no sequenceId or similar. So we must > + * wait for the time stamp interrupt. > + */ > + ptp_shared->xmit_work_fn = ksz9477_port_deferred_xmit; > + ptp_shared->xmit_worker = kthread_create_worker(0, "ksz_xmit"); > + if (IS_ERR(ptp_shared->xmit_worker)) { > + ret = PTR_ERR(ptp_shared->xmit_worker); > + dev_err(dev->dev, > + "failed to create deferred xmit thread: %d\n", ret); > + return ret; > } > + > + return 0; > +} > + > +static void ksz9477_ptp_shared_deinit(struct ksz_device *dev) { > + struct ksz_device_ptp_shared *ptp_shared = &dev->ptp_shared; > + > + kthread_destroy_worker(ptp_shared->xmit_worker); > } > > static int ksz9477_ptp_port_init(struct ksz_device *dev, int port) { > - struct ksz_port *prt = &dev->ports[port]; > - struct ksz_port_ptp_shared *ptp_shared = &prt->ptp_shared; > struct dsa_port *dp = dsa_to_port(dev->ds, port); > + struct ksz_port *prt = &dev->ports[port]; > int ret; > > if (port == dev->cpu_port) > @@ -809,31 +829,15 @@ static int ksz9477_ptp_port_init(struct ksz_device > *dev, int port) > if (ret) > goto error_disable_port_ptp_interrupts; > > - /* ksz_port::ptp_shared is used in tagging driver */ > - ptp_shared->dev = &dev->ptp_shared; > - dp->priv = ptp_shared; > - > /* PDelay_Req messages require deferred transmit as the time > * stamp unit provides no sequenceId or similar. So we must > * wait for the time stamp interrupt. > */ > + dp->priv = &dev->ptp_shared; > init_completion(&prt->tstamp_completion); > - kthread_init_work(&ptp_shared->xmit_work, > - ksz9477_port_deferred_xmit); > - ptp_shared->xmit_worker = kthread_create_worker(0, "%s_xmit", > - dp->slave->name); > - if (IS_ERR(ptp_shared->xmit_worker)) { > - ret = PTR_ERR(ptp_shared->xmit_worker); > - dev_err(dev->dev, > - "failed to create deferred xmit thread: %d\n", ret); > - goto error_disable_port_egress_interrupts; > - } > - skb_queue_head_init(&ptp_shared->xmit_queue); > > return 0; > > -error_disable_port_egress_interrupts: > - ksz9477_ptp_enable_port_egress_interrupts(dev, port, false); > error_disable_port_ptp_interrupts: > ksz9477_ptp_enable_port_ptp_interrupts(dev, port, false); > return ret; > @@ -841,12 +845,12 @@ static int ksz9477_ptp_port_init(struct ksz_device > *dev, int port) > > static void ksz9477_ptp_port_deinit(struct ksz_device *dev, int port) { > - struct ksz_port_ptp_shared *ptp_shared = &dev- > >ports[port].ptp_shared; > + struct dsa_port *dp = dsa_to_port(dev->ds, port); > > if (port == dev->cpu_port) > return; > > - kthread_destroy_worker(ptp_shared->xmit_worker); > + dp->priv = NULL; > ksz9477_ptp_enable_port_egress_interrupts(dev, port, false); > ksz9477_ptp_enable_port_ptp_interrupts(dev, port, false); } @@ - > 856,6 +860,10 @@ static int ksz9477_ptp_ports_init(struct ksz_device *dev) > int port; > int ret; > > + ret = ksz9477_ptp_shared_init(dev); > + if (ret) > + return ret; > + > for (port = 0; port < dev->port_cnt; port++) { > ret = ksz9477_ptp_port_init(dev, port); > if (ret) > @@ -867,6 +875,7 @@ static int ksz9477_ptp_ports_init(struct ksz_device > *dev) > error_deinit: > while (port-- > 0) > ksz9477_ptp_port_deinit(dev, port); > + ksz9477_ptp_shared_deinit(dev); > return ret; > } > > @@ -876,6 +885,7 @@ static void ksz9477_ptp_ports_deinit(struct > ksz_device *dev) > > for (port = 0; port < dev->port_cnt; port++) > ksz9477_ptp_port_deinit(dev, port); > + ksz9477_ptp_shared_deinit(dev); > } > > /* device attributes */ > diff --git a/drivers/net/dsa/microchip/ksz_common.h > b/drivers/net/dsa/microchip/ksz_common.h > index c9495c92a32d..abcbcbb3fcef 100644 > --- a/drivers/net/dsa/microchip/ksz_common.h > +++ b/drivers/net/dsa/microchip/ksz_common.h > @@ -45,7 +45,6 @@ struct ksz_port { > struct ksz_port_mib mib; > phy_interface_t interface; > #if IS_ENABLED(CONFIG_NET_DSA_MICROCHIP_KSZ9477_PTP) > - struct ksz_port_ptp_shared ptp_shared; > ktime_t tstamp_xdelay; > struct completion tstamp_completion; > bool hwts_tx_en; > diff --git a/include/linux/dsa/ksz_common.h > b/include/linux/dsa/ksz_common.h index a9b4720cc842..c75bc27e3e7a > 100644 > --- a/include/linux/dsa/ksz_common.h > +++ b/include/linux/dsa/ksz_common.h > @@ -35,13 +35,14 @@ struct ksz_device_ptp_shared { > /* approximated current time, read once per second from hardware > */ > struct timespec64 ptp_clock_time; > unsigned long state; > + void (*xmit_work_fn)(struct kthread_work *work); > + struct kthread_worker *xmit_worker; > }; > > -struct ksz_port_ptp_shared { > - struct ksz_device_ptp_shared *dev; > - struct kthread_worker *xmit_worker; > - struct kthread_work xmit_work; > - struct sk_buff_head xmit_queue; > +struct ksz_deferred_xmit_work { > + struct dsa_port *dp; > + struct sk_buff *skb; > + struct kthread_work work; > }; > > /* net/dsa/tag_ksz.c */ > diff --git a/net/dsa/tag_ksz.c b/net/dsa/tag_ksz.c index > 415a26044565..548f66888b0a 100644 > --- a/net/dsa/tag_ksz.c > +++ b/net/dsa/tag_ksz.c > @@ -175,11 +175,12 @@ static void ksz9477_xmit_timestamp(struct sk_buff > *skb) static struct sk_buff *ksz9477_defer_xmit(struct dsa_port *dp, > struct sk_buff *skb) > { > - struct ksz_port_ptp_shared *ptp_shared = dp->priv; > + struct ksz_device_ptp_shared *ptp_shared = dp->priv; > struct sk_buff *clone = DSA_SKB_CB(skb)->clone; > + struct ksz_deferred_xmit_work *xmit_work; > u8 ptp_msg_type; > > - if (!clone) > + if (!clone || !ptp_shared) > return skb; /* no deferred xmit for this packet */ > > /* Use cached PTP msg type from ksz9477_ptp_port_txtstamp(). */ > @@ -188,11 +189,18 @@ static struct sk_buff *ksz9477_defer_xmit(struct > dsa_port *dp, > ptp_msg_type != PTP_MSGTYPE_PDELAY_REQ) > goto out_free_clone; /* only PDelay_Req is deferred */ > > + xmit_work = kzalloc(sizeof(*xmit_work), GFP_ATOMIC); > + if (!xmit_work) > + return NULL; > + > + kthread_init_work(&xmit_work->work, ptp_shared- > >xmit_work_fn); > /* Increase refcount so the kfree_skb in dsa_slave_xmit > * won't really free the packet. > */ > - skb_queue_tail(&ptp_shared->xmit_queue, skb_get(skb)); > - kthread_queue_work(ptp_shared->xmit_worker, &ptp_shared- > >xmit_work); > + xmit_work->dp = dp; > + xmit_work->skb = skb_get(skb); > + > + kthread_queue_work(ptp_shared->xmit_worker, &xmit_work- > >work); > > return NULL; > > @@ -232,7 +240,7 @@ static void ksz9477_rcv_timestamp(struct sk_buff > *skb, u8 *tag, { > struct skb_shared_hwtstamps *hwtstamps = skb_hwtstamps(skb); > struct dsa_switch *ds = dev->dsa_ptr->ds; > - struct ksz_port_ptp_shared *port_ptp_shared; > + struct ksz_device_ptp_shared *ptp_shared; > u8 *tstamp_raw = tag - KSZ9477_PTP_TAG_LEN; > struct ptp_header *ptp_hdr; > unsigned int ptp_type; > @@ -240,15 +248,14 @@ static void ksz9477_rcv_timestamp(struct sk_buff > *skb, u8 *tag, > ktime_t tstamp; > s64 correction; > > - port_ptp_shared = dsa_to_port(ds, port)->priv; > - if (!port_ptp_shared) > + ptp_shared = dsa_to_port(ds, port)->priv; > + if (!ptp_shared) > return; > > /* convert time stamp and write to skb */ > tstamp = > ksz9477_decode_tstamp(get_unaligned_be32(tstamp_raw)); > memset(hwtstamps, 0, sizeof(*hwtstamps)); > - hwtstamps->hwtstamp = > ksz9477_tstamp_reconstruct(port_ptp_shared->dev, > - tstamp); > + hwtstamps->hwtstamp = ksz9477_tstamp_reconstruct(ptp_shared, > tstamp); > > /* For PDelay_Req messages, user space (ptp4l) expects that the > hardware > * subtracts the ingress time stamp from the correction field. The > @@ -289,8 +296,7 @@ static struct sk_buff *ksz9477_xmit(struct sk_buff > *skb, > struct net_device *dev) > { > struct dsa_port *dp = dsa_slave_to_port(dev); > - struct ksz_port_ptp_shared *port_ptp_shared = dp->priv; > - struct ksz_device_ptp_shared *ptp_shared = port_ptp_shared- > >dev; > + struct ksz_device_ptp_shared *ptp_shared = dp->priv; > __be16 *tag; > u8 *addr; > u16 val; > @@ -347,8 +353,7 @@ static struct sk_buff *ksz9893_xmit(struct sk_buff > *skb, > struct net_device *dev) > { > struct dsa_port *dp = dsa_slave_to_port(dev); > - struct ksz_port_ptp_shared *port_ptp_shared = dp->priv; > - struct ksz_device_ptp_shared *ptp_shared = port_ptp_shared- > >dev; > + struct ksz_device_ptp_shared *ptp_shared = dp->priv; > u8 *addr; > u8 *tag; > > -----------------------------[ cut here ]----------------------------- CONFIDENTIALITY NOTICE: This email and any attachments are for the sole use of the intended recipient and may contain material that is proprietary, confidential, privileged or otherwise legally protected or restricted under applicable government laws. Any review, disclosure, distributing or other use without expressed permission of the sender is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies without reading, printing, or saving. |
From: Vladimir O. <ol...@gm...> - 2021-10-18 13:48:13
|
Hi Christian, On Mon, Oct 18, 2021 at 02:14:12PM +0200, Christian Eggers wrote: > Hi Vladimir, > > On Monday, 18 October 2021, 13:39:42 CEST, Vladimir Oltean wrote: > > There's something that just doesn't compute for me. > > In those patches, Christian wrote: > > > > /* Currently, only P2P delay measurement is supported. Setting ocmode > > * to slave will work independently of actually being master or slave. > > * For E2E delay measurement, switching between master and slave would > > * be required, as the KSZ devices filters out PTP messages depending on > > * the ocmode setting: > > * - in slave mode, DelayReq messages are filtered out > > * - in master mode, Sync messages are filtered out > > * Currently (and probably also in future) there is no interface in the > > * kernel which allows switching between master and slave mode. For > > * this reason, E2E cannot be supported. See patchwork for full > > * discussion: > > * https://patchwork.ozlabs.org/project/netdev/patch/202...@ar.../ > > */ > > ksz9477_ptp_tcmode_set(dev, KSZ9477_PTP_TCMODE_P2P); > > ksz9477_ptp_ocmode_set(dev, KSZ9477_PTP_OCMODE_SLAVE); > > > > Did you modify the driver's OCMODE? I am super confused as to which > > packets ptp4l is actually waiting for a TX timestamp for. Because if > > you're using E2E and not P2P, then the entire ksz9477_port_deferred_xmit() > > is just dead code, is it not? > > I attached the patch series which I originally provided to Brian. This series > is for linux-5.10.x. The backports folder contains patches which are already > present in 5.11 and later kernels (some of them are even in the latest > 5.10-stable). For recent kernels, the ksz9563_ptp folder is sufficient. > > Compared with the latest series I sent to netdev, I added > 0010-net-dsa-microchip-ksz9477-add-E2E-support.patch for E2E support. This was > rejected by the ptp4l maintainer as it would require ptp4l to dynamically > switch the KSZ hardware between master and slave mode (there are packet filters > in hardware which cannot entirely be disabled). > > Currently there is not much demand for PTP in our current product development. > But if the KSZ work can be finished / mainlined, I am highly interested. My > latest status was (IIRC): > > 1. There is something wrong with the time stamping offsets. As a result, 1PPS > works nearly perfect with two KSZ devices, but shows a constant offset when > using a Meinberg clock as master. > > 2. Currently the user is responsible for providing a start time (for PPS) that > is in the future. In case the time point has already elapsed, the driver will > report an error. > > 3. Occasional timeouts when waiting for TX timestamps. If think that I already > implemented the driver changes you requested, but probably the problem still > persist. It is even possible that the (Brian's) KSZ9567 suffers from hardware > bugs here which are not present on (my) KSZ9563 (I think the guy from Microchip > mentioned this). > > 4. Sometimes the 1PPS becomes completely out of sync (but recovers then later). > This is surprising for me, as ptp4l uses filters/regulators and should not be > affected by single packet / timestamp failures. > > Is there anything I can help? Thanks for the clarification, it makes more sense now since the picture is more complete. I've no idea about the hardware specifics since I don't own the hardware. But I noticed a logical issue with might be relevant to the TX timestamp timeout events that Brian is seeing. Brian, can you try out the patch below? Compile-tested only, as mentioned I can't really do much more. -----------------------------[ cut here ]----------------------------- >From 7f4771bc19db9b48577f3f45ba907e5c13aea808 Mon Sep 17 00:00:00 2001 From: Vladimir Oltean <vla...@nx...> Date: Mon, 18 Oct 2021 16:35:18 +0300 Subject: [PATCH] net: dsa: ksz9477: use a kthread work item per deferred skb There might be a race in tag_ksz.c between these two lines: skb_queue_tail(&ptp_shared->xmit_queue, skb_get(skb)); kthread_queue_work(ptp_shared->xmit_worker, &ptp_shared->xmit_work); and the skb dequeue logic in ksz9477_port_deferred_xmit(). For example, the xmit_work might be already queued, however the work item has just finished walking through the skb queue. Because we don't check the return code from kthread_queue_work, we don't do anything if the work item is already queued. However, nobody will take that skb and send it, at least until the next timestampable skb is sent. With the ksz9477 driver, two-step TX timestamping is a rare process, and in certain configs it may happen even as rarely as once per second. So if the race condition described above happens, we might experience huge delays. To close that race, let's not keep a single work item per port, and a skb timestamping queue, but rather dynamically allocate a work item per packet. It is also unnecessary to have more than one kthread that does the work. So delete the per-port kthread allocation and replace them with a single kthread which is global to the switch. Signed-off-by: Vladimir Oltean <vla...@nx...> --- drivers/net/dsa/microchip/ksz9477_ptp.c | 88 ++++++++++++++----------- drivers/net/dsa/microchip/ksz_common.h | 1 - include/linux/dsa/ksz_common.h | 11 ++-- net/dsa/tag_ksz.c | 31 +++++---- 4 files changed, 73 insertions(+), 58 deletions(-) diff --git a/drivers/net/dsa/microchip/ksz9477_ptp.c b/drivers/net/dsa/microchip/ksz9477_ptp.c index c646689cb71e..0f05aafbdd3d 100644 --- a/drivers/net/dsa/microchip/ksz9477_ptp.c +++ b/drivers/net/dsa/microchip/ksz9477_ptp.c @@ -749,42 +749,62 @@ static void ksz9477_ptp_txtstamp_skb(struct ksz_device *dev, skb_complete_tx_timestamp(skb, &hwtstamps); } -#define work_to_port(work) \ - container_of((work), struct ksz_port_ptp_shared, xmit_work) -#define ptp_shared_to_ksz_port(t) \ - container_of((t), struct ksz_port, ptp_shared) -#define ptp_shared_to_ksz_device(t) \ - container_of((t), struct ksz_device, ptp_shared) +#define work_to_xmit_work(w) \ + container_of((w), struct ksz_deferred_xmit_work, work) /* Deferred work is necessary for time stamped PDelay_Req messages. This cannot * be done from atomic context as we have to wait for the hardware interrupt. */ static void ksz9477_port_deferred_xmit(struct kthread_work *work) { - struct ksz_port_ptp_shared *prt_ptp_shared = work_to_port(work); - struct ksz_port *prt = ptp_shared_to_ksz_port(prt_ptp_shared); - struct ksz_device_ptp_shared *ptp_shared = prt_ptp_shared->dev; - struct ksz_device *dev = ptp_shared_to_ksz_device(ptp_shared); - int port = prt - dev->ports; - struct sk_buff *skb; + struct ksz_deferred_xmit_work *xmit_work = work_to_xmit_work(work); + struct dsa_switch *ds = xmit_work->dp->ds; + struct sk_buff *skb = xmit_work->skb; + struct dsa_port *dp = xmit_work->dp; + struct ksz_device *dev = ds->priv; + struct ksz_port *prt = dp->priv; + + reinit_completion(&prt->tstamp_completion); - while ((skb = skb_dequeue(&prt_ptp_shared->xmit_queue)) != NULL) { - struct sk_buff *clone = DSA_SKB_CB(skb)->clone; + /* Transfer skb to the host port. */ + dsa_enqueue_skb(skb, dp->slave); - reinit_completion(&prt->tstamp_completion); + ksz9477_ptp_txtstamp_skb(dev, prt, DSA_SKB_CB(skb)->clone); + kfree(xmit_work); +} - /* Transfer skb to the host port. */ - dsa_enqueue_skb(skb, dsa_to_port(dev->ds, port)->slave); +static int ksz9477_ptp_shared_init(struct ksz_device *dev) +{ + struct ksz_device_ptp_shared *ptp_shared = &dev->ptp_shared; + int ret; - ksz9477_ptp_txtstamp_skb(dev, prt, clone); + /* PDelay_Req messages require deferred transmit as the time + * stamp unit provides no sequenceId or similar. So we must + * wait for the time stamp interrupt. + */ + ptp_shared->xmit_work_fn = ksz9477_port_deferred_xmit; + ptp_shared->xmit_worker = kthread_create_worker(0, "ksz_xmit"); + if (IS_ERR(ptp_shared->xmit_worker)) { + ret = PTR_ERR(ptp_shared->xmit_worker); + dev_err(dev->dev, + "failed to create deferred xmit thread: %d\n", ret); + return ret; } + + return 0; +} + +static void ksz9477_ptp_shared_deinit(struct ksz_device *dev) +{ + struct ksz_device_ptp_shared *ptp_shared = &dev->ptp_shared; + + kthread_destroy_worker(ptp_shared->xmit_worker); } static int ksz9477_ptp_port_init(struct ksz_device *dev, int port) { - struct ksz_port *prt = &dev->ports[port]; - struct ksz_port_ptp_shared *ptp_shared = &prt->ptp_shared; struct dsa_port *dp = dsa_to_port(dev->ds, port); + struct ksz_port *prt = &dev->ports[port]; int ret; if (port == dev->cpu_port) @@ -809,31 +829,15 @@ static int ksz9477_ptp_port_init(struct ksz_device *dev, int port) if (ret) goto error_disable_port_ptp_interrupts; - /* ksz_port::ptp_shared is used in tagging driver */ - ptp_shared->dev = &dev->ptp_shared; - dp->priv = ptp_shared; - /* PDelay_Req messages require deferred transmit as the time * stamp unit provides no sequenceId or similar. So we must * wait for the time stamp interrupt. */ + dp->priv = &dev->ptp_shared; init_completion(&prt->tstamp_completion); - kthread_init_work(&ptp_shared->xmit_work, - ksz9477_port_deferred_xmit); - ptp_shared->xmit_worker = kthread_create_worker(0, "%s_xmit", - dp->slave->name); - if (IS_ERR(ptp_shared->xmit_worker)) { - ret = PTR_ERR(ptp_shared->xmit_worker); - dev_err(dev->dev, - "failed to create deferred xmit thread: %d\n", ret); - goto error_disable_port_egress_interrupts; - } - skb_queue_head_init(&ptp_shared->xmit_queue); return 0; -error_disable_port_egress_interrupts: - ksz9477_ptp_enable_port_egress_interrupts(dev, port, false); error_disable_port_ptp_interrupts: ksz9477_ptp_enable_port_ptp_interrupts(dev, port, false); return ret; @@ -841,12 +845,12 @@ static int ksz9477_ptp_port_init(struct ksz_device *dev, int port) static void ksz9477_ptp_port_deinit(struct ksz_device *dev, int port) { - struct ksz_port_ptp_shared *ptp_shared = &dev->ports[port].ptp_shared; + struct dsa_port *dp = dsa_to_port(dev->ds, port); if (port == dev->cpu_port) return; - kthread_destroy_worker(ptp_shared->xmit_worker); + dp->priv = NULL; ksz9477_ptp_enable_port_egress_interrupts(dev, port, false); ksz9477_ptp_enable_port_ptp_interrupts(dev, port, false); } @@ -856,6 +860,10 @@ static int ksz9477_ptp_ports_init(struct ksz_device *dev) int port; int ret; + ret = ksz9477_ptp_shared_init(dev); + if (ret) + return ret; + for (port = 0; port < dev->port_cnt; port++) { ret = ksz9477_ptp_port_init(dev, port); if (ret) @@ -867,6 +875,7 @@ static int ksz9477_ptp_ports_init(struct ksz_device *dev) error_deinit: while (port-- > 0) ksz9477_ptp_port_deinit(dev, port); + ksz9477_ptp_shared_deinit(dev); return ret; } @@ -876,6 +885,7 @@ static void ksz9477_ptp_ports_deinit(struct ksz_device *dev) for (port = 0; port < dev->port_cnt; port++) ksz9477_ptp_port_deinit(dev, port); + ksz9477_ptp_shared_deinit(dev); } /* device attributes */ diff --git a/drivers/net/dsa/microchip/ksz_common.h b/drivers/net/dsa/microchip/ksz_common.h index c9495c92a32d..abcbcbb3fcef 100644 --- a/drivers/net/dsa/microchip/ksz_common.h +++ b/drivers/net/dsa/microchip/ksz_common.h @@ -45,7 +45,6 @@ struct ksz_port { struct ksz_port_mib mib; phy_interface_t interface; #if IS_ENABLED(CONFIG_NET_DSA_MICROCHIP_KSZ9477_PTP) - struct ksz_port_ptp_shared ptp_shared; ktime_t tstamp_xdelay; struct completion tstamp_completion; bool hwts_tx_en; diff --git a/include/linux/dsa/ksz_common.h b/include/linux/dsa/ksz_common.h index a9b4720cc842..c75bc27e3e7a 100644 --- a/include/linux/dsa/ksz_common.h +++ b/include/linux/dsa/ksz_common.h @@ -35,13 +35,14 @@ struct ksz_device_ptp_shared { /* approximated current time, read once per second from hardware */ struct timespec64 ptp_clock_time; unsigned long state; + void (*xmit_work_fn)(struct kthread_work *work); + struct kthread_worker *xmit_worker; }; -struct ksz_port_ptp_shared { - struct ksz_device_ptp_shared *dev; - struct kthread_worker *xmit_worker; - struct kthread_work xmit_work; - struct sk_buff_head xmit_queue; +struct ksz_deferred_xmit_work { + struct dsa_port *dp; + struct sk_buff *skb; + struct kthread_work work; }; /* net/dsa/tag_ksz.c */ diff --git a/net/dsa/tag_ksz.c b/net/dsa/tag_ksz.c index 415a26044565..548f66888b0a 100644 --- a/net/dsa/tag_ksz.c +++ b/net/dsa/tag_ksz.c @@ -175,11 +175,12 @@ static void ksz9477_xmit_timestamp(struct sk_buff *skb) static struct sk_buff *ksz9477_defer_xmit(struct dsa_port *dp, struct sk_buff *skb) { - struct ksz_port_ptp_shared *ptp_shared = dp->priv; + struct ksz_device_ptp_shared *ptp_shared = dp->priv; struct sk_buff *clone = DSA_SKB_CB(skb)->clone; + struct ksz_deferred_xmit_work *xmit_work; u8 ptp_msg_type; - if (!clone) + if (!clone || !ptp_shared) return skb; /* no deferred xmit for this packet */ /* Use cached PTP msg type from ksz9477_ptp_port_txtstamp(). */ @@ -188,11 +189,18 @@ static struct sk_buff *ksz9477_defer_xmit(struct dsa_port *dp, ptp_msg_type != PTP_MSGTYPE_PDELAY_REQ) goto out_free_clone; /* only PDelay_Req is deferred */ + xmit_work = kzalloc(sizeof(*xmit_work), GFP_ATOMIC); + if (!xmit_work) + return NULL; + + kthread_init_work(&xmit_work->work, ptp_shared->xmit_work_fn); /* Increase refcount so the kfree_skb in dsa_slave_xmit * won't really free the packet. */ - skb_queue_tail(&ptp_shared->xmit_queue, skb_get(skb)); - kthread_queue_work(ptp_shared->xmit_worker, &ptp_shared->xmit_work); + xmit_work->dp = dp; + xmit_work->skb = skb_get(skb); + + kthread_queue_work(ptp_shared->xmit_worker, &xmit_work->work); return NULL; @@ -232,7 +240,7 @@ static void ksz9477_rcv_timestamp(struct sk_buff *skb, u8 *tag, { struct skb_shared_hwtstamps *hwtstamps = skb_hwtstamps(skb); struct dsa_switch *ds = dev->dsa_ptr->ds; - struct ksz_port_ptp_shared *port_ptp_shared; + struct ksz_device_ptp_shared *ptp_shared; u8 *tstamp_raw = tag - KSZ9477_PTP_TAG_LEN; struct ptp_header *ptp_hdr; unsigned int ptp_type; @@ -240,15 +248,14 @@ static void ksz9477_rcv_timestamp(struct sk_buff *skb, u8 *tag, ktime_t tstamp; s64 correction; - port_ptp_shared = dsa_to_port(ds, port)->priv; - if (!port_ptp_shared) + ptp_shared = dsa_to_port(ds, port)->priv; + if (!ptp_shared) return; /* convert time stamp and write to skb */ tstamp = ksz9477_decode_tstamp(get_unaligned_be32(tstamp_raw)); memset(hwtstamps, 0, sizeof(*hwtstamps)); - hwtstamps->hwtstamp = ksz9477_tstamp_reconstruct(port_ptp_shared->dev, - tstamp); + hwtstamps->hwtstamp = ksz9477_tstamp_reconstruct(ptp_shared, tstamp); /* For PDelay_Req messages, user space (ptp4l) expects that the hardware * subtracts the ingress time stamp from the correction field. The @@ -289,8 +296,7 @@ static struct sk_buff *ksz9477_xmit(struct sk_buff *skb, struct net_device *dev) { struct dsa_port *dp = dsa_slave_to_port(dev); - struct ksz_port_ptp_shared *port_ptp_shared = dp->priv; - struct ksz_device_ptp_shared *ptp_shared = port_ptp_shared->dev; + struct ksz_device_ptp_shared *ptp_shared = dp->priv; __be16 *tag; u8 *addr; u16 val; @@ -347,8 +353,7 @@ static struct sk_buff *ksz9893_xmit(struct sk_buff *skb, struct net_device *dev) { struct dsa_port *dp = dsa_slave_to_port(dev); - struct ksz_port_ptp_shared *port_ptp_shared = dp->priv; - struct ksz_device_ptp_shared *ptp_shared = port_ptp_shared->dev; + struct ksz_device_ptp_shared *ptp_shared = dp->priv; u8 *addr; u8 *tag; -----------------------------[ cut here ]----------------------------- |
From: Vladimir O. <ol...@gm...> - 2021-10-18 11:39:50
|
On Fri, Oct 15, 2021 at 12:01:24AM +0000, Bri...@L3... wrote: > > > > If this is a "stack" issue, what can I do to reduce the "message rate" > > > > or "grant duration" if these are related to whatever a "stack" issue > > > > is? > > > > > > I'd be willing to put my money on a driver bug. But for that you'd > > > need to confirm that the issue reproduces with the default.cfg and not > > > just with the > > > G.8275.2 profile. Don't try to run before you can walk. > > So I ran tests using a plain 1588 profile and E2E and yes the problem still happens. Here is that config: There's something that just doesn't compute for me. In those patches, Christian wrote: /* Currently, only P2P delay measurement is supported. Setting ocmode * to slave will work independently of actually being master or slave. * For E2E delay measurement, switching between master and slave would * be required, as the KSZ devices filters out PTP messages depending on * the ocmode setting: * - in slave mode, DelayReq messages are filtered out * - in master mode, Sync messages are filtered out * Currently (and probably also in future) there is no interface in the * kernel which allows switching between master and slave mode. For * this reason, E2E cannot be supported. See patchwork for full * discussion: * https://patchwork.ozlabs.org/project/netdev/patch/202...@ar.../ */ ksz9477_ptp_tcmode_set(dev, KSZ9477_PTP_TCMODE_P2P); ksz9477_ptp_ocmode_set(dev, KSZ9477_PTP_OCMODE_SLAVE); Did you modify the driver's OCMODE? I am super confused as to which packets ptp4l is actually waiting for a TX timestamp for. Because if you're using E2E and not P2P, then the entire ksz9477_port_deferred_xmit() is just dead code, is it not? > [global] > # > # Default Data Set (summary of your changes) twoStepFlag: 1 to 0 slaveOnly: 0 to 1 clockClass: 248 to 6 fault_reset_interval: 4 to -128 tx_timestamp_timeout: 10 to 1000 unicast_listen: 0 to 1 unicast_req_duration: 3600 to 300 summary_interval: 0 to 4 time_stamping: hardware to p2p1step tsproc_mode: filter to raw_weight Can you just print the packet in ptp4l? You're using the default.cfg settings otherwise, so the UDPv4 network_transport, so: static int udp_send(struct transport *t, struct fdarray *fda, enum transport_event event, int peer, void *buf, int len, struct address *addr, struct hw_timestamp *hwts) ... cnt = sendto(fd, buf, len, 0, &addr->sa, sizeof(addr->sin)); if (cnt < 1) { pr_err("sendto failed: %m"); return -errno; } /* * Get the time stamp right away. */ return event == TRANS_EVENT ? sk_receive(fd, junk, len, NULL, hwts, MSG_ERRQUEUE) : cnt; ^ you can print the buf here if sk_receive returns negative The only place I find where this makes sense to be called from is: port_delay_request: if (port_prepare_and_send(p, msg, TRANS_EVENT)) { But that further suggests that you've modified the driver, because: /* Defer transmit if waiting for egress time stamp is required. */ static struct sk_buff *ksz9477_defer_xmit(struct dsa_port *dp, struct sk_buff *skb) { /* Use cached PTP msg type from ksz9477_ptp_port_txtstamp(). */ ptp_msg_type = KSZ9477_SKB_CB(clone)->ptp_msg_type; if (ptp_msg_type != PTP_MSGTYPE_PDELAY_REQ) goto out_free_clone; /* only PDelay_Req is deferred */ So could you share the exact list of changes you've made to the patches from the form that they were posted in? > > And I did find a bug in the DSA driver but it didn't appear to change anything. > > In ksz9477_ptp_txtstamp_skb function the "ret" that is being assigned > by "wait_for_completion_timeout" returning is declared as an "int" > instead of an "unsigned long" so I fixed that. Doesn't really make a difference on a 64-bit machine. Nonetheless, is that the sticking point? Do you see this error message in dmesg when user space loses the TX timestamp? dev_err(dev->dev, "timeout waiting for time stamp\n"); > ... still looking for other stuff but again, I'm probably not > experienced enough (yet) with DSA and LinuxPTP to do much good. |
From: <Bri...@L3...> - 2021-10-15 00:01:34
|
Hi again, Latest update below ... > -----Original Message----- > From: Hutchinson, Brian (US) - PSPC > Sent: Wednesday, October 13, 2021 10:32 AM > To: Vladimir Oltean <ol...@gm...> > Cc: lin...@li...; Christian Eggers <ce...@ar...> > Subject: RE: [EXTERNAL] Re: [Linuxptp-users] Using G.8275.2 profile and > getting tx timestamp timeout, but changing logSyncInterval etc. changes how > often this happens > > Hi Vladimir, > > > > -----Original Message----- > > From: Vladimir Oltean <ol...@gm...> > > Sent: Tuesday, October 12, 2021 7:11 PM > > To: Hutchinson, Brian (US) - PSPC <Bri...@L3...> > > Cc: lin...@li...; Christian Eggers > > <ce...@ar...> > > Subject: [EXTERNAL] Re: [Linuxptp-users] Using G.8275.2 profile and > > getting tx timestamp timeout, but changing logSyncInterval etc. > > changes how often this happens > > > > On Fri, Oct 08, 2021 at 03:22:10PM +0000, Brian.Hutchinson--- via > > Linuxptp- users wrote: > > > Hi, > > > > > > I'm using Christian's DSA patches > > > https://lkml.org/lkml/2020/10/19/633) on a NXP iMX8MM with a > > > Microchip > > > ksz9567 with ptp4l.conf setup for E2E G.8275.2 profile. I'm running > > > a 1G RGMII interface and my GM and unit under test is connected via > > > a 1G Netgear dumb switch. > > > > > > Using 5.10.32 kernel with CONFIG_HZ_1000 and nohz=off on cmdline. > > > > > > I've been getting the "timed out while polling for tx timestamp" > > > error which causes linuxptp to restart. When linuxptp restarts my > > > 1PPS (generated from Microchip switch) walks all over the place on > > > my O Scope until linuxptp gets a good sync again and pulls 1PPS back > > > into sync with the GM sync out reference I'm also watching on the scope. > > > > > > Of course increasing tx_timestamp_timeout doesn't appear to help in > > > this case. I've tried values all the way up to 8000. > > > > > > But I can significantly reduce the frequency of the problem if I > > > make changes to some ptp4l.conf settings. > > > > > > With ptp4l.conf settings: > > > > > > logAnnounceInterval 1 > > > logSyncInterval 0 > > > logMinDelayReqInterval 0 > > > logMinPdelayReqInterval 0 > > > announceReceiptTimeout 2 > > > > > > I'll see the tx timestamp timeout probably 15 or so times running a > > > test overnight. > > > > > > If I set : > > > > > > logAnnounceInterval 1 > > > logSyncInterval 2 > > > logMinDelayReqInterval 2 > > > logMinPdelayReqInterval 2 > > > announceReceiptTimeout 2 > > > > > > ... then I might see tx timestamp only once or twice on an overnight run. > > > > > > I read a comment from Douglas Arnold from Meinberg that if basically > > > anything goes wrong with fulfilling a grant, message rate or grant > > > duration, or both, should be reduced. > > > > > > I've searched the archives and read all of the responses and a few > > > caught my attention. Most say it's a driver bug but some said it > > > could be a stack issue. So I'm wondering since I can significantly > > > decrease the occurrence of the tx timeout by modifying above > > > settings, what other settings would affect or tune this particular > > > telco profile? > > > > > > I'm still fairly new to all this and I understand the telco profiles > > > are a bit unique so I'm trying to understand what ptp4l.conf > > > settings I need to focus on for this particular profile. > > > > > > If this is a "stack" issue, what can I do to reduce the "message rate" > > > or "grant duration" if these are related to whatever a "stack" issue > > > is? > > > > I'd be willing to put my money on a driver bug. But for that you'd > > need to confirm that the issue reproduces with the default.cfg and not > > just with the > > G.8275.2 profile. Don't try to run before you can walk. So I ran tests using a plain 1588 profile and E2E and yes the problem still happens. Here is that config: [global] # # Default Data Set # twoStepFlag 0 slaveOnly 1 priority1 128 priority2 128 domainNumber 0 #utc_offset 37 clockClass 6 #clockClass 255 #step_window 48 clockAccuracy 0xFE offsetScaledLogVariance 0xFFFF free_running 0 freq_est_interval 1 dscp_event 0 dscp_general 0 dataset_comparison ieee1588 #for G.8275.1 #dataset_comparison G.8275.x G.8275.defaultDS.localPriority 128 # # Port Data Set # logAnnounceInterval 1 logSyncInterval 0 logMinDelayReqInterval 0 logMinPdelayReqInterval 0 announceReceiptTimeout 3 syncReceiptTimeout 0 delayAsymmetry 0 fault_reset_interval -128 #fault_reset_interval 4 neighborPropDelayThresh 20000000 masterOnly 0 G.8275.portDS.localPriority 128 # # Run time options # assume_two_step 0 logging_level 6 path_trace_enabled 0 follow_up_info 0 hybrid_e2e 0 inhibit_multicast_service 0 net_sync_monitor 0 tc_spanning_tree 0 #tx_timestamp_timeout 300 tx_timestamp_timeout 1000 unicast_listen 1 unicast_req_duration 300 #unicast_master_table 1 use_syslog 1 verbose 0 summary_interval 4 kernel_leap 1 check_fup_sync 0 # # Servo Options # #servo_offset_threshold 100 #servo_num_offset_values 64 pi_proportional_const 0.0 pi_integral_const 0.0 pi_proportional_scale 0.0 pi_proportional_exponent -0.3 pi_proportional_norm_max 0.7 pi_integral_scale 0.0 pi_integral_exponent 0.4 pi_integral_norm_max 0.3 step_threshold 0.0 #step_threshold 0.00002 first_step_threshold 0.00002 max_frequency 900000000 clock_servo pi sanity_freq_limit 200000000 ntpshm_segment 0 # # Transport options # transportSpecific 0x0 ptp_dst_mac 01:1B:19:00:00:00 p2p_dst_mac 01:80:C2:00:00:0E udp_ttl 1 #udp6_scope 0x0E uds_address /var/run/ptp4l # # Default interface options # clock_type OC network_transport UDPv4 #delay_mechanism P2P delay_mechanism E2E time_stamping p2p1step #time_stamping onestep #time_stamping hardware #tsproc_mode filter #tsproc_mode raw tsproc_mode raw_weight delay_filter moving_median delay_filter_length 10 egressLatency 0 ingressLatency 0 boundary_clock_jbod 0 # # Clock description # productDescription ;; revisionData ;; manufacturerIdentity 00:00:00 userDescription ; timeSource 0xA0 #maxStepsRemoved 255 # #[unicast_master_table] #table_id 1 #logQueryInterval 2 #UDPv4 192.168.0.250 #UDPv4 192.168.1.250 # #[lan1] #unicast_master_table 1 And I did find a bug in the DSA driver but it didn't appear to change anything. In ksz9477_ptp_txtstamp_skb function the "ret" that is being assigned by "wait_for_completion_timeout" returning is declared as an "int" instead of an "unsigned long" so I fixed that. ... still looking for other stuff but again, I'm probably not experienced enough (yet) with DSA and LinuxPTP to do much good. Regards, Brian CONFIDENTIALITY NOTICE: This email and any attachments are for the sole use of the intended recipient and may contain material that is proprietary, confidential, privileged or otherwise legally protected or restricted under applicable government laws. Any review, disclosure, distributing or other use without expressed permission of the sender is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies without reading, printing, or saving. |
From: Peter B. <pe...@be...> - 2021-10-14 09:04:36
|
Hi again, > phc2sys has an automatic mode. See the man page for details. I have some follow up question as I still can not see that I get this correct. I hope some experienced linuxptp users can help out. I can not get sync between CLOCK_REALTIME and PHC as I expect. I have two devices connected through a AVB Switch running gPTP config like the example from the repo. Only changed priority to 246 in order to become master instead of the AVB Switch. Both devices runnin glinuxptp version 3.1.1. Here is the config file used for ptp4l: # # 802.1AS example configuration containing those attributes which # differ from the defaults. See the file, default.cfg, for the # complete list of available options. # [global] gmCapable 1 priority1 246 priority2 246 logAnnounceInterval 0 logSyncInterval -3 syncReceiptTimeout 3 neighborPropDelayThresh 800 min_neighbor_prop_delay -20000000 assume_two_step 1 path_trace_enabled 1 follow_up_info 1 transportSpecific 0x1 ptp_dst_mac 01:80:C2:00:00:0E network_transport L2 delay_mechanism P2P On the first device I start and configure the linuxptp applications like this: $ ptp4l -i eth1 -f gPTP.cfg $ pmc -u -b 0 -t 1 "SET GRANDMASTER_SETTINGS_NP clockClass 246 \ clockAccuracy 0xfe offsetScaledLogVariance 0xffff \ currentUtcOffset 37 leap61 0 leap59 0 currentUtcOffsetValid 1 \ ptpTimescale 1 timeTraceable 1 frequencyTraceable 0 \ timeSource 0xa0" $ phc2sys -a -r -r --transportSpecific=1 -E ntpshm With this configuration this PTP device becomes master on the network. phc2sys starts to sync PHC from CLOCK_REALTIME. What I now expect is that CLOCK_REALTIME (in UTC) will be exactly 37 seconds behind PHC (in TAI) but I can not see this happen. $ date;phc_ctl eth1 get;phc_ctl eth1 cmp Thu Oct 14 08:56:37 UTC 2021 phc_ctl[573.296]: clock time is 1634201797.174712670 or Thu Oct 14 08:56:37 2021 phc_ctl[573.303]: offset from CLOCK_REALTIME is 1472ns The log file from phc2sys output like this: phc2sys[524]: [599.261] eth1 sys offset -37000001462 s0 freq +0 delay 875 phc2sys[524]: [600.262] eth1 sys offset -37000001467 s0 freq +0 delay 875 As phc_ctl does not compensate for UTC-TAI as I interpret the code I expected to see the 37 sec diff in the output from that utility and in the log file I expect something close to 0 offset. I think I have some bad configuration in my setup. Can someone help me understand how to proper setup this in order to get a good sync on the master side between the clocks? And I assume that I can use the same setting on the slave device to get it correct in that end to? I also think it is strange I can not get better precision than around 1.4 ms between the clocks. Thanks, /Peter |
From: Peter B. <pe...@be...> - 2021-10-14 08:07:18
|
Hi Vincius, On 2021-10-14 01:44, Vinicius Costa Gomes wrote: > One thing that could be missing is for phc2sys to use the same config > file as ptp4l, this should be more convenient, something like this: > > $ ./phc2sys -a -r -r -q -m -f /etc/linuxptp/gPTP.cfg sure it can be more convenient but I think I have to check all settings then at make sure they work for both applications. The man page for phc2sys has a quite clear warning about this, probably for a reason: WARNING Be cautious when the same configuration file is used for both ptp4l and phc2sys. Keep in mind, that values specified in the configuration file take precedence over their default values. If a certain option, which is common to ptp4l and phc2sys, is specified to a non- default value in the configuration file (p.e., for ptp4l), then this non-default value applies also for phc2sys. This might be not what is expected. It is recommended to use seperate configuration files for ptp4l and phc2sys in order to avoid any unexpected behavior. But along the road when I get full control of the applications this can probably be a good solution. Thanks, /Peter |
From: Vinicius C. G. <vin...@in...> - 2021-10-13 23:44:43
|
Hi, Peter Bergin <pe...@be...> writes: > Hi, > > On 2021-10-13 00:19, Richard Cochran wrote: >> On Tue, Oct 12, 2021 at 09:47:21PM +0200, Peter Bergin wrote: >>> Hi, >>> >>> I'm currently working on a network product using gPTP configuration for PTP >>> sync. The device shall be a AVB endpoint handling audio and according to >>> Avnu specifications all endpoint shall be capable of being master on the >>> network. Some plugins for AVB (such as gstreamer and alsa-plugins) require >>> that system time (CLOCK_REALTIME) is in sync with PHC to get correct timing. >>> I see different cases depending on if the device is master or slave. If the >>> device is slave I would like to sync PHC to CLOCK_REALTIME. The other way >>> around, if device is master I would like to sync CLOCK_REALTIME to PHC. >>> >>> To the question; is there a way to use phc2sys and handle this >>> automatically? >> phc2sys has an automatic mode. See the man page for details. >> > Thanks! My bad, should have read man pages better. And I did try '-a' > but had issues with it. To summarize for other users: > > $ ./phc2sys -a -r -r -q -m > phc2sys[412816.955]: Waiting for ptp4l... > phc2sys[412817.956]: Waiting for ptp4l... One thing that could be missing is for phc2sys to use the same config file as ptp4l, this should be more convenient, something like this: $ ./phc2sys -a -r -r -q -m -f /etc/linuxptp/gPTP.cfg Cheers, -- Vinicius |
From: Keller, J. E <jac...@in...> - 2021-10-13 17:41:30
|
On 10/13/2021 12:50 AM, Peter Bergin wrote: > Hi, > > On 2021-10-13 00:19, Richard Cochran wrote: >> On Tue, Oct 12, 2021 at 09:47:21PM +0200, Peter Bergin wrote: >>> Hi, >>> >>> I'm currently working on a network product using gPTP configuration for PTP >>> sync. The device shall be a AVB endpoint handling audio and according to >>> Avnu specifications all endpoint shall be capable of being master on the >>> network. Some plugins for AVB (such as gstreamer and alsa-plugins) require >>> that system time (CLOCK_REALTIME) is in sync with PHC to get correct timing. >>> I see different cases depending on if the device is master or slave. If the >>> device is slave I would like to sync PHC to CLOCK_REALTIME. The other way >>> around, if device is master I would like to sync CLOCK_REALTIME to PHC. >>> >>> To the question; is there a way to use phc2sys and handle this >>> automatically? >> phc2sys has an automatic mode. See the man page for details. >> > Thanks! My bad, should have read man pages better. And I did try '-a' > but had issues with it. To summarize for other users: > > $ ./phc2sys -a -r -r -q -m > phc2sys[412816.955]: Waiting for ptp4l... > phc2sys[412817.956]: Waiting for ptp4l... > > I had trouble communicating over UDS between ptp4l and phc2sys. The > problem was that I'm working with gPTP (IEEE 802.11AS) and the setting > transportSpecific didn't match between ptp4l and phc2sys. So the > solution was, as stated in the man page, to add '--transportSpecific=1' > when starting phc2sys and the issue was solved. > > ptp4l just silently drops messages if transportSpecific don't match (and > ignore_transport_specific=0). I tried to debug this issue with help of > debug messages (-l 7) but couldn't find it that way. Would it be a good > thing to add debug prints when that happens to improve visibility? Or > would that flood the log in some cases? > > diff --git a/port.c b/port.c > index fa49663..1c04fc5 100644 > --- a/port.c > +++ b/port.c > @@ -699,6 +699,8 @@ static int port_ignore(struct port *p, struct > ptp_message *m) > } > if (p->match_transport_specific && > msg_transport_specific(m) != p->transportSpecific) { > + pr_debug("port %hu: transport_specific did not match, > will drop message", > + portnum(p)); > return 1; > } > if (pid_eq(&m->header.sourcePortIdentity, &p->portIdentity)) { > diff --git a/tc.c b/tc.c > index 0346ba9..705f54c 100644 > --- a/tc.c > +++ b/tc.c > @@ -478,6 +478,8 @@ int tc_ignore(struct port *p, struct ptp_message *m) > > if (p->match_transport_specific && > msg_transport_specific(m) != p->transportSpecific) { > + pr_debug("port %hu: transport_specific did not match, > will drop message", > + portnum(p)); > return 1; > } > if (pid_eq(&m->header.sourcePortIdentity, &p->portIdentity)) { > > > Thanks, > /Peter > I'm ok with the debug prints. These would only display at the highest log level, so it would only clutter logs for those who are already debugging. Thanks, Jake > > > _______________________________________________ > Linuxptp-users mailing list > Lin...@li... > https://lists.sourceforge.net/lists/listinfo/linuxptp-users > |
From: <Bri...@L3...> - 2021-10-13 14:31:47
|
Hi Vladimir, > -----Original Message----- > From: Vladimir Oltean <ol...@gm...> > Sent: Tuesday, October 12, 2021 7:11 PM > To: Hutchinson, Brian (US) - PSPC <Bri...@L3...> > Cc: lin...@li...; Christian Eggers <ce...@ar...> > Subject: [EXTERNAL] Re: [Linuxptp-users] Using G.8275.2 profile and getting > tx timestamp timeout, but changing logSyncInterval etc. changes how often > this happens > > On Fri, Oct 08, 2021 at 03:22:10PM +0000, Brian.Hutchinson--- via Linuxptp- > users wrote: > > Hi, > > > > I'm using Christian's DSA patches > > https://lkml.org/lkml/2020/10/19/633) on a NXP iMX8MM with a Microchip > > ksz9567 with ptp4l.conf setup for E2E G.8275.2 profile. I'm running a > > 1G RGMII interface and my GM and unit under test is connected via a 1G > > Netgear dumb switch. > > > > Using 5.10.32 kernel with CONFIG_HZ_1000 and nohz=off on cmdline. > > > > I've been getting the "timed out while polling for tx timestamp" error > > which causes linuxptp to restart. When linuxptp restarts my 1PPS > > (generated from Microchip switch) walks all over the place on my O > > Scope until linuxptp gets a good sync again and pulls 1PPS back into > > sync with the GM sync out reference I'm also watching on the scope. > > > > Of course increasing tx_timestamp_timeout doesn't appear to help in > > this case. I've tried values all the way up to 8000. > > > > But I can significantly reduce the frequency of the problem if I make > > changes to some ptp4l.conf settings. > > > > With ptp4l.conf settings: > > > > logAnnounceInterval 1 > > logSyncInterval 0 > > logMinDelayReqInterval 0 > > logMinPdelayReqInterval 0 > > announceReceiptTimeout 2 > > > > I'll see the tx timestamp timeout probably 15 or so times running a > > test overnight. > > > > If I set : > > > > logAnnounceInterval 1 > > logSyncInterval 2 > > logMinDelayReqInterval 2 > > logMinPdelayReqInterval 2 > > announceReceiptTimeout 2 > > > > ... then I might see tx timestamp only once or twice on an overnight run. > > > > I read a comment from Douglas Arnold from Meinberg that if basically > > anything goes wrong with fulfilling a grant, message rate or grant > > duration, or both, should be reduced. > > > > I've searched the archives and read all of the responses and a few > > caught my attention. Most say it's a driver bug but some said it > > could be a stack issue. So I'm wondering since I can significantly > > decrease the occurrence of the tx timeout by modifying above settings, > > what other settings would affect or tune this particular telco > > profile? > > > > I'm still fairly new to all this and I understand the telco profiles > > are a bit unique so I'm trying to understand what ptp4l.conf settings > > I need to focus on for this particular profile. > > > > If this is a "stack" issue, what can I do to reduce the "message rate" > > or "grant duration" if these are related to whatever a "stack" issue > > is? > > I'd be willing to put my money on a driver bug. But for that you'd need to > confirm that the issue reproduces with the default.cfg and not just with the > G.8275.2 profile. Don't try to run before you can walk. Ah, you are using my military saying of "craw, walk, run ... stumble, fall down" against me! I had this working with normal 1588 profile with P2P and don't recall having any linuxptp restarts due to tx timeouts. I think the problems I've noticed are only with E2E but I could be mistaken. > > Make no mistake, there was a reason why the patches you've pointed to > were not applied to the mainline kernel in their given form at the time. > > But regardless, which specific version of the patches have you applied? > Your link points to the RFC (aka "barely works"), whereas the latest version, > before being abandoned, was v5. > https://patchwork.kernel.org/project/netdevbpf/cover/20201203102117.899 > 5-1...@ar.../ Oh I'm aware. I've been dealing with this Microchip long before I met Christian. I've been working with Christian since we are in same boat using Microchip switches. I'm using his very latest patch set. I've read the discussions you guys had about the patches. But I have no choice other than to use Christians patches and do whatever I can to prove them out and make improvements if possible as Microchip has nothing for us at this time. They have things in the works (they are busy on other things too) that may help one day but it doesn't help us now with our immediate need. I tried to get their proprietary patch set working on our platform with the help from Microchip and could never get anything to work. That's probably partly my fault but this is a direction I never really wanted to go in anyway. I only attempted it as I thought it was the shortest path to get what we needed working the quickest. But we will be changing kernels frequently (cyber guys have to make a living too you know) so I didn't want to continually have to figure out how to continue to apply Microchips proprietary patches to a continually moving target ... especially when their stuff will never be mainlined. I like to believe that going the DSA route and attempting to get something mainlined for the ksz is time better spent. Which may be naïve considering the previous comments regarding all this but I still think it's the right thing to do. > I specifically had a comment that TX timestamps would potentially get lost if > user space would attempt timestamping of one frame while another was still > in progress, and this only got fixed in v5 by the addition of a > ksz9477_defer_xmit() function that waits until the in-flight skb has been > timestamped. There might be other issues too. ... the version I have has ksz9477_defer_xmit. I noticed in the "sja1105_port_deferred_xmit" they protected theirs with a mutex and also do a check on a "clone" variable that looks to be associated with "dsa_skb_tx_timestamp" ... but the ksz dsa doesn't have that so don't know if I'm on the right trail with this or not. Unfortunately I've just recently got into all this so I don't have the knowledge and background you guys do so I probably only know enough to be dangerous. In reading the archives, I do enjoy reading your posts. Glad you chimed in and hope to learn something. > > The logAnnounceInterval should not be making a difference, because the > driver performs one-step timestamping for Sync messages, so their rate > shouldn't matter, as the TX timestamp isn't reported to user space. > Just the two-step TX timestamp of the Pdelay_Req frame is, and therefore, > modulating the logMinPdelayReqInterval value is the only thing that should > be able to modulate the behavior of your observed issue. > > [ also, don't be shy to also provide negative values to > logMinPdelayReqInterval, > for example -3 means 2^-3 seconds == 125 ms. We should see something > really quickly with a setting like that ] Oh I've tried negative values. The example Renesas G.8275.2 profile I found and followed had negative values so I used those at first. It makes things happen a lot faster and also makes linuxptp reset with this tx timestamp issue much more frequently. Which is why I dialed it back but results in less accurate 1PPS (aka more jitter). > > Once you have a simple reproducer with the v5, maybe Christian would be > able to tell you where to put some trace points in the kernel for a better > understanding of what goes wrong with the Pdelay_Req messages. Christian is quite busy with other things now so you're stuck with me 😉 > > > Regards, > > > > Brian > > > > My complete ptp4l.conf settings. These settings will run with less " > > timed out while polling for tx timestamp" occurrences but increases my > > 1PPS jitter observed on O Scope by +/- 600ish ns. When I run with > > first set of logXxx settings above the jitter is much better at +/- > > 200ish ns. > > > > [global] > > # > > # Default Data Set > > # > > twoStepFlag 0 > > slaveOnly 1 > > priority1 128 > > priority2 255 > > domainNumber 44 > > utc_offset 37 > > #clockClass 248 > > clockClass 255 > > #step_window 3 > > clockAccuracy 0xFE > > offsetScaledLogVariance 0xFFFF > > free_running 0 > > freq_est_interval 1 > > dscp_event 0 > > dscp_general 0 > > #dataset_comparison ieee1588 > > #for G.8275.1 > > dataset_comparison G.8275.x > > G.8275.defaultDS.localPriority 128 > > # > > # Port Data Set > > # > > logAnnounceInterval 1 > > logSyncInterval 2 > > logMinDelayReqInterval 2 > > logMinPdelayReqInterval 2 > > announceReceiptTimeout 2 > > syncReceiptTimeout 0 > > delayAsymmetry 0 > > fault_reset_interval -128 > > #fault_reset_interval 4 > > neighborPropDelayThresh 20000000 > > masterOnly 0 > > G.8275.portDS.localPriority 128 > > # > > # Run time options > > # > > assume_two_step 0 > > logging_level 6 > > path_trace_enabled 0 > > follow_up_info 0 > > hybrid_e2e 1 > > inhibit_multicast_service 1 > > net_sync_monitor 0 > > tc_spanning_tree 0 > > #tx_timestamp_timeout 300 > > tx_timestamp_timeout 8000 > > unicast_listen 1 > > unicast_req_duration 300 > > unicast_master_table 1 > > use_syslog 1 > > verbose 0 > > summary_interval 4 > > kernel_leap 1 > > #check_fup_sync 0 > > check_fup_sync 1 > > # > > # Servo Options > > # > > #write_phase_mode 1 > > servo_offset_threshold 100 > > servo_num_offset_values 64 > > pi_proportional_const 0.0 > > #pi_proportional_const 0.7 > > pi_integral_const 0.0 > > #pi_integral_const 0.3 > > pi_proportional_scale 0.0 > > pi_proportional_exponent -0.3 > > pi_proportional_norm_max 0.7 > > pi_integral_scale 0.0 > > pi_integral_exponent 0.4 > > pi_integral_norm_max 0.3 > > step_threshold 0.0 > > #step_threshold 0.00002 > > first_step_threshold 0.00002 > > max_frequency 900000000 > > clock_servo pi > > sanity_freq_limit 200000000 > > ntpshm_segment 0 > > # > > # Transport options > > # > > transportSpecific 0x0 > > ptp_dst_mac 01:1B:19:00:00:00 > > p2p_dst_mac 01:80:C2:00:00:0E > > udp_ttl 1 > > #udp6_scope 0x0E > > uds_address /var/run/ptp4l > > # > > # Default interface options > > # > > clock_type OC > > network_transport UDPv4 > > #delay_mechanism P2P > > delay_mechanism E2E > > time_stamping p2p1step > > #time_stamping onestep > > #time_stamping hardware > > #tsproc_mode filter > > tsproc_mode filter_weight > > delay_filter moving_median > > #delay_filter_length 10 > > delay_filter_length 100 > > egressLatency 0 > > ingressLatency 0 > > boundary_clock_jbod 0 > > # > > # Clock description > > # > > productDescription ;; > > revisionData ;; > > manufacturerIdentity 00:00:00 > > userDescription ; > > timeSource 0xA0 > > maxStepsRemoved 255 > > # > > [unicast_master_table] > > table_id 1 > > logQueryInterval 2 > > UDPv4 192.168.0.250 > > #UDPv4 192.168.1.250 > > # > > [lan1] > > unicast_master_table > > > > > > > > CONFIDENTIALITY NOTICE: This email and any attachments are for the > > sole use of the intended recipient and may contain material that is > > proprietary, confidential, privileged or otherwise legally protected > > or restricted under applicable government laws. Any review, > > disclosure, distributing or other use without expressed permission of > > the sender is strictly prohibited. If you are not the intended > > recipient, please contact the sender and delete all copies without > > reading, printing, or saving. > > Am I an intended recipient? Let me know so I can delete the email if needed. > What about the sourceforge mail archive? Ha, ha. Yeah, I usually use my gmail account as my work is a very large bureaucracy that is mostly defense contractor related so our IT puts once size fits all solutions on us even though we do private land mobile radio and public safety (police, fire, dispatch consoles etc.). So it is what it is. My work is all Open Source so forgive me as it's out of my control and just try to ignore it. I'm like most in Open Source and simply trying to push things along for the common good. The proprietary Gestapo isn't going to come after anyone 😉 Regards, Brian CONFIDENTIALITY NOTICE: This email and any attachments are for the sole use of the intended recipient and may contain material that is proprietary, confidential, privileged or otherwise legally protected or restricted under applicable government laws. Any review, disclosure, distributing or other use without expressed permission of the sender is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies without reading, printing, or saving. |
From: Peter B. <pe...@be...> - 2021-10-13 07:51:13
|
Hi, On 2021-10-13 00:19, Richard Cochran wrote: > On Tue, Oct 12, 2021 at 09:47:21PM +0200, Peter Bergin wrote: >> Hi, >> >> I'm currently working on a network product using gPTP configuration for PTP >> sync. The device shall be a AVB endpoint handling audio and according to >> Avnu specifications all endpoint shall be capable of being master on the >> network. Some plugins for AVB (such as gstreamer and alsa-plugins) require >> that system time (CLOCK_REALTIME) is in sync with PHC to get correct timing. >> I see different cases depending on if the device is master or slave. If the >> device is slave I would like to sync PHC to CLOCK_REALTIME. The other way >> around, if device is master I would like to sync CLOCK_REALTIME to PHC. >> >> To the question; is there a way to use phc2sys and handle this >> automatically? > phc2sys has an automatic mode. See the man page for details. > Thanks! My bad, should have read man pages better. And I did try '-a' but had issues with it. To summarize for other users: $ ./phc2sys -a -r -r -q -m phc2sys[412816.955]: Waiting for ptp4l... phc2sys[412817.956]: Waiting for ptp4l... I had trouble communicating over UDS between ptp4l and phc2sys. The problem was that I'm working with gPTP (IEEE 802.11AS) and the setting transportSpecific didn't match between ptp4l and phc2sys. So the solution was, as stated in the man page, to add '--transportSpecific=1' when starting phc2sys and the issue was solved. ptp4l just silently drops messages if transportSpecific don't match (and ignore_transport_specific=0). I tried to debug this issue with help of debug messages (-l 7) but couldn't find it that way. Would it be a good thing to add debug prints when that happens to improve visibility? Or would that flood the log in some cases? diff --git a/port.c b/port.c index fa49663..1c04fc5 100644 --- a/port.c +++ b/port.c @@ -699,6 +699,8 @@ static int port_ignore(struct port *p, struct ptp_message *m) } if (p->match_transport_specific && msg_transport_specific(m) != p->transportSpecific) { + pr_debug("port %hu: transport_specific did not match, will drop message", + portnum(p)); return 1; } if (pid_eq(&m->header.sourcePortIdentity, &p->portIdentity)) { diff --git a/tc.c b/tc.c index 0346ba9..705f54c 100644 --- a/tc.c +++ b/tc.c @@ -478,6 +478,8 @@ int tc_ignore(struct port *p, struct ptp_message *m) if (p->match_transport_specific && msg_transport_specific(m) != p->transportSpecific) { + pr_debug("port %hu: transport_specific did not match, will drop message", + portnum(p)); return 1; } if (pid_eq(&m->header.sourcePortIdentity, &p->portIdentity)) { Thanks, /Peter |
From: Vladimir O. <ol...@gm...> - 2021-10-12 23:10:42
|
On Fri, Oct 08, 2021 at 03:22:10PM +0000, Brian.Hutchinson--- via Linuxptp-users wrote: > Hi, > > I'm using Christian's DSA patches > https://lkml.org/lkml/2020/10/19/633) on a NXP iMX8MM with a Microchip > ksz9567 with ptp4l.conf setup for E2E G.8275.2 profile. I'm running a > 1G RGMII interface and my GM and unit under test is connected via a 1G > Netgear dumb switch. > > Using 5.10.32 kernel with CONFIG_HZ_1000 and nohz=off on cmdline. > > I've been getting the "timed out while polling for tx timestamp" error > which causes linuxptp to restart. When linuxptp restarts my 1PPS > (generated from Microchip switch) walks all over the place on my O > Scope until linuxptp gets a good sync again and pulls 1PPS back into > sync with the GM sync out reference I'm also watching on the scope. > > Of course increasing tx_timestamp_timeout doesn't appear to help in > this case. I've tried values all the way up to 8000. > > But I can significantly reduce the frequency of the problem if I make > changes to some ptp4l.conf settings. > > With ptp4l.conf settings: > > logAnnounceInterval 1 > logSyncInterval 0 > logMinDelayReqInterval 0 > logMinPdelayReqInterval 0 > announceReceiptTimeout 2 > > I'll see the tx timestamp timeout probably 15 or so times running a > test overnight. > > If I set : > > logAnnounceInterval 1 > logSyncInterval 2 > logMinDelayReqInterval 2 > logMinPdelayReqInterval 2 > announceReceiptTimeout 2 > > ... then I might see tx timestamp only once or twice on an overnight run. > > I read a comment from Douglas Arnold from Meinberg that if basically > anything goes wrong with fulfilling a grant, message rate or grant > duration, or both, should be reduced. > > I've searched the archives and read all of the responses and a few > caught my attention. Most say it's a driver bug but some said it > could be a stack issue. So I'm wondering since I can significantly > decrease the occurrence of the tx timeout by modifying above settings, > what other settings would affect or tune this particular telco > profile? > > I'm still fairly new to all this and I understand the telco profiles > are a bit unique so I'm trying to understand what ptp4l.conf settings > I need to focus on for this particular profile. > > If this is a "stack" issue, what can I do to reduce the "message rate" > or "grant duration" if these are related to whatever a "stack" issue > is? I'd be willing to put my money on a driver bug. But for that you'd need to confirm that the issue reproduces with the default.cfg and not just with the G.8275.2 profile. Don't try to run before you can walk. Make no mistake, there was a reason why the patches you've pointed to were not applied to the mainline kernel in their given form at the time. But regardless, which specific version of the patches have you applied? Your link points to the RFC (aka "barely works"), whereas the latest version, before being abandoned, was v5. https://patchwork.kernel.org/project/netdevbpf/cover/202...@ar.../ I specifically had a comment that TX timestamps would potentially get lost if user space would attempt timestamping of one frame while another was still in progress, and this only got fixed in v5 by the addition of a ksz9477_defer_xmit() function that waits until the in-flight skb has been timestamped. There might be other issues too. The logAnnounceInterval should not be making a difference, because the driver performs one-step timestamping for Sync messages, so their rate shouldn't matter, as the TX timestamp isn't reported to user space. Just the two-step TX timestamp of the Pdelay_Req frame is, and therefore, modulating the logMinPdelayReqInterval value is the only thing that should be able to modulate the behavior of your observed issue. [ also, don't be shy to also provide negative values to logMinPdelayReqInterval, for example -3 means 2^-3 seconds == 125 ms. We should see something really quickly with a setting like that ] Once you have a simple reproducer with the v5, maybe Christian would be able to tell you where to put some trace points in the kernel for a better understanding of what goes wrong with the Pdelay_Req messages. > Regards, > > Brian > > My complete ptp4l.conf settings. These settings will run with less " > timed out while polling for tx timestamp" occurrences but increases my > 1PPS jitter observed on O Scope by +/- 600ish ns. When I run with > first set of logXxx settings above the jitter is much better at +/- > 200ish ns. > > [global] > # > # Default Data Set > # > twoStepFlag 0 > slaveOnly 1 > priority1 128 > priority2 255 > domainNumber 44 > utc_offset 37 > #clockClass 248 > clockClass 255 > #step_window 3 > clockAccuracy 0xFE > offsetScaledLogVariance 0xFFFF > free_running 0 > freq_est_interval 1 > dscp_event 0 > dscp_general 0 > #dataset_comparison ieee1588 > #for G.8275.1 > dataset_comparison G.8275.x > G.8275.defaultDS.localPriority 128 > # > # Port Data Set > # > logAnnounceInterval 1 > logSyncInterval 2 > logMinDelayReqInterval 2 > logMinPdelayReqInterval 2 > announceReceiptTimeout 2 > syncReceiptTimeout 0 > delayAsymmetry 0 > fault_reset_interval -128 > #fault_reset_interval 4 > neighborPropDelayThresh 20000000 > masterOnly 0 > G.8275.portDS.localPriority 128 > # > # Run time options > # > assume_two_step 0 > logging_level 6 > path_trace_enabled 0 > follow_up_info 0 > hybrid_e2e 1 > inhibit_multicast_service 1 > net_sync_monitor 0 > tc_spanning_tree 0 > #tx_timestamp_timeout 300 > tx_timestamp_timeout 8000 > unicast_listen 1 > unicast_req_duration 300 > unicast_master_table 1 > use_syslog 1 > verbose 0 > summary_interval 4 > kernel_leap 1 > #check_fup_sync 0 > check_fup_sync 1 > # > # Servo Options > # > #write_phase_mode 1 > servo_offset_threshold 100 > servo_num_offset_values 64 > pi_proportional_const 0.0 > #pi_proportional_const 0.7 > pi_integral_const 0.0 > #pi_integral_const 0.3 > pi_proportional_scale 0.0 > pi_proportional_exponent -0.3 > pi_proportional_norm_max 0.7 > pi_integral_scale 0.0 > pi_integral_exponent 0.4 > pi_integral_norm_max 0.3 > step_threshold 0.0 > #step_threshold 0.00002 > first_step_threshold 0.00002 > max_frequency 900000000 > clock_servo pi > sanity_freq_limit 200000000 > ntpshm_segment 0 > # > # Transport options > # > transportSpecific 0x0 > ptp_dst_mac 01:1B:19:00:00:00 > p2p_dst_mac 01:80:C2:00:00:0E > udp_ttl 1 > #udp6_scope 0x0E > uds_address /var/run/ptp4l > # > # Default interface options > # > clock_type OC > network_transport UDPv4 > #delay_mechanism P2P > delay_mechanism E2E > time_stamping p2p1step > #time_stamping onestep > #time_stamping hardware > #tsproc_mode filter > tsproc_mode filter_weight > delay_filter moving_median > #delay_filter_length 10 > delay_filter_length 100 > egressLatency 0 > ingressLatency 0 > boundary_clock_jbod 0 > # > # Clock description > # > productDescription ;; > revisionData ;; > manufacturerIdentity 00:00:00 > userDescription ; > timeSource 0xA0 > maxStepsRemoved 255 > # > [unicast_master_table] > table_id 1 > logQueryInterval 2 > UDPv4 192.168.0.250 > #UDPv4 192.168.1.250 > # > [lan1] > unicast_master_table > > > > CONFIDENTIALITY NOTICE: This email and any attachments are for the > sole use of the intended recipient and may contain material that is > proprietary, confidential, privileged or otherwise legally protected > or restricted under applicable government laws. Any review, > disclosure, distributing or other use without expressed permission of > the sender is strictly prohibited. If you are not the intended > recipient, please contact the sender and delete all copies without > reading, printing, or saving. Am I an intended recipient? Let me know so I can delete the email if needed. What about the sourceforge mail archive? |
From: Richard C. <ric...@gm...> - 2021-10-12 22:20:06
|
On Tue, Oct 12, 2021 at 09:47:21PM +0200, Peter Bergin wrote: > Hi, > > I'm currently working on a network product using gPTP configuration for PTP > sync. The device shall be a AVB endpoint handling audio and according to > Avnu specifications all endpoint shall be capable of being master on the > network. Some plugins for AVB (such as gstreamer and alsa-plugins) require > that system time (CLOCK_REALTIME) is in sync with PHC to get correct timing. > I see different cases depending on if the device is master or slave. If the > device is slave I would like to sync PHC to CLOCK_REALTIME. The other way > around, if device is master I would like to sync CLOCK_REALTIME to PHC. > > To the question; is there a way to use phc2sys and handle this > automatically? phc2sys has an automatic mode. See the man page for details. Thanks, Richard |
From: Peter B. <pe...@be...> - 2021-10-12 20:03:56
|
Hi, I'm currently working on a network product using gPTP configuration for PTP sync. The device shall be a AVB endpoint handling audio and according to Avnu specifications all endpoint shall be capable of being master on the network. Some plugins for AVB (such as gstreamer and alsa-plugins) require that system time (CLOCK_REALTIME) is in sync with PHC to get correct timing. I see different cases depending on if the device is master or slave. If the device is slave I would like to sync PHC to CLOCK_REALTIME. The other way around, if device is master I would like to sync CLOCK_REALTIME to PHC. To the question; is there a way to use phc2sys and handle this automatically? What I currently do manually is (linuxptp version 3.1): $ ptp4l -i eth1 -f /etc/linuxptp/gPTP.cfg On the device becoming master: $ phc2sys -c eth1 -s CLOCK_REALTIME --step_threshold=1 --transportSpecific=1 -w On the device becoming slave: $ phc2sys -c CLOCK_REALTIME -s eth1 --step_threshold=1 --transportSpecific=1 -w It would be good if the above could be dynamic. Starting everything as services and let the system sync itself. /Peter |
From: <Bri...@L3...> - 2021-10-08 15:37:37
|
Hi, I'm using Christian's DSA patches https://lkml.org/lkml/2020/10/19/633) on a NXP iMX8MM with a Microchip ksz9567 with ptp4l.conf setup for E2E G.8275.2 profile. I'm running a 1G RGMII interface and my GM and unit under test is connected via a 1G Netgear dumb switch. Using 5.10.32 kernel with CONFIG_HZ_1000 and nohz=off on cmdline. I've been getting the "timed out while polling for tx timestamp" error which causes linuxptp to restart. When linuxptp restarts my 1PPS (generated from Microchip switch) walks all over the place on my O Scope until linuxptp gets a good sync again and pulls 1PPS back into sync with the GM sync out reference I'm also watching on the scope. Of course increasing tx_timestamp_timeout doesn't appear to help in this case. I've tried values all the way up to 8000. But I can significantly reduce the frequency of the problem if I make changes to some ptp4l.conf settings. With ptp4l.conf settings: logAnnounceInterval 1 logSyncInterval 0 logMinDelayReqInterval 0 logMinPdelayReqInterval 0 announceReceiptTimeout 2 I'll see the tx timestamp timeout probably 15 or so times running a test overnight. If I set : logAnnounceInterval 1 logSyncInterval 2 logMinDelayReqInterval 2 logMinPdelayReqInterval 2 announceReceiptTimeout 2 ... then I might see tx timestamp only once or twice on an overnight run. I read a comment from Douglas Arnold from Meinberg that if basically anything goes wrong with fulfilling a grant, message rate or grant duration, or both, should be reduced. I've searched the archives and read all of the responses and a few caught my attention. Most say it's a driver bug but some said it could be a stack issue. So I'm wondering since I can significantly decrease the occurrence of the tx timeout by modifying above settings, what other settings would affect or tune this particular telco profile? I'm still fairly new to all this and I understand the telco profiles are a bit unique so I'm trying to understand what ptp4l.conf settings I need to focus on for this particular profile. If this is a "stack" issue, what can I do to reduce the "message rate" or "grant duration" if these are related to whatever a "stack" issue is? Regards, Brian My complete ptp4l.conf settings. These settings will run with less " timed out while polling for tx timestamp" occurrences but increases my 1PPS jitter observed on O Scope by +/- 600ish ns. When I run with first set of logXxx settings above the jitter is much better at +/- 200ish ns. [global] # # Default Data Set # twoStepFlag 0 slaveOnly 1 priority1 128 priority2 255 domainNumber 44 utc_offset 37 #clockClass 248 clockClass 255 #step_window 3 clockAccuracy 0xFE offsetScaledLogVariance 0xFFFF free_running 0 freq_est_interval 1 dscp_event 0 dscp_general 0 #dataset_comparison ieee1588 #for G.8275.1 dataset_comparison G.8275.x G.8275.defaultDS.localPriority 128 # # Port Data Set # logAnnounceInterval 1 logSyncInterval 2 logMinDelayReqInterval 2 logMinPdelayReqInterval 2 announceReceiptTimeout 2 syncReceiptTimeout 0 delayAsymmetry 0 fault_reset_interval -128 #fault_reset_interval 4 neighborPropDelayThresh 20000000 masterOnly 0 G.8275.portDS.localPriority 128 # # Run time options # assume_two_step 0 logging_level 6 path_trace_enabled 0 follow_up_info 0 hybrid_e2e 1 inhibit_multicast_service 1 net_sync_monitor 0 tc_spanning_tree 0 #tx_timestamp_timeout 300 tx_timestamp_timeout 8000 unicast_listen 1 unicast_req_duration 300 unicast_master_table 1 use_syslog 1 verbose 0 summary_interval 4 kernel_leap 1 #check_fup_sync 0 check_fup_sync 1 # # Servo Options # #write_phase_mode 1 servo_offset_threshold 100 servo_num_offset_values 64 pi_proportional_const 0.0 #pi_proportional_const 0.7 pi_integral_const 0.0 #pi_integral_const 0.3 pi_proportional_scale 0.0 pi_proportional_exponent -0.3 pi_proportional_norm_max 0.7 pi_integral_scale 0.0 pi_integral_exponent 0.4 pi_integral_norm_max 0.3 step_threshold 0.0 #step_threshold 0.00002 first_step_threshold 0.00002 max_frequency 900000000 clock_servo pi sanity_freq_limit 200000000 ntpshm_segment 0 # # Transport options # transportSpecific 0x0 ptp_dst_mac 01:1B:19:00:00:00 p2p_dst_mac 01:80:C2:00:00:0E udp_ttl 1 #udp6_scope 0x0E uds_address /var/run/ptp4l # # Default interface options # clock_type OC network_transport UDPv4 #delay_mechanism P2P delay_mechanism E2E time_stamping p2p1step #time_stamping onestep #time_stamping hardware #tsproc_mode filter tsproc_mode filter_weight delay_filter moving_median #delay_filter_length 10 delay_filter_length 100 egressLatency 0 ingressLatency 0 boundary_clock_jbod 0 # # Clock description # productDescription ;; revisionData ;; manufacturerIdentity 00:00:00 userDescription ; timeSource 0xA0 maxStepsRemoved 255 # [unicast_master_table] table_id 1 logQueryInterval 2 UDPv4 192.168.0.250 #UDPv4 192.168.1.250 # [lan1] unicast_master_table CONFIDENTIALITY NOTICE: This email and any attachments are for the sole use of the intended recipient and may contain material that is proprietary, confidential, privileged or otherwise legally protected or restricted under applicable government laws. Any review, disclosure, distributing or other use without expressed permission of the sender is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies without reading, printing, or saving. |
From: Richard C. <ric...@gm...> - 2021-10-05 17:56:12
|
On Mon, Oct 04, 2021 at 03:12:57PM -0300, Lucas Gonçalves Martins via Linuxptp-users wrote: > Ok, the problem is with the master branch version. Checking out to v3.1.1 > makes it work again. That is very strange. Can you do "git bisect" between v3.1.1 and master to identify the bad commit? Thanks, Richard |
From: Lucas G. M. <lu...@kr...> - 2021-10-04 18:23:56
|
Hello, I have a system with onboard and offboard ethernet adapters (all of them are Intel, being the offboard one an Intel xxv710-da2t). All interfaces have support for PTP with hardware timestamping. I'm running Ubuntu 20.04.3, using the current kernel version (i.e: linux-headers-5.4.0-88-generic). Both master and slave devices have the same hardware and OS configuration. The offboard NIC driver is up-to-date, and working. The situation is the following: When I run the linuxptp package available in the Ubuntu APT repository, version 1.92, I can use all interfaces (onboard and offboard) to run the PTP with hardware timestamping. When I run the most recent linuxptp package (version 3.1.1), available in the git repository, only the onboard interfaces work with hardware timestamping. If I run it on the offboard NIC, I get the following message in the slave: ptp4l[1680.877]: selected /dev/ptp3 as PTP clock ptp4l[1680.877]: port 1: INITIALIZING to LISTENING on INIT_COMPLETE ptp4l[1680.877]: port 0: INITIALIZING to LISTENING on INIT_COMPLETE ptp4l[1681.264]: port 1: received DELAY_REQ without timestamp ptp4l[1681.779]: port 1: received DELAY_REQ without timestamp ptp4l[1683.671]: port 1: received DELAY_REQ without timestamp ptp4l[1684.184]: port 1: new foreign master b49691.fffe.a71b10-1 ptp4l[1685.183]: port 1: received SYNC without timestamp ptp4l[1686.183]: port 1: received SYNC without timestamp ptp4l[1687.183]: port 1: received SYNC without timestamp ptp4l[1687.216]: port 1: LISTENING to MASTER on ANNOUNCE_RECEIPT_TIMEOUT_EXPIRES ptp4l[1687.216]: selected local clock b49691.fffe.a62da1 as best master ptp4l[1687.216]: assuming the grand master role ptp4l[1688.183]: port 1: received SYNC without timestamp ptp4l[1688.184]: selected best master clock b49691.fffe.a71b10 ptp4l[1688.184]: assuming the grand master role ptp4l[1689.183]: port 1: received SYNC without timestamp ptp4l[1691.198]: port 1: received DELAY_REQ without timestamp ptp4l[1692.500]: port 1: received DELAY_REQ without timestamp [...] And the following in the master: ptp4l[944.408]: selected /dev/ptp1 as PTP clock ptp4l[944.409]: port 1 (ens1f0): INITIALIZING to LISTENING on INIT_COMPLETE ptp4l[944.409]: port 0 (/var/run/ptp4l): INITIALIZING to LISTENING on INIT_COMPLETE ptp4l[944.409]: port 0 (/var/run/ptp4lro): INITIALIZING to LISTENING on INIT_COMPLETE ptp4l[945.913]: port 1 (ens1f0): new foreign master b49691.fffe.a62da1-1 ptp4l[949.913]: selected best master clock b49691.fffe.a62da1 ptp4l[949.913]: port 1 (ens1f0): LISTENING to UNCALIBRATED on RS_SLAVE ptp4l[994.205]: port 1 (ens1f0): UNCALIBRATED to MASTER on ANNOUNCE_RECEIPT_TIMEOUT_EXPIRES ptp4l[994.205]: selected local clock b49691.fffe.a71b10 as best master ptp4l[994.205]: port 1 (ens1f0): assuming the grand master role ptp4l[999.239]: selected best master clock b49691.fffe.a62da1 ptp4l[999.239]: port 1 (ens1f0): MASTER to UNCALIBRATED on RS_SLAVE ptp4l[1027.618]: port 1 (ens1f0): UNCALIBRATED to MASTER on ANNOUNCE_RECEIPT_TIMEOUT_EXPIRES ptp4l[1027.618]: selected local clock b49691.fffe.a71b10 as best master ptp4l[1027.619]: port 1 (ens1f0): assuming the grand master role I need the version 3.1.1 to use the UNICAST mode,which is not available in the 1.92 version. Do you have any idea how I can make it work? Is there any special flag to build the code? Currently I'm just running the make command, with no flags. Below you can find more information about the NICs. Offboard NIC (ethtool -T) Time stamping parameters for ens1f0: Capabilities: hardware-transmit (SOF_TIMESTAMPING_TX_HARDWARE) software-transmit (SOF_TIMESTAMPING_TX_SOFTWARE) hardware-receive (SOF_TIMESTAMPING_RX_HARDWARE) software-receive (SOF_TIMESTAMPING_RX_SOFTWARE) software-system-clock (SOF_TIMESTAMPING_SOFTWARE) hardware-raw-clock (SOF_TIMESTAMPING_RAW_HARDWARE) PTP Hardware Clock: 1 Hardware Transmit Timestamp Modes: off (HWTSTAMP_TX_OFF) on (HWTSTAMP_TX_ON) Hardware Receive Filter Modes: none (HWTSTAMP_FILTER_NONE) ptpv1-l4-sync (HWTSTAMP_FILTER_PTP_V1_L4_SYNC) ptpv1-l4-delay-req (HWTSTAMP_FILTER_PTP_V1_L4_DELAY_REQ) ptpv2-l4-event (HWTSTAMP_FILTER_PTP_V2_L4_EVENT) ptpv2-l4-sync (HWTSTAMP_FILTER_PTP_V2_L4_SYNC) ptpv2-l4-delay-req (HWTSTAMP_FILTER_PTP_V2_L4_DELAY_REQ) ptpv2-l2-event (HWTSTAMP_FILTER_PTP_V2_L2_EVENT) ptpv2-l2-sync (HWTSTAMP_FILTER_PTP_V2_L2_SYNC) ptpv2-l2-delay-req (HWTSTAMP_FILTER_PTP_V2_L2_DELAY_REQ) ptpv2-event (HWTSTAMP_FILTER_PTP_V2_EVENT) ptpv2-sync (HWTSTAMP_FILTER_PTP_V2_SYNC) ptpv2-delay-req (HWTSTAMP_FILTER_PTP_V2_DELAY_REQ) Onboard NIC (ethtool -T) Time stamping parameters for eno1: Capabilities: hardware-transmit (SOF_TIMESTAMPING_TX_HARDWARE) software-transmit (SOF_TIMESTAMPING_TX_SOFTWARE) hardware-receive (SOF_TIMESTAMPING_RX_HARDWARE) software-receive (SOF_TIMESTAMPING_RX_SOFTWARE) software-system-clock (SOF_TIMESTAMPING_SOFTWARE) hardware-raw-clock (SOF_TIMESTAMPING_RAW_HARDWARE) PTP Hardware Clock: 0 Hardware Transmit Timestamp Modes: off (HWTSTAMP_TX_OFF) on (HWTSTAMP_TX_ON) Hardware Receive Filter Modes: none (HWTSTAMP_FILTER_NONE) ptpv1-l4-event (HWTSTAMP_FILTER_PTP_V1_L4_EVENT) ptpv2-l4-event (HWTSTAMP_FILTER_PTP_V2_L4_EVENT) ptpv2-l2-event (HWTSTAMP_FILTER_PTP_V2_L2_EVENT) Offboard NIC (modinfo i40e) filename: /lib/modules/5.4.0-88-generic/updates/drivers/net/ethernet/intel/i40e/i40e.ko version: 2.16.11 license: GPL description: Intel(R) 40-10 Gigabit Ethernet Connection Network Driver author: Intel Corporation, <e10...@li...> srcversion: A478D718E644C30E2434DE7 alias: pci:v00008086d0000158Bsv*sd*bc*sc*i* alias: pci:v00008086d0000158Asv*sd*bc*sc*i* alias: pci:v00008086d000037D3sv*sd*bc*sc*i* alias: pci:v00008086d000037D2sv*sd*bc*sc*i* alias: pci:v00008086d000037D1sv*sd*bc*sc*i* alias: pci:v00008086d000037D0sv*sd*bc*sc*i* alias: pci:v00008086d000037CFsv*sd*bc*sc*i* alias: pci:v00008086d000037CEsv*sd*bc*sc*i* alias: pci:v00008086d00000D58sv*sd*bc*sc*i* alias: pci:v00008086d00000CF8sv*sd*bc*sc*i* alias: pci:v00008086d00001588sv*sd*bc*sc*i* alias: pci:v00008086d00001587sv*sd*bc*sc*i* alias: pci:v00008086d0000104Fsv*sd*bc*sc*i* alias: pci:v00008086d0000104Esv*sd*bc*sc*i* alias: pci:v00008086d000015FFsv*sd*bc*sc*i* alias: pci:v00008086d00001589sv*sd*bc*sc*i* alias: pci:v00008086d00001586sv*sd*bc*sc*i* alias: pci:v00008086d0000101Fsv*sd*bc*sc*i* alias: pci:v00008086d00001585sv*sd*bc*sc*i* alias: pci:v00008086d00001584sv*sd*bc*sc*i* alias: pci:v00008086d00001583sv*sd*bc*sc*i* alias: pci:v00008086d00001581sv*sd*bc*sc*i* alias: pci:v00008086d00001580sv*sd*bc*sc*i* alias: pci:v00008086d00001574sv*sd*bc*sc*i* alias: pci:v00008086d00001572sv*sd*bc*sc*i* depends: retpoline: Y name: i40e vermagic: 5.4.0-84-generic SMP mod_unload modversions parm: debug:Debug level (0=none,...,16=all) (int) parm: l4mode:L4 cloud filter mode: 0=UDP,1=TCP,2=Both,-1=Disabled(default) (int) -- Lucas Martins CTO lu...@kr... +55 (19) 3112-5000 www.kryptus.com <http://www.kryptus.com/> *Este e-mail e quaisquer anexos podem conter informação confidencial, proprietária, privilegiada, classificada ou protegida por Lei. A informação aqui contida é destinada exclusivamente para os destinatários nominados (ou para a pessoa responsável por entregar a informação para o destinatário). Se você não é o destinatário pretendido desta mensagem então você não está autorizado a ler, imprimir, reter, copiar ou disseminar esta mensagem na íntegra ou mesmo parcialmente. Se você recebeu este e-mail erroneamente, por favor notifique o remetente e remova a mesma de sua caixa postal e dispositivos.This e-mail and any attachments may contain information that is confidential, proprietary, privileged or otherwise protected by law. The information contained herein is solely intended for the named addressee (or a person responsible for delivering it to the addressee).If you are not the intended recipient of this message, you are not authorized to read, print, retain, copy or disseminate this message or any part of it. If you have received this e-mail in error, please notify the sender immediately by return e-mail and delete it from your computer.* |
From: Lucas G. M. <lu...@kr...> - 2021-10-04 18:13:17
|
Ok, the problem is with the master branch version. Checking out to v3.1.1 makes it work again. I thought the master would have the latest release. Best regards, On Mon, Oct 4, 2021 at 2:50 PM Lucas Gonçalves Martins <lu...@kr...> wrote: > Hello, > > I have a system with onboard and offboard ethernet adapters (all of them > are Intel, being the offboard one an Intel xxv710-da2t). All interfaces > have support for PTP with hardware timestamping. I'm running Ubuntu > 20.04.3, using the current kernel version (i.e: > linux-headers-5.4.0-88-generic). Both master and slave devices have the > same hardware and OS configuration. The offboard NIC driver is up-to-date, > and working. > > The situation is the following: > > When I run the linuxptp package available in the Ubuntu APT repository, > version 1.92, I can use all interfaces (onboard and offboard) to run the > PTP with hardware timestamping. > > When I run the most recent linuxptp package (version 3.1.1), available in > the git repository, only the onboard interfaces work with hardware > timestamping. If I run it on the offboard NIC, I get the following > message in the slave: > > ptp4l[1680.877]: selected /dev/ptp3 as PTP clock > ptp4l[1680.877]: port 1: INITIALIZING to LISTENING on INIT_COMPLETE > ptp4l[1680.877]: port 0: INITIALIZING to LISTENING on INIT_COMPLETE > ptp4l[1681.264]: port 1: received DELAY_REQ without timestamp > ptp4l[1681.779]: port 1: received DELAY_REQ without timestamp > ptp4l[1683.671]: port 1: received DELAY_REQ without timestamp > ptp4l[1684.184]: port 1: new foreign master b49691.fffe.a71b10-1 > ptp4l[1685.183]: port 1: received SYNC without timestamp > ptp4l[1686.183]: port 1: received SYNC without timestamp > ptp4l[1687.183]: port 1: received SYNC without timestamp > ptp4l[1687.216]: port 1: LISTENING to MASTER on > ANNOUNCE_RECEIPT_TIMEOUT_EXPIRES > ptp4l[1687.216]: selected local clock b49691.fffe.a62da1 as best master > ptp4l[1687.216]: assuming the grand master role > ptp4l[1688.183]: port 1: received SYNC without timestamp > ptp4l[1688.184]: selected best master clock b49691.fffe.a71b10 > ptp4l[1688.184]: assuming the grand master role > ptp4l[1689.183]: port 1: received SYNC without timestamp > ptp4l[1691.198]: port 1: received DELAY_REQ without timestamp > ptp4l[1692.500]: port 1: received DELAY_REQ without timestamp > [...] > > > And the following in the master: > > ptp4l[944.408]: selected /dev/ptp1 as PTP clock > ptp4l[944.409]: port 1 (ens1f0): INITIALIZING to LISTENING on INIT_COMPLETE > ptp4l[944.409]: port 0 (/var/run/ptp4l): INITIALIZING to LISTENING on > INIT_COMPLETE > ptp4l[944.409]: port 0 (/var/run/ptp4lro): INITIALIZING to LISTENING on > INIT_COMPLETE > ptp4l[945.913]: port 1 (ens1f0): new foreign master b49691.fffe.a62da1-1 > ptp4l[949.913]: selected best master clock b49691.fffe.a62da1 > ptp4l[949.913]: port 1 (ens1f0): LISTENING to UNCALIBRATED on RS_SLAVE > ptp4l[994.205]: port 1 (ens1f0): UNCALIBRATED to MASTER on > ANNOUNCE_RECEIPT_TIMEOUT_EXPIRES > ptp4l[994.205]: selected local clock b49691.fffe.a71b10 as best master > ptp4l[994.205]: port 1 (ens1f0): assuming the grand master role > ptp4l[999.239]: selected best master clock b49691.fffe.a62da1 > ptp4l[999.239]: port 1 (ens1f0): MASTER to UNCALIBRATED on RS_SLAVE > ptp4l[1027.618]: port 1 (ens1f0): UNCALIBRATED to MASTER on > ANNOUNCE_RECEIPT_TIMEOUT_EXPIRES > ptp4l[1027.618]: selected local clock b49691.fffe.a71b10 as best master > ptp4l[1027.619]: port 1 (ens1f0): assuming the grand master role > > > I need the version 3.1.1 to use the UNICAST mode,which is not available in > the 1.92 version. Do you have any idea how I can make it work? Is there any > special flag to build the code? Currently I'm just running the make > command, with no flags. > > Below you can find more information about the NICs. > > Offboard NIC (ethtool -T) > > Time stamping parameters for ens1f0: > Capabilities: > hardware-transmit (SOF_TIMESTAMPING_TX_HARDWARE) > software-transmit (SOF_TIMESTAMPING_TX_SOFTWARE) > hardware-receive (SOF_TIMESTAMPING_RX_HARDWARE) > software-receive (SOF_TIMESTAMPING_RX_SOFTWARE) > software-system-clock (SOF_TIMESTAMPING_SOFTWARE) > hardware-raw-clock (SOF_TIMESTAMPING_RAW_HARDWARE) > PTP Hardware Clock: 1 > Hardware Transmit Timestamp Modes: > off (HWTSTAMP_TX_OFF) > on (HWTSTAMP_TX_ON) > Hardware Receive Filter Modes: > none (HWTSTAMP_FILTER_NONE) > ptpv1-l4-sync (HWTSTAMP_FILTER_PTP_V1_L4_SYNC) > ptpv1-l4-delay-req (HWTSTAMP_FILTER_PTP_V1_L4_DELAY_REQ) > ptpv2-l4-event (HWTSTAMP_FILTER_PTP_V2_L4_EVENT) > ptpv2-l4-sync (HWTSTAMP_FILTER_PTP_V2_L4_SYNC) > ptpv2-l4-delay-req (HWTSTAMP_FILTER_PTP_V2_L4_DELAY_REQ) > ptpv2-l2-event (HWTSTAMP_FILTER_PTP_V2_L2_EVENT) > ptpv2-l2-sync (HWTSTAMP_FILTER_PTP_V2_L2_SYNC) > ptpv2-l2-delay-req (HWTSTAMP_FILTER_PTP_V2_L2_DELAY_REQ) > ptpv2-event (HWTSTAMP_FILTER_PTP_V2_EVENT) > ptpv2-sync (HWTSTAMP_FILTER_PTP_V2_SYNC) > ptpv2-delay-req (HWTSTAMP_FILTER_PTP_V2_DELAY_REQ) > > > Onboard NIC (ethtool -T) > > Time stamping parameters for eno1: > Capabilities: > hardware-transmit (SOF_TIMESTAMPING_TX_HARDWARE) > software-transmit (SOF_TIMESTAMPING_TX_SOFTWARE) > hardware-receive (SOF_TIMESTAMPING_RX_HARDWARE) > software-receive (SOF_TIMESTAMPING_RX_SOFTWARE) > software-system-clock (SOF_TIMESTAMPING_SOFTWARE) > hardware-raw-clock (SOF_TIMESTAMPING_RAW_HARDWARE) > PTP Hardware Clock: 0 > Hardware Transmit Timestamp Modes: > off (HWTSTAMP_TX_OFF) > on (HWTSTAMP_TX_ON) > Hardware Receive Filter Modes: > none (HWTSTAMP_FILTER_NONE) > ptpv1-l4-event (HWTSTAMP_FILTER_PTP_V1_L4_EVENT) > ptpv2-l4-event (HWTSTAMP_FILTER_PTP_V2_L4_EVENT) > ptpv2-l2-event (HWTSTAMP_FILTER_PTP_V2_L2_EVENT) > > > Offboard NIC (modinfo i40e) > > filename: > /lib/modules/5.4.0-88-generic/updates/drivers/net/ethernet/intel/i40e/i40e.ko > version: 2.16.11 > license: GPL > description: Intel(R) 40-10 Gigabit Ethernet Connection Network Driver > author: Intel Corporation, <e10...@li...> > srcversion: A478D718E644C30E2434DE7 > alias: pci:v00008086d0000158Bsv*sd*bc*sc*i* > alias: pci:v00008086d0000158Asv*sd*bc*sc*i* > alias: pci:v00008086d000037D3sv*sd*bc*sc*i* > alias: pci:v00008086d000037D2sv*sd*bc*sc*i* > alias: pci:v00008086d000037D1sv*sd*bc*sc*i* > alias: pci:v00008086d000037D0sv*sd*bc*sc*i* > alias: pci:v00008086d000037CFsv*sd*bc*sc*i* > alias: pci:v00008086d000037CEsv*sd*bc*sc*i* > alias: pci:v00008086d00000D58sv*sd*bc*sc*i* > alias: pci:v00008086d00000CF8sv*sd*bc*sc*i* > alias: pci:v00008086d00001588sv*sd*bc*sc*i* > alias: pci:v00008086d00001587sv*sd*bc*sc*i* > alias: pci:v00008086d0000104Fsv*sd*bc*sc*i* > alias: pci:v00008086d0000104Esv*sd*bc*sc*i* > alias: pci:v00008086d000015FFsv*sd*bc*sc*i* > alias: pci:v00008086d00001589sv*sd*bc*sc*i* > alias: pci:v00008086d00001586sv*sd*bc*sc*i* > alias: pci:v00008086d0000101Fsv*sd*bc*sc*i* > alias: pci:v00008086d00001585sv*sd*bc*sc*i* > alias: pci:v00008086d00001584sv*sd*bc*sc*i* > alias: pci:v00008086d00001583sv*sd*bc*sc*i* > alias: pci:v00008086d00001581sv*sd*bc*sc*i* > alias: pci:v00008086d00001580sv*sd*bc*sc*i* > alias: pci:v00008086d00001574sv*sd*bc*sc*i* > alias: pci:v00008086d00001572sv*sd*bc*sc*i* > depends: > retpoline: Y > name: i40e > vermagic: 5.4.0-84-generic SMP mod_unload modversions > parm: debug:Debug level (0=none,...,16=all) (int) > parm: l4mode:L4 cloud filter mode: > 0=UDP,1=TCP,2=Both,-1=Disabled(default) (int) > > > -- > Lucas Martins > CTO > > lu...@kr... > +55 (19) 3112-5000 > > www.kryptus.com > > <http://www.kryptus.com/> > > > > *Este e-mail e quaisquer anexos podem conter informação confidencial, > proprietária, privilegiada, classificada ou protegida por Lei. A informação > aqui contida é destinada exclusivamente para os destinatários nominados (ou > para a pessoa responsável por entregar a informação para o destinatário). > Se você não é o destinatário pretendido desta mensagem então você não está > autorizado a ler, imprimir, reter, copiar ou disseminar esta mensagem na > íntegra ou mesmo parcialmente. Se você recebeu este e-mail erroneamente, > por favor notifique o remetente e remova a mesma de sua caixa postal e > dispositivos.This e-mail and any attachments may contain information that > is confidential, proprietary, privileged or otherwise protected by law. The > information contained herein is solely intended for the named addressee (or > a person responsible for delivering it to the addressee).If you are not the > intended recipient of this message, you are not authorized to read, print, > retain, copy or disseminate this message or any part of it. If you have > received this e-mail in error, please notify the sender immediately by > return e-mail and delete it from your computer.* > -- Lucas Martins CTO lu...@kr... +55 (19) 3112-5000 www.kryptus.com <http://www.kryptus.com/> *Este e-mail e quaisquer anexos podem conter informação confidencial, proprietária, privilegiada, classificada ou protegida por Lei. A informação aqui contida é destinada exclusivamente para os destinatários nominados (ou para a pessoa responsável por entregar a informação para o destinatário). Se você não é o destinatário pretendido desta mensagem então você não está autorizado a ler, imprimir, reter, copiar ou disseminar esta mensagem na íntegra ou mesmo parcialmente. Se você recebeu este e-mail erroneamente, por favor notifique o remetente e remova a mesma de sua caixa postal e dispositivos.This e-mail and any attachments may contain information that is confidential, proprietary, privileged or otherwise protected by law. The information contained herein is solely intended for the named addressee (or a person responsible for delivering it to the addressee).If you are not the intended recipient of this message, you are not authorized to read, print, retain, copy or disseminate this message or any part of it. If you have received this e-mail in error, please notify the sender immediately by return e-mail and delete it from your computer.* |
From: 廖書華 <sim...@gm...> - 2021-10-04 12:38:58
|
Dear, Does anyone used the following three NICs while running *ptp4l with the HW timestamping* before ? Intel Corporation Ethernet Controller X710 for 10GbE SFP+ Intel Corporation Ethernet Connection X722 for 10GBASE-T Intel Corporation Ethernet Controller XL710 for 40GbE QSFP+ Since we all have the issue received SYNC/PDELAY_REQ/PDELAY_RESP without timestamp. However, the result of ethtool, seems like it support HW timestamping. Thanks in advance for any help you are able to provide !!! Best Regards, Shu-hua, Liao |
From: Richard C. <ric...@gm...> - 2021-10-02 13:33:08
|
On Thu, Aug 19, 2021 at 07:03:36AM +0000, ramesh t via Linuxptp-devel wrote: > hi, > > We are planning to upgrade ptp4l to latest version (3.1.1), had few questions > > 1) From which release onwards of ptp4l IEEE 1588-2019 is supported? all published versions from v1 onward > 2) Is IEEE 1588-2019 backward compatible with switches and NIC supporting IEEE 1588-2008 version? yes > 3) After moving to latest version 3.1.1, was there any compatibility issues that was found? no |
From: Keller, J. E <jac...@in...> - 2021-10-01 21:49:24
|
> -----Original Message----- > From: Wong, Vee Khee <vee...@in...> > Sent: Friday, October 01, 2021 4:15 AM > To: 王佳磊 <lin...@gm...>; lin...@li... > Subject: Re: [Linuxptp-users] HW timestampt: SYNC without timestamp > > On Friday, October 1, 2021 18:38, 王佳磊 wrote: > > > Dear all, > > When I am running ptp4l by HW timestamp, I also face the issue without > timestamp. > > > Dell-server > > And my Nic is: Intel(R) Ethernet 10G 4P X710 SFP+ > > Driver and firmware: > > > So I don;t know how to solve this problem. > You might try checking ethtool stats and seeing if anything with 'tstamp' appears to have incrementing values. I believe the i40e driver reports an error statistic related to this... > > Can you give some suggestion? > > Are you running with one-step PTP, else it's weird that ptp4l expects > Timestamp on a SYNC msg. > > Maybe try to run with '-f configs/default.cfg' ? > > > > Best Regards, > > Jia-Lei,Wang > > _______________________________________________ > Linuxptp-users mailing list > Lin...@li... > https://lists.sourceforge.net/lists/listinfo/linuxptp-users |
From: Keller, J. E <jac...@in...> - 2021-10-01 21:47:52
|
> -----Original Message----- > From: Wong, Vee Khee <vee...@in...> > Sent: Friday, October 01, 2021 4:15 AM > To: 王佳磊 <lin...@gm...>; lin...@li... > Subject: Re: [Linuxptp-users] HW timestampt: SYNC without timestamp > > On Friday, October 1, 2021 18:38, 王佳磊 wrote: > > > Dear all, > > When I am running ptp4l by HW timestamp, I also face the issue without > timestamp. > > > Dell-server > > And my Nic is: Intel(R) Ethernet 10G 4P X710 SFP+ > > Driver and firmware: > > > So I don;t know how to solve this problem. > > > Can you give some suggestion? > > Are you running with one-step PTP, else it's weird that ptp4l expects > Timestamp on a SYNC msg. PTP protocol gets a receive timestamp for sync messages, I believe? The transmit timestamp is submitted as part of follow-up, but this error message is about receive timestamps. Thanks, Jake > > Maybe try to run with '-f configs/default.cfg' ? > > > > Best Regards, > > Jia-Lei,Wang > > _______________________________________________ > Linuxptp-users mailing list > Lin...@li... > https://lists.sourceforge.net/lists/listinfo/linuxptp-users |