linuxptp-users Mailing List for linuxptp (Page 158)
PTP IEEE 1588 stack for Linux
Brought to you by:
rcochran
You can subscribe to this list here.
2012 |
Jan
|
Feb
(10) |
Mar
(47) |
Apr
|
May
(26) |
Jun
(10) |
Jul
(4) |
Aug
(2) |
Sep
(2) |
Oct
(20) |
Nov
(14) |
Dec
(8) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2013 |
Jan
(6) |
Feb
(18) |
Mar
(27) |
Apr
(57) |
May
(32) |
Jun
(21) |
Jul
(79) |
Aug
(108) |
Sep
(13) |
Oct
(73) |
Nov
(51) |
Dec
(24) |
2014 |
Jan
(24) |
Feb
(41) |
Mar
(39) |
Apr
(5) |
May
(6) |
Jun
(2) |
Jul
(5) |
Aug
(15) |
Sep
(7) |
Oct
(6) |
Nov
|
Dec
(7) |
2015 |
Jan
(27) |
Feb
(18) |
Mar
(37) |
Apr
(8) |
May
(13) |
Jun
(44) |
Jul
(4) |
Aug
(50) |
Sep
(35) |
Oct
(6) |
Nov
(24) |
Dec
(19) |
2016 |
Jan
(30) |
Feb
(30) |
Mar
(23) |
Apr
(4) |
May
(12) |
Jun
(19) |
Jul
(26) |
Aug
(13) |
Sep
|
Oct
(23) |
Nov
(37) |
Dec
(15) |
2017 |
Jan
(33) |
Feb
(19) |
Mar
(20) |
Apr
(43) |
May
(39) |
Jun
(23) |
Jul
(20) |
Aug
(27) |
Sep
(10) |
Oct
(15) |
Nov
|
Dec
(24) |
2018 |
Jan
(3) |
Feb
(10) |
Mar
(34) |
Apr
(34) |
May
(28) |
Jun
(50) |
Jul
(27) |
Aug
(75) |
Sep
(21) |
Oct
(42) |
Nov
(25) |
Dec
(31) |
2019 |
Jan
(39) |
Feb
(28) |
Mar
(19) |
Apr
(7) |
May
(30) |
Jun
(22) |
Jul
(54) |
Aug
(36) |
Sep
(19) |
Oct
(33) |
Nov
(36) |
Dec
(32) |
2020 |
Jan
(29) |
Feb
(38) |
Mar
(29) |
Apr
(30) |
May
(39) |
Jun
(45) |
Jul
(31) |
Aug
(52) |
Sep
(40) |
Oct
(8) |
Nov
(48) |
Dec
(30) |
2021 |
Jan
(35) |
Feb
(32) |
Mar
(23) |
Apr
(55) |
May
(43) |
Jun
(63) |
Jul
(17) |
Aug
(24) |
Sep
(9) |
Oct
(31) |
Nov
(67) |
Dec
(55) |
2022 |
Jan
(31) |
Feb
(48) |
Mar
(76) |
Apr
(18) |
May
(13) |
Jun
(46) |
Jul
(75) |
Aug
(54) |
Sep
(59) |
Oct
(65) |
Nov
(44) |
Dec
(7) |
2023 |
Jan
(38) |
Feb
(32) |
Mar
(35) |
Apr
(23) |
May
(46) |
Jun
(53) |
Jul
(18) |
Aug
(10) |
Sep
(24) |
Oct
(15) |
Nov
(40) |
Dec
(6) |
From: Mario M. <Mar...@we...> - 2012-11-25 05:38:08
|
>> For IPV6 tests I have changed PTP_PRIMARY_MCAST_IP6ADDR to a link local >> address. >> (#define PTP_PRIMARY_MCAST_IP6ADDR "FF02:0:0:0:0:0:0:181") > > Why do you think this should be a link local address? used the local scope in order to use only modules in a closed network segments. > (I chose the global scope in order to maximise the chance of reaching > a master when running slave, and vice versa. I confess that I don't > really know what scope is correct to use here. This is correct and makes sense. But in some cases maybe would like the user change this, but this is not the point at the moment. >> After this change I have saw that the communication and explicit the >> udp6_send didn't work. >> >> I have review the the code of udp6_send and I have saw that the >> sin6_sopce_id of struct sockaddr_in6 addr is not correct initialize. > > Are you running E2E mode? Yes I have used E2E. But problem should also appear with P2P. >It looks like we will have to set sin6_scope_id for the Pdelay >messages in P2P mode in any case. I will do some testing on this to >find out. Ok. Thanks >> + ip6_scope_id = if_nametoindex(name); > > This won't work if you have more than one port. The particular index > must be remembered in a private field derived from the struct > transport. Ok. Thanks for the hint. Yes it's better to use the struct transport. >> + memset(&addr, 0, sizeof(struct sockaddr_in6)); /* Init */ > > This makes sense in any case. Better to use variable name, and the > comment isn't helpful. > > memset(&addr, 0, sizeof(addr)); This is correct the comment is not necessary and it isn't help full and thanks for the hint. >> + > ^ > That is a stray tab. Ok. Thanks, Mario |
From: Richard C. <ric...@gm...> - 2012-11-24 20:07:10
|
On Fri, Nov 23, 2012 at 03:33:03PM +0100, Mario Molitor wrote: > Hallo Richard, > I have make a observation that the PTP4L daemon seems to have a problem with to a link local address for IPv6 . > For IPV6 tests I have changed PTP_PRIMARY_MCAST_IP6ADDR to a link local address. > (#define PTP_PRIMARY_MCAST_IP6ADDR "FF02:0:0:0:0:0:0:181") Why do you think this should be a link local address? (I chose the global scope in order to maximise the chance of reaching a master when running slave, and vice versa. I confess that I don't really know what scope is correct to use here.) > After this change I have saw that the communication and explicit the udp6_send didn't work. > > I have review the the code of udp6_send and I have saw that the > sin6_sopce_id of struct sockaddr_in6 addr is not correct initialize. Are you running E2E mode? It looks like we will have to set sin6_scope_id for the Pdelay messages in P2P mode in any case. I will do some testing on this to find out. > > I have corrected this and the communication problem disappears. > > --- a/udp6.c > +++ b/udp6.c > @@ -135,6 +135,8 @@ enum { MC_PRIMARY, MC_PDELAY }; > > static struct in6_addr mc6_addr[2]; > > +static unsigned ip6_scope_id; > + > static int udp6_open(struct transport *t, char *name, struct fdarray *fda, > enum timestamp_type ts_type) > { > @@ -156,7 +158,8 @@ static int udp6_open(struct transport *t, char *name, struct fdarray *fda, > > if (sk_timestamping_init(efd, name, ts_type, TRANS_UDP_IPV6)) > goto no_timestamping; > - > + > + ip6_scope_id = if_nametoindex(name); This won't work if you have more than one port. The particular index must be remembered in a private field derived from the struct transport. > fda->fd[FD_EVENT] = efd; > fda->fd[FD_GENERAL] = gfd; > return 0; > @@ -181,11 +184,14 @@ static int udp6_send(struct transport *t, struct fdarray *fda, int event, int pe > ssize_t cnt; > int fd = event ? fda->fd[FD_EVENT] : fda->fd[FD_GENERAL]; > struct sockaddr_in6 addr; > + memset(&addr, 0, sizeof(struct sockaddr_in6)); /* Init */ This makes sense in any case. Better to use variable name, and the comment isn't helpful. memset(&addr, 0, sizeof(addr)); > + ^ That is a stray tab. Thanks, Richard |
From: Mario M. <mar...@we...> - 2012-11-23 14:33:13
|
Hallo Richard, I have make a observation that the PTP4L daemon seems to have a problem with to a link local address for IPv6 . For IPV6 tests I have changed PTP_PRIMARY_MCAST_IP6ADDR to a link local address. (#define PTP_PRIMARY_MCAST_IP6ADDR "FF02:0:0:0:0:0:0:181") After this change I have saw that the communication and explicit the udp6_send didn't work. I have review the the code of udp6_send and I have saw that the sin6_sopce_id of struct sockaddr_in6 addr is not correct initialize. I have corrected this and the communication problem disappears. --- a/udp6.c +++ b/udp6.c @@ -135,6 +135,8 @@ enum { MC_PRIMARY, MC_PDELAY }; static struct in6_addr mc6_addr[2]; +static unsigned ip6_scope_id; + static int udp6_open(struct transport *t, char *name, struct fdarray *fda, enum timestamp_type ts_type) { @@ -156,7 +158,8 @@ static int udp6_open(struct transport *t, char *name, struct fdarray *fda, if (sk_timestamping_init(efd, name, ts_type, TRANS_UDP_IPV6)) goto no_timestamping; - + + ip6_scope_id = if_nametoindex(name); fda->fd[FD_EVENT] = efd; fda->fd[FD_GENERAL] = gfd; return 0; @@ -181,11 +184,14 @@ static int udp6_send(struct transport *t, struct fdarray *fda, int event, int pe ssize_t cnt; int fd = event ? fda->fd[FD_EVENT] : fda->fd[FD_GENERAL]; struct sockaddr_in6 addr; + memset(&addr, 0, sizeof(struct sockaddr_in6)); /* Init */ + unsigned char junk[1600]; addr.sin6_family = AF_INET6; addr.sin6_addr = peer ? mc6_addr[MC_PDELAY] : mc6_addr[MC_PRIMARY]; addr.sin6_port = htons(event ? EVENT_PORT : GENERAL_PORT); + addr.sin6_scope_id = ip6_scope_id; len += 2; /* Extend the payload by two, for UDP checksum corrections. */ -- Could you please check and review my changes? I appreciate your opinion and ideas. You can find the changes as patch file attached to this email. Best Regards and thanks, Mario |
From: Keller, J. E <jac...@in...> - 2012-11-05 17:44:41
|
I tested igb and ixgbe using a print out which would tell me how many retries it was taking. We actually saw most of them taking 2-5 retries with a few large anomalies. I want to change the default to maybe 20 or 25, as this should reduce false positives without taking significantly more time in the longer cases. - Jake > -----Original Message----- > From: Stephan Gatzka [mailto:ste...@gm...] > Sent: Saturday, November 03, 2012 2:45 AM > To: Richard Cochran > Cc: Keller, Jacob E; lin...@li... > Subject: Re: [Linuxptp-users] Fw: “Resource temporarily unavailable” > errors during flood ping test > > > > > I just retested the PHYTER under a ping flood, and there were no > > hiccups. I will test the CPTS again when I get a chance. > > > > We also have this National Phyter but connected to an 400MHz MPC5200b. > We see these hiccups with the default value. > > Regards, > > Stephan |
From: Stephan G. <ste...@gm...> - 2012-11-03 09:45:26
|
> > I just retested the PHYTER under a ping flood, and there were no > hiccups. I will test the CPTS again when I get a chance. > We also have this National Phyter but connected to an 400MHz MPC5200b. We see these hiccups with the default value. Regards, Stephan |
From: Delio B. <dbr...@au...> - 2012-11-01 17:32:39
|
Hello Richard, On Nov 1, 2012, at 9:53 AM, Richard Cochran <ric...@gm...> wrote: > On Tue, Oct 30, 2012 at 11:02:21PM +0000, Keller, Jacob E wrote: >> >> It would take an insane amount of work to move to a model that >> allows receive handling inside sk.c, and I believe it isn't worth >> the effort. I would however like to increase the default >> tx_timestamp_retries value, as 2 tries rarely works for anything >> I've tested, as it generates false positive errors a lot. > > Okay, what value is a safe default, in your experience? > >> What hardware are you using that doesn't have issues at 2 retries? > > The PHYTER almost never loses a Tx time stamp, and the TI CPTS seems > to be working perfectly, too. I don't have the Freescale eTSEC (gianfar) Unfortunately I can trigger this issue (TX timestamp loss) by transferring a large file via ssh on my DM8148 based board (CPSW+CPTS) using the default value for sk_tx_retries. I can enable extra debug flags and investigate if the timestamp is lost at the CPTS level if it's any help. Regards -- Delio Brignoli Audioscience Inc |
From: Jacob K. <jac...@in...> - 2012-11-01 16:54:28
|
On 11/01/2012 01:53 AM, Richard Cochran wrote: > On Tue, Oct 30, 2012 at 11:02:21PM +0000, Keller, Jacob E wrote: >> >> It would take an insane amount of work to move to a model that >> allows receive handling inside sk.c, and I believe it isn't worth >> the effort. I would however like to increase the default >> tx_timestamp_retries value, as 2 tries rarely works for anything >> I've tested, as it generates false positive errors a lot. > > Okay, what value is a safe default, in your experience? > >> What hardware are you using that doesn't have issues at 2 retries? > > The PHYTER almost never loses a Tx time stamp, and the TI CPTS seems > to be working perfectly, too. I don't have the Freescale eTSEC (gianfar) > for testing, but I remember that it always worked, since the Tx time > stamp is delivered into packet buffer's padding. > > With the IGB, sometimes it seemed that 2 retries is okay, but > sometimes I needed to ramp this up. > >> And have you attempted testing this under moderate stress? > > I just retested the PHYTER under a ping flood, and there were no > hiccups. I will test the CPTS again when I get a chance. > > Thanks, > Richard > At 10Gbe link, I've only needed to go up to about 20 or 25 to remove most of the contention. Our validation team increased this to 200 as they really didn't want to see false positives and it seemed better to wait longer. the 1Gb ones I haven't got a figure personally but I will ask Matthew. I am ok with needing to ramp up the value sometimes, but I would much prefer a default which meant fewer users had to change something as the value seemed cryptic to the people I had to explain it too. An interesting thought I just had was how difficult would it be for the software to automatically increase the timing if it misses a bunch of tx timestamps in a row? So if would increase a counter as it missed tx timestamps, so that it would start low but would increase to the value the hardware needs to respond in time. I'm thinking it would only trigger if you missed a few in a row as a one time mistake isn't really a big deal. Thanks - Jake |
From: Mario M. <mar...@em...> - 2012-11-01 15:01:15
|
Yes the increase tx_timestamp_retries is the correct way. Thanks for the hint. My observation is that a increase on 3 of this variable seems to help. In the most retries case it seems to work with one retry. (my test platform MPC5200/DP83640) With moderate stress test it seems no retries necessary. Thanks, Mario > > On Tue, Oct 30, 2012 at 11:02:21PM +0000, Keller, Jacob E wrote: > > > > It would take an insane amount of work to move to a model that > > allows receive handling inside sk.c, and I believe it isn't worth > > the effort. I would however like to increase the default > > tx_timestamp_retries value, as 2 tries rarely works for anything > > I've tested, as it generates false positive errors a lot. > > Okay, what value is a safe default, in your experience? > > > What hardware are you using that doesn't have issues at 2 retries? > > The PHYTER almost never loses a Tx time stamp, and the TI CPTS seems > to be working perfectly, too. I don't have the Freescale eTSEC (gianfar) > for testing, but I remember that it always worked, since the Tx time > stamp is delivered into packet buffer's padding. > > With the IGB, sometimes it seemed that 2 retries is okay, but > sometimes I needed to ramp this up. > > > And have you attempted testing this under moderate stress? > > I just retested the PHYTER under a ping flood, and there were no > hiccups. I will test the CPTS again when I get a chance. > > Thanks, > Richard > |
From: Mario M. <mar...@we...> - 2012-11-01 15:00:54
|
Yes the increase tx_timestamp_retries is the correct way. Thanks for the hint. My observation is that a increase on 3 of this variable seems to help. In the most retries case it seems to work with one retry. (my test platform MPC5200/DP83640) With moderate stress test it seems no retries necessary. Thanks, Mario > Gesendet: Donnerstag, 01. November 2012 um 09:53 Uhr > Von: "Richard Cochran" <ric...@gm...> > An: "Keller, Jacob E" <jac...@in...> > Cc: "lin...@li..." <lin...@li...> > Betreff: Re: [Linuxptp-users] Fw: “Resource temporarily unavailable” errors during flood ping test > > On Tue, Oct 30, 2012 at 11:02:21PM +0000, Keller, Jacob E wrote: > > > > It would take an insane amount of work to move to a model that > > allows receive handling inside sk.c, and I believe it isn't worth > > the effort. I would however like to increase the default > > tx_timestamp_retries value, as 2 tries rarely works for anything > > I've tested, as it generates false positive errors a lot. > > Okay, what value is a safe default, in your experience? > > > What hardware are you using that doesn't have issues at 2 retries? > > The PHYTER almost never loses a Tx time stamp, and the TI CPTS seems > to be working perfectly, too. I don't have the Freescale eTSEC (gianfar) > for testing, but I remember that it always worked, since the Tx time > stamp is delivered into packet buffer's padding. > > With the IGB, sometimes it seemed that 2 retries is okay, but > sometimes I needed to ramp this up. > > > And have you attempted testing this under moderate stress? > > I just retested the PHYTER under a ping flood, and there were no > hiccups. I will test the CPTS again when I get a chance. > > Thanks, > Richard |
From: Richard C. <ric...@gm...> - 2012-11-01 08:53:18
|
On Tue, Oct 30, 2012 at 11:02:21PM +0000, Keller, Jacob E wrote: > > It would take an insane amount of work to move to a model that > allows receive handling inside sk.c, and I believe it isn't worth > the effort. I would however like to increase the default > tx_timestamp_retries value, as 2 tries rarely works for anything > I've tested, as it generates false positive errors a lot. Okay, what value is a safe default, in your experience? > What hardware are you using that doesn't have issues at 2 retries? The PHYTER almost never loses a Tx time stamp, and the TI CPTS seems to be working perfectly, too. I don't have the Freescale eTSEC (gianfar) for testing, but I remember that it always worked, since the Tx time stamp is delivered into packet buffer's padding. With the IGB, sometimes it seemed that 2 retries is okay, but sometimes I needed to ramp this up. > And have you attempted testing this under moderate stress? I just retested the PHYTER under a ping flood, and there were no hiccups. I will test the CPTS again when I get a chance. Thanks, Richard |
From: Keller, J. E <jac...@in...> - 2012-10-31 19:14:04
|
> -----Original Message----- > From: Jonatan Walck [mailto:jw...@ne...] > Sent: Wednesday, October 31, 2012 12:30 AM > To: lin...@li... > Subject: Re: [Linuxptp-users] Fw: “Resource temporarily unavailable” > errors during flood ping test > > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > On 10/30/2012 10:23 PM, Keller, Jacob E wrote: > > > > > >> -----Original Message----- From: Richard Cochran > >> [mailto:ric...@gm...] Sent: Tuesday, October 30, 2012 > >> 1:48 PM To: Keller, Jacob E Cc: > >> lin...@li... Subject: Re: > >> [Linuxptp-users] Fw: “Resource temporarily unavailable” errors > >> during flood ping test > >> > >> On Tue, Oct 30, 2012 at 06:23:53PM +0100, Richard Cochran wrote: > >>> On Tue, Oct 30, 2012 at 09:54:30AM -0700, Jacob Keller wrote: > >>>> > >>>> I believe the true correct answer is to completely > >>>> re-architect the tx_hwtstamp to be asynchronous, so that it > >>>> just waits until it > >> receives > >>>> the timestamp for a complete sequence of events. That design > >>>> is significantly more difficult to write though. > >>> > >>> But even if we did that way, it would not really be a better > >>> solution. Think about your own Intel cards. They would end up > >>> missing Tx time stamps and possibly mixing them up due to the > >>> hardware limitation of having a Tx time stamp FIFO of depth > >>> one. > >> > >> This may be the wrong list, but this reminds me of an issue with > >> the Intel hardware that I have been meaning to ask you about. > >> The igb driver has always had the following comment regarding > >> transmit time stamps: > >> > >> * If we were asked to do hardware stamping and such a time stamp > >> is * available, then it must have been for this skb here because > >> we only * allow only one such packet into the queue. > >> > >> This statement wasn't actually true up until recently, when > >> Matthew Vick added some code that enforced the one packet limit. > >> > >> If I am not mistaken, the ixgb also would need some kind of > >> guard against the case when a user program sends two or more > >> event packets in a row, would it not? > > > > > > Short answer: that limit is enforced by the hardware (it disables > > time stamping as long as the RXTSTMP register is locked), except in > > the mode that puts time stamp directly into packet buffer. > > > > > > Long answer: > > > > That comment actually refers to hardware design for the 82576 > > device. Basically, a packet is time stamped and the register stores > > RXHWTSTMP and sets the bit in the descriptor plus the bit in > > TSYNCRXCTL. > > > > No more than one packet will have the bit set in the descriptor, > > because time stamping is disabled when there is a valid stamp in > > the RXHWTSTAMP registers, so that packet must match the timestamp > > in the registers. > > > > There was some queuing code but this actually turns out to be bogus > > and did nothing of value, and I've petitioned to have it removed. > > > > for 10Gbe, I added the ptp_match function to prevent the case where > > a time stamped packet is dropped. > > > > The one-per-queue basically occurs because hardware design > > timestamps the packet, puts timestamp in registers, and indicates > > which packet got time stamped. There's no need for more correlation > > because the descriptor indicates which packet got time stamped, and > > as long as you don't read the RXTSTMP registers they remain locked > > and hardware won't timestamp another packet until you unlock the > > RXTSTMP registers. The ptp_match is necessary in the very rare case > > that a time stamped ptp packet never reaches the driver. (it will > > find the next ptp packet that should have been time stamped > > according to the timestamp mode, and then clear timestamps so that > > the error case causing timestamps to stop forever is avoided) > > > > For the 82580 part timestamps are stored in the packet buffer > > avoiding the issue entirely. > > > > So, the only guard necessary is the ptp_match function to prevent > > that condition. If there is a timestamp in the registers, hardware > > doesn't timestamp again until the user reads the timestamp value > > out. Two rapid event packets in succession will cause the first > > arrived to be time stamped and the second to not be time stamped. > > > >> > >> Thanks, Richard > > > > Now that we are on the subject, have I understood correctly that > 82599(ES) suffers from the same hardware design as the 82576? That is, > timestamps in a seperate register rather than the possibility to be > strapped on to each packet in the queue? > > Are there any 10GE cards with packet buffer timestamping? At this time, there are no 10Gb cards with per-packet timestamping in the buffer. - Jake |
From: Jonatan W. <jw...@ne...> - 2012-10-31 07:49:26
|
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 10/30/2012 10:23 PM, Keller, Jacob E wrote: > > >> -----Original Message----- From: Richard Cochran >> [mailto:ric...@gm...] Sent: Tuesday, October 30, 2012 >> 1:48 PM To: Keller, Jacob E Cc: >> lin...@li... Subject: Re: >> [Linuxptp-users] Fw: “Resource temporarily unavailable” errors >> during flood ping test >> >> On Tue, Oct 30, 2012 at 06:23:53PM +0100, Richard Cochran wrote: >>> On Tue, Oct 30, 2012 at 09:54:30AM -0700, Jacob Keller wrote: >>>> >>>> I believe the true correct answer is to completely >>>> re-architect the tx_hwtstamp to be asynchronous, so that it >>>> just waits until it >> receives >>>> the timestamp for a complete sequence of events. That design >>>> is significantly more difficult to write though. >>> >>> But even if we did that way, it would not really be a better >>> solution. Think about your own Intel cards. They would end up >>> missing Tx time stamps and possibly mixing them up due to the >>> hardware limitation of having a Tx time stamp FIFO of depth >>> one. >> >> This may be the wrong list, but this reminds me of an issue with >> the Intel hardware that I have been meaning to ask you about. >> The igb driver has always had the following comment regarding >> transmit time stamps: >> >> * If we were asked to do hardware stamping and such a time stamp >> is * available, then it must have been for this skb here because >> we only * allow only one such packet into the queue. >> >> This statement wasn't actually true up until recently, when >> Matthew Vick added some code that enforced the one packet limit. >> >> If I am not mistaken, the ixgb also would need some kind of >> guard against the case when a user program sends two or more >> event packets in a row, would it not? > > > Short answer: that limit is enforced by the hardware (it disables > time stamping as long as the RXTSTMP register is locked), except in > the mode that puts time stamp directly into packet buffer. > > > Long answer: > > That comment actually refers to hardware design for the 82576 > device. Basically, a packet is time stamped and the register stores > RXHWTSTMP and sets the bit in the descriptor plus the bit in > TSYNCRXCTL. > > No more than one packet will have the bit set in the descriptor, > because time stamping is disabled when there is a valid stamp in > the RXHWTSTAMP registers, so that packet must match the timestamp > in the registers. > > There was some queuing code but this actually turns out to be bogus > and did nothing of value, and I've petitioned to have it removed. > > for 10Gbe, I added the ptp_match function to prevent the case where > a time stamped packet is dropped. > > The one-per-queue basically occurs because hardware design > timestamps the packet, puts timestamp in registers, and indicates > which packet got time stamped. There's no need for more correlation > because the descriptor indicates which packet got time stamped, and > as long as you don't read the RXTSTMP registers they remain locked > and hardware won't timestamp another packet until you unlock the > RXTSTMP registers. The ptp_match is necessary in the very rare case > that a time stamped ptp packet never reaches the driver. (it will > find the next ptp packet that should have been time stamped > according to the timestamp mode, and then clear timestamps so that > the error case causing timestamps to stop forever is avoided) > > For the 82580 part timestamps are stored in the packet buffer > avoiding the issue entirely. > > So, the only guard necessary is the ptp_match function to prevent > that condition. If there is a timestamp in the registers, hardware > doesn't timestamp again until the user reads the timestamp value > out. Two rapid event packets in succession will cause the first > arrived to be time stamped and the second to not be time stamped. > >> >> Thanks, Richard > Now that we are on the subject, have I understood correctly that 82599(ES) suffers from the same hardware design as the 82576? That is, timestamps in a seperate register rather than the possibility to be strapped on to each packet in the queue? Are there any 10GE cards with packet buffer timestamping? // jwalck -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://www.enigmail.net/ iQIcBAEBAgAGBQJQkNNcAAoJEFwg9i9GDX+nKFwP/2T9dGhz+kZXYwX6PRVU5cKn ZvD8wwmBi4i8xtEF36Ulc/HNfzY7JuSQtobfheYIu60FpLy1DF1nemWM62Sm6c2v VdZ5Dx2JgPLRaYBZKfwR2/MNzKHfm7Sw0OSDTqvNe59ZCYFIAPmYsk0+6TLUSeqY BhKe1TH8yRgCgFkBvsQ2Fsh9jcTwprjENXoFPhIP2ww3+Iq3t9IV4ZtbXoQMVHb/ ppKOkYmZ45OgYSrbNgGJeEYf8KvIAKy92Fd26635PImhxjMm5hIfPwBs95xjlCqU 8JA/RuW9Vtit+n5dAv/+OHVLoge+RS8MxDuJ2c79+nHFoqya45TbmuTnVAqSZXs3 0+Ou2YqSoHRDfOg0b6gSsQOczQjY9i/9k75/2VM+fuZr6TIwcHupe3QUy9TGDRSS 53uKt0AGYKkq/xQILKIdEGkFmAB4C1mL2UoJSngZDFnHRv5k13gK/Oq5IBOjejAV OJvQdy8twsnBmH8pxd+jIB2j/T72lG7+kEu8nWjolbX7QX7QokK9XklmUCoxtfbm 6umZUwEZ8tAh3LWeRFhvddjFM53wh5TAsy2qxqc9zGAucOYrjDYPeFzGRS9xjEkQ VLAJR9OPAnrszEkav237qJ+HdZ+9LFDFeoeh3OYwrtRrwEO6bU/N5EtXKM7EPt+S lxKKnp1oLnQ6n72XjMe1 =lIKO -----END PGP SIGNATURE----- |
From: Keller, J. E <jac...@in...> - 2012-10-31 00:02:37
|
> -----Original Message----- > From: Miroslav Lichvar [mailto:mli...@re...] > Sent: Tuesday, October 30, 2012 10:17 AM > To: lin...@li... > Subject: Re: [Linuxptp-users] Fw: “Resource temporarily unavailable” > errors during flood ping test > > On Tue, Oct 30, 2012 at 06:05:38PM +0100, Richard Cochran wrote: > > This does not accomplish anything since: > > > > --- /usr/include/asm-generic/errno.h --- > > > > #define EWOULDBLOCK EAGAIN /* Operation would block */ > > > > > > > usleep(1); > > > + try_again++; > > > > This is the wrong solution. The right way is to set the > > tx_timestamp_retries configuration variable to a higher number, like > > 200 or 2000 instead of the default of 2. > > Would it make sense to specify a timeout instead of number of retries > and use select()? > > -- > Miroslav Lichvar I did some digging in the kernel to figure out why exceptfs parameter didn't get wokenup when a message appeared on the error queue. Turns out that is because in fs/socket.c the POLLEX_SET flags only includes POLL_PRI and not POLL_ERR. I tried a modified kernel which defined POLLEX_SET (POLL_PRI | POLL_ERR) and it enabled the usecase we want. Sadly this would add a further dependency.. and probably can't really be changed.. I don't know why exceptfs which is documented as "wake a socket on error" doesn't check POLL_ERR flag.... it seems really silly. - Jake |
From: Keller, J. E <jac...@in...> - 2012-10-30 23:03:12
|
> -----Original Message----- > From: Richard Cochran [mailto:ric...@gm...] > Sent: Tuesday, October 30, 2012 10:45 AM > To: Keller, Jacob E > Cc: lin...@li... > Subject: Re: [Linuxptp-users] Fw: “Resource temporarily unavailable” > errors during flood ping test > > > I like the idea of architecting it to use select and a delay value, > > so I'll try to see how difficult that would be. > > Be my guest ;) > > I tried once to fixup ptpd like this, and I didn't get very far. > You're right. I looked at implementing it and to make it work we really need to have select only check on the errqueue which it doesn't seem to be able to do... It would take an insane amount of work to move to a model that allows receive handling inside sk.c, and I believe it isn't worth the effort. I would however like to increase the default tx_timestamp_retries value, as 2 tries rarely works for anything I've tested, as it generates false positive errors a lot. What hardware are you using that doesn't have issues at 2 retries? And have you attempted testing this under moderate stress? I also like the idea of using CLOCK_MONOTONIC as a "timeout" seems to make more user-sense than a tx_retries as the user will more likely understand what is intended vs the tx_timestamp_retries which might be confusing. Thanks - Jake > Thanks, > Richard |
From: Keller, J. E <jac...@in...> - 2012-10-30 21:23:28
|
> -----Original Message----- > From: Richard Cochran [mailto:ric...@gm...] > Sent: Tuesday, October 30, 2012 1:48 PM > To: Keller, Jacob E > Cc: lin...@li... > Subject: Re: [Linuxptp-users] Fw: “Resource temporarily unavailable” > errors during flood ping test > > On Tue, Oct 30, 2012 at 06:23:53PM +0100, Richard Cochran wrote: > > On Tue, Oct 30, 2012 at 09:54:30AM -0700, Jacob Keller wrote: > > > > > > I believe the true correct answer is to completely re-architect the > > > tx_hwtstamp to be asynchronous, so that it just waits until it > receives > > > the timestamp for a complete sequence of events. That design is > > > significantly more difficult to write though. > > > > But even if we did that way, it would not really be a better > > solution. Think about your own Intel cards. They would end up missing > > Tx time stamps and possibly mixing them up due to the hardware > > limitation of having a Tx time stamp FIFO of depth one. > > This may be the wrong list, but this reminds me of an issue with the > Intel hardware that I have been meaning to ask you about. The igb > driver has always had the following comment regarding transmit time > stamps: > > * If we were asked to do hardware stamping and such a time stamp is > * available, then it must have been for this skb here because we only > * allow only one such packet into the queue. > > This statement wasn't actually true up until recently, when Matthew > Vick added some code that enforced the one packet limit. > > If I am not mistaken, the ixgb also would need some kind of guard > against the case when a user program sends two or more event packets > in a row, would it not? Short answer: that limit is enforced by the hardware (it disables time stamping as long as the RXTSTMP register is locked), except in the mode that puts time stamp directly into packet buffer. Long answer: That comment actually refers to hardware design for the 82576 device. Basically, a packet is time stamped and the register stores RXHWTSTMP and sets the bit in the descriptor plus the bit in TSYNCRXCTL. No more than one packet will have the bit set in the descriptor, because time stamping is disabled when there is a valid stamp in the RXHWTSTAMP registers, so that packet must match the timestamp in the registers. There was some queuing code but this actually turns out to be bogus and did nothing of value, and I've petitioned to have it removed. for 10Gbe, I added the ptp_match function to prevent the case where a time stamped packet is dropped. The one-per-queue basically occurs because hardware design timestamps the packet, puts timestamp in registers, and indicates which packet got time stamped. There's no need for more correlation because the descriptor indicates which packet got time stamped, and as long as you don't read the RXTSTMP registers they remain locked and hardware won't timestamp another packet until you unlock the RXTSTMP registers. The ptp_match is necessary in the very rare case that a time stamped ptp packet never reaches the driver. (it will find the next ptp packet that should have been time stamped according to the timestamp mode, and then clear timestamps so that the error case causing timestamps to stop forever is avoided) For the 82580 part timestamps are stored in the packet buffer avoiding the issue entirely. So, the only guard necessary is the ptp_match function to prevent that condition. If there is a timestamp in the registers, hardware doesn't timestamp again until the user reads the timestamp value out. Two rapid event packets in succession will cause the first arrived to be time stamped and the second to not be time stamped. > > Thanks, > Richard |
From: Richard C. <ric...@gm...> - 2012-10-30 20:48:41
|
On Tue, Oct 30, 2012 at 06:23:53PM +0100, Richard Cochran wrote: > On Tue, Oct 30, 2012 at 09:54:30AM -0700, Jacob Keller wrote: > > > > I believe the true correct answer is to completely re-architect the > > tx_hwtstamp to be asynchronous, so that it just waits until it receives > > the timestamp for a complete sequence of events. That design is > > significantly more difficult to write though. > > But even if we did that way, it would not really be a better > solution. Think about your own Intel cards. They would end up missing > Tx time stamps and possibly mixing them up due to the hardware > limitation of having a Tx time stamp FIFO of depth one. This may be the wrong list, but this reminds me of an issue with the Intel hardware that I have been meaning to ask you about. The igb driver has always had the following comment regarding transmit time stamps: * If we were asked to do hardware stamping and such a time stamp is * available, then it must have been for this skb here because we only * allow only one such packet into the queue. This statement wasn't actually true up until recently, when Matthew Vick added some code that enforced the one packet limit. If I am not mistaken, the ixgb also would need some kind of guard against the case when a user program sends two or more event packets in a row, would it not? Thanks, Richard |
From: Keller, J. E <jac...@in...> - 2012-10-30 20:00:44
|
> -----Original Message----- > From: Stephan Gatzka [mailto:ste...@gm...] > Sent: Tuesday, October 30, 2012 12:35 PM > To: Richard Cochran > Cc: lin...@li... > Subject: Re: [Linuxptp-users] Fw: “Resource temporarily unavailable” > errors during flood ping test > > > >> --- a/sk.c > >> +++ b/sk.c > >> > >> } > >> if (errno == EINTR) { > >> try_again++; > >> - } else if (errno == EAGAIN) { > >> + } else if ((errno == EAGAIN ) || (errno == EWOULDBLOCK)) { > > > > This does not accomplish anything since: > > > > --- /usr/include/asm-generic/errno.h --- > > The man page for recvmsg suggest to check both just for portability. But > maybe the whole stuff is so much Linux dependent that it probably makes > no sense to distinguish both. > It is completely 100% Linux (of very recent kernels!) dependant. It makes no sense to attempt to be more portable because the interfaces for ptp and hwtimestamps are completely non-portable.... - Jake > Regards, > > Stephan |
From: Stephan G. <ste...@gm...> - 2012-10-30 19:54:46
|
>> --- a/sk.c >> +++ b/sk.c >> >> } >> if (errno == EINTR) { >> try_again++; >> - } else if (errno == EAGAIN) { >> + } else if ((errno == EAGAIN ) || (errno == EWOULDBLOCK)) { > > This does not accomplish anything since: > > --- /usr/include/asm-generic/errno.h --- The man page for recvmsg suggest to check both just for portability. But maybe the whole stuff is so much Linux dependent that it probably makes no sense to distinguish both. Regards, Stephan |
From: Richard C. <ric...@gm...> - 2012-10-30 17:57:01
|
On Tue, Oct 30, 2012 at 05:52:51PM +0000, Keller, Jacob E wrote: > > > > > > while (nothing_in_errorqueue or timeleft) { > > > > /* handle new, incoming packets here? */ > > > > Basically that's the idea. But don't forget about the other ports! Thanks, Richard > > > timeleft -= select(timeleft) > > > > > > } |
From: Keller, J. E <jac...@in...> - 2012-10-30 17:53:03
|
> -----Original Message----- > From: Richard Cochran [mailto:ric...@gm...] > Sent: Tuesday, October 30, 2012 10:46 AM > To: Keller, Jacob E > Cc: Miroslav Lichvar; lin...@li... > Subject: Re: [Linuxptp-users]?Fw: “Resource temporarily unavailable” > errors during flood ping test > > On Tue, Oct 30, 2012 at 05:33:45PM +0000, Keller, Jacob E wrote: > > > That doesn't work because you can't restrict the select to the error > > > queue only. > > > > > > > But select returns the time it waited, and I believe we can do something > like: > > > > while (nothing_in_errorqueue or timeleft) { > > /* handle new, incoming packets here? */ > Basically that's the idea. - Jake > > timeleft -= select(timeleft) > > > > } |
From: Richard C. <ric...@gm...> - 2012-10-30 17:46:40
|
On Tue, Oct 30, 2012 at 05:33:45PM +0000, Keller, Jacob E wrote: > > That doesn't work because you can't restrict the select to the error > > queue only. > > > > But select returns the time it waited, and I believe we can do something like: > > while (nothing_in_errorqueue or timeleft) { /* handle new, incoming packets here? */ > timeleft -= select(timeleft) > > } |
From: Richard C. <ric...@gm...> - 2012-10-30 17:45:22
|
On Tue, Oct 30, 2012 at 05:28:46PM +0000, Keller, Jacob E wrote: > I think we can get PTP4l to work right even under those scenarios, > but right now it is horribly annoying that practically everyone has > to change the value if they ever have stress when ptp is running. [ I don't have this problem on the hardware that I use most often. We can increase the default if you want. ] We can get ptp4l to transmit asynchronously, but we can't fix the hardware. For this reason I remain convinced that we must block and wait for the time stamp. If you accept this premise, then you also must accept the idea of a timeout or retry count, since time stamps can get lost. There is a trade off between wanting to wait long enough (hardware specific) and wanting to give up ASAP when a time stamp goes missing. We could change the counter to a time value to be compared against CLOCK_MONOTONIC, but that is only a cosmetic change. > I like the idea of architecting it to use select and a delay value, > so I'll try to see how difficult that would be. Be my guest ;) I tried once to fixup ptpd like this, and I didn't get very far. Thanks, Richard |
From: Keller, J. E <jac...@in...> - 2012-10-30 17:34:23
|
> -----Original Message----- > From: Richard Cochran [mailto:ric...@gm...] > Sent: Tuesday, October 30, 2012 10:26 AM > To: Miroslav Lichvar > Cc: lin...@li... > Subject: Re: [Linuxptp-users] Fw: “Resource temporarily unavailable” > errors during flood ping test > > > > > > This is the wrong solution. The right way is to set the > > > tx_timestamp_retries configuration variable to a higher number, like > > > 200 or 2000 instead of the default of 2. > > > > Would it make sense to specify a timeout instead of number of retries > > and use select()? > > That doesn't work because you can't restrict the select to the error > queue only. > But select returns the time it waited, and I believe we can do something like: while (nothing_in_errorqueue or timeleft) { timeleft -= select(timeleft) } > Thanks, > Richard |
From: Keller, J. E <jac...@in...> - 2012-10-30 17:29:00
|
> -----Original Message----- > From: Richard Cochran [mailto:ric...@gm...] > Sent: Tuesday, October 30, 2012 10:24 AM > To: Keller, Jacob E > Cc: lin...@li... > Subject: Re: [Linuxptp-users] Fw: “Resource temporarily unavailable” > errors during flood ping test > > On Tue, Oct 30, 2012 at 09:54:30AM -0700, Jacob Keller wrote: > > > > I believe the true correct answer is to completely re-architect the > > tx_hwtstamp to be asynchronous, so that it just waits until it receives > > the timestamp for a complete sequence of events. That design is > > significantly more difficult to write though. > > But even if we did that way, it would not really be a better > solution. Think about your own Intel cards. They would end up missing > Tx time stamps and possibly mixing them up due to the hardware > limitation of having a Tx time stamp FIFO of depth one. > > And it is not just Intel cards that have this issue. I think the > majority of the current hardware offerings all have this same > limitation. So we really must wait for the Tx time stamp after > sending an event message before going on with the protocol, simply > to function on most of the hardware out there. > > Thanks, > Richard I think we can get PTP4l to work right even under those scenarios, but right now it is horribly annoying that practically everyone has to change the value if they ever have stress when ptp is running. I like the idea of architecting it to use select and a delay value, so I'll try to see how difficult that would be. - Jake |
From: Richard C. <ric...@gm...> - 2012-10-30 17:26:02
|
On Tue, Oct 30, 2012 at 06:17:18PM +0100, Miroslav Lichvar wrote: > On Tue, Oct 30, 2012 at 06:05:38PM +0100, Richard Cochran wrote: > > This does not accomplish anything since: > > > > --- /usr/include/asm-generic/errno.h --- > > > > #define EWOULDBLOCK EAGAIN /* Operation would block */ > > > > > > > usleep(1); > > > + try_again++; > > > > This is the wrong solution. The right way is to set the > > tx_timestamp_retries configuration variable to a higher number, like > > 200 or 2000 instead of the default of 2. > > Would it make sense to specify a timeout instead of number of retries > and use select()? That doesn't work because you can't restrict the select to the error queue only. Thanks, Richard |