Re: [Linuxptp-devel] [PATCH v2 2/3] clock: dump unexpected packets received on the error queues of
PTP IEEE 1588 stack for Linux
Brought to you by:
rcochran
From: Jacob K. <jac...@in...> - 2020-06-18 19:44:50
|
On 6/15/2020 8:23 AM, Vladimir Oltean wrote: > In the current design of the SO_SELECT_ERR_QUEUE socket option (which is > enabled by sk_timestamping_init on the event fd), it is a bug to only > check revents & POLLIN, but not also POLLERR. > > Normally the error queue events that the application expects (i.e. TX > timestamps) are received, within a given timeout, in-band with the > transmission of the timestampable message itself (for example in > raw_send). > > For messages that the application does not / no longer expects, such as > TX timestamps delivered late, duplicate TX timestamps, general > exceptional messages enqueued by the kernel in the socket error queue > etc, ptp4l will be taken by surprise in clock_poll() by these, and will > think that there is data, since POLLIN is set (but in fact in the socket > error queue etc, ptp4l will be taken by surprise in clock_poll() by > these, and will think that there is data, since POLLIN is set (but in > fact POLLERR is also set, and this has an entirely different meaning). > > A very, very simple reproducer is to take a DSA switch and run: > > tcpdump -i eth0 -j adapter_unsynced > > on its DSA master net device. The above command will enable timestamping > on that net device, and if both the DSA switch and the master support > PTP, this will make the kernel send duplicate TX timestamps for every > sent event packet, which will completely kill ptp4l until a reboot, with > no indication whatsoever of what's going on. > > Since the messages on the error queue are unexpected, we have no need > for them. And they can be in theory anything, so simply hex dumping > their content and moving along sounds like a good idea. > > Printing them to the user is optional (and helpful), but reading them is > not. With this patch, even with extraneous data delivered by a buggy > kernel (which the application now loudly complains about), the > synchronization keeps chugging along. Otherwise the application starts > reordering packets in recvmsg() due to misinterpreting which socket > queue has data available. > > Signed-off-by: Vladimir Oltean <ol...@gm...> > --- > Changes in v2: > Make msg_raw_dump print to stderr instead of stdout, matching the > "Unexpected data on socket err queue:" text. > > clock.c | 11 +++++++++++ > msg.c | 12 ++++++++++++ > msg.h | 8 ++++++++ > 3 files changed, 31 insertions(+) > > diff --git a/clock.c b/clock.c > index f43cc2afd49e..d61881c5db7a 100644 > --- a/clock.c > +++ b/clock.c > @@ -1559,6 +1559,17 @@ int clock_poll(struct clock *c) > LIST_FOREACH(p, &c->ports, list) { > /* Let the ports handle their events. */ > for (i = 0; i < N_POLLFD; i++) { > + if (cur[i].revents & POLLERR) { > + unsigned char pkt[1600]; > + > + cnt = recv(cur[i].fd, pkt, sizeof(pkt), > + MSG_ERRQUEUE); > + pr_err("Unexpected data on socket err queue:"); > + msg_raw_dump(pkt, cnt, stderr); > + > + continue; > + } > + > if (cur[i].revents & (POLLIN|POLLPRI)) { > event = port_event(p, i); > if (EV_STATE_DECISION_EVENT == event) { > diff --git a/msg.c b/msg.c > index d1619d4973f1..59bb55d6cd89 100644 > --- a/msg.c > +++ b/msg.c > @@ -601,3 +601,15 @@ int msg_sots_missing(struct ptp_message *m) > } > return msg_sots_valid(m) ? 0 : 1; > } > + > +void msg_raw_dump(unsigned char *pkt, int cnt, FILE *fp) > +{ > + int k; > + > + for (k = 0; k < cnt; k++) { > + if (k % 16 == 0) > + fprintf(fp, "\n%04x ", k); > + fprintf(fp, "%02x ", pkt[k]); > + } > + fprintf(fp, "\n"); > +} Could this use the pr facility, so that it honors the printing options to log to syslog and manages the level? I'm not sure what I would mark these as: debug because they no longer impact the syncing process, or true error because they are unexpected? Thanks, Jake > diff --git a/msg.h b/msg.h > index e71d3ceec4b9..f2165084395a 100644 > --- a/msg.h > +++ b/msg.h > @@ -395,6 +395,14 @@ void msg_put(struct ptp_message *m); > */ > int msg_sots_missing(struct ptp_message *m); > > +/** > + * Print a wireshark-compatible hex dump of a message buffer. > + * @param pkt Message buffer to print > + * @param cnt Length of message buffer > + * @param fp An open file pointer. > + */ > +void msg_raw_dump(unsigned char *pkt, int cnt, FILE *fp); > + > /** > * Test whether a message has a valid SO_TIMESTAMPING time stamp. > * @param m Message to test. > |