Dmitry Butskoy - 2024-10-12

Sebastian Moeller wrote:

FreeBSD's traceroute for IPv4 (a different bninary from traceroute6)
has a very interesting diagnostiuc/debug mode that allows to record
and then analyse modification to the IP header along a path:

Unfortunately we don't have access to the raw IP headers anyway due to
implementation limitations.

All information is obtained only by the recvmsg(2) system call with the
MSG_ERRQUEUE flag set. This way allows unprivileged users to use "udp"
and/or "icmp/echo" traces without having to set the "setuid" bit on the
traceroute binary (or escalate privileges via setcap(8)). This was the
original reason why this traceroute implementation appeared in the first
place (the ability to run without setuid bits).

All we can do is check returned protocol headers (as has been done for
tcp mss clamping detection since 2.1.4), but NOT the returned IP header.
So it is imposibble to obtain the returned TOS value, since the Linux
kernel completely eats IP header before returning anything to us in
response to recvmsg(2). The only TOS value we can obtain (by IP_RECVTOS)
is the tos of the "icmp time exceeded" packet itself, not from the
incapsulated original packet.

This is quite helpful to use traceroute as a tool to record IP header
modifications along a path, in this case the original deciimal TOS
value of 180 (0xb4: 10110100) got changed to 0x34:00110100 by hop 10.
By only printing the values that differ from the original outgoing
header it becomes trivial to see where things differ

Yes, this would be a good feature, but it would require kernel support
to access encapsulated IP header. :(
ability for normal end users to explore TOS/tclass (so DSCP and ECN
bitfields) along network paths as there is little data and lots of
opinions around what happends to TOS/DSCPs values over the internet.
Maybe playing with TCP tracing will help here? At least for "ecn" cases?
(See "traceroute -T -O help" for more info, or traceroute(8)).