Iperf 2 / Tickets / #126 UDP Server Listen Failure on Bound TAP Interface

Bob ONeil - 2021-07-24

Changes made to the source to provide binding to an interface are as attached under the conditional compilation BIND_LISTENER_TO_INTERFACE

PerfSocket.cpp

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Bob ONeil - 2021-07-24

A Wireshark capture on this interface is as attached.

Last edit: Bob ONeil 2021-07-24

zeth1.traffic.pcapng

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Bob ONeil - 2021-07-24

Other small source code mods - print output before listen socket is bound. Content added preceded with initials RWO

Last edit: Bob ONeil 2021-07-24

Listener.cpp

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Robert McMahon - 2021-07-24

status: open --> accepted

assigned_to: Robert McMahon
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Robert McMahon - 2021-07-24

let me take a look and get back to you

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Bob ONeil - 2021-07-24

Thanks, let me know if there is anything else I can try. My next attempt will be to use
the interfaces actual destination MAC address.

Should we get this to work, I think it opens up some doors relative to use of Iperf in a network emulation mode, where multiple instances of Iperf run on the same workstation, each one pointed to listen to a particular virtual TAP interface. When using one TAP interface to create a common bus between a network of running applications, such as when simulating an RF channel over the air bus, as with a radio emulator, having not only the interface, but some other discriminator such as perhaps destination IP would be fantastic and would allow multiple versions of Iperf to listen on a common virtual interface, keeping up separate conversations.

Last edit: Bob ONeil 2021-07-24

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Bob ONeil - 2021-07-24

Next I am going to try using the actual destination MAC address of the interface rather
than the broadcast destination MAC. May not make any difference.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Bob ONeil - 2021-07-24

Attached is the Linux script files I use to create the 2 virtual TAP interfaces. My application
accepts input on the 192.0.1.1 IP on zeth0 and bridges it modifying the IP header to the
138.0.0.0/24 network on zeth1. To setup these two interfaces under Linux, you run "net-setup.sh start".

Last edit: Bob ONeil 2021-07-24

net-setup.zip

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Bob ONeil - 2021-07-24

Attached is my config.h for the build for reference, Build platform is Ubuntu 20.

config.h

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Bob ONeil - 2021-07-24

Using the TAP interfaces destination MAC rather than the broadcast MAC did not solve anything.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Robert McMahon - 2021-07-24
  
  I'm not a TAP user. Can you provide me more information or some links?
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Robert McMahon - 2021-07-24

I filed a ticket for the listener bind to device. I think this is separate from the TAP issue so I broke it out into it's own ticket. Code change is in master and 2-1-4-rc.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Bob ONeil - 2021-07-24

A Tap interface is a virtual Layer 2 Ethernet interface created as an entry point for a network stack. In this case, it is created on the LInux host device via the attached script Here you run "net-setup.sh start", then use ifconfig to verify the existence of the virtual Ethernet interfaces zeth0 and zeth1, as well as look at the routing established to the 2 interfaces.

A TAP interface is a layer 2 interface that includes a MAC address, and behaves similar to a real Ethernet interface such as eth0 and eth1. A closely related cousin is a TUN interface, which does not include Layer 2 routing.

https://www.kernel.org/doc/Documentation/networking/tuntap.txt

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Robert McMahon - 2021-07-25
  
  Ok, I don't see how this is related to iperf. Iperf is a socket level, layer 4 thing. Iperf utilizes the TCP/IP stack. There was an effort for raw sockets which might work over TUN/TAP interfaces but it wasn't incorporated into the code base.
  
  Setting up user space virtual interfaces to send/receive L2 packets over tunnels is beyond the scope of iperf 2. Iperf is designed to test network i/o across real networks, though one can test in machine via localhost. The amount of code to support this seems non trivial unless I'm misunderstanding the technology and the use case.
  
  This link seems to help
  
  What's the use of tun/tap devices?
  
  From the process described above, it can be seen that the tun/tap device is used to forward part of the data packets in the protocol stack to the user-space application program, giving the user-space program a chance to process the data packets. The most common scenarios of tun/tap devices are VPN, including tunnel s and application-level IPSec. The more famous projects are VTun If you are interested, you can get to know it.
  
  The difference between tun and tap
  
  User-level programs can only read and write IP data packets through tun devices, while tap devices can read and write link-level data packets. Similar to the difference between ordinary sockets and raw sockets, the format of processing data packets is different.
  
  Last edit: Robert McMahon 2021-07-25
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Bob ONeil - 2021-07-26

No argument to this point, there should not be significant additional coding added to support this effort. But I believe a L2 raw socket is not required, and the TAP interface should be accessible via a standard UDP socket. I will create a test application on Monday as a standard UDP socket to verify this, bound to the TAP interface.

Notionally, the TAP interface should act like a standard Ethernet interface, much like eth0 and eth1, so presumably Iperf should work. I will know more tomorrow.

Thanks for the effort and discussion.

If Iperf can work with TAP or TUN interfaces, it opens up the doors significantly to use for network emulation, which is a real big plus and feature.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Robert McMahon - 2021-07-26
  
  There is some code that reads L2 packets and does the UDP checksum and bypasses the UDP stack. This uses an AF_PACKET socket and an AF_INET socket, then uses a BPF to drop the AF_INET packets. There is no code that produces UDP or IP packets, those are handled by the network stack in the operating system. Iperf 2 sends it's messages in the layer 4 payload.
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Thanks Robert, my initial tests with simply a UDP socket on the TAP interface does not seem to work. So I will test with a raw socket instead. I think I know in the code where the listen socket is created and bound to an interface, I do not know where the UDP socket data (without the IP header) gets processed, that is some read() from listen socket API.

I will try out the raw socket approach and remove the IP header to call the same API to allow IPerf to process the UDP payload itself. It may require that the application be run as root for the raw socket creation.

BUT - if I can get this to work, it opens up a big door for Iperf to both real and virtual Ethernet interfaces (TUN/TAP), which is increasingly important in the world of network real time emulation.
I intend to use Iperf with with multiple versions of my radio simulation application, with a bunch of virtual Ethernet interfaces representing the ingress and egress of a radio.

If I can get it to work, I will send you my revised source code so you can consider whether or not you want to make it part of the official code base for use in network emulation with some command line argument indicating whether or not the socket binding is tos a real physical interface or a virtual one (TUN/TAP).

The is a large community of folks that need network emulation tools such as Iperf for emulated Ethernet interfaces.

So which line of code is the read performed from the listen socket, and where does the Layer 4 payload get processed?

Maybe you can give me a few places to look for to implement this functionality.

The L2 setup code is in src/Listener.cpp and the UDP checksum in src/Server.cpp

bool Listener::L2_setup (thread_Settings *server, int sockfd) {
#if defined(HAVE_LINUX_FILTER_H) && defined(HAVE_AF_PACKET)
    //
    //  Supporting parallel L2 UDP threads is a bit tricky.  Below are some notes as to why and the approach used.
    //
    //  The primary issues for UDP are:
    //
    //  1) We want the listener thread to hand off the flow to a server thread and not be burdened by that flow
    //  2) For -P support, the listener thread neads to detect new flows which will share the same UDP port
    //     and UDP is stateless
    //
    //  The listener thread needs to detect new traffic flows and hand them to a new server thread, and then
    //  rehang a listen/accept.  For standard iperf the "flow routing" is done using connect() per the ip quintuple.
    //  The OS will then route established connected flows to the socket descriptor handled by a server thread and won't
    //  burden the listener thread with these packets.
    //
    //  For L2 verification, we have to create a two sockets that will exist for the life of the flow.  A
    //  new packet socket (AF_PACKET) will receive L2 frames and bypasses
    //  the OS network stack.  The original AF_INET socket will still send up packets
    //  to the network stack.
    //
    //  When using packet sockets there is inherent packet duplication, the hand off to a server
    //  thread is not so straight forward as packets will continue being sent up to the listener thread
    //  (technical problem is that packet sockets do not support connect() which binds the IP quintuple as the
    //  forwarding key) Since the Listener uses recvfrom(), there is no OS mechanism to detect new flows nor
    //  to drop packets.  The listener can't listen on quintuple based connected flows because the client's source
    //  port is unknown.  Therefore the Listener thread will continue to receive packets from all established
    //  flows sharing the same dst port which will impact CPU utilization and hence performance.
    //
    //  The technique used to address this is to open an AF_PACKET socket and leave the AF_INET socket open.
    //  (This also aligns with BSD based systems)  The original AF_INET socket will remain in the (connected)
    //  state so the network stack has it's connected state.  A cBPF is then used to cause the kernel to fast drop
    //  those packets.  A cBPF is set up to drop such packets.  The test traffic will then only come over the
    //  packet (raw) socket and not the  AF_INET socket. If we were to try to close the original AF_INET socket
    //  (vs leave it open w/the fast drop cBPF) then the existing traffic will be sent up by the network stack
    //  to he Listener thread, flooding it with packets, again something we want to avoid.
    //
    //  On the packet (raw) socket itself, we do two more things to better handle performance

void Server::L2_processing () {
#if defined(HAVE_LINUX_FILTER_H) && defined(HAVE_AF_PACKET)
    eth_hdr = reinterpret_cast<struct ether_header *>(mBuf);
    ip_hdr = reinterpret_cast<struct iphdr *>(mBuf + sizeof(struct ether_header));
    // L4 offest is set by the listener and depends upon IPv4 or IPv6
    udp_hdr = reinterpret_cast<struct udphdr *>(mBuf + mSettings->l4offset);
    // Read the packet to get the UDP length
    int udplen = ntohs(udp_hdr->len);
    //
    // in the event of an L2 error, double check the packet before passing it to the reporter,
    // i.e. no reason to run iperf accounting on a packet that has no reasonable L3 or L4 headers
    //
    reportstruct->packetLen = udplen - sizeof(struct udphdr);
    reportstruct->expected_l2len = reportstruct->packetLen + mSettings->l4offset + sizeof(struct udphdr);
    if (reportstruct->l2len != reportstruct->expected_l2len) {
    reportstruct->l2errors |= L2LENERR;
    if (L2_quintuple_filter() != 0) {
        reportstruct->l2errors |= L2UNKNOWN;
        reportstruct->l2errors |= L2CSUMERR;
        reportstruct->emptyreport = 1;
    }
    }
    if (!(reportstruct->l2errors & L2UNKNOWN)) {
    // perform UDP checksum test, returns zero on success
    int rc;
    rc = udpchecksum((void *)ip_hdr, (void *)udp_hdr, udplen, (isIPV6(mSettings) ? 1 : 0));
    if (rc) {
        reportstruct->l2errors |= L2CSUMERR;
        if ((!(reportstruct->l2errors & L2LENERR)) && (L2_quintuple_filter() != 0)) {
        reportstruct->emptyreport = 1;
        reportstruct->l2errors |= L2UNKNOWN;
        }
    }
    }
#endif // HAVE_AF_PACKET
}

Bob ONeil - 2021-07-26

Thanks Robert, is it my understanding that this L2 code is NOT currently used?

I think there are separate listen threads?

Where does the UDP data get read from and then passed on for processing (UDP data alone?).

One of my requirements where IPerf2 works out well, is that it did not need an initial TCP connection client to server to work. The network app I am writing is UDP only.

Does the Iperf Listener need to send back any info to the client to operate?

I am ok with unacknowledgements at the end of the test run.

What I am getting at, if I can put in place a L2 raw socket for the listener socket and simply chop off the L2 header and provide an existing entry method that UDP payload which is would normally get from a UDP socket - would that potentially work, of am I missing something in the design?

I only need IPv4 UDP client to server support.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Robert McMahon - 2021-07-26
  
  L2 mode works. It's the --l2checks option. Requires root on the server for AF_PACKET sockets.
  
  Server code is in src/Server.cpp. UDP rx is RunUDP()
  
  Iperf 2 UDP is stateless and does not require any handshake between client and server. That's a design goal (and difference from iperf3)
  
  The man page describes the --no-udp-fin option
  
  --no-udp-fin Don't perform the UDP final server to client exchange which means there won't be a final server report displayed on the client. All packets per the test will be from the client to the server and no packets should be sent in the other direction. It's highly suggested that -t be set on the server if this option is being used. This is because there will be only one trigger ending packet sent from client to server and if it's lost then the server will continue to run. (Requires ver 2.0.14 or better)
  
  I think the approach used for --l2checks isn't ideal for TUN/TAP and related app testing.
  
  There is no L2 raw socket per se, The raw sockets are IP. The use of AF_PACKET was bit of a hack that works for a unit test around NIC drivers.
  
  The attempt at adding support for SOCK_RAW was non trivial. The code didn't get into the mainline because of this and I didn't have time to rewrite it, nor was there a strong enough use case to justify the work. The TUN interface may work with SOCK_RAW, not sure. TAP for sure won't. Iperf 2 doesn't support either at this stage.
  
  I don't see a huge advantage to using iperf 2 vs writing a new tool to test TAP/TUN and the connected apps. The app protocol(s) inside the payloads seems brand new. The reporting seems brand new too. I guess there could be some leveraging of the threading design somewhat documented in the DESIGN_NOTES. (Note: The traffic and listener threads really shouldn't have user i/o as an example. It takes a lot of work and testing to get this right.)
  
  There is one reporter thread. The goal of the reporter thread is to decouple user
  i/o from traffic as well as perform CPU related logic. This structure allows the
  tool to measure traffic performance unrelated to user i/o and traffic accounting.
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Bob ONeil - 2021-07-26

Thanks Robert for all the advice. I have confirmed that a raw socket approach will work, run as root. Attached is the source code for my test application for reference. This code when run without the decoding of the packets can keep up with Iperf2 as a burst.

If I were going to try and implement this logic within Iperf2, would I fundamentally be making a mistake, meaning it will just not work. You may have already answered this in your previous responses which I will study. If this is a doomed path, would you reference any starting point for making a custom client server test application?

Unrelated question: IPerf is very bursty for different data rates, with packets sometimes separated by a few hundred microseconds, which makes it sometimes difficult to have a network emulation keep up with these bursts. It also seems like the packet duty cycle changes according to the data rate, but the packets remain tightly packed independent of data rate.

Is there anyway of controlling the packet separation such that they are more uniform over time, say separated by a millisecond or so?

Last edit: Bob ONeil 2021-07-26

main.cpp

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Robert McMahon - 2021-07-26

you probably want to try the --isochronous option found in the man page.

If it's a receive only thing, then that's not a big change as the existing --l2checks demonstrates. If iperf 2 needs to generate L2 packets, sending over TAP/TUN and supporting the multiple traffic profiles, it's a bit more. I haven't assessed it.

The question is what exactly needs to be tested?

There really isn't much impetus for this other than your requests here. I tend to avoid one off features as they require continuous testing for an extremely small subset of users.

--ipg n set the inter-packet gap to n (units of seconds) for packets or within a frame/burst when --isochronous is set --isochronous[=fps:mean,stdev] send isochronous traffic with frequency frames per second and load defined by mean and standard deviation using a log normal distribution, defaults to 60:20m,0. (Note: Here the suffixes indicate bytes/sec or bits/sec per use of uppercase or lowercase, respectively. Also the p suffix is supported to set the burst size in packets, e.g. isochronous=2:25p will send two 25 packet bursts every second, or one 25 packet burst every 0.5 seconds.)
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Robert McMahon - 2021-07-26

From a receive perspective I think it's check the device as IFF_TUN then do the decode, UDP checksum, shift the pointers, and send it to the normal iperf recv. This seems trivial.

Sending is the tricky part as you'd have to implement your own UDP stack in user space. I think that's a lot of work to get right.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Bob ONeil - 2021-07-27

The only thing that I need is UDP IPv4 testing per my sample commands, client to server, one way communications. Data I need is just drop percentages and latencies.

Notionally, I was thinking it was simply a matter of using the L2 Raw socket in the listen thread instead of the UDP socket, toss away the IP header upon reception, drive the Layer 4 UDP data to the receiver similar to the way that the current UDP socket works.

That is about it, allow the server application to bind to a specified TAP interface and port, strip and process UDP content. The client side seems to work as is, that is, I do not need the client to generate L2 packets, it drives my application, which then creates L2 packets that get transmitted one of the TAP interfaces. So it is a server receive issue only.

But there things that I do not understand about Iperf operation where perhaps this over simplification does not take into account things I do not know.

I did not think it was necessary to do much packet manipulation other than throwing away the L2 header.

Last edit: Bob ONeil 2021-07-27

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Bob ONeil - 2021-07-27

I understand not wanting to make changes for one-offs, but if I can get the TAP interface working on the receiver through a raw socket, it really does open up Iperf to being the go to tool for folks doing network emulation on a single workstation, that is having multiple instances of a test application running with other instances, where Ethernet interfaces are virtualized as TAP interfaces on a Linux server platform.

A TAP interface is the interface used by the popular EMANE network emulation framework, and the entry point for the LTE User Equipment Linux stack just to mention 2 use cases.

Maybe I am not that far from making it work simply swapping out the UDP socket in the listen thread with a raw socket, running Iperf on the server side as root, and then just chopping off the IPv4 header before passing the UDP payload data?

If this were true, that would be a most excellent result and would make Iperf the most suitable tool for this type of testing, especially for ones where the actual system does not permit end to end host connectivity to begin traffic flows.

Thanks for all the help .. I hope with your guidance and direction, I can close the loop on the receive end and get it working.

Last edit: Bob ONeil 2021-07-27

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

UDP Server Listen Failure on Bound TAP Interface

A means to measure network responsiveness and throughput

Milestone

Searches

Help

#126 UDP Server Listen Failure on Bound TAP Interface

Discussion