Per Brian Tierny
I did some testing with the neper tool from google, which support SO_ZEROCOPY/MSG_ZEROCOPY via the -Z flag, and found it made a big difference, so long as the receiver is not CPU limited. See: https://github.com/google/neper
The trick is to use the --skip-rx-copy flag on the server to make sure you are not limited by the receiver.
Here are my test commands and results: (I'm using numactl to use the same core every time, as different cores often give different throughput)
server: numactl -C 5 ./tcp_stream --skip-rx-copy
client: numactl -C 5 ./tcp_stream -c -H 10.10.2.62 -Z
result: 37Gbps
client with MSG_ZEROCOPY: numactl -C 5 ./tcp_stream -c -H 10.10.2.62 -Z
result: 47Gbps
Based on this, I think it would be great if both iperf2 and iperf3 supported MSG_ZEROCOPY (and --skip-rx-copy, which sets the TRUNC flag in recv).
Looking at the code for this in neper, it should be a pretty easy enhancement for both iperf2/iperf3.
PS: I'm also testing BIG TCP now too. I'll let y'all know what I find. But BIG TCP needs to be enabled at the system level, so its not something you can set in iperf.
--skip-rx-copy commit is here. Seems to work as iperf CPU goes from 53% to 29% when using a single stream TCP over 10G
On:
Off:
Hello, @rjmcmahon
Thank you for adding this feature.
Based on my understanding, now it skips only rx copy.
Based on my understanding, iperf2 uses write() function to send TCP packet in client and write() function does not support zero-copy.
Do you have any plan to support zero-copy on TX (SO_ZEROCOPY/MSG_ZEROCOPY)?
Thanks