Multiple RX threads for Linux Code
Status: Pre-Alpha
Brought to you by:
shihchun
File | Date | Author | Commit |
---|---|---|---|
Makefile.in | 2009-09-27 | shihchun | [r23] 1. make clean before make. |
README | 2009-09-28 | shihchun | [r24] add module description and more details in readme |
build.sh | 2009-09-27 | shihchun | [r23] 1. make clean before make. |
mrxt.c | 2009-11-11 | shihchun | [r25] proper cache alignment |
mrxt.h | 2009-09-26 | shihchun | [r20] 1. Add build.sh to build MRxT and driver. |
------------ Introduction Multiple Rx Threads (MRxT) is a Linux kernel module that is designed for systems with multiple CPUs to achieve better performance with NAPI framework. NAPI (New API) was created for reducing the interrupt overhead when Gigabit Ethernet NICs had emerged. The basic idea behind NAPI was that it's not necessary for a NIC to inform OS that it has packets incoming. Therefore, soon after the kernel receives a packet in response to an interrupt, it immediately polls the NIC to see if there is any packets pending. If there is, the kernel reads the packets until it has exceeds its limit. The advantages of this approach is that under high load interrupts are eliminated and also reduce the likihood of packet re-ordering. This turns out to be a huge performance gain compared to the previous approach. However, the issue with NAPI is it does not take advantage of multiple cores since it disables interrupt and begins polling until no more packets in NIC (or the limit is exceeded). Under high load, only one CPU is used completely while others are sitting idle. ---------------- MRxT Design Idea In order to utilize other cores, MRxT creates multiple received threads, which would process received packets. Each Rx thread is affinitized to a particular CPU. When a NIC receives a packet, it queues the packet and quickly poll the next one. In NAPI, a NIC needs to process the packet immediately with interrupt disabled. For a firewall that does connection tracking and NATting, this takes substantial amount of time. When MRxT is used, a public function is used for NIC driver to store received packets into an appropriate received queue from which a Rx thread will process. Note that MRxT improves performance in term of packet rate rather than bandwidth. ------------------ Module Compilation In order to use MRxT, driver patch is needed. build.sh will search for source files and patch them. 1. tar zxvf mrxt-0.5.tar.gz 2. cd mrxt-0.5 3. ./build.sh <path-to-driver-source> [kernel version] For example, if you are using e1000e driver, you can compile MRxT and e1000e by the following command. $ ./build.sh ../e1000e-1.0.2.5/src/ kernel version, if omitted, will be the running version. The module for driver will be at tmp_build/ ------------ Loading MRxT There are two arguments for MRxT. 1. cpu_mask - The mask of active Rx threads 2. irq_mask - The mask of IRQs from which packets are coming. For example, on a 8-core system, if you want MRxT to run on the first four CPUs and CPU4 and CPU5 to receive packets from NIC, you can load the module with the following command. insmod mrxt.ko cpu_mask=0x0000000f irq_mask=0x00000030 If a packet received from an IRQ that is not in irq_mask, the packet will be processed immediately instead of being queued for Rx Threads. -------------- Loading Driver insmod tmp_build/<driver>.ko [Driver Arguments] ---- IRQ affinity Note that it's important to set affinity of a NIC to a particular CPU. You can first verify IRQ numbers used by your NIC by the command below. grep eth /proc/interrupts After you obtain the IRQ number, you can set the affinity by echo 00000010 > /proc/irq/18/smp_affinity (e.g eth0) echo 00000020 > /proc/irq/19/smp_affinity (e.g eth1) ---------------- Process affinity You may want to set affinity of other processes to other CPUs that are not used by MRxT. This is to avoid resource contention between user applications and MRxT. See man page of taskset for details. ---- Packet re-ordering One disadvantage of MRxT is possible packet re-ordering due to multiple queues. ------------------ Author and License MRxT is developed by Shih-Chun Chang <niick.chang@gmail.com> and is licensed under GPLv2.