I would like to report a strange behaviour when a virtual bridge is used to interconnect a external VLAN trunk to virtual machines, along with the workaround to avoid the problem.
Scenario
--------
Let's consider the scenario shown in the attached figure. It is composed of two virtual machines, interconnected in bridge mode to a trunk port in an external switch (layer 2 interconnection). That means the traffic going in and out the virtual machines and external switch is VLAN tagged. Therefore, the eth1 interface in the virtual machines is a trunk interface, and sub-interfaces are configured for each VLAN that want to be used (as example, we consider two VLANs: 200 and 201).
The VNUML spec is below.
<vnuml>
<global>
<version>1.7</version>
<simulation_name>test</simulation_name>
<ssh_version>2</ssh_version>
<automac/>
<netconfig stp="on" />
<vm_mgmt type="net" network="1.0.0.0" mask="24" offset="0">
<mgmt_net sock="/var/local/run/vnuml/Mgmt_net.ctl" hostip="1.0.0.1"
/>
</vm_mgmt>
<vm_defaults>
<filesystem
type="cow">/usr/local/share/vnuml/filesystems/root_fs_tutorial</filesystem>
<kernel>/usr/local/share/vnuml/kernels/linux</kernel>
</vm_defaults>
</global>
<net name="corevl" mode="virtual_bridge" external="eth1" />
<vm name="vm0">
<mem>128M</mem>
<if id="1" net="corevl" />
<exec type="verbatim" seq="up">modprobe 8021q</exec>
<exec type="verbatim" seq="up">vconfig add eth1 200</exec>
<exec type="verbatim" seq="up">vconfig add eth1 201</exec>
<exec type="verbatim" seq="up">ifconfig eth1.200 10.1.200.10 netmask 255.255.255.0</exec>
<exec type="verbatim" seq="up">ifconfig eth1.201 10.1.201.10 netmask 255.255.255.0</exec>
<exec type="verbatim" seq="up">route add default gw 10.1.200.1</exec>
</vm>
<vm name="vm1">
<mem>128M</mem>
<if id="1" net="corevl" />
<exec type="verbatim" seq="up">modprobe 8021q</exec>
<exec type="verbatim" seq="up">vconfig add eth1 200</exec>
<exec type="verbatim" seq="up">vconfig add eth1 201</exec>
<exec type="verbatim" seq="up">ifconfig eth1.200 10.1.200.11 netmask 255.255.255.0</exec>
<exec type="verbatim" seq="up">ifconfig eth1.201 10.1.201.11 netmask 255.255.255.0</exec>
<exec type="verbatim" seq="up">route add default gw 10.1.200.1</exec>
</vm>
</vnuml>
A 1.0.0.0/24 internal network management is being used, based on a uml_switch stablished with the following commands (in host, previously to build the scenario):
tunctl -u vnuml -t tap0
ifconfig tap0 1.0.0.1 netmask 255.255.255.0 up
su -pc 'uml_switch -tap tap0 -unix /var/local/run/vnuml/Mgmt_net.ctl < /dev/null > /dev/null &' vnuml
chmod g+rw /var/local/run/vnuml/Mgmt_net.ctl
The scenario is built with (-u root is optional, actually):
vnumlparser.pl -t test.xml -u root -v
vnumlparser.pl -x up@test.xml -v
Problem
-------
The problem arises when sending big streams of traffic to the virtual machines from external equipment. For example, a scp copy operation of a 1MB file from 10.1.200.15 (a external PC) to 10.1.200.10 (vm0). The first bytes of the stream seem to arrive to vm0, but after a while transmission stops (for example, in the case of the scp you could note that 192KB has been transmitted, but the progress bar is stalled after that).
In fact, the problem is that packets with maximum size (that is 1500 bytes, the MTU of eth1) never arrives to the virtual machine. Note the problem is quite subtle: a ping to 10.1.200.10 from 10.1.200.15 goes fine, just because of ping echos and replies are much sorter of the maximum of 1500 bytes. The same seems to happen with ssh interactive sessions.
Surprisingly, in the opposite direction (scp copy of the 1MB file from virtual machine to the external PC) all seems going ok...
Solution
--------
The problem seems due to tagged frames are 1518 bytes long, instead of the
1514 bytes of the conventional Ethernet frames, that is, 4 bytes more (this bytes are actually the VLAN tag). The solution is to increase the MTU in
eth1 in virtual machines by 4, that is adding the following <exec> just after the modprobe one:
<exec type="verbatim" seq="up">modprobe 8021q</exec>
+<exec type="verbatim" seq="up">ifconfig eth1 mtu 1504</exec>
<exec type="verbatim" seq="up">vconfig add eth1 200</exec>
I think is a very amazing behaviour, and I don't know exactly what is the cause. However, I suspect an issue in the UML software, in the low-level networking processing. It seems that it isn't "VLAN aware", so when it receives the frame (1518 byte) in eth1 it makes a calculation based on the total length of the frame less the header length (14 bytes), resulting is
1504 and, therefore bigger than 1500 bytes, so the frame is discarted. The right behaviour ("VLAN aware") should be to consider the header length *and* the VLAN tag length in the case it appears in the frame (some deconding function is needed to realice the frame has a VLAN tag).
Anybody has observed a similar behaviour or can reproduce it? Any idea of what is the actual cause (I mean, if my hypothesis is right or wrong)?