Thread: [uml-devel] uml "ip header error" for large (fragmented) udp packets

user-mode-linux-devel

[uml-devel] uml "ip header error" for large (fragmented) udp packets

From: Lars E. <Lar...@li...> - 2004-10-03 02:36:14

in the upcomming new heartbeat design of linux-ha, we use relatively
large udp packets (passing large xml blobs back and forth between the
nodes). now, cluster development and testing is convenient on UML.

but:
 uml seems to silently and systematically lose all fragmented
 udp packets, that is packages larger than mtu, and 1480 byte
 max for xml blobs just does not work out too good ...

I investigated a little bit (using two simple perl snippets to
generate/[not]receive the udp packets), and it turns out that:

 sending large (up to 64K; fragmented) udp packets
 using tuntap:
  HOST -> UML  works.
  UML  -> HOST nope :(
  UML <-> UML  nope :( [ neither with mcast or other ]
 
 looking into /proc/net/snmp on the UMLs and the HOST
 show on the not receiving side an increase of
 InHdrError!
 (it never is reassembled into a proper udp packet)
 
this is easy to reproduce (because it just happens all the time)
tested with 2.6.6 and 2.6.8 plus respective UML patches.

it seems to me that UML corrupts the ip header of
fragmented udp packets somehow at sending time.

I wonder, if someone uses nfs over udp on uml, this should
be a long known issue and turn up loads of hits in a search.
I did not find a single reference to that problem, though.

the way it is now we probably need to reanimate some of our old
boxes to form a real test cluster. and believe me, that is no fun :(

if some kind soul would be able to fix that...
would make cluster testing as we do it
so much more convenient :-)

thanks,

	Lars Ellenberg

please CC me, I'm not subscribed on this list.

Re: [uml-devel] uml "ip header error" for large (fragmented) udp packets

From: BlaisorBlade <bla...@ya...> - 2004-10-03 15:17:19

On Sunday 03 October 2004 04:36, Lars Ellenberg wrote:
> in the upcomming new heartbeat design of linux-ha, we use relatively
> large udp packets (passing large xml blobs back and forth between the
> nodes). now, cluster development and testing is convenient on UML.
>
> but:
>  uml seems to silently and systematically lose all fragmented
>  udp packets, that is packages larger than mtu, and 1480 byte
>  max for xml blobs just does not work out too good ...

> I investigated a little bit (using two simple perl snippets to
> generate/[not]receive the udp packets), and it turns out that:

>  sending large (up to 64K; fragmented) udp packets
>  using tuntap:
>   HOST -> UML  works.
>   UML  -> HOST nope :(
>   UML <-> UML  nope :( [ neither with mcast or other ]
>
>  looking into /proc/net/snmp on the UMLs and the HOST
>  show on the not receiving side an increase of
>  InHdrError!
>  (it never is reassembled into a proper udp packet)
>
> this is easy to reproduce (because it just happens all the time)
> tested with 2.6.6 and 2.6.8 plus respective UML patches.

Have you tried to use different UML transports (ethertap/uml_switch)- 
tcpdump'ing the traffic - other things? Changing the host version? Increasing 
the MTU somewhere (but seems not to work)? What is strange is that the UML 
code does not even parse the IP header when using tuntap - it only works at 
the Ethernet layer. IIRC, the fragmentation happens at the IP layer...

Also, can you post the scripts you use?

> it seems to me that UML corrupts the ip header of
> fragmented udp packets somehow at sending time.
>
> I wonder, if someone uses nfs over udp on uml, this should
> be a long known issue and turn up loads of hits in a search.
> I did not find a single reference to that problem, though.

Does NFS uses large UDP packets?

> thanks,
>
> 	Lars Ellenberg
>
> please CC me, I'm not subscribed on this list.

-- 
Paolo Giarrusso, aka Blaisorblade
Linux registered user n. 292729

Re: [uml-devel] uml "ip header error" for large (fragmented) udp packets

From: Lars E. <Lar...@li...> - 2004-10-03 19:29:01

Attachments: UdpRecv.pl UdpSend.pl

/ 2004-10-03 17:11:26 +0200
\ BlaisorBlade:
> On Sunday 03 October 2004 04:36, Lars Ellenberg wrote:
> > in the upcomming new heartbeat design of linux-ha, we use relatively
> > large udp packets (passing large xml blobs back and forth between the
> > nodes). now, cluster development and testing is convenient on UML.
> >
> > but:
> >  uml seems to silently and systematically lose all fragmented
> >  udp packets, that is packages larger than mtu, and 1480 byte
> >  max for xml blobs just does not work out too good ...
> 
> > I investigated a little bit (using two simple perl snippets to
> > generate/[not]receive the udp packets), and it turns out that:
> 
> >  sending large (up to 64K; fragmented) udp packets
> >  using tuntap:
> >   HOST -> UML  works.
> >   UML  -> HOST nope :(
> >   UML <-> UML  nope :( [ neither with mcast or other ]
> >
> >  looking into /proc/net/snmp on the UMLs and the HOST
> >  show on the not receiving side an increase of
> >  InHdrError!
> >  (it never is reassembled into a proper udp packet)
> >
> > this is easy to reproduce (because it just happens all the time)
> > tested with 2.6.6 and 2.6.8 plus respective UML patches.
> 
> Have you tried to use different UML transports (ethertap/uml_switch)- 

tuntap with uml_switch, uml_switch -hub
multicast

> tcpdump'ing the traffic - other things?

host tcpdump sees packets, they appear to be correct.

tcpdump in the uml segfaults
on the second fragment for two-fragment udp packet :-) 

> Changing the host version?

host linux kernel:
2.4.21-suse something, 2.4.26, 2.6.6, 2.6.8

> Increasing the MTU somewhere (but seems not to work)?

right. does not work.

> What is strange is that the UML code does not even parse the IP header
> when using tuntap - it only works at the Ethernet layer.
> IIRC, the fragmentation happens at the IP layer...

hm... 
I only describe what I see.

> Also, can you post the scripts you use?

attached.
on one side: perl UdpRecv.pl &
on other uml, host, or same system, does not matter:
perl UdpSend.pl [<target ip>]

btw, uml to itself via uml-lo does work...

> > it seems to me that UML corrupts the ip header of
> > fragmented udp packets somehow at sending time.
> >
> > I wonder, if someone uses nfs over udp on uml, this should
> > be a long known issue and turn up loads of hits in a search.
> > I did not find a single reference to that problem, though.
> 
> Does NFS uses large UDP packets?

sometimes.
and yes, I just exported from uml,
nfs-mounted on host, did an ls in a directory with MANY files,
and it never came back
(udp never reached host, host increases InHdrError)
blocksize of exported file system was 1024, mtu is 1500, on would assume
it should just work ... but that is still an other problem.


thanks,

	lge

Re: [uml-devel] uml "ip header error" for large (fragmented) udp packets

From: Gerd K. <kr...@by...> - 2004-10-04 11:20:18

BlaisorBlade <bla...@ya...> writes:

> > I wonder, if someone uses nfs over udp on uml, this should
> > be a long known issue and turn up loads of hits in a search.
> > I did not find a single reference to that problem, though.
> 
> Does NFS uses large UDP packets?

Looks like it does.  I can ack that issue for the NFS case.  NFS over
tcp does fine (which seems to be the default, so I didn't notice until
now).  NFS over udp works, is very slow through and I get plenty of
"nfs server not responding" + "nfs server ok" messages in the syslog.
Looks like it doesn't loose all packets, but enougth to slowdown it
drastically and trigger timeouts on the client side.

That is (kernel) nfs server on the host machine, uml being connected
via tuntap networking and mounting /home using NFS.

  Gerd

-- 
return -ENOSIG;

Re: [uml-devel] uml "ip header error" for large (fragmented) udp packets

From: Henrik N. <um...@hn...> - 2004-10-04 11:57:13

On Mon, 4 Oct 2004, Gerd Knorr wrote:

>> Does NFS uses large UDP packets?
>
> Looks like it does.

The NFS message size is normally 4096 + message headers + protocol layers. 
The rsize/wsize mount parameters has a play in this.. (NFS data payload 
size within the NFS RPC message over UDP).

Regards
Henrik

Re: [uml-devel] uml "ip header error" for large (fragmented) udp packets

From: BlaisorBlade <bla...@ya...> - 2004-10-06 18:02:50

On Sunday 03 October 2004 04:36, Lars Ellenberg wrote:
> in the upcomming new heartbeat design of linux-ha, we use relatively
> large udp packets (passing large xml blobs back and forth between the
> nodes). now, cluster development and testing is convenient on UML.

> but:
>  uml seems to silently and systematically lose all fragmented
>  udp packets, that is packages larger than mtu, and 1480 byte
>  max for xml blobs just does not work out too good ...

> I investigated a little bit (using two simple perl snippets to
> generate/[not]receive the udp packets), and it turns out that:

>  sending large (up to 64K; fragmented) udp packets
>  using tuntap:
>   HOST -> UML  works.
>   UML  -> HOST nope :(
>   UML <-> UML  nope :( [ neither with mcast or other ]
>
>  looking into /proc/net/snmp on the UMLs and the HOST
>  show on the not receiving side an increase of
>  InHdrError!
>  (it never is reassembled into a proper udp packet)
>
> this is easy to reproduce (because it just happens all the time)
> tested with 2.6.6 and 2.6.8 plus respective UML patches.
>
> it seems to me that UML corrupts the ip header of
> fragmented udp packets somehow at sending time.

I've traced this with Ethereal (v0.10.5) running on tap0 and it complains that 
the IP header checksum is always incorrect when the packet is fragmented. 
This does not happen when running both programs on the host; I've set an mtu 
of 1500 for "lo" fot this test.

However, it seems that Ethereal always shows the UDP checksum, which is 
different, as incorrect for not fragmented packets, when they are sent over 
the "lo" link (on my 2.6.7 host kernel); by comparison, when sending them 
over local network it never complains. The Ethereal doc say that when 
capturing on an interface that supports TCP checksum offloading (i.e. 
hardware checksumming), this is normal for TCP checksums, so I guess this can 
happen for UDP checksums, too.

But why the loopback driver should mark itself as capable of doing "hardware 
checksum"? However, it seems that actually this is the situation. In the 
source code, the loopback driver is marked as "needing no checksum at all 
because it's safe (see NETIF_F_NO_CSUM in include/linux/skbuff.h).

Also, it seems that the UML code happily ignores specifying what checksum 
support. And this could help us.

include/linux/skbuff.h describes the Checksum flags, and UML does not use 
them: these two commands return no output.

find arch/um/ -name '*.[ch]'|xargs grep NETIF
find arch/um/ -name '*.[ch]'|xargs grep CHECKSUM

Actually I've never done any work *at all* on the networking code, so this is 
just a wild guess.
> the way it is now we probably need to reanimate some of our old
> boxes to form a real test cluster. and believe me, that is no fun :(

I've tried UML 2.4, and it does not seem to experience this bug: it does not 
increases the host error count in /proc/net/snmp, UdpRecv receives all 
packet sizes (I stopped the test at 49100 bytes), and even Ethereal shows 
correct datas. The test were run sending 
the packets from UML to the Host, as you say.

So this could help you for now, while we try to find a clue about this. Quite 
frankly, I must say that I'm not seeing any network kernel hacker here 
(correct me if I'm wrong), so it will take some time to debug it. Maybe Gerd 
Knorr is an exception, actually.

> if some kind soul would be able to fix that...
> would make cluster testing as we do it
> so much more convenient :-)

-- 
Paolo Giarrusso, aka Blaisorblade
Linux registered user n. 292729

Re: [uml-devel] uml "ip header error" for large (fragmented) udp packets

From: Lars E. <Lar...@li...> - 2004-10-06 18:48:20

> So this could help you for now, while we try to find a clue about this.=
 Quite=20
> frankly, I must say that I'm not seeing any network kernel hacker here=20
> (correct me if I'm wrong), so it will take some time to debug it. Maybe=
 Gerd=20
> Knorr is an exception, actually.

Well, then I take Andi Kleen and Lars Marowsky-Br=E9e into CC for now.
Lars, because I expect him to be interessted in having UML as full
featured cluster simulation tool available, and Andi because I hope he
might know the network code much better than me...

FYI, full thread can be found for example at
http://thread.gmane.org/gmane.linux.uml.devel/4607

Thanks,

	Lars Ellenberg

/ 2004-10-06 19:50:46 +0200
\ BlaisorBlade:
> On Sunday 03 October 2004 04:36, Lars Ellenberg wrote:
> > in the upcomming new heartbeat design of linux-ha, we use relatively
> > large udp packets (passing large xml blobs back and forth between the
> > nodes). now, cluster development and testing is convenient on UML.
>=20
> > but:
> >  uml seems to silently and systematically lose all fragmented
> >  udp packets, that is packages larger than mtu, and 1480 byte
> >  max for xml blobs just does not work out too good ...
>=20
> > I investigated a little bit (using two simple perl snippets to
> > generate/[not]receive the udp packets), and it turns out that:
>=20
> >  sending large (up to 64K; fragmented) udp packets
> >  using tuntap:
> >   HOST -> UML  works.
> >   UML  -> HOST nope :(
> >   UML <-> UML  nope :( [ neither with mcast or other ]
> >
> >  looking into /proc/net/snmp on the UMLs and the HOST
> >  show on the not receiving side an increase of
> >  InHdrError!
> >  (it never is reassembled into a proper udp packet)
> >
> > this is easy to reproduce (because it just happens all the time)
> > tested with 2.6.6 and 2.6.8 plus respective UML patches.
> >
> > it seems to me that UML corrupts the ip header of
> > fragmented udp packets somehow at sending time.
>=20
> I've traced this with Ethereal (v0.10.5) running on tap0 and it complai=
ns that=20
> the IP header checksum is always incorrect when the packet is fragmente=
d.=20
> This does not happen when running both programs on the host; I've set a=
n mtu=20
> of 1500 for "lo" fot this test.
>=20
> However, it seems that Ethereal always shows the UDP checksum, which is=
=20
> different, as incorrect for not fragmented packets, when they are sent =
over=20
> the "lo" link (on my 2.6.7 host kernel); by comparison, when sending th=
em=20
> over local network it never complains. The Ethereal doc say that when=20
> capturing on an interface that supports TCP checksum offloading (i.e.=20
> hardware checksumming), this is normal for TCP checksums, so I guess th=
is can=20
> happen for UDP checksums, too.
>=20
> But why the loopback driver should mark itself as capable of doing "har=
dware=20
> checksum"? However, it seems that actually this is the situation. In th=
e=20
> source code, the loopback driver is marked as "needing no checksum at a=
ll=20
> because it's safe (see NETIF_F_NO_CSUM in include/linux/skbuff.h).
>=20
> Also, it seems that the UML code happily ignores specifying what checks=
um=20
> support. And this could help us.
>=20
> include/linux/skbuff.h describes the Checksum flags, and UML does not u=
se=20
> them: these two commands return no output.
>=20
> find arch/um/ -name '*.[ch]'|xargs grep NETIF
> find arch/um/ -name '*.[ch]'|xargs grep CHECKSUM
>=20
> Actually I've never done any work *at all* on the networking code, so t=
his is=20
> just a wild guess.
> > the way it is now we probably need to reanimate some of our old
> > boxes to form a real test cluster. and believe me, that is no fun :(
>=20
> I've tried UML 2.4, and it does not seem to experience this bug: it doe=
s not=20
> increases the host error count in /proc/net/snmp, UdpRecv receives all=20
> packet sizes (I stopped the test at 49100 bytes), and even Ethereal sho=
ws=20
> correct datas. The test were run sending=20
> the packets from UML to the Host, as you say.
>=20
> So this could help you for now, while we try to find a clue about this.=
 Quite=20
> frankly, I must say that I'm not seeing any network kernel hacker here=20
> (correct me if I'm wrong), so it will take some time to debug it. Maybe=
 Gerd=20
> Knorr is an exception, actually.
>=20
> > if some kind soul would be able to fix that...
> > would make cluster testing as we do it
> > so much more convenient :-)
>=20
> --=20
> Paolo Giarrusso, aka Blaisorblade
> Linux registered user n. 292729

Re: [uml-devel] uml "ip header error" for large (fragmented) udp packets

From: Andi K. <ak...@su...> - 2004-10-06 20:38:49

On Wed, Oct 06, 2004 at 08:48:23PM +0200, Lars Ellenberg wrote:
> > So this could help you for now, while we try to find a clue about this. Quite 
> > frankly, I must say that I'm not seeing any network kernel hacker here 
> > (correct me if I'm wrong), so it will take some time to debug it. Maybe Gerd 
> > Knorr is an exception, actually.
> 
> Well, then I take Andi Kleen and Lars Marowsky-Br?e into CC for now.
> Lars, because I expect him to be interessted in having UML as full
> featured cluster simulation tool available, and Andi because I hope he
> might know the network code much better than me...
> 
> FYI, full thread can be found for example at
> http://thread.gmane.org/gmane.linux.uml.devel/4607

Paolo's analysis is basically correct. loopback sets this flag
for better performance.  Actually in 2.6 it probably doesn't help
very much anymore because TCP can do checksum copy RX now, and that
would get the checksum basically for free. But it's still there
and may still make things slightly faster.

If UML taps the packets from lo it will see incorrect checksums.

Using a tun or ethertap device would avoid this. In the worst
case you could also just delete the flag from the loopback
interface, it's only an optimization.

-Andi

Re: [uml-devel] uml "ip header error" for large (fragmented) udp packets

From: Lars E. <Lar...@li...> - 2004-10-06 21:49:33

/ 2004-10-06 22:35:39 +0200
\ Andi Kleen:
> On Wed, Oct 06, 2004 at 08:48:23PM +0200, Lars Ellenberg wrote:
> > > So this could help you for now, while we try to find a clue about this. Quite 
> > > frankly, I must say that I'm not seeing any network kernel hacker here 
> > > (correct me if I'm wrong), so it will take some time to debug it. Maybe Gerd 
> > > Knorr is an exception, actually.
> > 
> > Well, then I take Andi Kleen and Lars Marowsky-Br?e into CC for now.
> > Lars, because I expect him to be interessted in having UML as full
> > featured cluster simulation tool available, and Andi because I hope he
> > might know the network code much better than me...
> > 
> > FYI, full thread can be found for example at
> > http://thread.gmane.org/gmane.linux.uml.devel/4607
> 
> Paolo's analysis is basically correct. loopback sets this flag
> for better performance.  Actually in 2.6 it probably doesn't help
> very much anymore because TCP can do checksum copy RX now, and that
> would get the checksum basically for free. But it's still there
> and may still make things slightly faster.
> 
> If UML taps the packets from lo it will see incorrect checksums.
> 
> Using a tun or ethertap device would avoid this. In the worst
> case you could also just delete the flag from the loopback
> interface, it's only an optimization.
> 
> -Andi

unfortunately ethertap transport does not work either,
at least if UML is 2.6.6 and host kernel is 2.4.21-suse-whatever...
I did not try other combinations yet, but I doubt that changes a thing.

you suggest that we remove NET_IF_F_NO_CSUM from lo in the host?
ok, I'll try recompile my host then, and followup if that helps.

	lge

Re: [uml-devel] uml "ip header error" for large (fragmented) udp packets

From: BlaisorBlade <bla...@ya...> - 2004-10-07 18:41:20

On Wednesday 06 October 2004 23:48, Lars Ellenberg wrote:
> / 2004-10-06 22:35:39 +0200
>
> \ Andi Kleen:
> > On Wed, Oct 06, 2004 at 08:48:23PM +0200, Lars Ellenberg wrote:
> > > > So this could help you for now, while we try to find a clue about
> > > > this. Quite frankly, I must say that I'm not seeing any network
> > > > kernel hacker here (correct me if I'm wrong), so it will take some
> > > > time to debug it. Maybe Gerd Knorr is an exception, actually.
> > >
> > > Well, then I take Andi Kleen and Lars Marowsky-Br?e into CC for now.
> > > Lars, because I expect him to be interessted in having UML as full
> > > featured cluster simulation tool available, and Andi because I hope he
> > > might know the network code much better than me...
> > >
> > > FYI, full thread can be found for example at
> > > http://thread.gmane.org/gmane.linux.uml.devel/4607
> >
> > Paolo's analysis is basically correct. loopback sets this flag
> > for better performance.  Actually in 2.6 it probably doesn't help
> > very much anymore because TCP can do checksum copy RX now, and that
> > would get the checksum basically for free. But it's still there
> > and may still make things slightly faster.
> >
> > If UML taps the packets from lo it will see incorrect checksums.
It does not, so the solution is not the right one.
> > Using a tun or ethertap device would avoid this. In the worst
> > case you could also just delete the flag from the loopback
> > interface, it's only an optimization.
> >
> > -Andi
>
> unfortunately ethertap transport does not work either,
> at least if UML is 2.6.6 and host kernel is 2.4.21-suse-whatever...
> I did not try other combinations yet, but I doubt that changes a thing.
>
> you suggest that we remove NET_IF_F_NO_CSUM from lo in the host?
> ok, I'll try recompile my host then, and followup if that helps.

No, I think he spoke about the guest; also he misunderstood a bit the problem, 
since the packets do not go through the "lo" interface inside UML.

-- 
Paolo Giarrusso, aka Blaisorblade
Linux registered user n. 292729

Re: [uml-devel] uml "ip header error" for large (fragmented) udp packets

From: BlaisorBlade <bla...@ya...> - 2004-10-07 18:41:56

On Wednesday 06 October 2004 22:35, Andi Kleen wrote:
> On Wed, Oct 06, 2004 at 08:48:23PM +0200, Lars Ellenberg wrote:
> > > So this could help you for now, while we try to find a clue about this.
> > > Quite frankly, I must say that I'm not seeing any network kernel hacker
> > > here (correct me if I'm wrong), so it will take some time to debug it.
> > > Maybe Gerd Knorr is an exception, actually.

> > Well, then I take Andi Kleen and Lars Marowsky-Br?e into CC for now.
> > Lars, because I expect him to be interessted in having UML as full
> > featured cluster simulation tool available, and Andi because I hope he
> > might know the network code much better than me...

> > FYI, full thread can be found for example at
> > http://thread.gmane.org/gmane.linux.uml.devel/4607

> Paolo's analysis is basically correct. loopback sets this flag
> for better performance.  Actually in 2.6 it probably doesn't help
> very much anymore because TCP can do checksum copy RX now, and that
> would get the checksum basically for free. But it's still there
> and may still make things slightly faster.
First thing: thanks a lot for your quick answer.

My discussion about "lo" was slightly unrelated to the exact problem, and a 
bit confusing...

I was at first surprised from Ethereal complaining about the host kernel, so I 
thought I could have a buggy Ethereal, and then went checking that it's a 
Linux optimization, indeed.

> If UML taps the packets from lo it will see incorrect checksums.
> Using a tun or ethertap device would avoid this.

> In the worst 
> case you could also just delete the flag from the loopback
> interface, it's only an optimization.

No, inside the Uml kernel they go through a virtual "ethN" interface, which 
uses special code. That driver, in turn, will use either ethertap, or TAP (it 
sends Ethernet frames), or even other mechanism.

You can find it (in 2.6.9-rc2 at least) in arch/um/drivers/net_*.c and 
arch/um/os-Linux/drivers/*tap*.c. The code in *_kern.c files links against 
the kernel API and includes, *_user.c against the host userspace includes.

And the problem is, probably, that the UML network drivers never declare their 
checksumming status, as I said in the previous mail:

[quote]
include/linux/skbuff.h describes the Checksum flags, and UML does not use 
them: these two commands return no (relevant) output.

find arch/um/ -name '*.[ch]'|xargs grep NETIF
find arch/um/ -name '*.[ch]'|xargs grep CHECKSUM
[/quote]

And not even these ones:

find arch/um/ -name '*.[ch]'|xargs grep NETIF
find arch/um/ -name '*.[ch]'|xargs grep CHECKSUM

Also, it's possible that there are even other bugs...
-- 
Paolo Giarrusso, aka Blaisorblade
Linux registered user n. 292729

Re: [uml-devel] uml "ip header error" for large (fragmented) udp packets

From: Lars E. <Lar...@li...> - 2004-10-07 20:25:47

/ 2004-10-07 20:41:51 +0200
\ BlaisorBlade:
> On Wednesday 06 October 2004 22:35, Andi Kleen wrote:
> > On Wed, Oct 06, 2004 at 08:48:23PM +0200, Lars Ellenberg wrote:

> > > FYI, full thread can be found for example at
> > > http://thread.gmane.org/gmane.linux.uml.devel/4607
> 
> > Paolo's analysis is basically correct. loopback sets this flag
> > for better performance.  Actually in 2.6 it probably doesn't help
> > very much anymore because TCP can do checksum copy RX now, and that
> > would get the checksum basically for free. But it's still there
> > and may still make things slightly faster.
> First thing: thanks a lot for your quick answer.
> 
> My discussion about "lo" was slightly unrelated to the exact problem, and a 
> bit confusing...
> 
> I was at first surprised from Ethereal complaining about the host kernel, so I 
> thought I could have a buggy Ethereal, and then went checking that it's a 
> Linux optimization, indeed.
> 
> > If UML taps the packets from lo it will see incorrect checksums.
> > Using a tun or ethertap device would avoid this.
> 
> > In the worst 
> > case you could also just delete the flag from the loopback
> > interface, it's only an optimization.
> 
> No, inside the Uml kernel they go through a virtual "ethN" interface, which 
> uses special code. That driver, in turn, will use either ethertap, or TAP (it 
> sends Ethernet frames), or even other mechanism.
> 
> You can find it (in 2.6.9-rc2 at least) in arch/um/drivers/net_*.c and 
> arch/um/os-Linux/drivers/*tap*.c. The code in *_kern.c files links against 
> the kernel API and includes, *_user.c against the host userspace includes.
> 
> And the problem is, probably, that the UML network drivers never declare their 
> checksumming status, as I said in the previous mail:
> 
> [quote]
> include/linux/skbuff.h describes the Checksum flags, and UML does not use 
> them: these two commands return no (relevant) output.
> 
> find arch/um/ -name '*.[ch]'|xargs grep NETIF
> find arch/um/ -name '*.[ch]'|xargs grep CHECKSUM
> [/quote]
> 
> And not even these ones:
> 
> find arch/um/ -name '*.[ch]'|xargs grep NETIF
> find arch/um/ -name '*.[ch]'|xargs grep CHECKSUM
> 
> Also, it's possible that there are even other bugs...

now, what I found:
arch/um/drivers/net_kern.c:
struct sk_buff *ether_adjust_skb(struct sk_buff *skb, int extra)
{
	if((skb != NULL) && (skb_tailroom(skb) < extra)){
	  	struct sk_buff *skb2;

		skb2 = skb_copy_expand(skb, 0, extra, GFP_ATOMIC);
		dev_kfree_skb(skb);
		skb = skb2;
	}
	if(skb != NULL) skb_put(skb, extra);
	return(skb);
}

net/core/skbuff.c:
 *	BUG ALERT: ip_summed is not copied. Why does this work? Is it used
 *	only by netfilter in the cases when checksum is recalculated? --ANK
 */
struct sk_buff *skb_copy_expand(const struct sk_buff *skb,
				int newheadroom, int newtailroom, int gfp_mask)
{


does that trigger something in someones brain maybe?
someone "sees" it?

otherwise I keep poking around...

	lge

Re: [uml-devel] uml "ip header error" for large (fragmented) udp packets

From: Lars E. <Lar...@li...> - 2004-10-11 17:54:24

/ 2004-10-07 22:24:54 +0200
\ Lars Ellenberg:

> now, what I found:
> arch/um/drivers/net_kern.c:
> struct sk_buff *ether_adjust_skb(struct sk_buff *skb, int extra)
> {
> 	if((skb != NULL) && (skb_tailroom(skb) < extra)){
> 	  	struct sk_buff *skb2;
> 
> 		skb2 = skb_copy_expand(skb, 0, extra, GFP_ATOMIC);
> 		dev_kfree_skb(skb);
> 		skb = skb2;
> 	}
> 	if(skb != NULL) skb_put(skb, extra);
> 	return(skb);
> }
> 
> net/core/skbuff.c:
>  *	BUG ALERT: ip_summed is not copied. Why does this work? Is it used
>  *	only by netfilter in the cases when checksum is recalculated? --ANK
>  */
> struct sk_buff *skb_copy_expand(const struct sk_buff *skb,
> 				int newheadroom, int newtailroom, int gfp_mask)
> {
> 
> 
> does that trigger something in someones brain maybe?
> someone "sees" it?
> 
> otherwise I keep poking around...

since UML => Host does not work, but
Host => UML does, this suggests that
the bug is somewhere on the sending side.

I was not able to track it down.

but, I just patched out the "verify checksum" from the receiving part,
and get 16000 byte through now, sometimes more (this seems to be a
buffer issue).

so for now, I just don't care for the ip checksum on my UMLs,
and I have to live with no fragmented UDP from UML => Host.
but between my UMLs, I have now up to 16k UDP, that should be enough for
the moment beeing.


bute but effective: dont care for checksums.  at first I had additional
printks in the code wherever it said "goto inhdr_error", to find where
exactly it breaks.
only the now #if 0'ed one triggered.

	Lars Ellenberg


--- linux-2.6.6/net/ipv4/ip_input.c.orig	2004-10-11 19:35:57.000000000 +0200
+++ linux-2.6.6/net/ipv4/ip_input.c	2004-10-11 19:53:58.000000000 +0200
@@ -403,8 +403,10 @@
 
 	iph = skb->nh.iph;
 
+#if 0
 	if (ip_fast_csum((u8 *)iph, iph->ihl) != 0)
 		goto inhdr_error; 
+#endif
 
 	{
 		__u32 len = ntohs(iph->tot_len);

Re: [uml-devel] SOLVED: uml "ip header error" for large (fragmented) udp packets

From: Lars E. <Lar...@li...> - 2004-10-12 00:02:34

/ 2004-10-11 19:55:12 +0200
\ Lars Ellenberg:
> since UML => Host does not work, but
> Host => UML does, this suggests that
> the bug is somewhere on the sending side.
> 
> I was not able to track it down.

just so you know, finally:

=======================
--- linux-2.6.6/arch/um/include/sysdep-i386/checksum.h.orig	2004-10-12 01:50:49.000000000 +0200
+++ linux-2.6.6/arch/um/include/sysdep-i386/checksum.h	2004-10-12 01:50:58.000000000 +0200
@@ -102,8 +102,7 @@
 	   are modified, we must also specify them as outputs, or gcc
 	   will assume they contain their original values. */
 	: "=r" (sum), "=r" (iph), "=r" (ihl)
-	: "1" (iph), "2" (ihl)
-	: "memory");
+	: "1" (iph), "2" (ihl));
 	return(sum);
 }
 
=======================

that's all, folks. only a missing memory barrier.

WTF :-/

same patch applies to 2.6.8.1 - uml and probably all other umls.
since that is a one-to-one copy anyways, maybe UML should better use the
original (in include/asm-i386/checksum.h) right away??
there may be similar bugs hiding in various areas of uml...

Thanks for now,
keep it going...

btw,
anyone wants to give me a hint how to tune it best?
maybe how to up the mtu of the UML "nics"?

	Lars Ellenberg

"NLRge your UML-UDP" ...

Re: [uml-devel] SOLVED: uml "ip header error" for large (fragmented) udp packets

From: Andi K. <ak...@su...> - 2004-10-12 00:18:40

>  	   will assume they contain their original values. */
>  	: "=r" (sum), "=r" (iph), "=r" (ihl)
> -	: "1" (iph), "2" (ihl)
> -	: "memory");
> +	: "1" (iph), "2" (ihl));
>  	return(sum);
>  }
>  
> =======================

That's reverted, right? 
> 
> that's all, folks. only a missing memory barrier.
> 
> WTF :-/

This was fixed in mainline some time ago (several months probably more) 
The problem only started with newer gccs that optimize more aggressively.

> original (in include/asm-i386/checksum.h) right away??
> there may be similar bugs hiding in various areas of uml...

Sounds like a good idea. 

> 
> Thanks for now,
> keep it going...
> 
> btw,
> anyone wants to give me a hint how to tune it best?
> maybe how to up the mtu of the UML "nics"?

Don't go over 4K because the VM doesn't like >order 0 allocations
very much. But in general bigger is better.

-Andi

Re: [uml-devel] SOLVED: uml "ip header error" for large (fragmented) udp packets

From: BlaisorBlade <bla...@ya...> - 2004-10-12 01:10:55

On Tuesday 12 October 2004 02:11, Andi Kleen wrote:
> >  	   will assume they contain their original values. */
> >
> >  	: "=r" (sum), "=r" (iph), "=r" (ihl)
> >
> > -	: "1" (iph), "2" (ihl)
> > -	: "memory");
> > +	: "1" (iph), "2" (ihl));
> >  	return(sum);
> >  }
> >
> > =======================
>
> That's reverted, right?
>
> > that's all, folks. only a missing memory barrier.
> >
> > WTF :-/
>
> This was fixed in mainline some time ago (several months probably more)
> The problem only started with newer gccs that optimize more aggressively.
>
> > original (in include/asm-i386/checksum.h) right away??
> > there may be similar bugs hiding in various areas of uml...
>
> Sounds like a good idea.
Agreed, but I've to check if the include does not have any conflict. And I 
don't have the time until after 2.6.9, because I must address more urgent 
issues. Obviously the one-liner itself is being sent to Andrew Morton.

> > Thanks for now,
> > keep it going...
> >
> > btw,
> > anyone wants to give me a hint how to tune it best?
> > maybe how to up the mtu of the UML "nics"?
>
> Don't go over 4K because the VM doesn't like >order 0 allocations
> very much. But in general bigger is better.

Sadly there is a problem: since we use TAP and emulate whole Ethernet frames, 
the code does not allow to increase the MTU to > 1500 bytes. I think this 
cannot be fixed currently, but if you think this is wrong, please let us now.

Obviously we could add another interface emulation with bigger MTU, but I've 
no ideas about which ones to emulate.
-- 
Paolo Giarrusso, aka Blaisorblade
Linux registered user n. 292729

Re: [uml-devel] larger MTU _does_ work

From: Lars E. <Lar...@li...> - 2004-10-15 15:25:32

/ 2004-10-12 03:10:59 +0200
\ BlaisorBlade:
> > > btw,
> > > anyone wants to give me a hint how to tune it best?
> > > maybe how to up the mtu of the UML "nics"?
> >
> > Don't go over 4K because the VM doesn't like >order 0 allocations
> > very much. But in general bigger is better.
> 
> Sadly there is a problem: since we use TAP and emulate whole Ethernet frames, 
> the code does not allow to increase the MTU to > 1500 bytes. I think this 
> cannot be fixed currently, but if you think this is wrong, please let us now.
> 
> Obviously we could add another interface emulation with bigger MTU, but I've 
> no ideas about which ones to emulate.

now, if I
  config eth0=daemon,FE:FD:00:00:00:01,,/tmp/uml0.ctl
and then
  ifconfig eth0 mtu 8192, uml does not complain
  (but does not work, either).

and if I then patch uml_switch from uml_utilities_20040114 (which is
what I have here, did not check whether there is anything newer out there)

===================
diff -ru tools.orig/uml_router/port.c tools/uml_router/port.c
--- tools.orig/uml_router/port.c	2003-03-12 16:19:03.000000000 +0100
+++ tools/uml_router/port.c	2004-10-15 16:59:07.000000000 +0200
@@ -14,7 +14,7 @@
     unsigned char src[ETH_ALEN];
     unsigned char proto[2];
   } header;
-  unsigned char data[1500];
+  unsigned char data[9000];
 };
 
 struct port {
===================

it even works!

I guess with mcast or other non-daemon transport (not yet tried),
it will just work, too (maybe you need to adjust the mtu on the host).

for the first time I really get 65507 byte udp packages through uml <-> uml.
(not that it makes any sense to use udp with that large messages ...)
anything larger won't work anyways...


	Lars Ellenberg

Re: [uml-devel] SOLVED: uml "ip header error" for large (fragmented) udp packets

From: BlaisorBlade <bla...@ya...> - 2004-10-12 00:26:46

On Tuesday 12 October 2004 02:03, Lars Ellenberg wrote:
> / 2004-10-11 19:55:12 +0200
>
> \ Lars Ellenberg:
> > since UML => Host does not work, but
> > Host => UML does, this suggests that
> > the bug is somewhere on the sending side.
> >
> > I was not able to track it down.
>
> just so you know, finally:

> =======================
> --- linux-2.6.6/arch/um/include/sysdep-i386/checksum.h.orig	2004-10-12
> 01:50:49.000000000 +0200 +++
> linux-2.6.6/arch/um/include/sysdep-i386/checksum.h	2004-10-12
> 01:50:58.000000000 +0200 @@ -102,8 +102,7 @@
>  	   are modified, we must also specify them as outputs, or gcc
>  	   will assume they contain their original values. */
>
>  	: "=r" (sum), "=r" (iph), "=r" (ihl)
>
> -	: "1" (iph), "2" (ihl)
> -	: "memory");
> +	: "1" (iph), "2" (ihl));
>  	return(sum);
>  }
The patch you posted REMOVES a memory barrier - you reversed it. I actually 
checked the barrier is missing in the source code; but the strange thing is 
that you modified checksum.h.orig and not checksum.h! Are you sure that you 
compiled the corrected header? However, the patch is correct, and I assume 
you posted it the right way. I'm going to send it to Andrew Morton for 2.6.9 
(hoping he wants to accept all these patches - they are rushing for 2.6.9).

> that's all, folks. only a missing memory barrier.
Thanks a lot, folks! Yes, the patch is right.
> WTF :-/
>
> same patch applies to 2.6.8.1 - uml and probably all other umls.
> since that is a one-to-one copy anyways, maybe UML should better use the
> original (in include/asm-i386/checksum.h) right away??
I could do this later - too much checks for the other header code for a 
one-minute patch.
> there may be similar bugs hiding in various areas of uml...
I think there are even worse ones.
> btw,
> anyone wants to give me a hint how to tune it best?
> maybe how to up the mtu of the UML "nics"?
Impossible with the current code - it emulates an ethernet card, so I don't 
think you can do anything for this. However I hope it does not cause too much 
problems. Search for the network howtos: someone explained how to set the 
interface packet scheduler (maybe even Jeff Dike on the main UML site).

Bye
-- 
Paolo Giarrusso, aka Blaisorblade
Linux registered user n. 292729

Re: [uml-devel] SOLVED: uml "ip header error" for large (fragmented) udp packets

From: Lars E. <Lar...@li...> - 2004-10-12 14:03:29

/ 2004-10-12 02:27:00 +0200
\ BlaisorBlade:
> On Tuesday 12 October 2004 02:03, Lars Ellenberg wrote:
> > / 2004-10-11 19:55:12 +0200
> >
> > \ Lars Ellenberg:
> > > since UML => Host does not work, but
> > > Host => UML does, this suggests that
> > > the bug is somewhere on the sending side.
> > >
> > > I was not able to track it down.
> >
> > just so you know, finally:
> 
> > =======================
> > --- linux-2.6.6/arch/um/include/sysdep-i386/checksum.h.orig	2004-10-12
> > 01:50:49.000000000 +0200 +++
> > linux-2.6.6/arch/um/include/sysdep-i386/checksum.h	2004-10-12
> > 01:50:58.000000000 +0200 @@ -102,8 +102,7 @@
> >  	   are modified, we must also specify them as outputs, or gcc
> >  	   will assume they contain their original values. */
> >
> >  	: "=r" (sum), "=r" (iph), "=r" (ihl)
> >
> > -	: "1" (iph), "2" (ihl)
> > -	: "memory");
> > +	: "1" (iph), "2" (ihl));
> >  	return(sum);
> >  }
> The patch you posted REMOVES a memory barrier - you reversed it. I actually 
> checked the barrier is missing in the source code; but the strange thing is 
> that you modified checksum.h.orig and not checksum.h!

nope.
I just did "diff -u linux-2.6.6/arch/um/include/sysdep-i386/checksum.h{,.orig}"
instead of "diff -u linux-2.6.6/arch/um/include/sysdep-i386/checksum.h{.orig,}"
and did not notice.
Hey, it was in the middle of the night ;-)

thanks,
	lge