Re: [mpls-linux-general] iptables problem?
Status: Beta
Brought to you by:
jleu
From: James R. L. <jl...@mi...> - 2009-04-21 16:09:57
|
Hello Tom, I'm going to top-post since my response is more generalized: I think you are observing the effects of the Linux route cache. All traffic that goes through the IPv4 stack has two 'stages' to its forwarding. The first couple of packets go through the 'slow' path in which a full iptables/route lookup is done. The results of that may be summarized into a route cache entry. From that point on all traffic that matches the src and dst of the route cache use the 'fast' path and are forwarded according to the contents of the route cache. I believe the route cache entries timeout after ~300 seconds. You can look at the contents of the route cache by doing: ip route show cache And you can flush it via: ip route flush cache Does this explain what you are seeing? On Tue, Apr 21, 2009 at 04:21:24PM +0200, Tom Kleiberg wrote: > Hello all, > > I have a question regarding iptables and mpls. I use four nodes (as in > the MPLS for Linux example on the SF.net website), mpls version 1.962, > FC8. The mpls packages are installed from the repository (so no custom > patching etc). I try to filter the packets using iptables and mangle > them so that they are directed onto an LSP, e.g. > > sudo /sbin/mpls nhlfe add key 0 instructions push gen 1001 nexthop eth1 > ipv4 10.0.0.2 > (returns 0x2) > sudo /sbin/iptables -t mangle -A POSTROUTING -s 10.0.0.1 -d 10.0.0.10 -p > tcp --source-port 4000 --destination-port 4001 -j mpls --nhlfe 0x2 > > > At the downstream nodes, the label is swapped and finally removed at the > last hop before the destination. > Now when I generate traffic (using d-itg), I see the following tcpdump > at the first downstream node (10.0.0.2): > > 16:01:25.713116 MPLS (label 1001, exp 0, [S], ttl 64) > IP (tos 0x0, ttl 64, id 22294, offset 0, flags [DF], proto TCP (6), > length 564) 10.0.0.1.4000 > 10.0.0.10.4001: P 958977:959489(512) ack 1 > win 46 <nop,nop,timestamp 3001565 3070458> > 16:01:25.713342 IP (tos 0x0, ttl 62, id 4498, offset 0, flags [DF], > proto TCP (6), length 52) 10.0.0.10.4001 > 10.0.0.1.4000: ., cksum > 0xa1e4 (correct), ack 959489 win 5024 <nop,nop,timestamp 3070459 > 3001565> > 16:01:25.714115 MPLS (label 1001, exp 0, [S], ttl 64) > IP (tos 0x0, ttl 64, id 22295, offset 0, flags [DF], proto TCP (6), > length 564) 10.0.0.1.4000 > 10.0.0.10.4001: P 959489:960001(512) ack 1 > win 46 <nop,nop,timestamp 3001566 3070459> > 16:01:25.714337 IP (tos 0x0, ttl 62, id 4499, offset 0, flags [DF], > proto TCP (6), length 52) 10.0.0.10.4001 > 10.0.0.1.4000: ., cksum > 0x9fe2 (correct), ack 960001 win 5024 <nop,nop,timestamp 3070460 > 3001566> > 16:01:25.715116 MPLS (label 1001, exp 0, [S], ttl 64) > IP (tos 0x0, ttl 64, id 22296, offset 0, flags [DF], proto TCP (6), > length 564) 10.0.0.1.4000 > 10.0.0.10.4001: P 960001:960513(512) ack 1 > win 46 <nop,nop,timestamp 3001567 3070460> > 16:01:25.715339 IP (tos 0x0, ttl 62, id 4500, offset 0, flags [DF], > proto TCP (6), length 52) 10.0.0.10.4001 > 10.0.0.1.4000: ., cksum > 0x9de0 (correct), ack 960513 win 5024 <nop,nop,timestamp 3070461 > 3001567> > > However, when I change the source or destination port used by the > traffic generator (or the protocol), then that traffic (which does not > match the iptables rule), is still mapped onto the LSP (in this case the > destination port is changed to 4002, but any other traffic to 10.0.0.10 > is in fact mapped onto the LSP). > > 16:11:31.590680 MPLS (label 1001, exp 0, [S], ttl 64) > IP (tos 0x0, ttl 64, id 30521, offset 0, flags [DF], proto TCP (6), > length 564) 10.0.0.1.4000 > 10.0.0.10.4002: P 1021953:1022465(512) ack 1 > win 46 <nop,nop,timestamp 3607443 3676339> > 16:11:31.591011 IP (tos 0x0, ttl 62, id 46089, offset 0, flags [DF], > proto TCP (6), length 52) 10.0.0.10.4002 > 10.0.0.1.4000: ., cksum > 0xeeb5 (correct), ack 1022465 win 5876 <nop,nop,timestamp 3676340 > 3607443> > 16:11:31.591681 MPLS (label 1001, exp 0, [S], ttl 64) > IP (tos 0x0, ttl 64, id 30522, offset 0, flags [DF], proto TCP (6), > length 564) 10.0.0.1.4000 > 10.0.0.10.4002: P 1022465:1022977(512) ack 1 > win 46 <nop,nop,timestamp 3607444 3676340> > 16:11:31.592009 IP (tos 0x0, ttl 62, id 46090, offset 0, flags [DF], > proto TCP (6), length 52) 10.0.0.10.4002 > 10.0.0.1.4000: ., cksum > 0xeca3 (correct), ack 1022977 win 5892 <nop,nop,timestamp 3676341 > 3607444> > 16:11:31.592681 MPLS (label 1001, exp 0, [S], ttl 64) > IP (tos 0x0, ttl 64, id 30523, offset 0, flags [DF], proto TCP (6), > length 564) 10.0.0.1.4000 > 10.0.0.10.4002: P 1022977:1023489(512) ack 1 > win 46 <nop,nop,timestamp 3607445 3676341> > 16:11:31.593007 IP (tos 0x0, ttl 62, id 46091, offset 0, flags [DF], > proto TCP (6), length 52) 10.0.0.10.4002 > 10.0.0.1.4000: ., cksum > 0xea90 (correct), ack 1023489 win 5909 <nop,nop,timestamp 3676342 > 3607445> > 16:11:31.593681 MPLS (label 1001, exp 0, [S], ttl 64) > IP (tos 0x0, ttl 64, id 30524, offset 0, flags [DF], proto TCP (6), > length 564) 10.0.0.1.4000 > 10.0.0.10.4002: P 1023489:1024001(512) ack 1 > win 46 <nop,nop,timestamp 3607446 3676342> > 16:11:31.594005 IP (tos 0x0, ttl 62, id 46092, offset 0, flags [DF], > proto TCP (6), length 52) 10.0.0.10.4002 > 10.0.0.1.4000: ., cksum > 0xe87e (correct), ack 1024001 win 5926 <nop,nop,timestamp 3676342 > 3607446> > > When I query the number of packets that match an iptables rule (iptables > -t mangle -vnL), the query does not show any increment between after the > second batch of traffic is generated (so according to the iptables output > the packets that do not match the filter are not mapped onto the LSP, > although in fact it DOES happen). > > More interestingly, the incorrect behavior disappears after some > time...so when I do not send packets from the source to the destination, > the filtering recovers itself (until the packets are being send that do > match the filter)... > > I hope my question makes sense and someone can help me figure out the > problem. > > thanks > tom > > > -- James R. Leu jl...@mi... |