Re: [mpls-linux-devel] Current state of dst stacking on davem implementation
Status: Beta
Brought to you by:
jleu
From: James R. L. <jl...@mi...> - 2004-04-20 13:38:25
|
Sorry for being absent for the last couple of weeks, I'm in the midst of the busiest time of the year for me. I'm preparing for the Network + InterOp tradeshow. This year I'm working on an 'Advanced Internetworking Initiative' which basically means MPLS services :-) We have equiptment from 10 vendors and over 20 devices. We'll be concentrating on MPLS BGP VPNs (v4,v6,Multicast), VPLS, and Carrier's Carrier. Our main area of 'experimenting' is in the multicast space. We're showing carriers and vendors how BGP VPNs have an advantage (although only slight) over L2 VPNs (ala VPLS) with repect to multicast. Hopefully the result will be L2 VPN vendors re-thinking their implementation. Anyway enough of that, onto the questions I've missed. On Tue, Apr 20, 2004 at 01:22:58PM +0200, Ramon Casellas wrote: > On Mon, 19 Apr 2004, Jamal Hadi Salim wrote: > > > On Mon, 2004-04-19 at 02:22, Ramon Casellas wrote: <snip> > OTOH, with the new extensions for L2SC/PWE3 control in (G)MPLS we may need > to work more closely with the bridging people (although most things belong > clearly to userspace). I know that James has been working in this area, > but I don't know the details or whether he has been working with other > people. My work, as usual, is a solo effort. If anything I've done can be of use, I'd be more the happy to contribute. > And finally, regarding the mpls stuff itself. I am not sure of > understanding what do you mean with 'we need to show some progress'. There > are other things discussed in previous mails that are there, but I don't > recall having had an agreement. We were discussing about MPLS tunnels, > NHLFE stacking, new opcodes, etc. The ideas are there, the only thing we > need to define is how these ideas (if adequate) are going to be > integrated. I have created a seperate branch, //depot/mpls-kernel-merger/... which is the davem code with the dst stakcing added. The plan is to migrate features from our code into teh davem code. Then when jamal feels like we have made some good progress he can integrate changes to the davem branch. (I can help with all of the integratations). > Some questions: are we free to go and change Dave's core? Should we first > submit mainly Dave's core and then and only then work with incremental and > small changes? Or should we keep it quite a little more and extend Dave's > code with some work before submitting? In this later case: what? if you > ask myself or James (although I let him speak by himself) NHLFE stacking. > You rised the concern of performance. I don't think it really is an issue. One thing to note is that the great thing about a dual instructions scheme (one set on the ILM and one set on the NHLFE) is that you can optimize the instructions for speed or flexibility. If you want to create a PUSHN instruction, you can easily add it and use it instead of the NHLFE stacking (if that is what your particular application needs). I still think that NHLFE stacking is what will be used for the common hierachical LSP case though. > In other words: how do we port features that are in our version to Dave's > core? and finally, there is one important question that I have not dared > to ask until now: there is a notable user base (in research labs) that use > James implementation *and* userspace applications that work with it. The > most notable example (for me and some people I know) is RSVP-TE daemon. If > the drop some exisiting features like procfs and/or ioctl support, this > means a step back from the user point of view (although I tend to think > that we should ask what's best for the kernel and let the userspace apps > drag behind). > > > Action Points (just my opinion): > > * I would say that l2c stuff could (you are the expert) be separated from > the mpls core. Submit this before the MPLS core and let it become stable > in the netdev tree. Agreed. I want to use the L2C netlink code, but I'm trying to minimize the number of non-mainsteam patches I'm tracking. For now I've built on top of Ramons netlink code. It doesn't work, but the guts of it are there. > * I would start by submitting Dave's core + James Dst stacking. Lets make one or two more changes before we send it back to davem for review. (i sent an email at the end of March with my suggested developement path for the davem code, I'll dig it up and resend) > * Keep on working from that focusing on porting some features present in > James impl. (if we manage to convince you ;)) ) > > * At some point we should converge... Agreed. > (btw, I appreciate your offer regarding the inclusion of James and myself > as co-authors) Yes, thank you. Like I said above, I'll find my previous email and resend. BTW do either of you have a good upstream connection? If so we could create a p4 proxy with a high level of caching. Then only the commits need to consume my limited bandwidth. (I'm researching getting a better upstream, but options are limited in my suburb, which was a farms corn field last year at this time). > ==== //depot/mpls-kernel-davem/net/ipv4/fib_semantics.c#5 - /home/jleu/personal/clients/mpls-kernel-davem2/net/ipv4/fib_semantics.c ==== > --- /tmp/tmp.26663.0 2004-02-29 22:19:34.063069400 -0600 > +++ /home/jleu/personal/clients/mpls-kernel-davem2/net/ipv4/fib_semantics.c 2004-02-26 22:35:44.000000000 -0600 > @@ -42,6 +42,9 @@ > #include <net/tcp.h> > #include <net/sock.h> > #include <net/ip_fib.h> > +#ifdef CONFIG_NET_MPLS > +#include <net/mpls.h> > +#endif > > #define FSprintk(a...) > > @@ -169,16 +172,6 @@ > fi->fib_prev->fib_next = fi->fib_next; > if (fi == fib_info_list) > fib_info_list = fi->fib_next; > -#ifdef CONFIG_NET_MPLS > - if (fi->fib_nh && fi->fib_nh->nh_mpls_fec) { > - struct mpls_nhlfe_route *mpls; > - struct fib_nh *nh = fi->fib_nh; > - mpls = mpls_nhlfe_lookup(nh->nh_mpls_fec, 0, 0); > - if (NULL != mpls) { > - mpls_nhlfe_put(mpls); > - } > - } > -#endif > fi->fib_dead = 1; > fib_info_put(fi); > } > ==== //depot/mpls-kernel-davem/net/ipv4/ip_output.c#4 - /home/jleu/personal/clients/mpls-kernel-davem2/net/ipv4/ip_output.c ==== > --- /tmp/tmp.26663.1 2004-02-29 22:19:34.393019240 -0600 > +++ /home/jleu/personal/clients/mpls-kernel-davem2/net/ipv4/ip_output.c 2004-02-26 22:35:44.000000000 -0600 > @@ -174,6 +174,11 @@ > struct net_device *dev = dst->dev; > int hh_len = LL_RESERVED_SPACE(dev); > > + if (dst->child) { > + skb->dst = dst_pop(skb->dst); > + return skb->dst->output(skb); > + } > + > /* Be paranoid, rather than too clever. */ > if (unlikely(skb_headroom(skb) < hh_len && dev->hard_header)) { > struct sk_buff *skb2; > @@ -219,7 +224,7 @@ > skb->protocol = htons(ETH_P_IP); > > return NF_HOOK(PF_INET, NF_IP_POST_ROUTING, skb, NULL, dev, > - net_output_maybe_mpls(skb->dst, ip_finish_output2)); > + ip_finish_output2); > } > > int ip_mc_output(struct sk_buff *skb) > ==== //depot/mpls-kernel-davem/net/ipv4/route.c#4 - /home/jleu/personal/clients/mpls-kernel-davem2/net/ipv4/route.c ==== > --- /tmp/tmp.26663.2 2004-02-29 22:19:34.600987624 -0600 > +++ /home/jleu/personal/clients/mpls-kernel-davem2/net/ipv4/route.c 2004-02-26 22:35:44.000000000 -0600 > @@ -103,6 +103,9 @@ > #ifdef CONFIG_SYSCTL > #include <linux/sysctl.h> > #endif > +#ifdef CONFIG_NET_MPLS > +#include <net/mpls.h> > +#endif > > #define IP_MAX_MTU 0xFFF0 > > @@ -1399,6 +1402,20 @@ > rt->rt_gateway = FIB_RES_GW(*res); > memcpy(rt->u.dst.metrics, fi->fib_metrics, > sizeof(rt->u.dst.metrics)); > + > +#ifdef CONFIG_NET_MPLS > + if (FIB_RES_MPLS_FEC(*res)) { > + struct mpls_nhlfe_route *mpls; > + > + mpls = mpls_nhlfe_lookup(FIB_RES_MPLS_FEC(*res), > + rt->u.dst.dev->ifindex,0); > + if (!IS_ERR(mpls)) { > + dst_hold(&mpls->u.dst); > + rt->u.dst.child = &mpls->u.dst; > + } > + } > +#endif > + > if (fi->fib_mtu == 0) { > rt->u.dst.metrics[RTAX_MTU-1] = rt->u.dst.dev->mtu; > if (rt->u.dst.metrics[RTAX_LOCK-1] & (1 << RTAX_MTU) && > @@ -1705,23 +1722,6 @@ > > rth->rt_flags = flags; > > -#ifdef CONFIG_NET_MPLS > - if (res.fi && FIB_RES_MPLS_FEC(res)) { > - struct mpls_nhlfe_route *mpls; > - > - mpls = mpls_nhlfe_lookup(FIB_RES_MPLS_FEC(res), out_dev->dev->ifindex,0); > - if (IS_ERR(mpls)) { > - rt_drop(rth); > - err = PTR_ERR(mpls); > - goto done; > - } > - > - rth->u.dst.mpls = mpls; > - mpls_bind_neighbour(&rth->u.dst); > - } > -#endif > - > - > #ifdef CONFIG_NET_FASTROUTE > if (netdev_fastroute && !(flags&(RTCF_NAT|RTCF_MASQ|RTCF_DOREDIRECT))) { > struct net_device *odev = rth->u.dst.dev; > @@ -2204,22 +2204,6 @@ > > rt_set_nexthop(rth, &res, 0); > > -#ifdef CONFIG_NET_MPLS > - if (res.fi && FIB_RES_MPLS_FEC(res)) { > - struct mpls_nhlfe_route *mpls; > - > - mpls = mpls_nhlfe_lookup(FIB_RES_MPLS_FEC(res), dev_out->ifindex,0); > - if (IS_ERR(mpls)) { > - rt_drop(rth); > - err = PTR_ERR(mpls); > - goto done; > - } > - > - rth->u.dst.mpls = mpls; > - mpls_bind_neighbour(&rth->u.dst); > - } > -#endif > - > rth->rt_flags = flags; > > hash = rt_hash_code(oldflp->fl4_dst, oldflp->fl4_src ^ (oldflp->oif << 5), tos); > ==== //depot/mpls-kernel-davem/net/ipv6/ip6_output.c#4 - /home/jleu/personal/clients/mpls-kernel-davem2/net/ipv6/ip6_output.c ==== > --- /tmp/tmp.26663.3 2004-02-29 22:19:34.689974096 -0600 > +++ /home/jleu/personal/clients/mpls-kernel-davem2/net/ipv6/ip6_output.c 2004-02-26 22:35:44.000000000 -0600 > @@ -75,6 +75,11 @@ > struct dst_entry *dst = skb->dst; > struct hh_cache *hh = dst->hh; > > + if (dst->child) { > + skb->dst = dst_pop(skb->dst); > + return dst_output(skb); > + } > + > if (hh) { > int hh_alen; > > @@ -138,9 +143,7 @@ > > IP6_INC_STATS(Ip6OutMcastPkts); > } > - > - return NF_HOOK(PF_INET6, NF_IP6_POST_ROUTING, skb,NULL, skb->dev, > - net_output_maybe_mpls(dst, ip6_output_finish)); > + return NF_HOOK(PF_INET6, NF_IP6_POST_ROUTING, skb,NULL, skb->dev,ip6_output_finish); > } > > int ip6_output(struct sk_buff *skb) > ==== //depot/mpls-kernel-davem/net/ipv6/route.c#4 - /home/jleu/personal/clients/mpls-kernel-davem2/net/ipv6/route.c ==== > --- /tmp/tmp.26663.4 2004-02-29 22:19:34.868946888 -0600 > +++ /home/jleu/personal/clients/mpls-kernel-davem2/net/ipv6/route.c 2004-02-26 22:35:44.000000000 -0600 > @@ -21,6 +21,7 @@ > * - select from (probably) reachable routers (i.e. > * routers in REACHABLE, STALE, DELAY or PROBE states). > * - always select the same router if it is (probably) > +5B > * reachable. otherwise, round-robin the list. > */ > > @@ -59,6 +60,9 @@ > #ifdef CONFIG_SYSCTL > #include <linux/sysctl.h> > #endif > +#ifdef CONFIG_NET_MPLS > +#include <net/mpls.h> > +#endif > > /* Set to 3 to get tracing. */ > #define RT6_DEBUG 2 > @@ -838,13 +842,12 @@ > if (rt->rt6i_mpls_fec) { > struct mpls_nhlfe_route *mpls; > > - mpls = mpls_nhlfe_lookup(rt->rt6i_mpls_fec, dev->ifindex,0); > - if (IS_ERR(mpls)) { > - err = PTR_ERR(mpls); > -goto out; > + mpls = mpls_nhlfe_lookup(rt->rt6i_mpls_fec, > + dev->ifindex,0); > + if (!IS_ERR(mpls)) { > + dst_hold(&mpls->u.dst); > + rt->u.dst.child = &mpls->u.dst; > } > - rt->u.dst.mpls = mpls; > - mpls_bind_neighbour(&rt->u.dst); > } > } > #endif > ==== //depot/mpls-kernel-davem/net/mpls/mpls_fib.c#4 - /home/jleu/personal/clients/mpls-kernel-davem2/net/mpls/mpls_fib.c ==== > --- /tmp/tmp.26663.5 2004-02-29 22:19:34.950934424 -0600 > +++ /home/jleu/personal/clients/mpls-kernel-davem2/net/mpls/mpls_fib.c 2004-02-26 22:35:44.000000000 -0600 > @@ -20,6 +20,8 @@ > #include <linux/proc_fs.h> > #include <linux/seq_file.h> > #include <linux/init.h> > +#include <net/mpls.h> > +#include <net/mpls_ilm.h> > #include <linux/rtnetlink.h> > #include <linux/l2cnetlink.h> > #include <linux/string.h> > @@ -28,20 +30,19 @@ > #include <net/neighbour.h> > #include <net/dst.h> > #include <net/flow.h> > -#include <net/mpls.h> > -#include <net/mpls_ilm.h> > > +static void mpls_nhlfe_event(int event, struct mpls_nhlfe_route *mpls); > > /* > * Interface to generic destination cache. > */ > > -static struct dst_entry *mpls_dst_check(struct dst_entry *dst, u32 cookie); > -static void mpls_dst_destroy(struct dst_entry *dst); > -static struct dst_entry *mpls_negative_advice(struct dst_entry *dst); > -static void mpls_link_failure(struct sk_buff *skb); > -static void mpls_update_pmtu(struct dst_entry *dst, u32 mtu); > -static int mpls_garbage_collect(void); > +static struct dst_entry *mpls_ilm_check(struct dst_entry *dst, u32 cookie); > +static void mpls_ilm_destroy(struct dst_entry *dst); > +static struct dst_entry *mpls_ilm_negative_advice(struct dst_entry *dst); > +static void mpls_ilm_link_failure(struct sk_buff *skb); > +static void mpls_ilm_update_pmtu(struct dst_entry *dst, u32 mtu); > +static int mpls_ilm_garbage_collect(void); > > #define LT_HASH_ENTS 256 > #define LT_HASH_MASK (LT_HASH_ENTS - 1) > @@ -53,54 +54,51 @@ > > static struct lt_hash_bucket *lt_hash_table; > > -static struct dst_ops mpls_unicast_dst_ops = { > +static struct dst_ops mpls_ilm_ops = { > .family = AF_MPLS, > .protocol = __constant_htons(ETH_P_MPLS_UC), > - .gc = mpls_garbage_collect, > - .check = mpls_dst_check, > - .destroy = mpls_dst_destroy, > - .negative_advice = mpls_negative_advice, > - .link_failure = mpls_link_failure, > - .update_pmtu = mpls_update_pmtu, > + .gc = mpls_ilm_garbage_collect, > + .check = mpls_ilm_check, > + .destroy = mpls_ilm_destroy, > + .negative_advice = mpls_ilm_negative_advice, > + .link_failure = mpls_ilm_link_failure, > + .update_pmtu = mpls_ilm_update_pmtu, > .entry_size = sizeof(struct ltable), > }; > > -static struct dst_entry *mpls_dst_check(struct dst_entry *dst, u32 cookie) > +static struct dst_entry *mpls_ilm_check(struct dst_entry *dst, u32 cookie) > { > printk("mpls_dst_check\n"); > dst_release(dst); > return NULL; > } > > -static void mpls_dst_destroy(struct dst_entry *dst) > +static void mpls_ilm_destroy(struct dst_entry *dst) > { > - if (NULL != dst->mpls) { > - printk("mpls_dst_destroy nuking mpls\n"); > - mpls_nhlfe_release(dst->mpls); > - } > + printk("mpls_dst_destroy\n"); > } > > -static struct dst_entry *mpls_negative_advice(struct dst_entry *dst) > +static struct dst_entry *mpls_ilm_negative_advice(struct dst_entry *dst) > { > printk("mpls_negative_advice\n"); > dst_release(dst); > return NULL; > } > > -static void mpls_link_failure(struct sk_buff *skb) > +static void mpls_ilm_link_failure(struct sk_buff *skb) > { > printk("mpls_link_failure\n"); > } > > -static void mpls_update_pmtu(struct dst_entry *dst, u32 mtu) > +static void mpls_ilm_update_pmtu(struct dst_entry *dst, u32 mtu) > { > printk("mpls_update_pmtu\n"); > } > > /* we really dont need this since we have a static table*/ > -static int mpls_garbage_collect(void) > +static int mpls_ilm_garbage_collect(void) > { > - printk("mpls_garbage_collect %d entries \n",atomic_read(&mpls_unicast_dst_ops.entries)); > + printk("mpls_garbage_collect %d entries \n",atomic_read(&mpls_ilm_ops.entries)); > return 0; > } > > @@ -141,7 +139,7 @@ > /* not used at the moment */ > static struct ltable *mpls_build_blackhole(u32 label) > { > - struct ltable *lth = dst_alloc(&mpls_unicast_dst_ops); > + struct ltable *lth = dst_alloc(&mpls_ilm_ops); > > if (!lth) > return NULL; > @@ -258,7 +256,7 @@ > lth->lt_space = space; > lth->lt_ifindex = ifindex; > lth->u.dst.dev = dev; > - lth->u.dst.mpls = mpls; > + lth->u.dst.child = &mpls->u.dst; > > return 0; > } > @@ -394,7 +392,7 @@ > if (NULL == __dev_get_by_index(ilm->in_ifindex)) > return -ENODEV; > > - lt = dst_alloc(&mpls_unicast_dst_ops); > + lt = dst_alloc(&mpls_ilm_ops); > if (!lt) > return -ENOMEM; > > @@ -566,6 +564,69 @@ > return err; > } > > +static struct dst_entry *mpls_nhlfe_check(struct dst_entry *dst, u32 cookie); > +static void mpls_nhlfe_destroy(struct dst_entry *dst); > +static struct dst_entry *mpls_nhlfe_negative_advice(struct dst_entry *dst); > +static void mpls_nhlfe_link_failure(struct sk_buff *skb); > +static void mpls_nhlfe_update_pmtu(struct dst_entry *dst, u32 mtu); > +static int mpls_nhlfe_garbage_collect(void); > + > +static struct dst_ops mpls_nhlfe_ops = { > + .family = AF_MPLS, > + .protocol = __constant_htons(ETH_P_MPLS_UC), > + .gc = mpls_nhlfe_garbage_collect, > + .check = mpls_nhlfe_check, > + .destroy = mpls_nhlfe_destroy, > + .negative_advice = mpls_nhlfe_negative_advice, > + .link_failure = mpls_nhlfe_link_failure, > + .update_pmtu = mpls_nhlfe_update_pmtu, > + .entry_size = sizeof(struct mpls_nhlfe_route), > +}; > + > +static struct dst_entry *mpls_nhlfe_check(struct dst_entry *dst, u32 cookie) > +{ > + printk("mpls_dst_check\n"); > + dst_release(dst); > + return NULL; > +} > + > +static void mpls_nhlfe_destroy(struct dst_entry *dst) > +{ > + struct mpls_nhlfe_route *mir = (struct mpls_nhlfe_route*)dst; > + printk("mpls_dst_destroy nuking mpls\n"); > + > + list_del(&mir->mr_hash); > + INIT_LIST_HEAD(&mir->mr_hash); > + mpls_nhlfe_event(L2CM_DELNHLFE,mir); > + kfree(mir->opcodes); > + mpls_put_prot(mir->mr_prot); > +} > + > +static struct dst_entry *mpls_nhlfe_negative_advice(struct dst_entry *dst) > +{ > + printk("mpls_negative_advice\n"); > + dst_release(dst); > + return NULL; > +} > + > +static void mpls_nhlfe_link_failure(struct sk_buff *skb) > +{ > + printk("mpls_link_failure\n"); > +} > + > +static void mpls_nhlfe_update_pmtu(struct dst_entry *dst, u32 mtu) > +{ > + printk("mpls_update_pmtu\n"); > +} > + > +/* we really dont need this since we have a static table*/ > +static int mpls_nhlfe_garbage_collect(void) > +{ > + printk("mpls_garbage_collect %d entries \n",atomic_read(&mpls_nhlfe_ops.entries)); > + return 0; > +} > + > + > int > mpls_build_nhlfe_route(struct mpls_nhlfe_route *mir, > struct sockaddr *sock_addr, struct nhlfemsg *nh) > @@ -579,7 +640,6 @@ > goto out; > #endif > > - memset(mir,0,sizeof(*mir)); > prot = mpls_get_prot(nh->nh_proto); > if (!prot) > goto out_free_mir; > @@ -589,9 +649,13 @@ > goto out_put_prot; > > INIT_LIST_HEAD(&mir->mr_hash); > + > + mir->u.dst.dev = &loopback_dev; > + mir->u.dst.output = mpls_nhlfe_ucast_output; > + mir->u.dst.neighbour = neigh; > + > mir->mr_nhlfeid = nh->nh_nhlfeid; > mir->mr_ifindex = nh->nh_ifindex; > - mir->mr_neigh = neigh; > // mir->mr_hh = hh; > mir->mr_prot = prot; > mir->mr_protocol = nh->nh_proto; > @@ -609,36 +673,14 @@ > return err; > } > > -void mpls_nhlfe_destroy(struct mpls_nhlfe_route *mir) > -{ > - BUG_ON(!list_empty(&mir->mr_hash)); > - kfree(mir->opcodes); > - neigh_release(mir->mr_neigh); > - mpls_put_prot(mir->mr_prot); > - kfree(mir); > -} > - > void mpls_nhlfe_hold(struct mpls_nhlfe_route *mpls) > { > - > - atomic_inc(&mpls->mr_ref); > + dst_hold(&mpls->u.dst); > } > > void mpls_nhlfe_put(struct mpls_nhlfe_route *mpls) > { > - atomic_dec(&mpls->mr_ref); > -} > - > -void mpls_bind_neighbour(struct dst_entry *dst) > -{ > - struct mpls_nhlfe_route *mr = dst->mpls; > - struct neighbour *neigh = dst->neighbour; > - > - BUG_ON(!mr || !mr->mr_neigh); > - > - if (neigh) > - neigh_release(neigh); > - dst->neighbour = neigh_clone(mr->mr_neigh); > + dst_release(&mpls->u.dst); > } > > void print_nhlfe(struct mpls_nhlfe_route *mpls) > @@ -654,7 +696,7 @@ > ins++; > } > > - printk("mtu adjustment %d\n",mpls->mr_path_hlen); > + printk("mtu adjustment %d\n",mpls->u.dst.metrics[RTAX_MTU -1]); > > if (mpls->mr_protocol == AF_INET) { > struct sockaddr_in *addr = (struct sockaddr_in *)&mpls->addr; > @@ -779,11 +821,11 @@ > } > } > > - mpls = kmalloc(sizeof(*mpls), GFP_KERNEL); > + mpls = dst_alloc(&mpls_nhlfe_ops); > if (!mpls) > return -ENOMEM; > > - err = mpls_build_nhlfe_route(mpls,&sock_addr, nh); > + err = mpls_build_nhlfe_route(mpls, &sock_addr, nh); > > if (0 > err) { > return err; > @@ -802,7 +844,7 @@ > > memcpy(mpls->opcodes,ins,lb); > mpls->n_opcodes = lc; > - mpls->mr_path_hlen = lb; > + mpls->u.dst.metrics[RTAX_MTU - 1] = lb; > memcpy(&mpls->addr,&sock_addr,sizeof(struct sockaddr)); > mpls_nhlfe_hold(mpls); > err = mpls_nhlfe_intern(mpls); > @@ -820,7 +862,7 @@ > kfree(mpls->opcodes); > err_op: > mpls_put_prot(mpls->mr_prot); > - neigh_release(mpls->mr_neigh); > + neigh_release(mpls->u.dst.neighbour); > kfree(mpls); > return err; > } > @@ -833,17 +875,11 @@ > spin_lock_bh(&mpls_nhlfe_lock); > mpls = __mpls_nhlfe_lookup(nh->nh_nhlfeid, nh->nh_ifindex, nh->nh_space); > if (NULL != mpls) { > - int usrs = atomic_read(&mpls->mr_ref); > - if (usrs > 1) { > - printk(" %d users probably FTN or ILM still holding NHLFE route\n",usrs - 1); > - err = -EINVAL; > - } else { > - list_del(&mpls->mr_hash); > - INIT_LIST_HEAD(&mpls->mr_hash); > - mpls_nhlfe_event(L2CM_DELNHLFE,mpls); > - mpls_nhlfe_release(mpls); > - err = 0; > - } > + mpls_nhlfe_put(mpls); > + call_rcu (&mpls->u.dst.rcu_head, (void (*)(void *))dst_free, > + &mpls->u.dst); > + rt_cache_flush(0); > + err = 0; > } > spin_unlock_bh(&mpls_nhlfe_lock); > > @@ -945,11 +981,18 @@ > lt_hash_table[i].chain = NULL; > } > > - mpls_unicast_dst_ops.kmem_cachep = kmem_cache_create("mpls_u_dst_cache", > + mpls_ilm_ops.kmem_cachep = kmem_cache_create("mpls_ilm_cache", > sizeof(struct ltable), 0, SLAB_HWCACHE_ALIGN, NULL, NULL); > > - if (!mpls_unicast_dst_ops.kmem_cachep) > - panic("MPLS: failed to allocate mpls_u_dst_cache\n"); > + if (!mpls_ilm_ops.kmem_cachep) > + panic("MPLS: failed to allocate mpls_ilm_cache\n"); > + > + mpls_nhlfe_ops.kmem_cachep = kmem_cache_create("mpls_nhlfe_cache", > + sizeof(struct mpls_nhlfe_route), 0, SLAB_HWCACHE_ALIGN, > + NULL, NULL); > + > + if (!mpls_nhlfe_ops.kmem_cachep) > + panic("MPLS: failed to allocate mpls_nhlfe_cache\n"); > > for (i = 0; i < MPLS_NHLFE_HASHSZ; i++) { > INIT_LIST_HEAD(&mpls_nhlfe_hash[i]); > ==== //depot/mpls-kernel-davem/net/mpls/mpls_forward.c#3 - /home/jleu/personal/clients/mpls-kernel-davem2/net/mpls/mpls_forward.c ==== > --- /tmp/tmp.26663.6 2004-02-29 22:19:34.981929712 -0600 > +++ /home/jleu/personal/clients/mpls-kernel-davem2/net/mpls/mpls_forward.c 2004-02-26 22:35:44.000000000 -0600 > @@ -46,7 +46,7 @@ > mplsh = (u32 *)skb->nh.raw; > lt = (struct ltable *)skb->dst; > skb->dst = <->u.dst; > - mpls = skb->dst->mpls; > + mpls = container_of (skb->dst->child,struct mpls_nhlfe_route,u.dst); > ttl = MPLS_LABEL_TTL(ntohl(*mplsh)) ; > > mpls->stats.packets++; > @@ -58,7 +58,7 @@ > } > > /* We are about to mangle packet. Copy it! */ > - if (skb_cow(skb, LL_RESERVED_SPACE(mpls->mr_neigh->dev)+mpls->mr_path_hlen)) > + if (skb_cow(skb, LL_RESERVED_SPACE(mpls->u.dst.neighbour->dev)+mpls->u.dst.metrics[RTAX_MTU - 1])) > goto drop; > > /* ttl = mpls_decrease_ttl(mplsh); */ > ==== //depot/mpls-kernel-davem/net/mpls/mpls_output.c#3 - /home/jleu/personal/clients/mpls-kernel-davem2/net/mpls/mpls_output.c ==== > --- /tmp/tmp.26663.7 2004-02-29 22:19:34.990928344 -0600 > +++ /home/jleu/personal/clients/mpls-kernel-davem2/net/mpls/mpls_output.c 2004-02-26 22:35:44.000000000 -0600 > @@ -17,7 +17,7 @@ > /* XXX Breaks with TSO... */ > int mpls_unicast_output(struct sk_buff *skb, struct mpls_nhlfe_route *mpls) > { > - struct net_device *dev = mpls->mr_neigh->dev; > + struct net_device *dev = mpls->u.dst.neighbour->dev; > struct hh_cache *hh = NULL; > int hh_len; > > @@ -69,16 +69,6 @@ > mpls->stats.bytes+= len; > } > return ret; > - } else if (mpls->mr_neigh) { > - int len = skb->len; > - int ret = mpls->mr_neigh->output(skb); > - if (NET_XMIT_SUCCESS != ret) { > - mpls->stats.drops++; > - } else { > - mpls->stats.packets++; > - mpls->stats.bytes+= len; > - } > - return ret; > } > > if (net_ratelimit()) > @@ -92,13 +82,18 @@ > > int mpls_nhlfe_ucast_output(struct sk_buff *skb) > { > - struct mpls_nhlfe_route *mir = skb->dst->mpls; > + struct mpls_nhlfe_route *mir; > u32 bos = __MPLS_LABEL_S_BIT; > u32 *mplsh = (u32 *) skb->nh.raw; > u32 ttl; > struct mpls_op_k *ins; > int count = 0; > > + skb = skb_share_check(skb, GFP_ATOMIC); > + if (unlikely(!skb)) > + goto drop; > + > + mir = container_of(skb->dst, struct mpls_nhlfe_route, u.dst); > > if (mir->mr_flags & MPLS_FLAG_TTL_PROPAGATE) > ttl = mir->mr_prot->get_ttl(skb); > @@ -130,4 +125,8 @@ > } > > return mpls_unicast_output(skb,mir); > + > +drop: > + kfree_skb(skb); > + return NET_XMIT_DROP; > } > ==== //depot/mpls-kernel-davem/include/net/dst.h#4 - /home/jleu/personal/clients/mpls-kernel-davem2/include/net/dst.h ==== > --- /tmp/tmp.26672.0 2004-02-29 22:19:40.037161200 -0600 > +++ /home/jleu/personal/clients/mpls-kernel-davem2/include/net/dst.h 2004-02-26 22:35:44.000000000 -0600 > @@ -13,7 +13,6 @@ > #include <linux/rcupdate.h> > #include <linux/jiffies.h> > #include <net/neighbour.h> > -#include <net/mpls.h> > #include <asm/processor.h> > > > @@ -67,7 +66,6 @@ > struct neighbour *neighbour; > struct hh_cache *hh; > struct xfrm_state *xfrm; > - struct mpls_nhlfe_route *mpls; > > int (*input)(struct sk_buff*); > int (*output)(struct sk_buff*); > @@ -123,7 +121,8 @@ > u32 mtu = dst_metric(path, RTAX_MTU); > > #ifdef CONFIG_NET_MPLS > - mtu -= (path->mpls ? path->mpls->mr_path_hlen : 0); > + if (path->child) > + mtu -= dst_metric(path->child, RTAX_MTU); > #endif > > /* Yes, _exactly_. This is paranoia. */ > @@ -228,14 +227,7 @@ > int err; > > for (;;) { > -#ifdef CONFIG_NET_MPLS > - if (skb->dst->mpls) > - err = mpls_nhlfe_ucast_output(skb); > - else > - err = skb->dst->output(skb); > -#else > err = skb->dst->output(skb); > -#endif > > if (likely(err == 0)) > return err; > ==== //depot/mpls-kernel-davem/include/net/mpls.h#4 - /home/jleu/personal/clients/mpls-kernel-davem2/include/net/mpls.h ==== > --- /tmp/tmp.26672.1 2004-02-29 22:19:40.074155576 -0600 > +++ /home/jleu/personal/clients/mpls-kernel-davem2/include/net/mpls.h 2004-02-26 22:35:44.000000000 -0600 > @@ -113,6 +113,10 @@ > > /* NHLFE entry */ > struct mpls_nhlfe_route { > + union { > + struct dst_entry dst; > + struct mpls_nhlfe_route *next; > + } u; > struct list_head mr_hash; > struct gnet_stats stats; > int mr_ifindex; > @@ -120,7 +124,7 @@ > u32 mr_flags; > u32 mr_nhlfeid; > u32 lt_tclass; > - u32 mr_path_hlen; > +// u32 mr_path_hlen; > u8 mr_protocol; > u8 mr_ltype; > u8 mr_ttl; > @@ -144,7 +148,6 @@ > #endif > struct sockaddr addr; > > - struct neighbour *mr_neigh; > struct mpls_prot_driver *mr_prot; > }; > > @@ -154,16 +157,9 @@ > > extern struct mpls_nhlfe_route *mpls_nhlfe_lookup(u32 nhlfeid, int ifindex,u32 space); > > -extern void mpls_nhlfe_destroy(struct mpls_nhlfe_route *); > extern void mpls_nhlfe_put(struct mpls_nhlfe_route *); > void mpls_nhlfe_hold(struct mpls_nhlfe_route *mir); > > -static inline void mpls_nhlfe_release(struct mpls_nhlfe_route *mir) > -{ > - if (atomic_dec_and_test(&mir->mr_ref)) > - mpls_nhlfe_destroy(mir); > -} > - > extern void mpls_bind_neighbour(struct dst_entry *dst); > extern int mpls_nhlfe_ucast_output(struct sk_buff *skb); > -- James R. Leu jl...@mi... |