Thread: Re: [mpls-linux-devel] Current state of dst stacking on davem implementation (Page 2)

Status: Beta

Brought to you by: jleu

mpls-linux-devel

Re: [mpls-linux-devel] Current state of dst stacking on davem implementation

From: Ramon C. <cas...@in...> - 2004-04-20 11:56:44

Attachments: mpls-davem-dst-stacking.diff

On Mon, 19 Apr 2004, Jamal Hadi Salim wrote:

> On Mon, 2004-04-19 at 02:22, Ramon Casellas wrote:
> 
> Ok, so we are on the same track then. I wasnt sure if that plan was
> still on. 
> BTW, can you extract this patch and send it to me?

Sure, attached.

> I think we need to show some progress and just the dst stacking maybe
> insufficient. Probably the easiest one is for me to submit the L2C patch
> sans the MPLS bits to it since the bridging guys need to use that too. 
> Thoughts?

(side question:Are you using p4 finally? sorry, I recall your stating the
'I will give it a try' but I don't know if you decided to. James is quite
a pro managing p4 patches)

Nevertheless, regarding your question: I would say that separating the L2C 
patch is an interesting approach decoupling concepts. I'd go for it.

OTOH, with the new extensions for L2SC/PWE3 control in (G)MPLS we may need
to work more closely with the bridging people (although most things belong
clearly to userspace). I know that James has been working in this area,
but I don't know the details or whether he has been working with other
people.

And finally, regarding the mpls stuff itself. I am not sure of
understanding what do you mean with 'we need to show some progress'. There
are other things discussed in previous mails that are there, but I don't
recall having had an agreement. We were discussing about MPLS tunnels,
NHLFE stacking, new opcodes, etc. The ideas are there, the only thing we
need to define is how these ideas (if adequate) are going to be
integrated. 

Some questions: are we free to go and change Dave's core? Should we first
submit mainly Dave's core and then and only then work with incremental and
small changes? Or should we keep it quite a little more and extend Dave's
code with some work before submitting? In this later case: what? if you 
ask myself or James (although I let him speak by himself) NHLFE stacking. 
You rised the concern of performance. I don't think it really is an issue.

In other words: how do we port features that are in our version to Dave's 
core? and finally, there is one important question that I have not dared 
to ask until now: there is a notable user base (in research labs) that use 
James implementation *and* userspace applications that work with it. The 
most notable example (for me and some people I know) is RSVP-TE daemon. If 
the drop some exisiting features like procfs and/or ioctl support, this 
means a step back from the user point of view (although I tend to think 
that we should ask what's best for the kernel and let the userspace apps 
drag behind).

Action Points (just my opinion):

* I would say that l2c stuff could (you are the expert) be separated from 
the mpls core. Submit this before the MPLS core and let it become stable 
in the netdev tree.

* I would start by submitting Dave's core + James Dst stacking.

* Keep on working from that focusing on porting some features present in 
James impl. (if we manage to convince you ;)) )

* At some point we should converge...

(btw, I appreciate your offer regarding the inclusion of James and myself
as co-authors)

regards,
R.

Re: [mpls-linux-devel] Current state of dst stacking on davem implementation

From: Jamal H. S. <ha...@zn...> - 2004-07-22 20:25:13

Howdy Folks,
Been a while. 

On Tue, 2004-04-20 at 07:22, Ramon Casellas wrote:
> On Mon, 19 Apr 2004, Jamal Hadi Salim wrote:
> 
> > On Mon, 2004-04-19 at 02:22, Ramon Casellas wrote:
> > 
> > Ok, so we are on the same track then. I wasnt sure if that plan was
> > still on. 
> > BTW, can you extract this patch and send it to me?
> 
> Sure, attached.

I would really like to push something now to Davem.
How about this approach to start with:

1) Create a new patch based on 2.6.8-rc2
for original Davem patch + dst changes from James
2) Add both your names in the appropriate files.
3) push to Dave
4) Start pushing things from jleu code into kernel 

Let me know if this is ok with you.

cheers,
jamal

Re: [mpls-linux-devel] Current state of dst stacking on davem implementation

From: James R. L. <jl...@mi...> - 2004-07-23 16:52:09

Hey there,

On Thu, Jul 22, 2004 at 04:23:55PM -0400, Jamal Hadi Salim wrote:
> Howdy Folks,
> Been a while. 
>
> On Tue, 2004-04-20 at 07:22, Ramon Casellas wrote:
> > On Mon, 19 Apr 2004, Jamal Hadi Salim wrote:
> > 
> > > On Mon, 2004-04-19 at 02:22, Ramon Casellas wrote:
> > > 
> > > Ok, so we are on the same track then. I wasnt sure if that plan was
> > > still on. 
> > > BTW, can you extract this patch and send it to me?
> > 
> > Sure, attached.
> 
> I would really like to push something now to Davem.
> How about this approach to start with:
> 
> 1) Create a new patch based on 2.6.8-rc2
> for original Davem patch + dst changes from James
> 2) Add both your names in the appropriate files.
> 3) push to Dave
> 4) Start pushing things from jleu code into kernel 

How about this.  I will create a patch for the current davem code against
2.6.8-rc2 AND a patch for a stripped down version of the jleu code
against 2.6.8-rc2.  We can then see how far apart the implementations
are, I suppect we'll see 95% similarity.  The biggest difference will be the
lack of netlink in the jleu implementation (which I'm working on).

> Let me know if this is ok with you.

I'll produce the patches, and we'll decide what the next step is from there.

> cheers,
> jamal

Laters.

-- 
James R. Leu
jl...@mi...

Re: [mpls-linux-devel] Current state of dst stacking on davem implementation

From: Jamal H. S. <ha...@zn...> - 2004-07-23 19:07:38

On Fri, 2004-07-23 at 12:52, James R. Leu wrote:

> > Let me know if this is ok with you.
> 
> I'll produce the patches, and we'll decide what the next step is from there.

Sounds good to me.

cheers,
jamal

Re: [mpls-linux-devel] Current state of dst stacking on davem implementation

From: James R. L. <jl...@mi...> - 2004-08-04 08:56:50

On Fri, Jul 23, 2004 at 03:06:51PM -0400, Jamal Hadi Salim wrote:
> On Fri, 2004-07-23 at 12:52, James R. Leu wrote:
> 
> > > Let me know if this is ok with you.
> > 
> > I'll produce the patches, and we'll decide what the next step is from there.
> 
> Sounds good to me.

I just wanted to let you know I haven't forgotten about this.

I've been busy.  I'm heading on a trip for the next week, I tried to get this
done before I leave, but, as you can see it's 4am, and I need to get some
sleep before I travel.  I've switched to a 2.6.8-rc3 (which I'm sure will
be rc4 or 2.6.8 final by the time I'm back).  I've done some work with netlink
but I definitly like your L2C stuff better then my work.  Is that going to
be added to 2.6 any time soon?  What about gen_stats?

Any who, sorry for the long delay, I'll get something to the group when
I get back from my trip.

-- 
James R. Leu
jl...@mi...

Re: [mpls-linux-devel] Current state of dst stacking on davem implementation

From: Jamal H. S. <ha...@zn...> - 2004-08-04 13:17:43

On Wed, 2004-08-04 at 04:56, James R. Leu wrote:
> On Fri, Jul 23, 2004 at 03:06:51PM -0400, Jamal Hadi Salim wrote:
> > On Fri, 2004-07-23 at 12:52, James R. Leu wrote:
> > 
> > > > Let me know if this is ok with you.
> > > 
> > > I'll produce the patches, and we'll decide what the next step is from there.
> > 
> > Sounds good to me.
> 
> I just wanted to let you know I haven't forgotten about this.
> 
> I've been busy.  I'm heading on a trip for the next week, I tried to get this
> done before I leave, but, as you can see it's 4am, and I need to get some
> sleep before I travel.

I am also traveling thats why i havent bothered you ;->
If you are in the bay area we could get together for a meal or drink.
I am heading home on the weekend.

> I've switched to a 2.6.8-rc3 (which I'm sure will
> be rc4 or 2.6.8 final by the time I'm back).  I've done some work with netlink
> but I definitly like your L2C stuff better then my work. 

I should be pushing that - too busy.
There are other clients of it (layer2 related)

>  Is that going to
> be added to 2.6 any time soon?  What about gen_stats?

Gen stats and the estimator as well are independent and i should be able
to push them separately.
Maybe I should be pushing this first.
Note, the old code didnt have any flushing or dumping of tables which
is a different code path (L2C was ready for it).

> Any who, sorry for the long delay, I'll get something to the group when
> I get back from my trip.

The sooner we do this the better. Dave is hot on getting something in so
lets take advantage of it.
Shall i push L2c and the stats stuff first?

cheers,
jamal

Re: [mpls-linux-devel] Current state of dst stacking on davem implementation

From: James R. L. <jl...@mi...> - 2004-08-04 13:30:49

Hello,

Yes, I think it is a good idea to start pushing the gen_stats and
l2c netlink code.  Both MPLS implementation can benifit from them
as well as other projects.

Laters

On Wed, Aug 04, 2004 at 09:17:24AM -0400, Jamal Hadi Salim wrote:
> On Wed, 2004-08-04 at 04:56, James R. Leu wrote:
> > On Fri, Jul 23, 2004 at 03:06:51PM -0400, Jamal Hadi Salim wrote:
> > > On Fri, 2004-07-23 at 12:52, James R. Leu wrote:
> > > 
> > > > > Let me know if this is ok with you.
> > > > 
> > > > I'll produce the patches, and we'll decide what the next step is from there.
> > > 
> > > Sounds good to me.
> > 
> > I just wanted to let you know I haven't forgotten about this.
> > 
> > I've been busy.  I'm heading on a trip for the next week, I tried to get this
> > done before I leave, but, as you can see it's 4am, and I need to get some
> > sleep before I travel.
> 
> I am also traveling thats why i havent bothered you ;->
> If you are in the bay area we could get together for a meal or drink.
> I am heading home on the weekend.
> 
> > I've switched to a 2.6.8-rc3 (which I'm sure will
> > be rc4 or 2.6.8 final by the time I'm back).  I've done some work with netlink
> > but I definitly like your L2C stuff better then my work. 
> 
> I should be pushing that - too busy.
> There are other clients of it (layer2 related)
> 
> >  Is that going to
> > be added to 2.6 any time soon?  What about gen_stats?
> 
> Gen stats and the estimator as well are independent and i should be able
> to push them separately.
> Maybe I should be pushing this first.
> Note, the old code didnt have any flushing or dumping of tables which
> is a different code path (L2C was ready for it).
> 
> > Any who, sorry for the long delay, I'll get something to the group when
> > I get back from my trip.
> 
> The sooner we do this the better. Dave is hot on getting something in so
> lets take advantage of it.
> Shall i push L2c and the stats stuff first?
> 
> cheers,
> jamal
> 
> 
> 
> -------------------------------------------------------
> This SF.Net email is sponsored by OSTG. Have you noticed the changes on
> Linux.com, ITManagersJournal and NewsForge in the past few weeks? Now,
> one more big change to announce. We are now OSTG- Open Source Technology
> Group. Come see the changes on the new OSTG site. www.ostg.com
> _______________________________________________
> mpls-linux-devel mailing list
> mpl...@li...
> https://lists.sourceforge.net/lists/listinfo/mpls-linux-devel

-- 
James R. Leu
jl...@mi...

Re: [mpls-linux-devel] Current state of dst stacking on davem implementation

From: Jamal H. S. <ha...@zn...> - 2004-08-14 02:41:36

On Wed, 2004-08-04 at 09:30, James R. Leu wrote:
> Hello,
> 
> Yes, I think it is a good idea to start pushing the gen_stats and
> l2c netlink code.  Both MPLS implementation can benifit from them
> as well as other projects.

I am back home.
I will start pushing this weekend.

cheers,
jamal

Re: [mpls-linux-devel] Current state of dst stacking on davem implementation

From: James R. L. <jl...@mi...> - 2004-08-15 17:43:49

Good timing.  I just got back from my trip too :-)

On Fri, Aug 13, 2004 at 10:41:13PM -0400, Jamal Hadi Salim wrote:
> On Wed, 2004-08-04 at 09:30, James R. Leu wrote:
> > Hello,
> > 
> > Yes, I think it is a good idea to start pushing the gen_stats and
> > l2c netlink code.  Both MPLS implementation can benifit from them
> > as well as other projects.
> 
> I am back home.
> I will start pushing this weekend.
> 
> cheers,
> jamal

-- 
James R. Leu
jl...@mi...

Re: [mpls-linux-devel] Current state of dst stacking on davem implementation

From: James R. L. <jl...@mi...> - 2004-04-20 13:38:25

Sorry for being absent for the last couple of weeks, I'm in the midst of
the busiest time of the year for me.  I'm preparing for the Network + InterOp
tradeshow.  This year I'm working on an 'Advanced Internetworking Initiative'
which basically means MPLS services :-)

We have equiptment from 10 vendors and over 20 devices.  We'll be concentrating
on MPLS BGP VPNs (v4,v6,Multicast), VPLS, and Carrier's Carrier.  Our main
area of 'experimenting' is in the multicast space.  We're showing carriers
and vendors how BGP VPNs have an advantage (although only slight) over
L2 VPNs (ala VPLS) with repect to multicast.  Hopefully the result will be
L2 VPN vendors re-thinking their implementation.

Anyway enough of that, onto the questions I've missed.

On Tue, Apr 20, 2004 at 01:22:58PM +0200, Ramon Casellas wrote:
> On Mon, 19 Apr 2004, Jamal Hadi Salim wrote:
> 
> > On Mon, 2004-04-19 at 02:22, Ramon Casellas wrote:
<snip>
> OTOH, with the new extensions for L2SC/PWE3 control in (G)MPLS we may need
> to work more closely with the bridging people (although most things belong
> clearly to userspace). I know that James has been working in this area,
> but I don't know the details or whether he has been working with other
> people.

My work, as usual, is a solo effort.  If anything I've done can be of
use, I'd be more the happy to contribute.

> And finally, regarding the mpls stuff itself. I am not sure of
> understanding what do you mean with 'we need to show some progress'. There
> are other things discussed in previous mails that are there, but I don't
> recall having had an agreement. We were discussing about MPLS tunnels,
> NHLFE stacking, new opcodes, etc. The ideas are there, the only thing we
> need to define is how these ideas (if adequate) are going to be
> integrated. 

I have created a seperate branch, //depot/mpls-kernel-merger/...  which
is the davem code with the dst stakcing added.  The plan is to migrate
features from our code into teh davem code.  Then when jamal feels like we
have made some good progress he can integrate changes to the davem branch.
(I can help with all of the integratations).

> Some questions: are we free to go and change Dave's core? Should we first
> submit mainly Dave's core and then and only then work with incremental and
> small changes? Or should we keep it quite a little more and extend Dave's
> code with some work before submitting? In this later case: what? if you 
> ask myself or James (although I let him speak by himself) NHLFE stacking. 
> You rised the concern of performance. I don't think it really is an issue.

One thing to note is that the great thing about a dual instructions scheme
(one set on the ILM and one set on the NHLFE) is that you can optimize the
instructions for speed or flexibility.  If you want to create a PUSHN
instruction, you can easily add it and use it instead of the NHLFE stacking
(if that is what your particular application needs).  I still think that NHLFE
stacking is what will be used for the common hierachical LSP case though.

> In other words: how do we port features that are in our version to Dave's 
> core? and finally, there is one important question that I have not dared 
> to ask until now: there is a notable user base (in research labs) that use 
> James implementation *and* userspace applications that work with it. The 
> most notable example (for me and some people I know) is RSVP-TE daemon. If 
> the drop some exisiting features like procfs and/or ioctl support, this 
> means a step back from the user point of view (although I tend to think 
> that we should ask what's best for the kernel and let the userspace apps 
> drag behind).
> 
> 
> Action Points (just my opinion):
> 
> * I would say that l2c stuff could (you are the expert) be separated from 
> the mpls core. Submit this before the MPLS core and let it become stable 
> in the netdev tree.

Agreed.  I want to use the L2C netlink code, but I'm trying to minimize
the number of non-mainsteam patches I'm tracking.  For now I've built
on top of Ramons netlink code.  It doesn't work, but the guts of it are
there.

> * I would start by submitting Dave's core + James Dst stacking.

Lets make one or two more changes before we send it back to davem for review.
(i sent an email at the end of March with my suggested developement path
for the davem code, I'll dig it up and resend)

> * Keep on working from that focusing on porting some features present in 
> James impl. (if we manage to convince you ;)) )
> 
> * At some point we should converge...

Agreed.

> (btw, I appreciate your offer regarding the inclusion of James and myself
> as co-authors)

Yes, thank you.

Like I said above, I'll find my previous email and resend.

BTW do either of you have a good upstream connection?  If so we could
create a p4 proxy with a high level of caching.  Then only the commits
need to consume my limited bandwidth.  (I'm researching getting a better
upstream, but options are limited in my suburb, which was a farms corn field
last year at this time).


> ==== //depot/mpls-kernel-davem/net/ipv4/fib_semantics.c#5 - /home/jleu/personal/clients/mpls-kernel-davem2/net/ipv4/fib_semantics.c ====
> --- /tmp/tmp.26663.0	2004-02-29 22:19:34.063069400 -0600
> +++ /home/jleu/personal/clients/mpls-kernel-davem2/net/ipv4/fib_semantics.c	2004-02-26 22:35:44.000000000 -0600
> @@ -42,6 +42,9 @@
>  #include <net/tcp.h>
>  #include <net/sock.h>
>  #include <net/ip_fib.h>
> +#ifdef CONFIG_NET_MPLS
> +#include <net/mpls.h>
> +#endif
>  
>  #define FSprintk(a...)
>  
> @@ -169,16 +172,6 @@
>  			fi->fib_prev->fib_next = fi->fib_next;
>  		if (fi == fib_info_list)
>  			fib_info_list = fi->fib_next;
> -#ifdef CONFIG_NET_MPLS
> -		if (fi->fib_nh && fi->fib_nh->nh_mpls_fec) {
> -			struct mpls_nhlfe_route *mpls;
> -			struct fib_nh *nh = fi->fib_nh;
> -			mpls = mpls_nhlfe_lookup(nh->nh_mpls_fec, 0, 0);
> -			if (NULL != mpls) {
> -				mpls_nhlfe_put(mpls);
> -			}
> -		}
> -#endif
>  		fi->fib_dead = 1;
>  		fib_info_put(fi);
>  	}
> ==== //depot/mpls-kernel-davem/net/ipv4/ip_output.c#4 - /home/jleu/personal/clients/mpls-kernel-davem2/net/ipv4/ip_output.c ====
> --- /tmp/tmp.26663.1	2004-02-29 22:19:34.393019240 -0600
> +++ /home/jleu/personal/clients/mpls-kernel-davem2/net/ipv4/ip_output.c	2004-02-26 22:35:44.000000000 -0600
> @@ -174,6 +174,11 @@
>  	struct net_device *dev = dst->dev;
>  	int hh_len = LL_RESERVED_SPACE(dev);
>  
> +        if (dst->child) {
> +		skb->dst = dst_pop(skb->dst);
> +		return skb->dst->output(skb);
> +        }
> +
>  	/* Be paranoid, rather than too clever. */
>  	if (unlikely(skb_headroom(skb) < hh_len && dev->hard_header)) {
>  		struct sk_buff *skb2;
> @@ -219,7 +224,7 @@
>  	skb->protocol = htons(ETH_P_IP);
>  
>  	return NF_HOOK(PF_INET, NF_IP_POST_ROUTING, skb, NULL, dev,
> -		       net_output_maybe_mpls(skb->dst, ip_finish_output2));
> +		ip_finish_output2);
>  }
>  
>  int ip_mc_output(struct sk_buff *skb)
> ==== //depot/mpls-kernel-davem/net/ipv4/route.c#4 - /home/jleu/personal/clients/mpls-kernel-davem2/net/ipv4/route.c ====
> --- /tmp/tmp.26663.2	2004-02-29 22:19:34.600987624 -0600
> +++ /home/jleu/personal/clients/mpls-kernel-davem2/net/ipv4/route.c	2004-02-26 22:35:44.000000000 -0600
> @@ -103,6 +103,9 @@
>  #ifdef CONFIG_SYSCTL
>  #include <linux/sysctl.h>
>  #endif
> +#ifdef CONFIG_NET_MPLS
> +#include <net/mpls.h>
> +#endif
>  
>  #define IP_MAX_MTU	0xFFF0
>  
> @@ -1399,6 +1402,20 @@
>  			rt->rt_gateway = FIB_RES_GW(*res);
>  		memcpy(rt->u.dst.metrics, fi->fib_metrics,
>  		       sizeof(rt->u.dst.metrics));
> +
> +#ifdef CONFIG_NET_MPLS
> +		if (FIB_RES_MPLS_FEC(*res)) {
> +			struct mpls_nhlfe_route *mpls;
> +
> +			mpls = mpls_nhlfe_lookup(FIB_RES_MPLS_FEC(*res),
> +				rt->u.dst.dev->ifindex,0);
> +			if (!IS_ERR(mpls)) {
> +				dst_hold(&mpls->u.dst);
> +				rt->u.dst.child = &mpls->u.dst;
> +			}
> +		}
> +#endif
> +
>  		if (fi->fib_mtu == 0) {
>  			rt->u.dst.metrics[RTAX_MTU-1] = rt->u.dst.dev->mtu;
>  			if (rt->u.dst.metrics[RTAX_LOCK-1] & (1 << RTAX_MTU) &&
> @@ -1705,23 +1722,6 @@
>  
>  	rth->rt_flags = flags;
>  
> -#ifdef CONFIG_NET_MPLS
> -	if (res.fi && FIB_RES_MPLS_FEC(res)) {
> -		struct mpls_nhlfe_route *mpls;
> -
> -		mpls = mpls_nhlfe_lookup(FIB_RES_MPLS_FEC(res), out_dev->dev->ifindex,0);
> -		if (IS_ERR(mpls)) {
> -			rt_drop(rth);
> -			err = PTR_ERR(mpls);
> -			goto done;
> -		}
> -
> -		rth->u.dst.mpls = mpls;
> -		mpls_bind_neighbour(&rth->u.dst);
> -	}
> -#endif
> -
> -
>  #ifdef CONFIG_NET_FASTROUTE
>  	if (netdev_fastroute && !(flags&(RTCF_NAT|RTCF_MASQ|RTCF_DOREDIRECT))) {
>  		struct net_device *odev = rth->u.dst.dev;
> @@ -2204,22 +2204,6 @@
>  
>  	rt_set_nexthop(rth, &res, 0);
>  	
> -#ifdef CONFIG_NET_MPLS
> -	if (res.fi && FIB_RES_MPLS_FEC(res)) {
> -		struct mpls_nhlfe_route *mpls;
> -
> -		mpls = mpls_nhlfe_lookup(FIB_RES_MPLS_FEC(res), dev_out->ifindex,0);
> -		if (IS_ERR(mpls)) {
> -			rt_drop(rth);
> -			err = PTR_ERR(mpls);
> -			goto done;
> -		}
> -
> -		rth->u.dst.mpls = mpls;
> -		mpls_bind_neighbour(&rth->u.dst);
> -	}
> -#endif
> -
>  	rth->rt_flags = flags;
>  
>  	hash = rt_hash_code(oldflp->fl4_dst, oldflp->fl4_src ^ (oldflp->oif << 5), tos);
> ==== //depot/mpls-kernel-davem/net/ipv6/ip6_output.c#4 - /home/jleu/personal/clients/mpls-kernel-davem2/net/ipv6/ip6_output.c ====
> --- /tmp/tmp.26663.3	2004-02-29 22:19:34.689974096 -0600
> +++ /home/jleu/personal/clients/mpls-kernel-davem2/net/ipv6/ip6_output.c	2004-02-26 22:35:44.000000000 -0600
> @@ -75,6 +75,11 @@
>  	struct dst_entry *dst = skb->dst;
>  	struct hh_cache *hh = dst->hh;
>  
> +	if (dst->child) {
> +		skb->dst = dst_pop(skb->dst);
> +		return dst_output(skb);
> +	}
> +
>  	if (hh) {
>  		int hh_alen;
>  
> @@ -138,9 +143,7 @@
>  
>  		IP6_INC_STATS(Ip6OutMcastPkts);
>  	}
> -
> -	return NF_HOOK(PF_INET6, NF_IP6_POST_ROUTING, skb,NULL, skb->dev,
> -		       net_output_maybe_mpls(dst, ip6_output_finish));
> +	return NF_HOOK(PF_INET6, NF_IP6_POST_ROUTING, skb,NULL, skb->dev,ip6_output_finish);
>  }
>  
>  int ip6_output(struct sk_buff *skb)
> ==== //depot/mpls-kernel-davem/net/ipv6/route.c#4 - /home/jleu/personal/clients/mpls-kernel-davem2/net/ipv6/route.c ====
> --- /tmp/tmp.26663.4	2004-02-29 22:19:34.868946888 -0600
> +++ /home/jleu/personal/clients/mpls-kernel-davem2/net/ipv6/route.c	2004-02-26 22:35:44.000000000 -0600
> @@ -21,6 +21,7 @@
>   *		- select from (probably) reachable routers (i.e.
>   *		routers in REACHABLE, STALE, DELAY or PROBE states).
>   *		- always select the same router if it is (probably)
> +5B
>   *		reachable.  otherwise, round-robin the list.
>   */
>  
> @@ -59,6 +60,9 @@
>  #ifdef CONFIG_SYSCTL
>  #include <linux/sysctl.h>
>  #endif
> +#ifdef CONFIG_NET_MPLS
> +#include <net/mpls.h>
> +#endif
>  
>  /* Set to 3 to get tracing. */
>  #define RT6_DEBUG 2
> @@ -838,13 +842,12 @@
>  		if (rt->rt6i_mpls_fec) {
>  			struct mpls_nhlfe_route *mpls;
>  
> -			mpls = mpls_nhlfe_lookup(rt->rt6i_mpls_fec, dev->ifindex,0);
> -			if (IS_ERR(mpls)) {
> -				err = PTR_ERR(mpls);
> -goto out;
> +			mpls = mpls_nhlfe_lookup(rt->rt6i_mpls_fec,
> +				dev->ifindex,0);
> +			if (!IS_ERR(mpls)) {
> +				dst_hold(&mpls->u.dst);
> +				rt->u.dst.child = &mpls->u.dst;
>  			}
> -			rt->u.dst.mpls = mpls;
> -			mpls_bind_neighbour(&rt->u.dst);
>  		}
>  	}
>  #endif
> ==== //depot/mpls-kernel-davem/net/mpls/mpls_fib.c#4 - /home/jleu/personal/clients/mpls-kernel-davem2/net/mpls/mpls_fib.c ====
> --- /tmp/tmp.26663.5	2004-02-29 22:19:34.950934424 -0600
> +++ /home/jleu/personal/clients/mpls-kernel-davem2/net/mpls/mpls_fib.c	2004-02-26 22:35:44.000000000 -0600
> @@ -20,6 +20,8 @@
>  #include <linux/proc_fs.h>
>  #include <linux/seq_file.h>
>  #include <linux/init.h>
> +#include <net/mpls.h>
> +#include <net/mpls_ilm.h>
>  #include <linux/rtnetlink.h>
>  #include <linux/l2cnetlink.h>
>  #include <linux/string.h>
> @@ -28,20 +30,19 @@
>  #include <net/neighbour.h>
>  #include <net/dst.h>
>  #include <net/flow.h>
> -#include <net/mpls.h>
> -#include <net/mpls_ilm.h>
>  
> +static void mpls_nhlfe_event(int event, struct mpls_nhlfe_route *mpls);
>  
>  /*
>   *	Interface to generic destination cache.
>   */
>  
> -static struct dst_entry *mpls_dst_check(struct dst_entry *dst, u32 cookie);
> -static void		 mpls_dst_destroy(struct dst_entry *dst);
> -static struct dst_entry *mpls_negative_advice(struct dst_entry *dst);
> -static void		 mpls_link_failure(struct sk_buff *skb);
> -static void		 mpls_update_pmtu(struct dst_entry *dst, u32 mtu);
> -static int mpls_garbage_collect(void);
> +static struct dst_entry *mpls_ilm_check(struct dst_entry *dst, u32 cookie);
> +static void		 mpls_ilm_destroy(struct dst_entry *dst);
> +static struct dst_entry *mpls_ilm_negative_advice(struct dst_entry *dst);
> +static void		 mpls_ilm_link_failure(struct sk_buff *skb);
> +static void		 mpls_ilm_update_pmtu(struct dst_entry *dst, u32 mtu);
> +static int		 mpls_ilm_garbage_collect(void);
>  
>  #define LT_HASH_ENTS	256
>  #define LT_HASH_MASK	(LT_HASH_ENTS - 1)
> @@ -53,54 +54,51 @@
>  
>  static struct lt_hash_bucket *lt_hash_table;
>  
> -static struct dst_ops mpls_unicast_dst_ops = {
> +static struct dst_ops mpls_ilm_ops = {
>  	.family =		AF_MPLS,
>  	.protocol =		__constant_htons(ETH_P_MPLS_UC),
> -	.gc =			mpls_garbage_collect,
> -	.check =		mpls_dst_check,
> -	.destroy =		mpls_dst_destroy,
> -	.negative_advice =	mpls_negative_advice,
> -	.link_failure =		mpls_link_failure,
> -	.update_pmtu =		mpls_update_pmtu,
> +	.gc =			mpls_ilm_garbage_collect,
> +	.check =		mpls_ilm_check,
> +	.destroy =		mpls_ilm_destroy,
> +	.negative_advice =	mpls_ilm_negative_advice,
> +	.link_failure =		mpls_ilm_link_failure,
> +	.update_pmtu =		mpls_ilm_update_pmtu,
>  	.entry_size =		sizeof(struct ltable),
>  };
>  
> -static struct dst_entry *mpls_dst_check(struct dst_entry *dst, u32 cookie)
> +static struct dst_entry *mpls_ilm_check(struct dst_entry *dst, u32 cookie)
>  {
>  	printk("mpls_dst_check\n");
>  	dst_release(dst);
>  	return NULL;
>  }
>  
> -static void mpls_dst_destroy(struct dst_entry *dst)
> +static void mpls_ilm_destroy(struct dst_entry *dst)
>  {
> -	if (NULL != dst->mpls) {
> -		printk("mpls_dst_destroy nuking mpls\n");
> -		mpls_nhlfe_release(dst->mpls);
> -	}
> +	printk("mpls_dst_destroy\n");
>  }
>  
> -static struct dst_entry *mpls_negative_advice(struct dst_entry *dst)
> +static struct dst_entry *mpls_ilm_negative_advice(struct dst_entry *dst)
>  {
>  	printk("mpls_negative_advice\n");
>  	dst_release(dst);
>  	return NULL;
>  }
>  
> -static void mpls_link_failure(struct sk_buff *skb)
> +static void mpls_ilm_link_failure(struct sk_buff *skb)
>  {
>  	printk("mpls_link_failure\n");
>  }
>  
> -static void mpls_update_pmtu(struct dst_entry *dst, u32 mtu)
> +static void mpls_ilm_update_pmtu(struct dst_entry *dst, u32 mtu)
>  {
>  	printk("mpls_update_pmtu\n");
>  }
>  
>  /* we really dont need this since we have a static table*/
> -static int mpls_garbage_collect(void)
> +static int mpls_ilm_garbage_collect(void)
>  { 
> -	printk("mpls_garbage_collect %d entries \n",atomic_read(&mpls_unicast_dst_ops.entries));
> +	printk("mpls_garbage_collect %d entries \n",atomic_read(&mpls_ilm_ops.entries));
>  	return 0;
>  }
>  
> @@ -141,7 +139,7 @@
>  /* not used at the moment */
>  static struct ltable *mpls_build_blackhole(u32 label)
>  {
> -	struct ltable *lth = dst_alloc(&mpls_unicast_dst_ops);
> +	struct ltable *lth = dst_alloc(&mpls_ilm_ops);
>  
>  	if (!lth)
>  		return NULL;
> @@ -258,7 +256,7 @@
>  	lth->lt_space = space;
>  	lth->lt_ifindex = ifindex;
>  	lth->u.dst.dev	= dev;
> -	lth->u.dst.mpls = mpls;
> +	lth->u.dst.child = &mpls->u.dst;
>  
>  	return 0;
>  }
> @@ -394,7 +392,7 @@
>  	if (NULL == __dev_get_by_index(ilm->in_ifindex))
>  		return -ENODEV;
>  
> -	lt = dst_alloc(&mpls_unicast_dst_ops);
> +	lt = dst_alloc(&mpls_ilm_ops);
>  	if (!lt)
>  		return -ENOMEM;
>  
> @@ -566,6 +564,69 @@
>  	return err;
>  }
>  
> +static struct dst_entry *mpls_nhlfe_check(struct dst_entry *dst, u32 cookie);
> +static void		 mpls_nhlfe_destroy(struct dst_entry *dst);
> +static struct dst_entry *mpls_nhlfe_negative_advice(struct dst_entry *dst);
> +static void		 mpls_nhlfe_link_failure(struct sk_buff *skb);
> +static void		 mpls_nhlfe_update_pmtu(struct dst_entry *dst, u32 mtu);
> +static int		 mpls_nhlfe_garbage_collect(void);
> +
> +static struct dst_ops mpls_nhlfe_ops = {
> +	.family =		AF_MPLS,
> +	.protocol =		__constant_htons(ETH_P_MPLS_UC),
> +	.gc =			mpls_nhlfe_garbage_collect,
> +	.check =		mpls_nhlfe_check,
> +	.destroy =		mpls_nhlfe_destroy,
> +	.negative_advice =	mpls_nhlfe_negative_advice,
> +	.link_failure =		mpls_nhlfe_link_failure,
> +	.update_pmtu =		mpls_nhlfe_update_pmtu,
> +	.entry_size =		sizeof(struct mpls_nhlfe_route),
> +};
> +
> +static struct dst_entry *mpls_nhlfe_check(struct dst_entry *dst, u32 cookie)
> +{
> +	printk("mpls_dst_check\n");
> +	dst_release(dst);
> +	return NULL;
> +}
> +
> +static void mpls_nhlfe_destroy(struct dst_entry *dst)
> +{
> +	struct mpls_nhlfe_route *mir = (struct mpls_nhlfe_route*)dst;
> +	printk("mpls_dst_destroy nuking mpls\n");
> +
> +	list_del(&mir->mr_hash);
> +	INIT_LIST_HEAD(&mir->mr_hash);
> +	mpls_nhlfe_event(L2CM_DELNHLFE,mir);
> +	kfree(mir->opcodes);
> +	mpls_put_prot(mir->mr_prot);
> +}
> +
> +static struct dst_entry *mpls_nhlfe_negative_advice(struct dst_entry *dst)
> +{
> +	printk("mpls_negative_advice\n");
> +	dst_release(dst);
> +	return NULL;
> +}
> +
> +static void mpls_nhlfe_link_failure(struct sk_buff *skb)
> +{
> +	printk("mpls_link_failure\n");
> +}
> +
> +static void mpls_nhlfe_update_pmtu(struct dst_entry *dst, u32 mtu)
> +{
> +	printk("mpls_update_pmtu\n");
> +}
> +
> +/* we really dont need this since we have a static table*/
> +static int mpls_nhlfe_garbage_collect(void)
> +{ 
> +	printk("mpls_garbage_collect %d entries \n",atomic_read(&mpls_nhlfe_ops.entries));
> +	return 0;
> +}
> +
> +
>  int
>  mpls_build_nhlfe_route(struct mpls_nhlfe_route *mir,
>  		struct sockaddr *sock_addr, struct nhlfemsg *nh)
> @@ -579,7 +640,6 @@
>  		goto out;
>  #endif
>  
> -	memset(mir,0,sizeof(*mir));
>  	prot = mpls_get_prot(nh->nh_proto);
>  	if (!prot)
>  		goto out_free_mir;
> @@ -589,9 +649,13 @@
>  		goto out_put_prot;
>  
>  	INIT_LIST_HEAD(&mir->mr_hash);
> +
> +	mir->u.dst.dev		= &loopback_dev;
> +	mir->u.dst.output	= mpls_nhlfe_ucast_output;
> +	mir->u.dst.neighbour	= neigh;
> +
>  	mir->mr_nhlfeid		= nh->nh_nhlfeid;
>  	mir->mr_ifindex		= nh->nh_ifindex;
> -	mir->mr_neigh		= neigh;
>  //	mir->mr_hh		= hh;
>  	mir->mr_prot		= prot;
>  	mir->mr_protocol	= nh->nh_proto;
> @@ -609,36 +673,14 @@
>  	return err;
>  }
>  
> -void mpls_nhlfe_destroy(struct mpls_nhlfe_route *mir)
> -{
> -	BUG_ON(!list_empty(&mir->mr_hash));
> -	kfree(mir->opcodes);
> -	neigh_release(mir->mr_neigh);
> -	mpls_put_prot(mir->mr_prot);
> -	kfree(mir);
> -}
> -
>  void mpls_nhlfe_hold(struct mpls_nhlfe_route *mpls)
>  {
> -
> -	atomic_inc(&mpls->mr_ref);
> +	dst_hold(&mpls->u.dst);
>  }
>  
>  void mpls_nhlfe_put(struct mpls_nhlfe_route *mpls)
>  {
> -	atomic_dec(&mpls->mr_ref);
> -}
> -
> -void mpls_bind_neighbour(struct dst_entry *dst)
> -{
> -	struct mpls_nhlfe_route *mr = dst->mpls;
> -	struct neighbour *neigh = dst->neighbour;
> -
> -	BUG_ON(!mr || !mr->mr_neigh);
> -
> -	if (neigh)
> -		neigh_release(neigh);
> -	dst->neighbour = neigh_clone(mr->mr_neigh);
> +	dst_release(&mpls->u.dst);
>  }
>  
>  void print_nhlfe(struct mpls_nhlfe_route *mpls)
> @@ -654,7 +696,7 @@
>  		 ins++;
>  	}
>  
> -	printk("mtu adjustment %d\n",mpls->mr_path_hlen);
> +	printk("mtu adjustment %d\n",mpls->u.dst.metrics[RTAX_MTU -1]);
>  
>  	if (mpls->mr_protocol == AF_INET) {
>  		struct sockaddr_in *addr = (struct sockaddr_in *)&mpls->addr;
> @@ -779,11 +821,11 @@
>  		}
>  	}
>  
> -	mpls = kmalloc(sizeof(*mpls), GFP_KERNEL);
> +	mpls = dst_alloc(&mpls_nhlfe_ops);
>  	if (!mpls)
>  		return -ENOMEM;
>  
> -	err = mpls_build_nhlfe_route(mpls,&sock_addr, nh);
> +	err = mpls_build_nhlfe_route(mpls, &sock_addr, nh);
>  
>  	if (0 > err)  {
>  		return err;
> @@ -802,7 +844,7 @@
>  
>  	memcpy(mpls->opcodes,ins,lb);
>  	mpls->n_opcodes = lc;
> -	mpls->mr_path_hlen = lb;
> +	mpls->u.dst.metrics[RTAX_MTU - 1] = lb;
>  	memcpy(&mpls->addr,&sock_addr,sizeof(struct sockaddr));
>  	mpls_nhlfe_hold(mpls);
>  	err = mpls_nhlfe_intern(mpls);
> @@ -820,7 +862,7 @@
>  	kfree(mpls->opcodes);
>  err_op:
>  	mpls_put_prot(mpls->mr_prot);
> -	neigh_release(mpls->mr_neigh);
> +	neigh_release(mpls->u.dst.neighbour);
>  	kfree(mpls);
>  	return  err;
>  }
> @@ -833,17 +875,11 @@
>  	spin_lock_bh(&mpls_nhlfe_lock);
>  	mpls = __mpls_nhlfe_lookup(nh->nh_nhlfeid, nh->nh_ifindex, nh->nh_space);
>  	if (NULL != mpls) {
> -		int usrs = atomic_read(&mpls->mr_ref);
> -		if (usrs > 1) {
> -			printk(" %d users probably FTN or ILM still holding NHLFE route\n",usrs - 1);
> -			err = -EINVAL;
> -		} else {
> -			list_del(&mpls->mr_hash);
> -			INIT_LIST_HEAD(&mpls->mr_hash);
> -			mpls_nhlfe_event(L2CM_DELNHLFE,mpls);
> -			mpls_nhlfe_release(mpls);
> -			err = 0;
> -		}
> +		mpls_nhlfe_put(mpls);
> +		call_rcu (&mpls->u.dst.rcu_head, (void (*)(void *))dst_free,
> +			&mpls->u.dst);
> +		rt_cache_flush(0);
> +		err = 0;
>  	}
>  	spin_unlock_bh(&mpls_nhlfe_lock);
>  
> @@ -945,11 +981,18 @@
>  		lt_hash_table[i].chain = NULL;
>  	}
>  
> -	mpls_unicast_dst_ops.kmem_cachep = kmem_cache_create("mpls_u_dst_cache",
> +	mpls_ilm_ops.kmem_cachep = kmem_cache_create("mpls_ilm_cache",
>  		sizeof(struct ltable), 0, SLAB_HWCACHE_ALIGN, NULL, NULL);
>  
> -	if (!mpls_unicast_dst_ops.kmem_cachep)
> -		panic("MPLS: failed to allocate mpls_u_dst_cache\n");
> +	if (!mpls_ilm_ops.kmem_cachep)
> +		panic("MPLS: failed to allocate mpls_ilm_cache\n");
> +
> +	mpls_nhlfe_ops.kmem_cachep = kmem_cache_create("mpls_nhlfe_cache",
> +		sizeof(struct mpls_nhlfe_route), 0, SLAB_HWCACHE_ALIGN,
> +		NULL, NULL);
> +
> +	if (!mpls_nhlfe_ops.kmem_cachep)
> +		panic("MPLS: failed to allocate mpls_nhlfe_cache\n");
>  
>  	for (i = 0; i < MPLS_NHLFE_HASHSZ; i++) {
>  		INIT_LIST_HEAD(&mpls_nhlfe_hash[i]);
> ==== //depot/mpls-kernel-davem/net/mpls/mpls_forward.c#3 - /home/jleu/personal/clients/mpls-kernel-davem2/net/mpls/mpls_forward.c ====
> --- /tmp/tmp.26663.6	2004-02-29 22:19:34.981929712 -0600
> +++ /home/jleu/personal/clients/mpls-kernel-davem2/net/mpls/mpls_forward.c	2004-02-26 22:35:44.000000000 -0600
> @@ -46,7 +46,7 @@
>  	mplsh = (u32 *)skb->nh.raw;
>  	lt = (struct ltable *)skb->dst;
>  	skb->dst = &lt->u.dst;
> -	mpls = skb->dst->mpls;
> +	mpls = container_of (skb->dst->child,struct mpls_nhlfe_route,u.dst);
>  	ttl = MPLS_LABEL_TTL(ntohl(*mplsh)) ;
>  
>  	mpls->stats.packets++;
> @@ -58,7 +58,7 @@
>  	}
>  
>  	/* We are about to mangle packet. Copy it! */
> -	if (skb_cow(skb, LL_RESERVED_SPACE(mpls->mr_neigh->dev)+mpls->mr_path_hlen))
> +	if (skb_cow(skb, LL_RESERVED_SPACE(mpls->u.dst.neighbour->dev)+mpls->u.dst.metrics[RTAX_MTU - 1]))
>  		goto drop;
>  
>  	/* ttl = mpls_decrease_ttl(mplsh); */
> ==== //depot/mpls-kernel-davem/net/mpls/mpls_output.c#3 - /home/jleu/personal/clients/mpls-kernel-davem2/net/mpls/mpls_output.c ====
> --- /tmp/tmp.26663.7	2004-02-29 22:19:34.990928344 -0600
> +++ /home/jleu/personal/clients/mpls-kernel-davem2/net/mpls/mpls_output.c	2004-02-26 22:35:44.000000000 -0600
> @@ -17,7 +17,7 @@
>  /* XXX Breaks with TSO... */
>  int mpls_unicast_output(struct sk_buff *skb, struct mpls_nhlfe_route *mpls)
>  {
> -	struct net_device *dev = mpls->mr_neigh->dev;
> +	struct net_device *dev = mpls->u.dst.neighbour->dev;
>  	struct hh_cache *hh = NULL;
>  	int hh_len;
>  
> @@ -69,16 +69,6 @@
>  			mpls->stats.bytes+= len;
>  		}
>  		return ret;
> -	} else if (mpls->mr_neigh) {
> -		int len = skb->len;
> -		int ret = mpls->mr_neigh->output(skb);
> -		if (NET_XMIT_SUCCESS != ret) {
> -			mpls->stats.drops++;
> -		} else {
> -			mpls->stats.packets++;
> -			mpls->stats.bytes+= len;
> -		}
> -		return ret;
>  	}
>  
>  	if (net_ratelimit())
> @@ -92,13 +82,18 @@
>  
>  int mpls_nhlfe_ucast_output(struct sk_buff *skb)
>  {
> -	struct mpls_nhlfe_route *mir = skb->dst->mpls;
> +	struct mpls_nhlfe_route *mir;
>  	u32 bos = __MPLS_LABEL_S_BIT;
>  	u32 *mplsh = (u32 *) skb->nh.raw;
>  	u32 ttl;
>  	struct mpls_op_k *ins;
>  	int count = 0;
>  
> +	skb = skb_share_check(skb, GFP_ATOMIC);
> +	if (unlikely(!skb))
> +		goto drop;
> +
> +	mir = container_of(skb->dst, struct mpls_nhlfe_route, u.dst);
>  
>  	if (mir->mr_flags & MPLS_FLAG_TTL_PROPAGATE)
>  		ttl = mir->mr_prot->get_ttl(skb);
> @@ -130,4 +125,8 @@
>  	}
>  
>  	return mpls_unicast_output(skb,mir);
> +
> +drop:
> +	kfree_skb(skb);
> +	return NET_XMIT_DROP;
>  }
> ==== //depot/mpls-kernel-davem/include/net/dst.h#4 - /home/jleu/personal/clients/mpls-kernel-davem2/include/net/dst.h ====
> --- /tmp/tmp.26672.0	2004-02-29 22:19:40.037161200 -0600
> +++ /home/jleu/personal/clients/mpls-kernel-davem2/include/net/dst.h	2004-02-26 22:35:44.000000000 -0600
> @@ -13,7 +13,6 @@
>  #include <linux/rcupdate.h>
>  #include <linux/jiffies.h>
>  #include <net/neighbour.h>
> -#include <net/mpls.h>
>  #include <asm/processor.h>
>  
>  
> @@ -67,7 +66,6 @@
>  	struct neighbour	*neighbour;
>  	struct hh_cache		*hh;
>  	struct xfrm_state	*xfrm;
> -	struct mpls_nhlfe_route *mpls;
>  
>  	int			(*input)(struct sk_buff*);
>  	int			(*output)(struct sk_buff*);
> @@ -123,7 +121,8 @@
>  	u32 mtu = dst_metric(path, RTAX_MTU);
>  
>  #ifdef	CONFIG_NET_MPLS
> -	mtu -= (path->mpls ? path->mpls->mr_path_hlen : 0);
> +	if (path->child)
> +		mtu -= dst_metric(path->child, RTAX_MTU);
>  #endif
>  
>  	/* Yes, _exactly_. This is paranoia. */
> @@ -228,14 +227,7 @@
>  	int err;
>  
>  	for (;;) {
> -#ifdef	CONFIG_NET_MPLS
> -		if (skb->dst->mpls)
> -			err = mpls_nhlfe_ucast_output(skb);
> -		else
> -			err = skb->dst->output(skb);
> -#else
>  		err = skb->dst->output(skb);
> -#endif
>  
>  		if (likely(err == 0))
>  			return err;
> ==== //depot/mpls-kernel-davem/include/net/mpls.h#4 - /home/jleu/personal/clients/mpls-kernel-davem2/include/net/mpls.h ====
> --- /tmp/tmp.26672.1	2004-02-29 22:19:40.074155576 -0600
> +++ /home/jleu/personal/clients/mpls-kernel-davem2/include/net/mpls.h	2004-02-26 22:35:44.000000000 -0600
> @@ -113,6 +113,10 @@
>  
>  /* NHLFE entry */
>  struct mpls_nhlfe_route {
> +	union {
> +		struct dst_entry	dst;
> +		struct mpls_nhlfe_route *next;
> +	} u;
>  	struct list_head	mr_hash;
>  	struct gnet_stats	stats;
>  	int			mr_ifindex;
> @@ -120,7 +124,7 @@
>  	u32			mr_flags;
>  	u32			mr_nhlfeid;
>  	u32			lt_tclass;
> -	u32			mr_path_hlen;
> +//	u32			mr_path_hlen;
>  	u8			mr_protocol;
>  	u8			mr_ltype;
>  	u8			mr_ttl;
> @@ -144,7 +148,6 @@
>  #endif
>  	struct sockaddr		addr;
>  
> -	struct neighbour	*mr_neigh;
>  	struct mpls_prot_driver	*mr_prot;
>  };
>  
> @@ -154,16 +157,9 @@
>  
>  extern struct mpls_nhlfe_route *mpls_nhlfe_lookup(u32 nhlfeid, int ifindex,u32 space);
>  
> -extern void mpls_nhlfe_destroy(struct mpls_nhlfe_route *);
>  extern void mpls_nhlfe_put(struct mpls_nhlfe_route *);
>  void mpls_nhlfe_hold(struct mpls_nhlfe_route *mir);
>  
> -static inline void mpls_nhlfe_release(struct mpls_nhlfe_route *mir)
> -{
> -	if (atomic_dec_and_test(&mir->mr_ref))
> -		mpls_nhlfe_destroy(mir);
> -}
> -
>  extern void mpls_bind_neighbour(struct dst_entry *dst);
>  extern int mpls_nhlfe_ucast_output(struct sk_buff *skb);
>  


-- 
James R. Leu
jl...@mi...

Re: [mpls-linux-devel] Current state of dst stacking on davem implementation

From: Ramon C. <cas...@in...> - 2004-04-20 14:17:45

See comments inline

ps: Is the devel mailing list public ?

On Tue, 20 Apr 2004, James R. Leu wrote:

> We have equiptment from 10 vendors and over 20 devices.  We'll be concentrating
> on MPLS BGP VPNs (v4,v6,Multicast), VPLS, and Carrier's Carrier.  Our main

Not bad :) looks promising. Where is it taking place ? Can I get some 
tickets :) ?

> area of 'experimenting' is in the multicast space.  We're showing carriers
> and vendors how BGP VPNs have an advantage (although only slight) over

What is the advantage? snooping? (just out of curiosity). While I can 
understand the idea that inspecting the client L3 may ease multicast 
integration, how would that push forward a change of VPLS architecture? 

> L2 VPNs (ala VPLS) with repect to multicast.  Hopefully the result will be
> L2 VPN vendors re-thinking their implementation.

Interesting... What would you suggest? RSVP-TE established PWE3 enabling 
PWE3 stitching or L2SC ? 

> 
> Anyway enough of that, onto the questions I've missed.

Pity! Other than the IETF we academics are too far from implementors :P

> (if that is what your particular application needs).  I still think that NHLFE
> stacking is what will be used for the common hierachical LSP case though.

ditto. 

> Lets make one or two more changes before we send it back to davem for review.
ok

Regards,

R.

Re: [mpls-linux-devel] Current state of dst stacking on davem implementation

From: James R. L. <jl...@mi...> - 2004-04-20 15:32:50

On Tue, Apr 20, 2004 at 03:45:33PM +0200, Ramon Casellas wrote:
> 
> See comments inline
> 
> ps: Is the devel mailing list public ?

Yes.

> On Tue, 20 Apr 2004, James R. Leu wrote:
> 
> 
> > We have equiptment from 10 vendors and over 20 devices.  We'll be concentrating
> > on MPLS BGP VPNs (v4,v6,Multicast), VPLS, and Carrier's Carrier.  Our main
> 
> Not bad :) looks promising. Where is it taking place ? Can I get some 
> tickets :) ?

Las Vegas, NV, USA.

> > area of 'experimenting' is in the multicast space.  We're showing carriers
> > and vendors how BGP VPNs have an advantage (although only slight) over
> 
> What is the advantage? snooping? (just out of curiosity). While I can 
> understand the idea that inspecting the client L3 may ease multicast 
> integration, how would that push forward a change of VPLS architecture? 

Multicast BGP VPNs (notice the lack of MPLS in the name) uses a subset of
the carrier multicast groups to tunnel the VPNs multicast.  By using more then
one of the carriers groups a some-what optimised distribution tree is created.
It's not perfect, some of the node of the VPN will still get multicast data
for groups for which they are not interested in, but it is far better then
the flooding that occurs with most L2 VPN implementation, which hits
especially hard on the PE devices that have to do packet replication.

> > L2 VPNs (ala VPLS) with repect to multicast.  Hopefully the result will be
> > L2 VPN vendors re-thinking their implementation.
> 
> Interesting... What would you suggest? RSVP-TE established PWE3 enabling 
> PWE3 stitching or L2SC ?

What we really need to multicast MPLS to progress, but I think that is still
stuck in the 'requirement draft' stage.

> > Anyway enough of that, onto the questions I've missed.
> 
> Pity! Other than the IETF we academics are too far from implementors :P
>
> > (if that is what your particular application needs).  I still think that NHLFE
> > stacking is what will be used for the common hierachical LSP case though.
> 
> ditto. 
> 
> 
> > Lets make one or two more changes before we send it back to davem for review.
> ok
> 
> 
> 
> Regards,
> 
> R.
> 
> 
> 
> -------------------------------------------------------
> This SF.Net email is sponsored by: IBM Linux Tutorials
> Free Linux tutorial presented by Daniel Robbins, President and CEO of
> GenToo technologies. Learn everything from fundamentals to system
> administration.http://ads.osdn.com/?ad_id=1470&alloc_id=3638&op=click
> _______________________________________________
> mpls-linux-devel mailing list
> mpl...@li...
> https://lists.sourceforge.net/lists/listinfo/mpls-linux-devel

-- 
James R. Leu
jl...@mi...

Re: [mpls-linux-devel] Current state of dst stacking on davem implementation

From: Jamal H. S. <ha...@zn...> - 2004-04-22 12:59:40

On Tue, 2004-04-20 at 09:33, James R. Leu wrote:
> Sorry for being absent for the last couple of weeks, I'm in the midst of
> the busiest time of the year for me.  I'm preparing for the Network + InterOp
> tradeshow.  This year I'm working on an 'Advanced Internetworking Initiative'
> which basically means MPLS services :-)

Fun in Las Vegas; gambling, cheap food and drinks and oh, a trade 
show ;->

> We have equiptment from 10 vendors and over 20 devices.  We'll be concentrating
> on MPLS BGP VPNs (v4,v6,Multicast), VPLS, and Carrier's Carrier.  Our main
> area of 'experimenting' is in the multicast space.  We're showing carriers
> and vendors how BGP VPNs have an advantage (although only slight) over
> L2 VPNs (ala VPLS) with repect to multicast.  Hopefully the result will be
> L2 VPN vendors re-thinking their implementation.

Doesnt VPLS eventually have to converge to a MPLS cloud? 
a lot of elcheapo ASICs with VPLS capability appearing lately. Ethernet
will always be king ;->
Didnt understand the multicast part.

> Anyway enough of that, onto the questions I've missed.
> 
> On Tue, Apr 20, 2004 at 01:22:58PM +0200, Ramon Casellas wrote:
> > On Mon, 19 Apr 2004, Jamal Hadi Salim wrote:
> > 
> > > On Mon, 2004-04-19 at 02:22, Ramon Casellas wrote:
> <snip>
> > OTOH, with the new extensions for L2SC/PWE3 control in (G)MPLS we may need
> > to work more closely with the bridging people (although most things belong
> > clearly to userspace). I know that James has been working in this area,
> > but I don't know the details or whether he has been working with other
> > people.
> 
> My work, as usual, is a solo effort.  If anything I've done can be of
> use, I'd be more the happy to contribute.
> 

I think lets go to that in the next phase. Lets get in the MPLs code
first. If theres any architectural concerns that will hindre that work,
then we should certainly address them now.


> One thing to note is that the great thing about a dual instructions scheme
> (one set on the ILM and one set on the NHLFE) is that you can optimize the
> instructions for speed or flexibility.  If you want to create a PUSHN
> instruction, you can easily add it and use it instead of the NHLFE stacking
> (if that is what your particular application needs).  I still think that NHLFE
> stacking is what will be used for the common hierachical LSP case though.

I have no problems with this. The way i see it is like this: You guys
are the MPLS experts - i have opinions that i make known and if you have
strong reservations then we go with yours. 


> BTW do either of you have a good upstream connection?  If so we could
> create a p4 proxy with a high level of caching.  Then only the commits
> need to consume my limited bandwidth.  (I'm researching getting a better
> upstream, but options are limited in my suburb, which was a farms corn field
> last year at this time).

I dont. My damn ISP doesnt even give me more than 5M of space (cant move
have been there more than 10 years).

cheers,
jamal

Re: [mpls-linux-devel] Current state of dst stacking on davem implementation

From: James R. L. <jl...@mi...> - 2004-04-20 13:43:22

Here is the guts of my suggestion.  Let me know what you guys think.

> I have created a seperate branch off of 'mpls-kernel-davem' called
> 'mpls-kernel-merger'.  It is where we can submit changes.  Every once in a
> while we can look at the diff between 'mpls-kernel-davem' and
> 'mpls-kernel-merger' and decide what gets propogated to 'mpls-kernel-davem'.
> 
> This next week is going to be very busy for me, so I will not be able to
> do much development.  Here is how I propose we start:
> 
> -I will submit the dst stacking code.
> -rename all structures to proper names
>     s/mpls_nhlfe_route/mpls_nhlfe/
>     s/ltable/mpls_ilm/
> -add instruction list to ILM and NHLFE
> -move NHLFE label and nexthop to instructions
> -move ILM -> NHLFE connection to instructions
> -modify netlink to better take advantage of the new structure
> 
> I was planning to take much of the code from our implementation and
> modify it to work in the DaveM code.

-- 
James R. Leu
jl...@mi...

Re: [mpls-linux-devel] Current state of dst stacking on davem implementation

From: Ramon C. <cas...@in...> - 2004-04-20 14:10:16

On Tue, 20 Apr 2004, James R. Leu wrote:

> Here is the guts of my suggestion.  Let me know what you guys think.

Well, we have discussed this before, so you know my position :)


> 
> > I have created a seperate branch off of 'mpls-kernel-davem' called
> > 'mpls-kernel-merger'.  It is where we can submit changes.  Every once in a
> > while we can look at the diff between 'mpls-kernel-davem' and
> > 'mpls-kernel-merger' and decide what gets propogated to 'mpls-kernel-davem'.

Ok. I'm having disk space problems with my laptop given your p4 trees :) 4 
kernels makes it 1Gb+ ;)


> > -I will submit the dst stacking code.

> > -rename all structures to proper names
> >     s/mpls_nhlfe_route/mpls_nhlfe/
> >     s/ltable/mpls_ilm/

agreed. 

> > -add instruction list to ILM and NHLFE

agreed. 


> > -move NHLFE label and nexthop to instructions
> > -move ILM -> NHLFE connection to instructions

ok. Shall I undersand the existence of a FWD instruction that allows NHLFE 
stacking ?


> > -modify netlink to better take advantage of the new structure
yep. I am thinking of porting RSVP-TE then, but it is a dauting task. I 
will talk to some people see if they are interested.



> > 
> > I was planning to take much of the code from our implementation and
> > modify it to work in the DaveM code.

Well, that's the goal of two devel/stable branches. The point is 
submitting incremental patches, so the big guys at netdev can eventually 
review them... 


Regards,
R.

Re: [mpls-linux-devel] Current state of dst stacking on davem implementation

From: Jamal H. S. <ha...@zn...> - 2004-04-22 12:37:16

On Tue, 2004-04-20 at 09:38, Ramon Casellas wrote:

> > > -modify netlink to better take advantage of the new structure
> yep. I am thinking of porting RSVP-TE then, but it is a dauting task. I 
> will talk to some people see if they are interested.
> 

I dont know what the RSVP-TE daemon is, but if you say it is important
then i think it is a worthwile effort. I will act as a backup for you
and offer any netlink related advice/code needed.

> 
> > > 
> > > I was planning to take much of the code from our implementation and
> > > modify it to work in the DaveM code.
> 
> Well, that's the goal of two devel/stable branches. The point is 
> submitting incremental patches, so the big guys at netdev can eventually 
> review them... 

This is what i meant by "progress" in my other email. We submit
something as a base - after going through James' list and then we keep 
submitting smaller patches. BTW, the reason i brought it up before was
because Dave had pinged me on how things were going (and i had said all
was fine).

cheers,
jamal

Re: [mpls-linux-devel] Current state of dst stacking on davem implementation

From: Jamal H. S. <ha...@zn...> - 2004-04-22 12:31:28

Hi Guys,
I will jump to this email first and catchup with others later. 

On Tue, 2004-04-20 at 09:37, James R. Leu wrote:
> Here is the guts of my suggestion.  Let me know what you guys think.
> 
> > I have created a seperate branch off of 'mpls-kernel-davem' called
> > 'mpls-kernel-merger'.  It is where we can submit changes.  Every once in a
> > while we can look at the diff between 'mpls-kernel-davem' and
> > 'mpls-kernel-merger' and decide what gets propogated to 'mpls-kernel-davem'.
> > 
> > This next week is going to be very busy for me, so I will not be able to
> > do much development.  Here is how I propose we start:
> > 
> > -I will submit the dst stacking code.
> > -rename all structures to proper names
> >     s/mpls_nhlfe_route/mpls_nhlfe/
> >     s/ltable/mpls_ilm/
> > -add instruction list to ILM and NHLFE
> > -move NHLFE label and nexthop to instructions
> > -move ILM -> NHLFE connection to instructions
> > -modify netlink to better take advantage of the new structure

All look good. The last one i can help with like i offered to last time.
You need to tell me what the messages look like.

> > I was planning to take much of the code from our implementation and
> > modify it to work in the DaveM code.

ok

cheers,
jamal

Re: [mpls-linux-devel] Current state of dst stacking on davem implementation

From: Ramon C. <cas...@in...> - 2004-04-22 12:38:05


Just a question,

What was the final decision regarding labelspaces ?

regards,
R.

Re: [mpls-linux-devel] Current state of dst stacking on davem implementation

From: Ramon C. <cas...@in...> - 2004-04-22 12:36:29

On Thu, 22 Apr 2004, Jamal Hadi Salim wrote:

> Hi Guys,
> I will jump to this email first and catchup with others later. 

Fine. Nevertheless, you picked the right one. It is basically a TODO list.
:)

> 
> On Tue, 2004-04-20 at 09:37, James R. Leu wrote:
> > Here is the guts of my suggestion.  Let me know what you guys think.
> > 
> > > I have created a seperate branch off of 'mpls-kernel-davem' called
> > > 'mpls-kernel-merger'.  It is where we can submit changes.  Every once in a

I will work on mpls-kernel-merger.
I am p4 syncing right now. I will later start with basic stuff.

> All look good. The last one i can help with like i offered to last time.
> You need to tell me what the messages look like.

Good. Although I think that in the process we should review the packet 
format and formalize it as in a RFC, si the Control plane - user plane 
interface should be well defined.

regards,
R.

Re: [mpls-linux-devel] Current state of dst stacking on davem implementation

From: Jamal H. S. <ha...@zn...> - 2004-04-22 13:07:09

On Thu, 2004-04-22 at 08:04, Ramon Casellas wrote:

> I will work on mpls-kernel-merger.
> I am p4 syncing right now. I will later start with basic stuff.
> 

Can you be my proxy to the P4 system for a short while? ;->
I have a feeling it will be a while before i set it up. So i would
prefer to work in diff -du mode.

> Good. Although I think that in the process we should review the packet 
> format and formalize it as in a RFC, si the Control plane - user plane 
> interface should be well defined.

Are you doing a write up? 
I was going to create a "l2c_hello" to show how to create a netlink L2C.
If I focus on that you can use it to write the MPLS L2C. Let me know.
I think the write up will still be useful regardless of who is doing
what.

What was the final decision regarding labelspaces ?

I think we are still gray in this area. My thoughts were device
labelspaces would be valuable if you have something in the packet header
that clearly distinguishes which labelspace to use.
James had other thoughts.

cheers,
jamal

Re: [mpls-linux-devel] Current state of dst stacking on davem implementation

From: Ramon C. <cas...@in...> - 2004-04-22 13:24:39

On Thu, 22 Apr 2004, Jamal Hadi Salim wrote:

> On Thu, 2004-04-22 at 08:04, Ramon Casellas wrote:
> 
> Can you be my proxy to the P4 system for a short while? ;->
> I have a feeling it will be a while before i set it up. So i would
> prefer to work in diff -du mode.

Sure. As soon as I finish p4 syncing, I'll send you a diff w.r.t vanilla 
kernel and I'll be sending updates regularly, no problemo.

> 
> 
> > Good. Although I think that in the process we should review the packet 
> > format and formalize it as in a RFC, si the Control plane - user plane 
> > interface should be well defined.
> 
> Are you doing a write up? 

planning stages, based on the IOCTL of the existing impl. I could use your 
l2c_hello. 

> I think we are still gray in this area. My thoughts were device

Well, not a priority.

R.

Re: [mpls-linux-devel] Current state of dst stacking on davem implementation

From: Jamal H. S. <ha...@zn...> - 2004-03-03 13:15:24

Hi James,
I didnt apply the patch or test it but it looks clean. I like some of
the cleanups.
I wasnt sure about a few things (which maybe harder to see because i am
reading a patch);
1) ipv4/fib_semantics.c, you got rid of the check:

--
if (fi->fib_nh && fi->fib_nh->nh_mpls_fec) {
---

Is this in the spirit of avoiding refcount inc/dec?
BTW, i think it is a good idea to probably keep the refcounts inc/dec;
What is the main reason you are trying to get rid of them?

2) You no longer do a mpls_bind_neighbour(); is this covered elsewhere?

3) The check:
     if (NULL != mpls) {
               int usrs = atomic_read(&mpls->mr_ref);

is to ensure that no user apace deleted a nhlfe entry while some
other place is using it.

4) I suppose mpls_nhlfe_release() is no longer needed now because
of the refcounting changes?

Anyways, overall i would say this is an improvement and we should
have no challenges getting past Davem.

On Mon, 2004-03-01 at 00:03, James R. Leu wrote:
> I've attached a patch against the head of line of the p4 depot for the davem
> implementation.  I tested the output path and it correctly generates MPLS
> packets.  I was unable to figure out the correct combination of l2c
> commands on the input side.  So I gave up testing the input path for
> tonight.  I thought I'd send out what I have so you can look it over and
> get an understanding of where I'm heading with this change.  Of course you
> could always look at the latest changes to my implementation and you'd
> probably get an even better idea ;-)
> 
> Items left to be resolved:
> -correct l2c command line for input definition?  (are we open for discussing
>  changes?)

This is my domain and i am open. Lets just make sure that the ideas
have minimal impact on Daves code and 2 they are guided by useful
changes as opposed to sentimental ones (example "this is how we did it
before").

cheers,
jamal

<< < 1 2 (Page 2 of 2)