Re: [mpls-linux-devel] Re: 2.6 Spec: Random comments.
Status: Beta
Brought to you by:
jleu
From: Jamal H. S. <ha...@zn...> - 2004-02-16 14:23:34
|
On Sun, 2004-02-15 at 02:25, James R. Leu wrote: > It could be any protocol we map onto an LSP (ie ethernet/atm/fr over MPLS), > you just have to add a protocol driver for it. And the reason you want to do it at the protocol level is because you can classify better? > The ECMP feature only help you at the ingress LER. You need something > to handle load balancing in the core of the MPLS domain. Agreed, so in my earlier email i said we had no control over ECMP i.e at the mercy of linux V4/6 ECMP. At the ILM level on the other hand (for LSRs) we do have more control. > ECMP example: > ------- ------- | | | | .--1G-----| LSR 1 |---100M----| LSR 2 |----1G---. / | | | | \ ---------/ ------- ------- \-------- | Ingress | | Ingress | | LER | | LER | ---------\ ------- ------- /-------- \ | | | | / `--1G-----| LSR 3 |---100M----| LSR 4 |----1G---' | | | | ------- ------- > > In the above case ECMP will allow a max traffic of 200M between > ingress and egress. Ok > Load balancing example: > --------- ------- ------- -------- | | | |---100M----| | | | | Ingress |----1G-----| LSR 1 |---100M----| LSR 2 |----1G-----| Egress | | LER | | |---100M----| | | LER | --------- ------- ------- -------- > > > > Without load balancing LDP would create 1 LSP for traffic going > from ingress to egress. The max traffic you could sent from ingress > to egress is 100M. With load balancing LDP still sets up 1 LSP from > igress to egress, but when LSR2 advertises a label to LSR1, LSR1 realizes > it has 3 adj to LSR2 and creates 3 NHLFEs, on on each of the links. It then > uses some mechanism to load balance traffic arriving on it's 1 ILM onto > the 3 NHLFEs. In the single label case, looking at the protocol ID > associated with the ILM and doing a little layer violation ;-) and we > can do per flow hashing and map flows to the various NHLFEs. Now the > max traffic between ingress and egress is 300M. > Gotcha. so that balancing is done at the ILM level, correct? So that little violation or peeking is i take it the reason you want the protocol extension to be added? > > > The task is trival if the stack only has one label, for more then one label > > > we would have to be creative. Hashing the label stack, or use the PW ID > > > (suggestion in PWE3 WG which adds a word after the labelstack to indicate > > > what protocol lies below.) The PW ID could be used to lookup the protocol > > > driver to generate the hash. > > > > Point me to some doc if you dont mind. Is this for some of the VPN > > encapsulations? > http://www.ietf.org/internet-drafts/draft-allan-mpls-pid-00.txt I'll read the draft; i know the author from my nortel days. If i understood correctly, this is now introducing an extra piece of data in the packet? Note, as i described earlier, we should be able to just look at anything on the packet with the u32 classifier which can be activated before MPLS ILM is consulted. Also based on the top label we can do a classification again to peek into further packet data before making a decision the next hop. > > > Or of course we could just add an options for which algo to use. > > > > Note what i suggested is only for ILM level; And there you could add any > > algorithms you want. With the protocol driver are you suggesting to do > > something at the IPV4/6 FTN level only? > > To be able to load balance and guarentee packet order, you need to know > what is underneath the label stack. With just one label it is trivial to > figure out what is under the label stack. With more then one, it isn't > so easy (the LSR that needs to do the load balancing was not involved in the > signaling of any of the labels past the first one). Currently vendors do > some nasty hacking. Look at the first nibble after the label stack, if it > is a 4, they assume IPv4. They build the appropriate hash and use that > to select the outgoing NHLFE. Why cant you look? Is this because ASICS are already built? You know precisely where the label stack is going to end, no? Can you not then offset to that position and figure what the next data level is? > Since we use the childs output pointer, IPv4|6 don't care if it is MPLS. > I suppose the same check for child could be made in MPLS output, then yes > you could have more the one child stacked. I'm not sure if this would > be very optimal for create hierarchical LSPs (I think that is what > your eluding to). Ok, that sounds reasonable. For starters dont even talk about hierachical LSPs ;-> Out challenge is to get rid of dst->mpls .. then go to David with this one change - I think its above 5% value add;->. Are you going to make the change? cheers, jamal |