From: James R. L. <jl...@mi...> - 2001-12-06 15:55:27
|
Hello, I have (re)implemented my version of TC/DS enhanced mpls-linux. I would like to describe what I have created so I can get feed back. Also I would like to get examples from people of how to use TC and iptables to actually excercise this new code and to show me what this implementation cannot do. First this to keep in mind is that 'outgoing info' nolonger can be interpreted as 'outgoing label'. A particular 'outgoing info' can fwd on to another 'outgoing info' which may do a 'push'. 'incoming labels', aux_proto, and mpls tunnels all point to 'outgoing info' (in addition 'outgoing info' can point to other 'outgoing info'). I will refer to 'outgoing info' as 'MOI' (in the code it stands for the mpls_outgoing_info structure). Incoming labels and MOI's have an array of 'instructions' associated with them. Each instruction has a 'data block' associated with it. The original set of instructions, have changed very little: MPLS_OP_POP -> IN: pop off top label (no data) MPLS_OP_PEEK -> IN: make the top label the active 'incoming label' (no data) MPLS_OP_PUSH -> OUT: push a label on to the top of the label stack (label to push) MPLS_OP_DLV -> IN: deliver the packet to a specify protocol handler (protocol id to send the packet to ie IPv4 IPv6) MPLS_OP_FWD -> IN: transfer control to mpls_output() (pointer to the MOI) OUT: start processing the instructions wit the new MOI (pointer to the new MOI) MPLS_OP_SET -> IN: set the incoing interface OUT: set the dst_entry on the skb [last step before TXing a MPLS packet] (pointer to the dst_entry) These are the new instructions: *nfmark comes from skb #dsmark comes from IP header *tc_index comes from the skb *EXP comes from the active incoming label MPLS_OP_NF_FWD -> IN/OUT: index into the datablock by using the (nfmark & mask) start processing the MOI that was found. (array of MOIs) MPLS_OP_DS_FWD -> IN: index into the datablock by using the (dsmark & mask) start processing the MOI that was found. (array of MOIs) MPLS_OP_TC_FWD -> OUT: index into the datablock by using the (tc_index & mask) start processing the MOI that was found. (array of MOIs) MPLS_OP_EXP_FWD -> IN: index into the datablock by using the (EXP) start processing the MOI that was found. (array of MOIs) MPLS_OP_SET_TC -> IN/OUT: set the tc_index (tc_index to use) MPLS_OP_SET_DS -> IN/OUT: set the dsmark (DSCP to use) MPLS_OP_SET_EXP -> IN/OUT: set EXP on the top label (EXP to use) MPLS_OP_EXP2TC -> IN: index into the data block by using the (EXP) and set the tc_index to the value found MPLS_OP_EXP2DS -> IN: index into the data block by using the (EXP) and set the dsmark to the value found So here are some examples: Egress LER: On input of label 100 EXP 1 gets DSCP 0x4, EXP 4 gets DSCP 0x7 MII(100) -> PEEK POP EXP2DS(1->0x4,4->0x7) DLV(IPv4) Ingress LER (DSCP): Packets going to 11.0.0.0/16 goes out with label 100, DSCP 0x4 get EXP 1, DSCP 0x7 gets EXP 4 IPROUTE(11.0.0.0/16) -> MOI(1000) MOI(1000) DS_FWD(0x4->MOI(500), 0x7->MOI(2000)) MOI(500) SET_EXP (1) PUSH(100) SET(next hop info) MOI(2000) SET_EXP (4) PUSH(100) SET(next hop info) IP routing tranfer control to mpls_output and starts processing MOI(1000). MOI(1000) looks at the DSCP and starts processing either MOI(500) or MOI(2000). MOI(500) and MOI(2000) set the EXP, puch the label, set the dst_entry and then send the packet) (you could implement L-LSPs in a similar way, push differnt label in MOI(500) and MOI(2000) and do not set the EXP value) Alternative: IPROUTE(11.0.0.0/16) -> MPLS_TUNNEL(mpls0) mpls0 -> MOI(1000) MOI(500) SET_EXP (1) PUSH(100) SET(next hop info) MOI(2000) SET_EXP (4) PUSH(100) SET(next hop info) IP routing tranfer send the packet out interface mpls0. Interface mpls0 transfers control to mpls_output and starts processing MOI(1000). MOI(1000) looks at the DSCP and starts processing either MOI(500) or MOI(2000). MOI(500) and MOI(2000) set the EXP, puch the label, set the dst_entry and then send the packet) Ingress LER NFMARK and TCINDEX, work simlarly. Transit: INCOMING_LABEL(100) PEEK POP EXP2TC(1->0xF,4->0xE) -> MOI(10000) MOI(10000) PUSH(100) SET(next hop info) Incoming label 100 looks at the EXP bits and sets tc_index to 0xF when EXP is 1 and to 0xE when EXP is 4. MOI(10000) is responsible for trasmitting the label. It pushed on label 100 (and the same EXP bits) and send it on it way. As it leaved via the physical interface a packet scheduler can look at the tc_index and schedule it appropriately. If you want to translate the EXP then you could use an EXP forward to differnt MOIs that push on the same label, but set differnt EXP bits. Additional intructions? MPLS_OP_TC2EXP -> coule be use in a MOI to translate the tc_index set on input to differnt EXP values. This would avoid having to do a EXP FWD just to set differnt EXP values. MPLS_OP_DS2EXP -> same as above, but would look at DSCP in the IP header and could only be execute on packet that came directly from the IP layer. It would avoid having to have seperate MOIs to implement E-LSPs. Comments, questions, political statments? Jim -- James R. Leu jl...@mi... |