Re: [mpls-linux-general] bug: mpls/tc and mpls/netfilter, the same problem ?
Status: Beta
Brought to you by:
jleu
From: Olivier D. <Oli...@rd...> - 2001-07-05 19:17:17
|
Vincent Jardin wrote: > > Hi, > > > Why ? Can you tell me to what value mpls_index is set ? It's a parameter > > pb. or a filter pb. ? Perhaps the label key must be preceed by 'Ox' like > > with iptables to tell sch_ingress that the key is an hexadecimal value. > > If you read my patch, I just do like you, I have removed the TC_H_MIN mask. TC_H_MIN mask just skip the 16 MSB of mpls_index. The potential pb. (i see it with iptables) is that tc filter .... get the value as a decimal value and not an hexadecimal value. So, to avoid this in iptables, you must put '0x' before the key. I don't know if it the same with tc. Can you check this ? > > > No. If you do this, you strap many packet processing like header > > verification, hearder option treatment, fragment ... If ip_route_input > > doesn't find an entry in rt_hash_table, it call ip_route_input_slow > > which compute a dst entry and insert it in the rt_hash_table for the > > subsequent call. > > I agree. But I do not understand the idea of James. I am missing too many > points ;-). Jim use the dst (aka rt table) field to retrieve the moi for label processing. The mpls_bind2fec function init the moi in the fib structure. So, in rt_set_nexthop i can retrieve this information and put it into the dst field for further processing. > How could he process properly the ingress filter with ip_route_input_slow ? I don't think that Jim handle properly the ingress filter. I'm not sure, but this fonctionnality come from another personn (don't remember who). In fact, the ingress filter doesn't work with the original mpls-0.993 patch. Jim can you confirm that ? > > > > Moreover, I have tested my mpls/tc patch with the MPLS/netfilter patch. > > > It works too ;-) If anyone wants to try it, you should add the following > > > line into mpls_output.c: > > > int mpls_output(struct sk_buff *skb) { > > > static const char *fn_name = "mpls_output"; > > > struct mpls_push_data *mpr = (struct mpls_push_data*)kmem_cache_alloc( > > > mpls_mpr_cachep,GFP_ATOMIC); > > > [... ] > > > mpr->bos = 1; > > > mpr->exp = 0; > > > > > > #ifdef CONFIG_MPLS_INGRESS_POLICING > > > if (skb->mpls_index) { > > > unsigned int key = skb->mpls_index; > > > > > > MPLS_DEBUG(("%s: selecting moi with mpls_index 0x%x\n",fn_name,key)); > > > RADIX_GET(&moi_tree,mpls_out_info_node,next,MPLS_TREE_BITS,unsigned > > > int, key,MPLS_TREE_DEPTH,moi,moi,retval); > > > if(retval || !moi) { > > > MPLS_DEBUG(("%s: unknown mpls_index\n",fn_name)); > > > kmem_cache_free(mpls_mpr_cachep,mpr); > > > return -ESRCH; > > > } > > > > > > mpls_again: > > > + skb->dst = moi->moi_dst; // XXX <- add this line > > > > This is already done in mpls_output2 (line 100) which is call from > > mpls_output, for the GEN label type only. > > NOP, I get a panic if you do not add this line. Because in your patch, you > try to access to skb->dst before mpls_output2. See skb->dst->pmtu. > case MPLS_OP_PUSH: > skb->dst->pmtu -= 4; > break; > } > } The skb->dst field is setup by ip_route_input or ip_route_input_slow if it has not been set previously (test is made in ip_rcv_finish). So, if you strap (like you suggest in your pacth) this processing, the skb->dst field is empty when mpls_output is call. You can refine this by adding : mpls_again: + if (skb->dst == NULL) + skb->dst = moi->moi_dst; > > Vincent > > PS: I am thinking about how to integrate all the Linux's schedulers (and > mainly CBQ because it supports some hierachical classes) with the MPLS code > in order to do TE. For example, a leaf node of an egress hiearchy could be > mapped into an LSP. And another leaf node could be a Best Effort one with any > regular IP routing. > > 5 Mbps > +-----+ > | eth | > +-----+ > | > ----+---- > | | > --- --- > 2 Mbps 3 Mbps > Real time Best Effot > > UDP port regular routing > 5000 > -> push > label 32 > > tc qdisc ingress is not the right solution because it is "input" QoS. > > For example, it would be nice to get something like: > tc qdisc parent 2:1 [...] mplstag 32 [...] cbq [...] Yes, it would be very nice. Have you some ideas to start coding ? > > I would appreciate any suggestion. > What we intend to do is using iptables to mark packet, then use the tc 'fw' classifier (based on iptables mark) to enqueue the packets. Olivier -- FTR&D/DAC/CPN Technopole Anticipa | mailto:Oli...@fr... 2, Avenue Pierre Marzin | Phone: +(33) 2 96 05 28 80 F-22307 LANNION | Fax: +(33) 2 96 05 18 52 |