From: Alex Villacís L. <avi...@ce...> - 2008-11-10 16:10:55
Attachments:
0001-irda-Introduce_skbuff_tx_extra.patch
|
For background, see regression report at http://bugzilla.kernel.org/show_bug.cgi?id=11795 This patchset attempts to fix a regression that broke the irda stack as a result of the qdisc patches merged in 2.6.27. The previous patchsets attempt to fix the clobbering of the irda information by storing it within the data payload itself, as an additional header. For this, space has been allocated via skb_pull (?!) and later, skb_reserve(). The problem with this approach is that we do not have a guarantee that all skbuffs that are processed in the irda stack are actually allocated with functions that reserve the required space for the irda metadata, especially in the tx route. In addition, this approach mixes the payload data with metadata that should not be transmitted at all, which is a bit disorganized. This is the first of 3 patches that try a different approach. Instead of allocating an additional "header" within the data buffer itself, it introduces a new field within the skbuff, named tx_extra. This field should be used for passing data from the higher layers that is required for the drivers to transmit the packet correctly, and formalizes the previous usage of the cb field by the irda stack. The only issue I see is that every single skbuff carries an additional 32 bytes which are not put to any use in other stacks (for now). I was thinking about a pointer field to on-the-fly allocated data, but that means messing around with the skbuff allocation functions, the cloning functions (involving deciding how to behave on cloning), etc. This way is simpler to understand. This patch and the other two that follow fix the issue for me under 2.6.28-rc3. Please comment on this, as I am messing around with the skbuff structure, which potentially affects all network stacks. Signed-off-by: Alex Villacís Lasso <a_v...@pa...> -- perl -e '$x=2.4;print sprintf("%.0f + %.0f = %.0f\n",$x,$x,$x+$x);' |
From: Evgeniy P. <zb...@io...> - 2008-11-10 16:56:05
|
On Mon, Nov 10, 2008 at 07:35:46PM +0300, Evgeniy Polyakov (zb...@io...) wrote: > What exactly should be carried in that area? It is struct irda_skb_cb, which contains qos and other bits of intormation about how to transfer data, which are in turn obtained from various _cb IRDA structures, which I tracked upto to for example lsap_cb, which exists in the hash table. Do others also accessible via similar mechanism? Can that data be stored on per-device basis like ethernet checksum/offloading parameters (for example like LRO is done)? -- Evgeniy Polyakov |
From: Evgeniy P. <zb...@io...> - 2008-11-10 17:04:09
|
Hi Alex. On Mon, Nov 10, 2008 at 11:09:33AM -0500, Alex Villacís Lasso (avi...@ce...) wrote: > The previous patchsets attempt to fix the clobbering of the irda > information by storing it within the data payload itself, as an > additional header. For this, space has been allocated via skb_pull (?!) > and later, skb_reserve(). The problem with this approach is that we do > not have a guarantee that all skbuffs that are processed in the irda > stack are actually allocated with functions that reserve the required > space for the irda metadata, especially in the tx route. In addition, > this approach mixes the payload data with metadata that should not be > transmitted at all, which is a bit disorganized. Do all irda transfers require that additional data? You can simply change MAX_HEADER macro to be bigger than that size if it has to be larger than existing the biggest ax25 layer, which I doubt it has to. This data is always allocated for all transmitted skbs. Reserve is also properly done in sending functions. > This is the first of 3 patches that try a different approach. Instead of > allocating an additional "header" within the data buffer itself, it > introduces a new field within the skbuff, named tx_extra. This field > should be used for passing data from the higher layers that is required > for the drivers to transmit the packet correctly, and formalizes the > previous usage of the cb field by the irda stack. The only issue I see > is that every single skbuff carries an additional 32 bytes which are not I really wanted to write a joke about existing practice of shrinking skb as much as possible sucking bits out of existing fields, but your proposal to add 32 bytes just kicked me out of the chair, my head landed on the keyboard and result is quite miserable: kn cgjm ncfkjn chcmujhcm. This 32 bytes will be unused by at least half of the packets (rx ones) and on the machines, where irda is not used, it will be just a wasted block. Moreover, head of the skb->data is also unused for irda in this case... What exactly should be carried in that area? -- Evgeniy Polyakov |
From: Samuel O. <sa...@so...> - 2008-11-10 17:15:49
|
Hi Alex, On Mon, Nov 10, 2008 at 11:09:33AM -0500, Alex Villacís Lasso wrote: > This patch and the other two that follow fix the issue for me under > 2.6.28-rc3. Please comment on this, as I am messing around with the > skbuff structure, which potentially affects all network stacks. Adding 32 bytes to _all_ skbs is a no-no. This is an IrDA specific issue, and it has to be fixed in an IrDA specific way. My patches work with my irda-usb and mcs7780 dongles. If they dont for you, then we should dump the kernel stack when we're hitting the problem of not having reserved enough headroom for the irda cb. Evgeniy's proposal could also be another solution. Cheers, Samuel. |
From: David M. <da...@da...> - 2008-11-10 20:51:21
|
From: Alex Villacís Lasso <avi...@ce...> Date: Mon, 10 Nov 2008 11:09:33 -0500 > @@ -279,6 +279,14 @@ > */ > char cb[48]; > > + /* > + * Additional space for layer-specific variables that need to > + * survive past dev_queue_xmit(), which clobbers cb above. > + * Intended for use by drivers that need additional layer-specific > + * parameters in order to transmit a packet properly. > + */ > + char tx_extra[32]; > + This kind of bloat is absolutely not acceptable. With IRDA as the only user of this thing, %99.99999 of systems out there will just have this wasted space doing absolutely nothing. |
From: Samuel O. <sa...@so...> - 2008-11-11 00:57:23
|
On Mon, Nov 10, 2008 at 07:55:44PM +0300, Evgeniy Polyakov wrote: > On Mon, Nov 10, 2008 at 07:35:46PM +0300, Evgeniy Polyakov (zb...@io...) wrote: > > What exactly should be carried in that area? > > It is struct irda_skb_cb, which contains qos and other bits of > intormation about how to transfer data, which are in turn obtained from > various _cb IRDA structures, which I tracked upto to for example > lsap_cb, which exists in the hash table. Do others also accessible via > similar mechanism? Can that data be stored on per-device basis like > ethernet checksum/offloading parameters (for example like LRO is done)? I thought about that solution, but the irda_skb_cb line field has to be kept per skb. It is needed for ircomm LMP flow control, in the skb destructor. I see that BT rfcomm does something similar, but uses the skbuff->sk as a rfcomm_dev pointer. As far as I understand the skbuff structure, that doesnt look like a reasonnable solution, as we cant assume the sk pointer won't be altered down the line. Cheers, Samuel. |
From: Evgeniy P. <zb...@io...> - 2008-11-11 06:22:29
|
Hi Samuel. On Tue, Nov 11, 2008 at 02:00:01AM +0100, Samuel Ortiz (sa...@so...) wrote: > I thought about that solution, but the irda_skb_cb line field has to be kept > per skb. It is needed for ircomm LMP flow control, in the skb destructor. I > see that BT rfcomm does something similar, but uses the skbuff->sk as a > rfcomm_dev pointer. As far as I understand the skbuff structure, that doesnt > look like a reasonnable solution, as we cant assume the sk pointer won't be > altered down the line. It depends... If you own skb (when reference counter is 1), you can owerwrite socket pointer with own data as long as destructor will also be updated. You can try to clone skb and free old one to achieve this, but it is not the fastest operation, althouhg I think both bt and irda can afford that. Obviously in the first case you have to call old destructor with old socket pointer also. -- Evgeniy Polyakov |