Thread: [Ndiswrapper-general] thoughts on issues with fedora kernels
Status: Beta
Brought to you by:
pgiri
From: David K. <dmk...@uc...> - 2005-01-05 18:52:00
|
Hi, I saw somewhere a discussion of what to do about the fact that ndiswrapper no longer compiles correctly with Fedora (and probably other patched) kernels. The decision appeared to be that it wasn't possible to support kernels from every distribution. Although I agree that it is impossible to support all kernels, I would like to suggest a different approach. Basically, I think not dealing with the fact that ndiswrapper doesn't compile on redhat kernels isn't a good approach. For one, this could greatly limit the number of people who install ndiswrapper, which could slow development of ndiswrapper itself if there are fewer people testing the package. For two, one of the maintainers of redhat kernels is Alan Cox, the maintainer of stock kernels either is or is very close to Alan Cox. As a result, patches to redhat kernels sometimes indicate where the stock kernel is going and might need to be dealt with at some time. I suggest that someone contact one of the kernel maintainers at redhat (perhaps Alan Cox or Dave Jones) and bring the problems to their attention and ask them if they wouldn't mind answering a few questions about why they made those decisions and what can be done about them. I think they are aware that lots of people are using ndiswrapper and would be willing to answer a few questions. In that direction, I would like to start a list of questions that could be asked of redhat that others can comment on and add to: 1) 4k stack. Giri has explained to me that 4k stacks are more efficient for servers, but doesn't make a big difference for workstations. I am under the impression that stack size is normally a configurable kernel option (correct me if I am wrong), but redhat has eliminated that option and made it always 4k. I would like to know why that option was eliminated and if there is any drawback to letting the stack size be configurable. 2) CONFIG_DEBUG_SPINLOCK. I know nothing about spinlocks, but the name of this option suggests that it is a debugging option that would not be used in final releases. I would like to know from redhat or anyone else if that is the case and if they plan on removing it sometime down the line. If not, then perhaps suggestions for a way to work around it (linuxant came up with one). 3) "task_nice" undefined. Again, no idea what this means, but it would be nice to know why stock kernels don't bring up this warning. I also get "warning: `MODULE_PARM_' is deprecated" when I compile. Not sure if this is a fedora or generic problem. Thoughts.... David |
From: Gerald H. <ghe...@ro...> - 2005-01-05 22:13:14
|
On Wed, 05 Jan 2005 10:51:49 -0800, you wrote: >Cox. As a result, patches to redhat kernels sometimes indicate where >the stock kernel is going and might need to be dealt with at some >time. =20 More than sometimes. Unlike past Red Hat practices with Fedora the goal is to closely track the real release of any given software with as few changes as possible. In the case of the kernel this means the Fedora kernel is typically the same as the stock kernel with features from the next stock kernel release added in for testing purposes, particularly with the =46edora-devel version (aka Rawhide). >I suggest that someone contact one of the kernel maintainers at redhat >(perhaps Alan Cox or Dave Jones) and bring the problems to their >attention and ask them if they wouldn't mind answering a few questions >about why they made those decisions and what can be done about them. I >think they are aware that lots of people are using ndiswrapper and would >be willing to answer a few questions. They both participate on the Fedora-devel mailing list. >1) 4k stack. Giri has explained to me that 4k stacks are more efficient >for servers, but doesn't make a big difference for workstations. I am >under the impression that stack size is normally a configurable kernel >option (correct me if I am wrong), but redhat has eliminated that option >and made it always 4k. I would like to know why that option was >eliminated and if there is any drawback to letting the stack size be >configurable. Wrong, 4k helps the desktop. This was discussed a long time ago on the Fedora-devel list, some highlights include: https://www.redhat.com/archives/fedora-devel-list/2004-June/msg00643.html https://www.redhat.com/archives/fedora-devel-list/2004-June/msg00641.html >3) "task_nice" undefined. Again, no idea what this means, but it would >be nice to know why stock kernels don't bring up this warning. They will. See https://www.redhat.com/archives/fedora-devel-list/2005-January/msg00137.h= tml for an explanation that this is a feature of kernel 2.6.10. |
From: Adrian Irving-B. <wis...@wi...> - 2005-01-05 23:35:32
|
On Wed, Jan 05, 2005 at 05:13:01PM -0500, Gerald Henriksen wrote: > More than sometimes. Unlike past Red Hat practices with Fedora the > goal is to closely track the real release of any given software with > as few changes as possible. Thank goodness. That's great to hear. This was one of the reasons I left RedHat in a hurry once I tried Debian. |
From: Jim C. <jc...@di...> - 2005-01-06 06:20:06
|
(this is not a flame, and is not directed at anyone) Id like to suggest that everyone here read these 2 sites regularly. (this thread particularly, but everyone else too) http://lwn.net/ http://www.kerneltraffic.org/kernel-traffic/latest.html many suppositions & speculations are answered there, with more authority than any of us can toss around. 2nd, the 'get the facts' idea is nice, but I daresay they are aware of ndiswrapper, and a parade of users asking questions on LKML (the same ones, over and over again, quite possibly) will just look silly. 3rd, Alans post (thanks for that link) was pretty unambiguous. code that cant run in 4k stacks is 'already broken'. AIUI, not all drivers have problems here, Realtek's driver doesnt (or I havent seen it yet, in my limited use of that card) We would do well to determine which drivers have this problem, and collect the info, in one place. (a wiki perhaps :-O) 4 Kernel development model is changing/changed. 2.7 is still not out, and it may be a while (see KT #291) http://www.kerneltraffic.org/kernel-traffic/kt20050104_291.html#1 Andrew Morton (one of those authoritative sources) said (speculating): Or start alternating between stable and flakey releases, so 2.6.11 will be a feature release with a 2-month development period and 2.6.12 will be a bugfix-only release, with perhaps a 2-week development period, so people know that the even-numbered releases are better stabilised. what Ive read prior is that with more kernel and vendor maturity, the vendors can be relied upon to keep the unwashed masses stable, and the kernel is allowed to be more volatile. FC probly falls in between - its a beta vehicle for Redhat to acid test stuff before it goes to the paying customers (the unwashed masses above) It may be *as* unstable as vanilla, but unlikely to be *more* unstable, probably on average less unstable. Theyre not *trying* to screw with us. another quote from KT 291 Someone pointed out that Andrew Morton's -mm tree might be bleeding edge, but that Andrew made a conscious choice about when to do each release, and that this choice probably took stability into account. Alan Cox said: * 2.6.x-mm is more like some of the work the old 2.4-ac did in merging new stuff (its also worth noting that 2.4-ac ended up more stable than 2.4 at times so -mm might be stable) The -ac tree is trying to be fairly conservative. When I merge stuff that is a little less conservative because it has to be done then I've tried to put a note in the relnotes for that release warning people its more testing grade. * jeese- theyre all relevant: Alan Cox announced Linux 2.6.9-ac16, saying: * Further small fixes for different minor things. A merge of some of the small cleanups from Fedora work and also the fixes for the igmp and vc holes. Arjan van de Ven is now building RPMS of the kernel and those can be found in the RPM subdirectory and should be yum-able. Expect the RPMS to lag the diff a little as the RPM builds and tests do take time. *(evidently Gerald has been reading) 5. I cant see 8k kernels going away - ever - if youre willing to build your own that is. theres no runtime cost to a macro define. Of course, if this opinion breaks, you get to keep both pieces. 6. Maybe the best thing we can do is each pony up $5 to buy Giri (or his wife) a nice christmas present. We certainly have 50 people here. I daresay she has more influence over his free time than the demands of the bleating flock. |
From: Giridhar P. <gi...@lm...> - 2005-01-06 09:08:37
|
On Wed, 05 Jan 2005 23:19:56 -0700, Jim Cromie <jc...@di...> said: I have been following the discussion and want to clarify couple of things. Some of them have been mentioned before. Ndiswrapper itself works fine with 4k stacks. As some of you pointed out, some Windows drivers (at least Broadcom and Realtek drivers) work fine. Other drivers need 8k stacks. I don't know of any driver that needs 16k stack that linuxant's patch uses. If any RedHat/Fedora user/developer is interested, they can easily modify linuxant's patch to adjust the stack size to 8k (from 16k). That should be much better for the same reasons that 4k stacks are better than 8k stacks. CONFIG_DEBUG_SPINLOCK, IMHO, shouldn't be in production kernels. I don't know if RedHat kernels (not Fedora kernels) have this option enabled. Since RedHat/Fedora kernels need to be patched for 8k/16k stack and compiled, disabling CONFIG_DEBUG_SPINLOCK before compilation is not much work. In any case, current CVS and 1.0-rc2 have a workaround for this. However, some Windows drivers _may_ crash when CONFIG_DEBUG_SPINLOCK is used, so you are encouraged to disable this option. Since there seems to be lot of interest for RedHat/Fedora, it may be a good idea for someone to create kernel rpms (along with ndiswrapper module) with the above two patches and host them somewhere so others can simply download them instead of applying the patches and compiling. task_nice is not used in ndiswrapper any more (CVS/1.0-rc2), so it should compile with RedHat/Fedora kernels. It is not possible for ndiswrapper to support all kernels from all distributions (in addition to all 2.4/2.6 vanilla kernels). I myself don't have resources to even check out kernels from all distributions. However, patches for any particular distribution's kernel are welcome. Ideally, it would be better if someone from a distribution can send patches. At least some of you are paying for that distribution! Of course not all distributions may care about ndiswrapper, just as not all wireless card vendors care about Linux. Hopefully ndiswrapper may be useful to more people and catches the attention of distribution vendors. Jim> 6. Maybe the best thing we can do is each pony up $5 to buy Jim> Giri (or his wife) a nice christmas present. We certainly Jim> have 50 people here. I daresay she has more influence over Jim> his free time than the demands of the bleating flock. Well, actually, donations can be used for more important things like buying more cards and other hardware so I don't have to chase people before every release (that is one reason why it takes long time for each release). I also noticed that at least one driver stopped working from 0.12 release (RT2500 USB). Since I don't have this card, I don't know why it stopped working. I currently have about 5 different chipsets and I make sure that they all work with UP machine with/without preempt in 2.6 vanilla kernels. For anything else, there is no guarantee - I mostly rely on other users' reports to know if something is broken. If something is broken, I can't always fix or even offer suggestions, as I don't have a way to figure out without hardware. You can also donate cards (preferably PCMCIA cards). I will create a thread in the ndiswrapper Forums listing the cards that I have. If you have any other card that you would like to be supported, consider donating it. I should mention that I have written to all major wireless card vendors asking them to donate cards so they can be supported by ndiswrapper. Only US Robotics sent me couple of cards. So please write to your vendor and ask them to support ndiswrapper by donating cards. -- Giri |
From: Philip A. <ph...@ka...> - 2005-01-06 09:51:14
|
On Thu, Jan 06, 2005 at 04:08:35AM -0500, Giridhar Pemmasani wrote: > Ndiswrapper itself works fine with 4k stacks. As some of you pointed > out, some Windows drivers (at least Broadcom and Realtek drivers) work > fine. Other drivers need 8k stacks. I don't know of any driver that > needs 16k stack that linuxant's patch uses. If any RedHat/Fedora > user/developer is interested, they can easily modify linuxant's patch > to adjust the stack size to 8k (from 16k). That should be much better > for the same reasons that 4k stacks are better than 8k stacks. It seems quite likely that Linux may move to 4k stacks as a permanent thing in the future -- one of the guiding principles of Linux development in the past has been to ignore backwards compatibility. Especially backwards compatibility with binary-only drivers. Yes, from the POV of ndiswrapper it sucks, but everyone else wins. The question then becomes, is there someway that the ndiswrapped windows driver can be given it's own 8k (or even 16k) stack? You could presumably change the stack pointer on calling into the windows driver and put it back again when the function returns. That would seem to solve all the problems in a single swoop, at the cost of an additional 8k per network interface. You can now tell me why this scheme is unworkable :) Phil -- http://www.kantaka.co.uk/ .oOo. public key: http://www.kantaka.co.uk/gpg.txt |
From: John H. <jc...@th...> - 2005-01-06 10:10:44
|
Philip Armstrong wrote: >The question then becomes, is there someway that the ndiswrapped >windows driver can be given it's own 8k (or even 16k) stack? You could >presumably change the stack pointer on calling into the windows driver >and put it back again when the function returns. That would seem to >solve all the problems in a single swoop, at the cost of an additional >8k per network interface. > > > Grief. Deja vu all over again :-) I asked and no one took me up on the challenge. I think my mail doesn't make it though to the list sometimes. >You can now tell me why this scheme is unworkable :) > > I think it is workable. NT (Windows) has a 12k stack though so three pages per driver instance would make sense. The basic scheme is that where you called a wrapped NDIS routine you drop a calling stack from in this 12k buffer (which includes address that you'd like to return to), flip the stack pointer to point to the 12k stack and jump to the NDIS entry point. When you bounce back to the return address, restore the original stack pointer. This should work even on SMP kernels since the kernel stack pointer is a per-thread thing. It believe it will also work with a pre-emptable kernel (at least it will unless the pre-emptable kernel uses a stack management scheme that I'm not expecting). The downside, for me, to all this is that my assembler skills were honed on a VAX and I have trouble even reading i386 assembler. jch |
From: Philip A. <ph...@ka...> - 2005-01-06 10:57:24
|
On Thu, Jan 06, 2005 at 10:10:29AM +0000, John Haxby wrote: > Philip Armstrong wrote: > >The question then becomes, is there someway that the ndiswrapped > >windows driver can be given it's own 8k (or even 16k) stack? You could > >presumably change the stack pointer on calling into the windows driver > >and put it back again when the function returns. That would seem to > >solve all the problems in a single swoop, at the cost of an additional > >8k per network interface. > > > Grief. Deja vu all over again :-) I asked and no one took me up on > the challenge. I think my mail doesn't make it though to the list > sometimes. Sorry, I should have referenced your post. It did get through! > >You can now tell me why this scheme is unworkable :) > I think it is workable. NT (Windows) has a 12k stack though so three > pages per driver instance would make sense. The basic scheme is that > where you called a wrapped NDIS routine you drop a calling stack from in > this 12k buffer (which includes address that you'd like to return to), > flip the stack pointer to point to the 12k stack and jump to the NDIS > entry point. When you bounce back to the return address, restore the > original stack pointer. Absolutely. > This should work even on SMP kernels since the kernel stack pointer is a > per-thread thing. It believe it will also work with a pre-emptable > kernel (at least it will unless the pre-emptable kernel uses a stack > management scheme that I'm not expecting). > The downside, for me, to all this is that my assembler skills were honed > on a VAX and I have trouble even reading i386 assembler. Swizzling the stack pointer is only a couple of instructions thankfully. Plus you've got to put the return address in the right place and set up the arguments of course. But ndiswrapper must be doing that anyway -- the calling convention for the NT kernel is unlikely to be the same as the Linux kernel one -- so most of the work is already done. Phil -- http://www.kantaka.co.uk/ .oOo. public key: http://www.kantaka.co.uk/gpg.txt |
From: Jan K. <jan...@we...> - 2005-01-06 12:00:24
Attachments:
smime.p7s
|
Hi all, it was amusing to follow the flamewar about pros and cons of supporting distribution kernels, I'm not going to comment on this. At the moment, I unfortunately don't have the time and resources (parts of my test environment is still broken) to hack on ndiswrapper. But I read the suggestions about patching the stack size during runtime on which I would like to give my 2 cents: Philip Armstrong wrote: > On Thu, Jan 06, 2005 at 10:10:29AM +0000, John Haxby wrote: > >>Philip Armstrong wrote: >> >>>The question then becomes, is there someway that the ndiswrapped >>>windows driver can be given it's own 8k (or even 16k) stack? You could >>>presumably change the stack pointer on calling into the windows driver >>>and put it back again when the function returns. That would seem to >>>solve all the problems in a single swoop, at the cost of an additional >>>8k per network interface. >>> >> >>Grief. Deja vu all over again :-) I asked and no one took me up on >>the challenge. I think my mail doesn't make it though to the list >>sometimes. Sounds appealing, but reality is a bit more complex. Unfortunately, this is not just /the/ single stack used for the Windows driver entry. Actually, any user or kernel space task issuing a "sendmsg" or some configuration control command, may enter ndiswrapper and thus the Windows driver code. All these tasks have their own kernel stacks. Mapping these concurrent contexts on the same single stack will give nice kernel crashes... You would rather need a pool of shadow stacks or you would have to allocate that stacks on-demand. The latter may turn out to be a performance killer. And furthermore, you may also have to deal with detecting these shadow stacks, as the Windows drivers tend to call back to ndiswrapper which then could cause a second driver entry, this time without the need to switch stacks. Though, it might be possible to switch stacks, and the performance penalty might be tunable to an acceptable level. But it requires now someone with time and patience to fiddle around with the concepts - patches will be welcome. The costs of recompiling a kernel with 8k stacks are still much lower. Jan |
From: John H. <jc...@th...> - 2005-01-06 12:10:17
|
Jan Kiszka wrote: > > Sounds appealing, but reality is a bit more complex. Unfortunately, > this is not just /the/ single stack used for the Windows driver entry. > Actually, any user or kernel space task issuing a "sendmsg" or some > configuration control command, may enter ndiswrapper and thus the > Windows driver code. All these tasks have their own kernel stacks. > Mapping these concurrent contexts on the same single stack will give > nice kernel crashes... > > You would rather need a pool of shadow stacks or you would have to > allocate that stacks on-demand. The latter may turn out to be a > performance killer. And furthermore, you may also have to deal with > detecting these shadow stacks, as the Windows drivers tend to call > back to ndiswrapper which then could cause a second driver entry, this > time without the need to switch stacks. It is more complex, but I don't agree that you need multiple stacks. An NDIS wrapper function can only call back to a "normal" function if it has been itself called. And if it's called then the per-driver-instance stack is in use. ndiswrapper ought to be able to keep track of whether the normal (per-thread) kernel stack is in place or whether the per-driver stack is in place. You'd need multiple stacks if you allowed multiple threads to use the driver at the same time. If NDIS function calls are locked so that multiple threads can't use the driver at the same time, this issue won't arise. Note that the per-driver stack may be put in place both to handle the low-level interrupts (which don't have process context and just piggy back whatever kernel stack is in use at the moment) and "normal" function calls in which case the kernel stack belonging to the current task is used. jch |
From: Jan K. <jan...@we...> - 2005-01-06 13:57:14
Attachments:
smime.p7s
|
John Haxby wrote: > It is more complex, but I don't agree that you need multiple stacks. > An NDIS wrapper function can only call back to a "normal" function if it > has been itself called. And if it's called then the > per-driver-instance stack is in use. ndiswrapper ought to be able to > keep track of whether the normal (per-thread) kernel stack is in place > or whether the per-driver stack is in place. You'd need multiple > stacks if you allowed multiple threads to use the driver at the same > time. If NDIS function calls are locked so that multiple threads can't > use the driver at the same time, this issue won't arise. Note that the Locking and thus limiting the number of users seem to solve this problem - as long as the driver don't decide to fall asleep, e.g. on some mutex or for a certain delay. Then you will have a thread switch, and you may even end up with a deadlock if a second thread must enter the driver to resume the first one. There might be simple drivers not suffering from such a scenario, but note that such problems can be very hard to trace down when they really occur. A clean design right from the start may be more helpful. > per-driver stack may be put in place both to handle the low-level > interrupts (which don't have process context and just piggy back > whatever kernel stack is in use at the moment) and "normal" function > calls in which case the kernel stack belonging to the current task is used. > Interrupt handlers can indeed simply be handled linke normal driver entries: check if a stack switch was already performed, if not, do it now. But remember again, if your driver function was suspended and switched away, you cannot just reuse a single stack now. Jan |
From: Philip A. <ph...@ka...> - 2005-01-06 13:00:50
|
On Thu, Jan 06, 2005 at 01:00:05PM +0100, Jan Kiszka wrote: > >>Philip Armstrong wrote: > >> > >>>The question then becomes, is there someway that the ndiswrapped > >>>windows driver can be given it's own 8k (or even 16k) stack? You could > >>>presumably change the stack pointer on calling into the windows driver > >>>and put it back again when the function returns. That would seem to > >>>solve all the problems in a single swoop, at the cost of an additional > >>>8k per network interface. > > Sounds appealing, but reality is a bit more complex. Unfortunately, this > is not just /the/ single stack used for the Windows driver entry. > Actually, any user or kernel space task issuing a "sendmsg" or some > configuration control command, may enter ndiswrapper and thus the > Windows driver code. All these tasks have their own kernel stacks. > Mapping these concurrent contexts on the same single stack will give > nice kernel crashes... Why can't they just use the avaiable stack as is? It's just a stack. An interrupt shouldn't care whether it's using an ndiswrapper stack or a normal stack[1]. A preemption will be switching to another thread & so will get that thread stack if I have the semantics right. There's nothing magic about these ndiswrapper windows stacks which is going to stop kernel interrupt code from using them as is AFAICS[1]. Code talks however -- if I have time I'll have a look at it. Phil [1] Happy to see reasons to the contrary! -- http://www.kantaka.co.uk/ .oOo. public key: http://www.kantaka.co.uk/gpg.txt |