From: Blaisorblade <bla...@ya...> - 2004-11-30 18:41:57
|
Since you've started sending UML patches, I've remembered to ask you this. In particular, from a general feeling, you should merge: uml-fix-umldir-order uml-fix-export-symbol if not already applied. uml-raise-tty-limit (maybe something similar could be good for ubd, too? We are using half of the 256 minors available). -- Paolo Giarrusso, aka Blaisorblade Linux registered user n. 292729 http://www.user-mode-linux.org/~blaisorblade |
From: Blaisorblade <bla...@ya...> - 2004-11-30 19:00:03
|
On Tuesday 30 November 2004 19:45, Blaisorblade wrote: > Since you've started sending UML patches, I've remembered to ask you this. > > In particular, from a general feeling, you should merge: > > uml-fix-umldir-order > > uml-fix-export-symbol if not already applied. > > uml-raise-tty-limit (maybe something similar could be good for ubd, too? We > are using half of the 256 minors available). Forgot some other ones: uml-core-on-panic - I remember it being a bit controversial. At least it has been turned to using the notifier_chain... which was problem #1. #2 was that you need to skip all the rest of exit cleanup, which Jeff didn't like... but as long as it is not a change to the default behaviour, I think he might accept it. uml-general-protection-fault - from the comment, it seems that if that were a page fault, it would be fixable, while we always prefer to guess it's a GFP and so to die horribly. If this is true, it cannot be merged. But what about adding a PTRACE_EX_FAULTINFO to SKAS3 and probe for it at startup? Jeff Dike wanted to wait for SKAS4 to fix it, but as you see SKAS3 is still under very active development. Also, we get all the possible compatibility we could, this way. -- Paolo Giarrusso, aka Blaisorblade Linux registered user n. 292729 http://www.user-mode-linux.org/~blaisorblade |
From: Blaisorblade <bla...@ya...> - 2004-12-04 03:55:33
|
On Thursday 02 December 2004 12:48, Gerd Knorr wrote: > > > In particular, from a general feeling, you should merge: > > Yep, I'm waiting for -rc3 as sync point for my next patchset, then go > over the stuff, update if needed and submit what I think should go in. > > uml-general-protection-fault - from the comment, it seems that if that > > were a page fault, it would be fixable, while we always prefer to guess > > it's a GFP and so to die horribly. If this is true, it cannot be merged. > That triggeres if some application within uml tries to access I/O ports > (which it can't obviously). Only or also? Are you seeing specific behaviour improving here (i.e. hwclock or anything else, even X11)? > The processor raises a GPF then, and uml > without that patch tries to handle it as page fault ... If you mean the comment is outdated, it could be ok... but this *must* be in the comments of the patch. I'm going to actually study what is happening there... from FAULTINFO implementation: + { .is_write = child->thread.error_code, it seems that the error_code value is coded like the param of the i386 asmlinkage void do_page_fault(struct pt_regs *regs, unsigned long error_code) , where in fact I can read: tsk->thread.error_code = error_code | (address >= TASK_SIZE); , but I see it generates a fault with trap_no == 14, i.e. page fault (are there no macros for that? This is absolutely ugly!), so I assume that: asmlinkage void do_general_protection(struct pt_regs * regs, long error_code) which gives 13 as trap_no, has the same calling convention (enforced by the processor, probably). for which I see: * bit 2 == 0 means kernel, 1 means user-mode i.e., the fault on the host has happened on kernel code. If you go to the trouble of explaining that it can't be a page fault, or that it can on special conditions (I guess it's difficult - a page fault in kernel code never gives a SIGSEGV IMHO) and that accessing I/O ports gives a fault in kernel code (which does not make sense for me) and explain all this in the comment, then the patch might me merged - and if it's not a workaround then PTRACE_EX_FAULTINFO disappears. The rest of the message assumes PTRACE_EX_FAULTINFO might be still needed. > > But what about adding a PTRACE_EX_FAULTINFO to SKAS3 and probe for it at > > startup? > Don't like that idea that much, given the fact that there is a > incompatible change in the queue anyway where we can easily fix that. > On the other hand skas4 has been vaporware only up to now. This is the reason I don't want to wait for that to fix such things. Also, to do conservative coding, it's better to test this fixed API in SKAS3, so that we can identify any specific problem with it. > At least I > havn't seen a single line of code, not even an announcement so far. > Jeff, what is the current status? Is there more than just the name? > Design ideas? Code? If so, where? Design ideas: tossed in various places, probably you want to look at the original Jeff / Linus discussion: http://www.kerneltraffic.org/kernel-traffic/kt20030106_199.html#4 Also, Jeff explained the API multiple times over the ML - just search (it's just amazing to see how well Jeff managed to advertise SKAS4 as "The Next Big Thing(tm)!). Code: you can see an idea bundled inside the x86_64 patch against 2.6.4, the way SKAS3 is bundled in the normal 2.4 UML patch. It is basically just a raw idea - if it is ever seriously discussed, I have tons of concerns on that code. From missing security hooks, to the fact that moving current->mm might not be enough (you need some state which is elsewhere, especially the personality flags when you run 32bit processes inside a native UML for AMD64, but possibly more - the problem also shows up if you want to merge SKAS3 and GrSecurity *properly*) to the lack of any locking around current->mm. Now that I think to it, I may be wrong, but I've the doubt the sporadic panics I see about SKAS are simply race conditions on setting current->mm in PTRACE_SWITCH_MM: + child->mm = new; + child->active_mm = new; is probably racy when you immediately switch on another process to the new task... normally, when returning from sys_ptrace, you probably hit a barrier somewhere, but imagine that the code gets preempted before hitting the barrier and another processor schedules the child... it will BUM somehow. Also, the trick in switch_mm for the SMP case may not be enough... the per_cpu array of the active mm has a stale value, and if that array exists it probably is used, together with the stale value... **mumbling** Status: Go to the diary page on the site and search SKAS4 - there's a recent entry about Jeff giving it a try. > > Jeff Dike wanted to wait for SKAS4 to fix it, but as you see SKAS3 is > > still under very active development. Also, we get all the possible > > compatibility we could, this way. > Well, compatibility-wise there is no difference between skas4 and > skas3new: To actually use that both host and uml machines must support > that. Yes, but while skas4 *is* not yet ready, if you want to fix this issue, this week we can get a new SKAS version. And we are already supporting SYSEMU, so supporting such things is not new. Supporting SKAS4 is much different, instead. case 1: we need to copy over the SKAS folder and do changes all over. In this case, I don't even know if we would anytime offer kernels working with boths... that could come as a bonus, if anyone finds the time to do such packaging, but I'm against merging something such in the kernel. case 2: we can confine changes into a very few specific functions. It would seem the case from the idea itself, but if Jeff was getting problems with signals (!), it does not make sense > With the exception that we'll have three skas versions to support > once skas4 is done: skas3old, skas3new + skas4. > I'd prefeare to get skas4 out of the door quickly, and also see that > one merged mainline. That is more work _now_ of cource, but less > maintainance work in the long run. First, because we don't need > skas3new then, and second because it can be merged mainline. > Gerd -- Paolo Giarrusso, aka Blaisorblade Linux registered user n. 292729 http://www.user-mode-linux.org/~blaisorblade |
From: Gerd K. <kr...@by...> - 2004-12-07 12:20:26
|
> > > uml-general-protection-fault - from the comment, it seems that if that > > > were a page fault, it would be fixable, while we always prefer to guess > > > it's a GFP and so to die horribly. If this is true, it cannot be merged. > > > That triggeres if some application within uml tries to access I/O ports > > (which it can't obviously). > Only or also? Are you seeing specific behaviour improving here (i.e. hwclock > or anything else, even X11)? hwinfo (suses hardware detection tool) stops hanging, especially the vmware test (which tries to access one of the magic ports used for communication with vmware). It rarely shows up otherwise, usually the apps first try to get permissions via ioperm(), which fails. They usualy bail out then because of that and don't even try to access a I/O port. > > The processor raises a GPF then, and uml > > without that patch tries to handle it as page fault ... > > If you mean the comment is outdated, it could be ok... but this *must* be in > the comments of the patch. Maybe it's just badly explained. It's also some time ago I wrote that and I never looked at it again. > * bit 2 == 0 means kernel, 1 means user-mode > > i.e., the fault on the host has happened on kernel code. If you go to the > trouble of explaining that it can't be a page fault, or that it can on > special conditions (I guess it's difficult - a page fault in kernel code > never gives a SIGSEGV IMHO) and that accessing I/O ports gives a fault in > kernel code (which does not make sense for me) IIRC it *looks* like a kernel mode page fault (because the info passed on via FAULTINFO is incomplete), but it isn't one. As a kernel mode page fault shouldn't happen I used that to detect the GPF case. Seems to do fine normally, maybe it blows up if a real kernel mode page fault happens due to a uml kernel bug. > The rest of the message assumes PTRACE_EX_FAULTINFO might be still needed. Explicitly passing the fault info certainly is the cleaner solution. > > On the other hand skas4 has been vaporware only up to now. > > This is the reason I don't want to wait for that to fix such things. I also don't want to wait one more year. But if there is a chance to get skas4 work within a more reasonable time frame I'd prefeare that route. > http://www.kerneltraffic.org/kernel-traffic/kt20030106_199.html#4 I remember that discussion ;) > Also, Jeff explained the API multiple times over the ML - just search (it's > just amazing to see how well Jeff managed to advertise SKAS4 as "The Next Big > Thing(tm)!). Hmm, well, a google search wasn't that successful, althrough a few mailing list hist where in there (so it's actually indexed by google). > Code: you can see an idea bundled inside the x86_64 patch against 2.6.4, the > way SKAS3 is bundled in the normal 2.4 UML patch. It is basically just a raw > idea - if it is ever seriously discussed, This never being discussed is the first problem. > I have tons of concerns on that code. From missing security hooks, to > the fact that moving current->mm might not be enough (you need some > state which is elsewhere, especially the personality flags when you > run 32bit processes inside a native UML for AMD64, but possibly more - > the problem also shows up if you want to merge SKAS3 and GrSecurity > *properly*) to the lack of any locking around current->mm. skas3 sufferes from that as well, right? > Status: Go to the diary page on the site and search SKAS4 - there's a recent > entry about Jeff giving it a try. Yep, I've seen that. No url to the current code through so one could help somehow by reviewing / testing / debugging stuff. > Yes, but while skas4 *is* not yet ready, if you want to fix this issue, this > week we can get a new SKAS version. > > And we are already supporting SYSEMU, so supporting such things is not new. sysemu is independant from skas (and can also used in tt mode). Would be nice to have these as separated patches btw, so we could try to get sysemu merged mainline. > case 2: we can confine changes into a very few specific functions. It would > seem the case from the idea itself, but if Jeff was getting problems with > signals (!), it does not make sense Not clear yet why according to the diary, but without patches it's hard to help getting skas4 out of the door by looking at these issues ... Gerd -- #define printk(args...) fprintf(stderr, ## args) |
From: Blaisorblade <bla...@ya...> - 2004-12-10 20:40:27
|
On Tuesday 07 December 2004 13:03, Gerd Knorr wrote: > > > > uml-general-protection-fault - from the comment, it seems that if > > > > that were a page fault, it would be fixable, while we always prefer > > > > to guess it's a GFP and so to die horribly. If this is true, it > > > > cannot be merged. > > > > > > That triggeres if some application within uml tries to access I/O ports > > > (which it can't obviously). > > Only or also? Are you seeing specific behaviour improving here (i.e. > > hwclock or anything else, even X11)? > hwinfo (suses hardware detection tool) stops hanging, especially the > vmware test (which tries to access one of the magic ports used for > communication with vmware). It rarely shows up otherwise, usually the > apps first try to get permissions via ioperm(), which fails. They > usualy bail out then because of that and don't even try to access a I/O > port. Ok, thanks... > > > The processor raises a GPF then, and uml > > > without that patch tries to handle it as page fault ... > > If you mean the comment is outdated, it could be ok... but this *must* be > > in the comments of the patch. > Maybe it's just badly explained. Possibly... the problem is that the comment says it *may* work while I'm trying to make sure it *does* work always or to fix it... > It's also some time ago I wrote that > and I never looked at it again. > > * bit 2 == 0 means kernel, 1 means user-mode > > > > i.e., the fault on the host has happened on kernel code. If you go to the > > trouble of explaining that it can't be a page fault, or that it can on > > special conditions (I guess it's difficult - a page fault in kernel code > > never gives a SIGSEGV IMHO) and that accessing I/O ports gives a fault in > > kernel code (which does not make sense for me) > IIRC it *looks* like a kernel mode page fault (because the info passed > on via FAULTINFO is incomplete), but it isn't one. Well, I actually checked the value being passed by one of those functions; now, they seem to be registered as the processor trap handler, which means that checking the Intel manual would give the real API for the general case (bits may have different meanings for GPF traps) ... > As a kernel mode > page fault shouldn't happen I used that to detect the GPF case. > Seems > to do fine normally, maybe it blows up if a real kernel mode page fault > happens due to a uml kernel bug. Yes, if the kernel gets a page fault, the usermode process shouldn't get a SIGSEGV... However, we must make really sure that it's this way... > > The rest of the message assumes PTRACE_EX_FAULTINFO might be still > > needed. > Explicitly passing the fault info certainly is the cleaner solution. Yes, it may be done soon. > > > On the other hand skas4 has been vaporware only up to now. > > > > This is the reason I don't want to wait for that to fix such things. > I also don't want to wait one more year. But if there is a chance to > get skas4 work within a more reasonable time frame I'd prefeare that > route. When I get to work on Bodo's improvements to SYSEMU (on which I have some doubts I already posted) I'll also try adding that... UML must already handle some others similar "API-extensions", including probably PTRACE_SYSEMU_SINGLESTEP (or another kind of idea). > > Also, Jeff explained the API multiple times over the ML - just search > > (it's just amazing to see how well Jeff managed to advertise SKAS4 as > > "The Next Big Thing(tm)!). > Hmm, well, a google search wasn't that successful, althrough a few > mailing list hist where in there (so it's actually indexed by google). > > Code: you can see an idea bundled inside the x86_64 patch against 2.6.4, > > the way SKAS3 is bundled in the normal 2.4 UML patch. It is basically > > just a raw idea - if it is ever seriously discussed, > This never being discussed is the first problem. Yes, but I never bothered *really* asking Jeff to publish it - that's just the habits one can get used, as doing a solo-development... even waiting for the patch to be ready and polished up is not wise, when there are multiple developers which might like it. "Release often - release earlier" is important. Also, while in that discussion, Linus proposed that API, it does not mean that if we implement that exact API, it gets accepted by everyone without concerns. We should get into this mindview... > > I have tons of concerns on that code. From missing security hooks, to > > the fact that moving current->mm might not be enough (you need some > > state which is elsewhere, especially the personality flags when you > > run 32bit processes inside a native UML for AMD64, but possibly more - > > the problem also shows up if you want to merge SKAS3 and GrSecurity > > *properly*) to the lack of any locking around current->mm. > skas3 sufferes from that as well, right? Yes, both of not saving current->flags and the missing locking around current->mm. I think that a wmb(); call after setting current->mm may be enough - we get non-reproducible crashes and somebody posts the backtraces, but never figured out what's happening. In fact, if you think at it, it's difficult that no memory flushing is done when doing a context switch... also to get an Oops, you probably need to have child->mm to point at a freed and overwritten mm_struct... However, since I don't know which access rules are used in the kernel onto current->mm (everything is written by assuming that current->mm won't change). Probably, in SuSE there is someone who knows better (don't know if Andrea Arcangeli will ever care, but it would be nice)... since you include SKAS in your kernel, getting it reliable is interesting also for you all in SuSE. See my other mail, regarding the outdated SKAS version in SuSE 9.1 and its bugs. > > Status: Go to the diary page on the site and search SKAS4 - there's a > > recent entry about Jeff giving it a try. > Yep, I've seen that. No url to the current code through so one could > help somehow by reviewing / testing / debugging stuff. > > Yes, but while skas4 *is* not yet ready, if you want to fix this issue, > > this week we can get a new SKAS version. > > And we are already supporting SYSEMU, so supporting such things is not > > new. > sysemu is independant from skas (and can also used in tt mode). Yes, this support is merged somewhere (both -bb and -bk), but disabled for now in -bb just in case some other problem pops up with it... > Would > be nice to have these as separated patches btw, so we could try to get > sysemu merged mainline. Yes, they are splitout on my disk, never found time to package and describe it. However, the -bb patchset offers exactly that! I put those patches at the beginning of the UML tree, to notice early any conflict between SKAS+SYSEMU and UML. I've also posted once the SYSEMU fixes to LKML, but they didn't get much attention. Maybe they looked like a random idea for ptrace() without much uses (I marketed those patches as generally useful, rather than being about UML); also I didn't CC Roland McGrath, the ptrace maintainer. > > case 2: we can confine changes into a very few specific functions. It > > would seem the case from the idea itself, but if Jeff was getting > > problems with signals (!), it does not make sense > Not clear yet why according to the diary, but without patches it's hard > to help getting skas4 out of the door by looking at these issues ... We have more urgent stuff to get fixed right now... -- Paolo Giarrusso, aka Blaisorblade Linux registered user n. 292729 http://www.user-mode-linux.org/~blaisorblade |
From: Gerd K. <kr...@by...> - 2004-12-10 23:00:43
|
> > I also don't want to wait one more year. But if there is a chance to > > get skas4 work within a more reasonable time frame I'd prefeare that > > route. > > When I get to work on Bodo's improvements to SYSEMU (on which I have some > doubts I already posted) I'll also try adding that... UML must already handle > some others similar "API-extensions", including probably > PTRACE_SYSEMU_SINGLESTEP (or another kind of idea). Found a old x86_64 patch (2.6.4) with some skas4 stuff (uml kernel code only, but hey, more than nothing ...) on my hard disk. skas4 needs ptrace(FAULTINFO) as well, looks like this there: struct ptrace_faultinfo { int is_write; unsigned long addr; + unsigned long trap_no; + unsigned long error_code; }; So it probably doesn't make a big difference maintainance-wise whenever we'll put that in _now_ as FAULTINFO_NEW or wait for skas4 as skas4 will use the very same thing ... > > This never being discussed is the first problem. > > "Release often - release earlier" is important. Full agree ... > Also, while in that discussion, Linus proposed that API, it does not > mean that if we implement that exact API, it gets accepted by everyone > without concerns. We should get into this mindview... ... and thats one of the reasons. I think the only way to avoid that is to discuss the design ideas early. It doesn't need to be the perfect patch for that. The 2.6.4 x86_64 patch looks like Jeff tried two approachs: First being (IIRC the original thing suggested by Linus) a new syscall which returns a file descriptor refering to the mm, a syscall for executing mmap() operations on a mm using the fd refering to it. Switch processes via ptrace(SWITCH_MM) like skas3 does. Second being one host process per uml process, ptrace(SWITCH_MM) not being needed any more, current->mm not being modified any more, the syscall for changing mappings of another processes takes the pid not a fd as argument. The second approach looks much better to me. This way it likely also is easier to make tls+nptl work within uml, it will probably also simplify any personality issues on amd64. As the host kernel side is missing it's not clear how this makes sure that you can't change the mappings of random processes in the second approach. With the first approach this is easy, only the creator of a new mm can change mapping therein because you need the file handle for that. Jeff, any comment what the current status is? Maybe also a URL to current patches? > Probably, in SuSE there is someone who knows better (don't know if > Andrea Arcangeli will ever care, but it would be nice)... since you > include SKAS in your kernel, getting it reliable is interesting also > for you all in SuSE. Andrea looked into skas some time ago and didn't like the way mm switching is done very much. Havn't heared anything recently through, asking him for comments and then go over the skas patches is on my (way too long) todo list through. He probably could also give some useful comments on the skas4 design. > > [ split sysemu + skas ] > However, the -bb patchset offers exactly that! Great, I'll have a look. > UML); also I didn't CC Roland McGrath, the ptrace maintainer. Roland McGrath seems to be a busy guy as well, Cc:'ing him probably is a good idea to make sure he will see the mail. Gerd -- #define printk(args...) fprintf(stderr, ## args) |