From: Jeff D. <jd...@ka...> - 2002-01-12 01:35:25
|
[ This might be good to cc to linux-mm when you think it's settled down a bit ... ] bu...@gn... said: > This is cut #1 of the 'export switch_mm to userspace' patch, as > discussed on IRC. You're not wasting any time :-) > This patch creates a new /proc entry, /proc/mm, which on every open(2) > will create a new mm_struct and 'associate' it with the current > process. By write(2)ing into the returned file descriptor, the kernel > switches to this new mm_struct. I realize this is just a first prototype, so I'm not criticizing this interface. I just thought I'd write down my thoughts on the interface to this so that people can check if it makes sense. /proc/mm [ these names are probably bogus and I welcome better ones... ] - returns a file descriptor referring to a new, empty mm_struct /proc/self/mm - returns a file descriptor referring to current->mm /proc/<pid>/mm - returns a file descriptor referring to the mm of process <pid> populate an mm_struct with an extension of mmap - the current interface is this: void *mmap(void *start, size_t length, int prot, int flags, int fd, off_t offset); I would add another file descriptor: void *new_mmap(void *start, size_t length, int prot, int flags, int src_fd, off_t offset, int dest_fd); The range [ offset offset + length ] within the src_fd object will be mapped into the range [ start start + length ] in the dest_fd object. dest_fd -1 refers to current->mm, preserving current mmap semantics as a special case. Obviously, I'm envisioning dest_fd will be a /proc/mm descriptor, but having it be a normal file or anything else that supports mmap seems to make sense and seems to be useful. Maybe munmap, mprotect, and mlock can be similarly extended. lseek, read, write, and close on this type of object all seem to make sense - we'll need to look at the other ops. A caution - when I asked Linus about this at the kernel summit last year, he didn't have a problem with it, but said that we shouldn't try to manipulate active address spaces because it was easy to deadlock. I didn't know what he was talking about exactly and still don't, but this looks like something we need to look into. I don't know what a good interface to switch_mm would be. None of the file ops seem to make sense, except maybe mmap with a new MAP_REPLACE flag. > There's currently one subtle problem left: switching mm's subtly > corrupts the userspace stack. I haven't been able to figure this out. My first thought is that switch_mm isn't doing the cache flushing that you need in order to avoid remanants of the old address space contaminating the new one. However, this seems unlikely since you're switching between identical address spaces. Have you tried this under UML? If you can reproduce the corruption, it ought to be easier to track down there. And this reminds me, we need to think about whether to include a register switch as part of the mm switch. If so, we need a way of associating a register set with the mm. I think UML doesn't need this, but other users of this mechanism conceivably could. > (after write. notice the messed up backtrace.) It's not badly messed up, it seems fairly sane. Do hex dumps of the stack before and after agree? How about the register set? That's basically all of the relevant state, so the problem would have to show up there, unless it's some magic processor state that hasn't been switched or flushed. Jeff |
From: Michael R. <mc...@sa...> - 2002-01-12 20:57:00
|
-----BEGIN PGP SIGNED MESSAGE----- >>>>> "Jeff" == Jeff Dike <jd...@ka...> writes: Jeff> A caution - when I asked Linus about this at the kernel summit last year, he Jeff> didn't have a problem with it, but said that we shouldn't try to manipulate Jeff> active address spaces because it was easy to deadlock. I didn't know what Jeff> he was talking about exactly and still don't, but this looks like something Jeff> we need to look into. The problem may be that you can't change the address space of a process that is currently running. (Even if you know that your code is in a different portion of the address space) It may be that one needs to put the change of address request into the proc structure, but it won't become active until the next host context switch. The soft mappings of each process need to get loaded into the MMU at some point. Jeff> And this reminds me, we need to think about whether to include a register Jeff> switch as part of the mm switch. If so, we need a way of associating a Jeff> register set with the mm. I think UML doesn't need this, but other users of Jeff> this mechanism conceivably could. partially protected threads might be useful for some Java VM or some such. I can also imagine an incremental garbage collection system where "oldspace" and "newspace" are swapped in some fashion. ] ON HUMILITY: to err is human. To moo, bovine. | firewalls [ ] Michael Richardson, Sandelman Software Works, Ottawa, ON |net architect[ ] mc...@sa... http://www.sandelman.ottawa.on.ca/ |device driver[ ] panic("Just another NetBSD/notebook using, kernel hacking, security guy"); [ -----BEGIN PGP SIGNATURE----- Version: 2.6.3ia Charset: latin1 Comment: Finger me for keys iQCVAwUBPEA48oqHRg3pndX9AQHrFAQA3Ki/GmDgG7VexxKUDDLFdnT+eW0CnwHZ lFPhkBfHScWvZIaGlA5Ot3aO7J7EWxYBs5yldau0tkyWaKl4lVAsw0oFOi1dQPRq q3I9PP12lSSofD7ztkwpYJySvYkVNH1zEsJjOu4/ifjaELvcrZ6Nx8Hw/0Wj4Dfm B+BtcDsLcrc= =2mBv -----END PGP SIGNATURE----- |
From: Jeff D. <jd...@ka...> - 2002-01-13 01:11:48
|
mc...@sa... said: > The problem may be that you can't change the address space of a > process that is currently running. (Even if you know that your code is > in a different portion of the address space) I can see races caused by that sort of situation that would need some locking, but I don't see deadlocks. > It may be that one needs to put the change of address request into > the proc structure, but it won't become active until the next host > context switch. The soft mappings of each process need to get loaded > into the MMU at some point. This is a problem now for a multithreaded process running on an SMP box, and it's fixed by having one processor send a flush tlb request to another with an IPI. This same mechanism is in the works for UML. > partially protected threads might be useful for some Java VM or some > such. > I can also imagine an incremental garbage collection system where > "oldspace" and "newspace" are swapped in some fashion. Yeah, and I can imagine other uses for this sort of thing as well. It seems generically useful. Jeff |