From: Young K. <you...@gm...> - 2005-09-27 15:48:55
|
Hi, i have a question about copy_from/to_user() implementation in skas mode. as my understanding, when copy_from/to_user() is invoked, before the address translation happens, UML kernel calls sigsetjmp() to come back when there is a segmentation fault. and if there is, it seems that the system call an application triggered eventually returns EFAULT. then, it seems to me that sigsetjmp() is to catch the error when the application gave the invalid user space address. my question is, if so, shouldn't the error be caught when UML kernel translates the user space address to the kernel space address? i mean, UML kernel must know the valid memory regions and if the address is out of the valid regions, then it knows the address is invalid before UML tries to access the address. why should it use sigsetjmp() and let a segfault occur? Thank you in advance, -Young |
From: Jeff D. <jd...@ad...> - 2005-09-27 18:12:49
|
On Tue, Sep 27, 2005 at 10:06:53AM -0400, Young Koh wrote: > my question is, if so, shouldn't the error be caught when UML kernel > translates the user space address to the kernel space address? i mean, > UML kernel must know the valid memory regions and if the address is > out of the valid regions, then it knows the address is invalid before > UML tries to access the address. why should it use sigsetjmp() and let > a segfault occur? Because the address may be fine, and an access may still cause a segfault. UML memory is backed by a file on the host. You can map anything from the file you want, but if you access it when the host filesystem is full or you've exceeded your disk quota, the access will segfault. Jeff |
From: Blaisorblade <bla...@ya...> - 2005-09-28 12:00:05
|
On Tuesday 27 September 2005 16:06, Young Koh wrote: > Hi, > i have a question about copy_from/to_user() implementation in skas mode. Ok, and here I'll explain also about TT mode, since they're reasonably similar, and TT mode is more similar to i386. > as my understanding, > when copy_from/to_user() is invoked, before the address translation > happens, UML kernel calls sigsetjmp() to come back when there is a > segmentation fault. and if there is, it seems that the system call an > application triggered eventually returns EFAULT. Yes, copy_*_user returns a failure code and the calling code is supposed to check and return EFAULT. Assuming the fault is a *real* fault, i.e. an unfixable one - maybe we simply need to call handle_page_fault() and load the page from swap. > then, it seems to me > that sigsetjmp() is to catch the error when the application gave the > invalid user space address. Exactly. > my question is, if so, shouldn't the error be caught when UML kernel > translates the user space address to the kernel space address? i mean, > UML kernel must know the valid memory regions Well, saying "regions" is a bit confusing - in fact, you are coping with installed mappings (mmap()s done on the host), which are like page tables (populated on fault). > and if the address is > out of the valid regions, then it knows the address is invalid before > UML tries to access the address. > why should it use sigsetjmp() and let > a segfault occur? In general, because it would be faster, because you must optimize for the fast path, when a segfault won't occur and predoing the checking 1) Background: TT mode and i386 implementation. They can access the user address directly, so they do, and catch the error afterwards. Why? Because when we care for performance, the application will pass correct address. So it's better to optimize the performance of the fast path (correct address) than the one of the slow path. The hardware walking of page tables is faster than the software one (even due to TLBs, which are processor caches of page tables). Whenever a fault occur, the i386 exception handler (see search_exception_tables() and grep ".section __ex_table" include/asm-i386/*) and/or the TT mode fault catcher make copy_*_user return an error. 2) SKAS instead. SKAS is like 4G/4G on the host (it is actually a 3G/3G). In SKAS mode, we actually walk the page tables, because we cannot access the host mapping - we're using a different mapping set. In fact, what you see doesn't catch user space wrong addresses. It catches kernelspace faulting addresses - which is legal to happen, because i386 implementation catches any fault, and doesn't make a distinction, and which happens, when you try to do things like "cat /dev/kmem" - you're trying to do copy_to_user(to, offset /* which is 0 */, size). In fact, that sigsegjmp() was added back in 2.4.24-?um (IIRC) and then around ~2.6.7-um after I and Jeff analyzed this. -- Inform me of my mistakes, so I can keep imitating Homer Simpson's "Doh!". Paolo Giarrusso, aka Blaisorblade (Skype ID "PaoloGiarrusso", ICQ 215621894) http://www.user-mode-linux.org/~blaisorblade ___________________________________ Yahoo! Messenger: chiamate gratuite in tutto il mondo http://it.messenger.yahoo.com |
From: Young K. <you...@gm...> - 2005-09-28 14:22:54
|
so, as my understanding, sigsetjmp() is used for returning an error when there is a userspace and/or kernelspace address faulting in both skas and tt modes. and i386 implementation works the same way, i guess. my one quick question is (it could sound stupid, but) that why there may be a kernelspace faulting? kernel must correct and shouldn't access bad address, i guess, and if so, shouldn't it be a kernel panic? > In fact, what you see doesn't catch user space wrong addresses. > > It catches kernelspace faulting addresses - which is legal to happen, bec= ause > i386 implementation catches any fault, and doesn't make a distinction, an= d > which happens, when you try to do things like "cat /dev/kmem" - you're tr= ying > to do copy_to_user(to, offset /* which is 0 */, size). > > In fact, that sigsegjmp() was added back in 2.4.24-?um (IIRC) and then ar= ound > ~2.6.7-um after I and Jeff analyzed this. i'm using 2.4.26 and 2.6.12 and i think both versions include sigsetjmp(). Thank you, -Young |
From: Blaisorblade <bla...@ya...> - 2005-09-28 16:45:10
|
On Wednesday 28 September 2005 16:22, Young Koh wrote: > so, as my understanding, sigsetjmp() is used for returning an error > when there is a userspace and/or kernelspace address faulting in both > skas and tt modes. and i386 implementation works the same way, i > guess. > my one quick question is (it could sound stupid, Not at all. > but) that why there > may be a kernelspace faulting? kernel must correct and shouldn't > access bad address, i guess, and if so, shouldn't it be a kernel > panic? cat /dev/kmem, as I already said (won't repeat the whole story here). Yes, yes, yes, the kmem driver could check manually the address, but (same story as the rest): *) checking by hand is slower *) not needed, because for i386 works and other archs conform to i386. -- Inform me of my mistakes, so I can keep imitating Homer Simpson's "Doh!". Paolo Giarrusso, aka Blaisorblade (Skype ID "PaoloGiarrusso", ICQ 215621894) http://www.user-mode-linux.org/~blaisorblade ___________________________________ Yahoo! Mail: gratis 1GB per i messaggi e allegati da 10MB http://mail.yahoo.it |
From: Blaisorblade <bla...@ya...> - 2005-09-28 12:01:15
|
On Tuesday 27 September 2005 19:28, Jeff Dike wrote: > On Tue, Sep 27, 2005 at 10:06:53AM -0400, Young Koh wrote: > > my question is, if so, shouldn't the error be caught when UML kernel > > translates the user space address to the kernel space address? i mean, > > UML kernel must know the valid memory regions and if the address is > > out of the valid regions, then it knows the address is invalid before > > UML tries to access the address. why should it use sigsetjmp() and let > > a segfault occur? > > Because the address may be fine, and an access may still cause a segfault. > > UML memory is backed by a file on the host. You can map anything from > the file you want, but if you access it when the host filesystem is full > or you've exceeded your disk quota, the access will segfault. That wasn't the original reason - this is fine too, but as I explained in the other mail, cat /dev/kmem will cause a copy_to_user() with invalid kernel ("from") address. I remember because I discussed this with you at length. -- Inform me of my mistakes, so I can keep imitating Homer Simpson's "Doh!". Paolo Giarrusso, aka Blaisorblade (Skype ID "PaoloGiarrusso", ICQ 215621894) http://www.user-mode-linux.org/~blaisorblade ___________________________________ Yahoo! Mail: gratis 1GB per i messaggi e allegati da 10MB http://mail.yahoo.it |
From: Young K. <you...@gm...> - 2005-09-28 13:47:50
|
Thank you for your reply, but still have one more. (i think i forgot to reply to the mailing list with the previous email, so, i'm attaching the text) On 9/28/05, Jeff Dike <jd...@ad...> wrote: > On Tue, Sep 27, 2005 at 08:56:51PM -0400, Young Koh wrote: > > 1) if the address is fine, shouldn't the access that causes a segfault > > be regarded as a page fault? that is, shouldn't it be handled by UML > > kernel and UML proceeds normally instead of returning an error to the > > application? (because the app gave the proper address) > > No. > > It's akin to a piece of memory all of a sudden going bad. You allocated = the > memory and thought you could use it, but when you try, it turns out not > to be there. yes, i think memory can go bad if 1) it cannot be allocated because of filesystem full or similar reasons as Jeff described, or 2) it was allocated once but could have been swapped. in case of 1) it should be more like kernel panic rather than just a system call error, i think? because kernel cannot allocate any more memory, which kernel is supposed to use. in case of 2) this is a real fault, so, the seg fault handler of UML kernel is supposed to load the swapped page and UML kernel proceeds normally? (as Blaisorblade described in the other mail, by calling handle_page_fault()?) Thank you, -Young > > > 2) i thought that the file used for UML memory is created when a UML > > process is initialized. > > then, the memory file is not created for the fixed size at first, but > > it changes the size according to the UML memory usage? > > It's not fully allocated. It starts off sparse and gets allocated on > the host as the guest's memory usage increases. > > > 3) is it the only case sigsetjmp() protected from? > > I believe so, but am not positive. > > Jeff > |
From: Jeff D. <jd...@ad...> - 2005-09-28 17:13:15
|
On Wed, Sep 28, 2005 at 01:59:48PM +0200, Blaisorblade wrote: > That wasn't the original reason - this is fine too, but as I explained in the > other mail, cat /dev/kmem will cause a copy_to_user() with invalid kernel > ("from") address. I remember because I discussed this with you at length. Oh yeah. I was thinking there was a different (and better) reason, but I couldn't remember what it was. Also, my reason isn't that good anyway. I had a different fix a while ago, but it got lost somewhere. I added an arch hook to get_free_pages which touched each allocated page under the cover of a setjmp. If a page couldn't be allocated on the host, then it is put on a "bad pages" list, and another page is allocated instead. Jeff |
From: Blaisorblade <bla...@ya...> - 2005-09-28 17:28:41
|
On Wednesday 28 September 2005 18:09, Jeff Dike wrote: > On Wed, Sep 28, 2005 at 01:59:48PM +0200, Blaisorblade wrote: > > That wasn't the original reason - this is fine too, but as I explained in > > the other mail, cat /dev/kmem will cause a copy_to_user() with invalid > > kernel ("from") address. I remember because I discussed this with you at > > length. > Oh yeah. I was thinking there was a different (and better) reason, but I > couldn't remember what it was. > Also, my reason isn't that good anyway. I had a different fix a while ago, > but it got lost somewhere. I added an arch hook to get_free_pages which > touched each allocated page under the cover of a setjmp. If a page > couldn't be allocated on the host, then it is put on a "bad pages" list, > and another page is allocated instead. I'm not sure that would help anyway - if the host memory is full, it's full. It's just matter of waiting and retrying. I don't think the host would SIGBUS again on the same page specifically - or better, I'm almost sure this is not done. So, I don't see the reason for that. Catching SIGSEGV/SIGBUS is ok, taking another page is bad. -- Inform me of my mistakes, so I can keep imitating Homer Simpson's "Doh!". Paolo Giarrusso, aka Blaisorblade (Skype ID "PaoloGiarrusso", ICQ 215621894) http://www.user-mode-linux.org/~blaisorblade ___________________________________ Yahoo! Mail: gratis 1GB per i messaggi e allegati da 10MB http://mail.yahoo.it |
From: Jeff D. <jd...@ad...> - 2005-09-28 19:12:52
|
On Wed, Sep 28, 2005 at 07:26:43PM +0200, Blaisorblade wrote: > I'm not sure that would help anyway - if the host memory is full, it's full. > It's just matter of waiting and retrying. This isn't a matter of waiting and retrying. We use pages that are already known to be good, and the rest go on a "bad pages" list and not freed, so no one will try to use them again. > I don't think the host would SIGBUS again on the same page specifically - or > better, I'm almost sure this is not done. > So, I don't see the reason for that. Catching SIGSEGV/SIGBUS is ok, taking > another page is bad. Catching SIGSEGV/SIGBUS at some random place in the kernel after it kmalloced a bad page is OK? I don't think so. The only way to keep the kernel running at that point is to get another page and hope that it's OK. And if it's not, you try again. Jeff |
From: Jeff D. <jd...@ad...> - 2005-09-28 16:12:59
|
On Wed, Sep 28, 2005 at 09:47:41AM -0400, Young Koh wrote: > yes, i think memory can go bad if 1) it cannot be allocated because of > filesystem full or similar reasons as Jeff described, or 2) it was > allocated once but could have been swapped. > > in case of 1) it should be more like kernel panic rather than just a > system call error, i think? because kernel cannot allocate any more > memory, which kernel is supposed to use. No, just that page is bad. Another page could have been dirtied and thus allocated on the host, and it would be usable. So, it's not a fatal problem. > in case of 2) this is a real fault, so, the seg fault handler of UML > kernel is supposed to load the swapped page and UML kernel proceeds > normally? (as Blaisorblade described in the other mail, by calling > handle_page_fault()?) Yes, this happens in the call from maybe_map. Jeff |
From: Young K. <you...@gm...> - 2005-09-28 19:50:29
|
hi, On 9/28/05, Jeff Dike <jd...@ad...> wrote: > On Wed, Sep 28, 2005 at 09:47:41AM -0400, Young Koh wrote: > > yes, i think memory can go bad if 1) it cannot be allocated because of > > filesystem full or similar reasons as Jeff described, or 2) it was > > allocated once but could have been swapped. > > > > in case of 1) it should be more like kernel panic rather than just a > > system call error, i think? because kernel cannot allocate any more > > memory, which kernel is supposed to use. > > No, just that page is bad. Another page could have been dirtied and thus > allocated on the host, and it would be usable. So, it's not a fatal prob= lem. > then, if just that page is bad, shouldn't UML kernel wait until another page is usable(or force another page to be swapped out) and allocate the free page? and proceed normal? i may be still confused. Ok, my thought/idea/suggestion is that what if UML uses a TLB-like table before it does the address translation? i mean, once there is a valid mapping, UML inserts the address mapping(or page mapping) into a software TLB. after that, for that userspace address, UML can search the software TLB table and use the mapping without calling sigsetjmp() and walking through page tables. it seems that sigsetjmp() has relatively large overhead, we could reduce some overhead by not calling it. but surely the problem is the mapping can go corrupted. for that, UML may invalidate a TLB entry if the corresponding page is swapped out or any change is made. what do you think? thank you, |
From: Blaisorblade <bla...@ya...> - 2005-09-29 19:22:12
|
On Wednesday 28 September 2005 21:25, Young Koh wrote: > hi, > > On 9/28/05, Jeff Dike <jd...@ad...> wrote: > > On Wed, Sep 28, 2005 at 09:47:41AM -0400, Young Koh wrote: > > No, just that page is bad. Again, that page is not bad. There is no page yet for this address, and the host won't allocate one for now. > > Another page could have been dirtied and thus > > allocated on the host, and it would be usable. So, it's not a fatal > > problem. Ok, this makes a bit of sense, even if IMHO it doesn't work, I now see your point (but I still insist with what said above). However, even a dirtied page could be "bad", if it has been swapped. If we're getting a SIGBUS, it meant that it didn't succeed in freeing any memory. And, frankly, unless the UML ram file is kept on ramfs (which is RAM-only), it can be swapped (both for disk-based filesystem and for tmpfs). So, I don't think what you suggest could work. > then, if just that page is bad, shouldn't UML kernel wait until > another page is usable (or force another page to be swapped out) We're talking about the host, and if the host is on OOM, you can't help it. The best you can do is to reclaim cache memory. But that must be done when the host is starting to swap, not at SIGBUS time. > and > allocate the free page? and proceed normal? i may be still confused. Jeff's point is that once we have dirtied a page, and the host hasn't yet swapped it, it could be in memory - and accessing it would work. Having dirtied it means we allocated it with (say) kmalloc and then dirtied it. > Ok, my thought/idea/suggestion is that what if UML uses a TLB-like > table before it does the address translation? > i mean, once there is a > valid mapping, UML inserts the address mapping(or page mapping) into a > software TLB. after that, for that userspace address, UML can search > the software TLB table and use the mapping without calling sigsetjmp() > and walking through page tables. > it seems that sigsetjmp() has > relatively large overhead, we could reduce some overhead by not > calling it. How do you measure it? I'm curious myself - I know there's the possibility to use gprof, but I've never used that myself. Surely the "sig" thing is heavy (some syscalls like sigprocmask() for blocking/unblocking signals). Or better, it's the only heavy thing - the rest consists only of saving a couple of registers (6, IIRC) in memory, there's no interest in optimizing it away (I assume). But that part will go away - there's a "softints" patch for this (i.e. moving the blocking/unblocking to userspace - the signal handler notices the signal is stopped and queues the handling via userspace mechanisms). It's at: http://user-mode-linux.sourceforge.net/patches.html > but surely the problem is the mapping can go corrupted. > for that, UML may invalidate a TLB entry if the corresponding page is > swapped out or any change is made. what do you think? In short, it's a really interesting idea. The kernel (arch-independent) infrastructure for this exists, for managing the real TLBs. You already need to invalidate TLB entries when you swap a page and such. Reading Documentation/cachetlb.txt is definitely worth the time spent. And actually, currently that is used to update the host mappings. Using TLBs to save the page table walk is interesting, especially since that would avoid taking a spinlock on SMP (the current implementation of maybe_map() doesn't, but it should), and more important because the TLBs would likely be hotter than all the page tables, so it would probably fit in the L2 cache (while, to walk page tables, we're likely going to have a L2 miss - they're too big). Hotter means "more likely to be accessed, and thus more worth to keep in cache". The cache usage discussion in Reiser4 whitepaper (www.namesys.com) is really enlightening on this point. Don't know if we can optimize the locking on the TLBs, though - we could use maybe atomic ops, or have per-processor TLBs (which is the way it's implemented in hardware - you get IPIs on flush, though, and on i386 atomic_read and atomic_set have no additional cost over non-atomic counterparts. So maybe shared TLBs are ok - they'd need to be tagged, however). I've not yet thought about an efficient data structure - an array means that invalidation checks each entry, and I'd like to avoid that. The other way is to empty the TLB on flushing a single entry. The only problem is when there is a fault on the kernelspace address. However, we may implement some checking, if setjmp() is still costly: only kernelspace addresses upper than TASK_SIZE are valid (or something of this sort). -- Inform me of my mistakes, so I can keep imitating Homer Simpson's "Doh!". Paolo Giarrusso, aka Blaisorblade (Skype ID "PaoloGiarrusso", ICQ 215621894) http://www.user-mode-linux.org/~blaisorblade ___________________________________ Yahoo! Mail: gratis 1GB per i messaggi e allegati da 10MB http://mail.yahoo.it |
From: Jeff D. <jd...@ad...> - 2005-10-02 22:33:35
|
On Thu, Sep 29, 2005 at 02:09:27PM +0200, Blaisorblade wrote: > Again, that page is not bad. There is no page yet for this address, and the > host won't allocate one for now. It is bad in the sense that, unless some space is freed on that mount, a reference to the page will always fault. > Ok, this makes a bit of sense, even if IMHO it doesn't work, I now see your > point (but I still insist with what said above). Explain why it doesn't work. > However, even a dirtied page could be "bad", if it has been swapped. If > we're getting a SIGBUS, it meant that it didn't succeed in freeing any > memory. No it can't. A swapped page still counts as occupying space in the filesystem. If a page was successfully allocated, then accesses to it will always succeed, even if it needs to be swapped in. > And, frankly, unless the UML ram file is kept on ramfs (which is RAM-only), > it can be swapped (both for disk-based filesystem and for tmpfs). > So, I don't think what you suggest could work. Swapping makes no difference. Jeff |
From: Young K. <you...@gm...> - 2005-09-30 15:36:31
|
> > it seems that sigsetjmp() has > > relatively large overhead, we could reduce some overhead by not > > calling it. > > How do you measure it? I'm curious myself - I know there's the possibilit= y to > use gprof, but I've never used that myself. i usually use pentium rdtsc(read time stamp counter) to measure the timing and latency. (sometimes by instrumenting kernel or sometimes by measuring test programs) i ran a test program and measured the overhead of sigsetjmp() and setjmp(). it showed sigsetjmp() uses around 1350 cycles (which is around 0.45 us in 3.0GHz machine), and setjmp() only 21 cycles (< 0.01us). maybe while sigsetjmp() is implemented as a system call to cope with signal blocking/unblocking, setjmp() is not a system call? (getpid() itself takes more than 1000 cycles) thanks, > > Surely the "sig" thing is heavy (some syscalls like sigprocmask() for > blocking/unblocking signals). Or better, it's the only heavy thing - the = rest > consists only of saving a couple of registers (6, IIRC) in memory, there'= s no > interest in optimizing it away (I assume). > |
From: Geert U. <ge...@li...> - 2005-09-30 15:45:04
|
On Fri, 30 Sep 2005, Young Koh wrote: > > > it seems that sigsetjmp() has > > > relatively large overhead, we could reduce some overhead by not > > > calling it. > > > > How do you measure it? I'm curious myself - I know there's the possibility to > > use gprof, but I've never used that myself. > > i usually use pentium rdtsc(read time stamp counter) to measure the > timing and latency. (sometimes by instrumenting kernel or sometimes by > measuring test programs) i ran a test program and measured the > overhead of sigsetjmp() and setjmp(). it showed sigsetjmp() uses > around 1350 cycles (which is around 0.45 us in 3.0GHz machine), and > setjmp() only 21 cycles (< 0.01us). maybe while sigsetjmp() is > implemented as a system call to cope with signal blocking/unblocking, > setjmp() is not a system call? (getpid() itself takes more than 1000 > cycles) Indeed, setjmp() is not a system call. It just saves the registers to the passed env structure. Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@li... In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds |
From: Blaisorblade <bla...@ya...> - 2005-10-02 23:46:47
|
On Sunday 02 October 2005 03:03, Jeff Dike wrote: > On Thu, Sep 29, 2005 at 02:09:27PM +0200, Blaisorblade wrote: > > Again, that page is not bad. There is no page yet for this address, and > > the host won't allocate one for now. > It is bad in the sense that, unless some space is freed on that mount, a > reference to the page will always fault. Sorry, any reference will fault, unless it is done on a allocated present page, which the UML kernel freed but the host didn't. And remember, btw, you've planned to make this impossible... > > Ok, this makes a bit of sense, even if IMHO it doesn't work, I now see > > your point (but I still insist with what said above). > Explain why it doesn't work. Below. > > However, even a dirtied page could be "bad", if it has been swapped. If > > we're getting a SIGBUS, it meant that it didn't succeed in freeing any > > memory. > No it can't. A swapped page still counts as occupying space in the > filesystem. If a page was successfully allocated, then accesses to it will > always succeed, even if it needs to be swapped in. Sorry, Jeff, which page are you going to evict? It can be a dirty page. Unless you mean that since that page is still accounted in the FS, Linux will leave a RAM page free to allow it to be re-read, while still swapping the page. You didn't obviously mean this absurdity (why swap it in first place), but I don't catch what's missing to you. > > And, frankly, unless the UML ram file is kept on ramfs (which is > > RAM-only), it can be swapped (both for disk-based filesystem and for > > tmpfs). So, I don't think what you suggest could work. > Swapping makes no difference. Realoding pages means freeing RAM to leave place to them. -- Inform me of my mistakes, so I can keep imitating Homer Simpson's "Doh!". Paolo Giarrusso, aka Blaisorblade (Skype ID "PaoloGiarrusso", ICQ 215621894) http://www.user-mode-linux.org/~blaisorblade ___________________________________ Yahoo! Messenger: chiamate gratuite in tutto il mondo http://it.messenger.yahoo.com |
From: Jeff D. <jd...@ad...> - 2005-10-02 18:54:19
|
On Sun, Oct 02, 2005 at 12:23:14PM +0200, Blaisorblade wrote: > Sorry, any reference will fault, unless it is done on a allocated present > page, which the UML kernel freed but the host didn't. And remember, btw, > you've planned to make this impossible... You are not making any sense t me here. > Sorry, Jeff, which page are you going to evict? It can be a dirty page. Unless > you mean that since that page is still accounted in the FS, Linux will leave > a RAM page free to allow it to be re-read, while still swapping the page. Swapped pages are accounted in the FS all the time and there's obviously no dedicated page left free for them when they are next pulled in. All disk-based filesystems account pages that are on disk and not in memory. tmpfs is no different, except that its disk is the swap partition. Jeff |
From: Blaisorblade <bla...@ya...> - 2005-10-03 18:48:37
|
On Sunday 02 October 2005 20:31, Jeff Dike wrote: > On Sun, Oct 02, 2005 at 12:23:14PM +0200, Blaisorblade wrote: > > Sorry, any reference will fault, unless it is done on a allocated present > > page, which the UML kernel freed but the host didn't. And remember, btw, > > you've planned to make this impossible... > You are not making any sense t me here. /dev/anon - when the UML kernel frees a page, we ask the host to free it too. > > Sorry, Jeff, which page are you going to evict? It can be a dirty page. > > Unless you mean that since that page is still accounted in the FS, Linux > > will leave a RAM page free to allow it to be re-read, while still > > swapping the page. > Swapped pages are accounted in the FS all the time and there's obviously > no dedicated page left free for them when they are next pulled in. All > disk-based filesystems account pages that are on disk and not in memory. > tmpfs is no different, except that its disk is the swap partition. Exactly what I knew... However, I was in error... I just saw that filling the disk, or tmpfs, is rather different than going OOM. OOM causes SIGKILL, while SIGBUS (as you correctly said) comes from full disk/partition, or filled disk quota. I only checked tmpfs, but that gives at least a feeling. -- Inform me of my mistakes, so I can keep imitating Homer Simpson's "Doh!". Paolo Giarrusso, aka Blaisorblade (Skype ID "PaoloGiarrusso", ICQ 215621894) http://www.user-mode-linux.org/~blaisorblade ___________________________________ Yahoo! Mail: gratis 1GB per i messaggi e allegati da 10MB http://mail.yahoo.it |
From: Jeff D. <jd...@ad...> - 2005-10-03 21:17:31
|
On Mon, Oct 03, 2005 at 08:35:54PM +0200, Blaisorblade wrote: > /dev/anon - when the UML kernel frees a page, we ask the host to free it too. Yeah, /dev/anon is a completely different story, but I thought we were talking about normal tmpfs. Jeff |