From: Jeff D. <jd...@ad...> - 2004-08-19 01:59:22
|
I've released a second 2.6.7 UML patch. This is to push out the changes I have in order to give me a clean slate for the 2.6.8.1 UML. These changes sync up my 2.4 and 2.6 trees, and include Build cleanups, including 'linux' is now the default target, an updated defconfig, and Kconfig updates Code cleanup, including removal of unused SMP code, removal of a userspace file, more EINTR handling, more error checking Time fixes, including switching from rdtsc to gettimeofday for the real-time clock option and better handling of the host clock jumping backwards Introduction of centralized file descriptor management, allowing descriptors to be reclaimed if UML has hit the host's limit and the files can be reopened to provide access to the same object on the host Introduction of externfs, which allows host resources to be mounted as UML filesystems. externfs handles the kernel interface and plug-ins to it handle the host resources. There are two users, hostfs and humfs, both of which provide access to host directory hierarchies. humfs stores metadata separately, ala umsdos, which allows operations within the mount which would require root privileges on the host with hostfs. hostfs and humfs are still somewhat dodgy on 2.6. UML can load at 0x8048000 like every other binary when MODE_SKAS is enabled and MODE_TT is off. This makes valgrind happier, and also allows UML to have more physical memory without resorting to highmem. With STATIC_LINK disabled, the limit is ~750M, and with it enabled, the limit is ~2.75G. A good number of bugs, including some crashes were also fixed. For the full details, see http://user-mode-linux.sourceforge.net/changelog-uml-patch-2.6.7-2.bz2.html For the other UML mirrors and other downloads, see http://user-mode-linux.sourceforge.net/dl-sf.html Incremental patches between UML releases are available at http://user-mode-linux.sourceforge.net/patches.html Other links of interest: The UML project home page : http://user-mode-linux.sourceforge.net The UML Community site : http://usermodelinux.org Jeff |
From: Werner A. <wa...@al...> - 2004-08-27 07:11:31
|
Jeff Dike wrote: > hostfs and humfs are still somewhat dodgy on 2.6. Opening files for writing even if we only want to read them causes a number of problems: first, if I fire up a UML system that shares root with the host, and I'm root, it will fail to load things like bash, because it can't open it for writing. This trivial patch works around that: --- linux-2.6.7/fs/hostfs/host_fs.c.orig Fri Aug 27 03:55:48 2004 +++ linux-2.6.7/fs/hostfs/host_fs.c Fri Aug 27 03:56:17 2004 @@ -175,7 +175,7 @@ if(err == -EISDIR) goto out; - if(err == -EACCES) + if(err == -EACCES || err == -ETXTBSY) err = host_open_file(path, 1, 0, &hf->fh); if(err) Unfortunately, the problem also happens in the opposite direction. E.g. if the UML kernel has opened an executable for writing, that executable is no longer executable on the host. Open file caching makes this even worse. In fact, it may be a bad idea to cache a file open for writing at all, if an execute bit gets set, or if it is already set. A possible, but IMHO not very nice, work-around would be to run as a user who isn't allowed to write to anything the host might want to execute. - Werner -- _________________________________________________________________________ / Werner Almesberger, Buenos Aires, Argentina wa...@al... / /_http://www.almesberger.net/____________________________________________/ |
From: BlaisorBlade <bla...@ya...> - 2004-09-05 19:42:40
Attachments:
uml-hostfs-fix-maj-min.patch
|
On Friday 27 August 2004 09:10, Werner Almesberger wrote: > Jeff Dike wrote: > > hostfs and humfs are still somewhat dodgy on 2.6. > > Opening files for writing even if we only want to read them causes > a number of problems: Well, that is simply not needed. The -ETXTBUSY check is not a fix, but a workaround. Old good hostfs didn't do this. Btw, I'm experiencing two more problems with 2.6.8.1: - ls /mnt/host/dev/mapper/control returns EPERM errors, when trying to stat files; on the host, as the same user, or with 2.6.7-1 this does not happen. Why? It seems like hostfs opens files even to just stat them. Or maybe, it implements a wrong permission check. In fact, I'm not able to see the opening with strace (don't ask me why - I attach to the kernel thread, but I don't get the opening of the file; I got it only once, maybe ), but externfs_lookup calls init_inode which calls host_open_file! That's simply brain-damaged! My guest searches for binaries on the host, and as a result I get file descriptors 0-1023 opened by UML! I'm not joking! - Also (maybe related with calling iget(..., 0) ) I get this message on every unmount: VFS: Busy inodes after unmount. Self-destruct in 5 seconds. Have a nice day... which also seems to mean that files are not closed when unmounting hostfs! -Finally, for some reasons, the dev and rdev field returned by stat are screwed; when listing a device node on hostfs, it prints always the maj and min of the device containing the filesystem; i.e., file->rdev = host_stat -> dev instead of rdev. And I'm not able to see where the exchange happens. Instead, my old fix for the same problem always worked flawlessly. I'm attaching it - it's for the old hostfs, but maybe it's better anyway. Bye -- Paolo Giarrusso, aka Blaisorblade Linux registered user n. 292729 |
From: Jeff D. <jd...@ad...> - 2004-09-08 22:12:33
|
bla...@ya... said: > In fact, I'm not able to see the opening with strace (don't ask me why > - I attach to the kernel thread, but I don't get the opening of the > file; I got it only once, maybe ), but externfs_lookup calls > init_inode which calls host_open_file! That's simply brain-damaged! Maybe, but it's mighty useful in testing the file descriptor reclaiming code. > - Also (maybe related with calling iget(..., 0) ) I get this message > on every unmount: The iget(..., 0) thing has been fixed. > VFS: Busy inodes after unmount. Self-destruct in 5 seconds. Have a > nice day... I'm not seeing that any more. > -Finally, for some reasons, the dev and rdev field returned by stat > are screwed; when listing a device node on hostfs, it prints always > the maj and min of the device containing the filesystem; i.e., file-> > rdev = host_stat -> dev instead of rdev. And I'm not able to see > where the exchange happens. It's here - copy_stat: .ust_major = MAJOR(src->st_dev), /* device */ .ust_minor = MINOR(src->st_dev), do_stat_file: .rdev = MKDEV(buf.ust_major, buf.ust_minor), Now that I have externfs_inode_info, either it or uml_stat looks redundant. One more thing to clean up... Jeff |
From: BlaisorBlade <bla...@ya...> - 2004-09-05 19:42:29
|
Alle 05:00, gioved=EC 19 agosto 2004, Jeff Dike ha scritto: > I've released a second 2.6.7 UML patch. This is to push out the changes I > have in order to give me a clean slate for the 2.6.8.1 UML. About the patch (and even the 2.6.8.1-1 one), there are two problems: * First, please do a "make clean" before releasing the patch. There are som= e=20 binaries included in it! And also semaphore.c, which is a symlink normally. * Second, why do you disable module support when compiling it, or anyhow ho= w=20 could you succeed to build it? Starting from this patch (this bug is not=20 there in 2.6.7-1, and remains in 2.6.8.1-1) we have this line twice: EXPORT_SYMBOL(os_ioctl_generic); So it did not compile for me (I patched it, obviously). Patch attached -=20 uml-dup-sym. Also, you must still export a tons of symbols, plus make hostfs depend on=20 externfs. Also, to avoid linking against libgcc_s.so and exporting some of= =20 its symbols, which change, I made use of do_div for 64-bit division. For=20 this, see uml-export-Symbols.patch. It's only for 2.6 - for 2.4 it's a bit= =20 more complex (a module export all its symbols in 2.4, but if you link=20 statically the code you must export the symbol by hand inside an EXPORT_OBJ= ;=20 and if you export a missing symbol you get a link time failure). Btw, about the ->statfs op: you are missing some unsigned-ness for some=20 params, since sector_t, used in kstatfs, is unsigned. Do you want them fixe= d? * About filehandle_switch: you deleted a line (probably by mistake). Reread= =20 more carefully the separate patches you get with quilt: when you see the=20 other attached patch (uml-restore-lost-code.patch), you'll agree with me. Also, what you say about the patch is not correct: filehandle_switch has=20 almost just a cosmetic effect (there is a change from os_open_file to=20 open_file for new_mm mode, and nothing else). I've attached the 2.4.26-2 pa= rt=20 which is more actually the filehandle_switch part (it's not a perfect one, = it=20 contains some unrelated changes, but anyway you can fix it). However, IMHO, since you cannot close and reopen a pipe, it's braindead tha= t=20 the switch_pipe[] array is an array of filehandles. You must obviously use= =20 the make_pipe() API to call reclaim_fds() if needed, but making it return=20 filehandles is useless. They are never added onto the list, also, so they=20 never become reclaimable. But about the filehandle abstraction, I have a lo= t=20 of doubts, for which I'll write a separate mail. I like the idea, but not t= he=20 current implementation. =2D-=20 Paolo Giarrusso, aka Blaisorblade Linux registered user n. 292729 |
From: Jeff D. <jd...@ad...> - 2004-09-08 23:32:28
|
bla...@ya... said: > * First, please do a "make clean" before releasing the patch. There > are some binaries included in it! And also semaphore.c, which is a > symlink normally. I do. It's just that make clean didn't catch everything. > * About filehandle_switch: you deleted a line (probably by mistake). > Reread more carefully the separate patches you get with quilt: when > you see the other attached patch (uml-restore-lost-code.patch), > you'll agree with me. Yuck, I have no idea how that happened. > However, IMHO, since you cannot close and reopen a pipe, it's > braindead that the switch_pipe[] array is an array of filehandles. Yeah, this is fixed in my 2.6 tree now. Jeff |
From: BlaisorBlade <bla...@ya...> - 2004-09-11 15:13:45
|
On Thursday 09 September 2004 02:35, Jeff Dike wrote: > bla...@ya... said: > > * First, please do a "make clean" before releasing the patch. There > > are some binaries included in it! And also semaphore.c, which is a > > symlink normally. > > I do. It's just that make clean didn't catch everything. Btw, inside patch-scripts they provide a script which rather than diffing two trees, calls "combinediff" (from patchutils) to merge the patches statically, without need of the patched files. I've been very confortable with it - doesn't quilt have something such? About patchutils (quoting from Andrew Morton): See http://cyberelk.net/tim/patchutils/ (Don't download the "experimental" patchutils - it seems to only have half of the commands in it. Go for "stable") > > * About filehandle_switch: you deleted a line (probably by mistake). > > Reread more carefully the separate patches you get with quilt: when > > you see the other attached patch (uml-restore-lost-code.patch), > > you'll agree with me. > Yuck, I have no idea how that happened. Btw, I'm assuming that you didn't want to drop the HPPFS compile line in "externfs" (since that's not documented), right? --- um.orig/fs/Makefile 2004-08-06 15:17:22.000000000 -0400 +++ um/fs/Makefile 2004-08-06 15:17:25.000000000 -0400 @@ -91,5 +91,4 @@ obj-$(CONFIG_XFS_FS) += xfs/ obj-$(CONFIG_AFS_FS) += afs/ obj-$(CONFIG_BEFS_FS) += befs/ -obj-$(CONFIG_HOSTFS) += hostfs/ -obj-$(CONFIG_HPPFS) += hppfs/ # <---- WHY? +obj-$(CONFIG_EXTERNFS) += hostfs/ > > However, IMHO, since you cannot close and reopen a pipe, it's > > braindead that the switch_pipe[] array is an array of filehandles. > Yeah, this is fixed in my 2.6 tree now. Yes, I saw it, a lot after writing the message (I sent it a lot after writing it). However, another thing: I think that the handling of EMFILE/ENFILE (too many fd's for the app or for the system) should be moved inside the os_ layer. Or will you create yet a filehandle wrapper for functions like os_connect_socket() (which calls socket(), which requests an fd)? Do you agree or have any arguments to support the current design? -- Paolo Giarrusso, aka Blaisorblade Linux registered user n. 292729 |
From: Jeff G. <jg...@po...> - 2004-09-05 20:28:53
|
Just a FWIW... Using the latest UML code in 2.6.x BitKeeper (2.6.9-rc1-bk), things are looking pretty darn good. I was able to boot a Fedora Core userland with "init=/bin/sh", set up networking easily [uml_net helper worked], and ssh into my virtual host. To problems, one major and two minor: (major) 1) Using standard Fedora Core SysvInit, rc.sysinit will proceed to completion, then the userland boot halts. ps shows the virtual host's /sbin/init process in constant run state, chewing CPU time like mad. Didn't investigate further, simply booted with init=/bin/sh and moved on. (minor) 2) The Makefiles appear to add "-um1" to the kernel version for some reason, and IMHO violates the Principle of Least Surprise. This breaks several of my scripts :( 3) The arch/um build is verbose, showing each gcc command line as it is executed. This is incorrect when KBUILD_VERBOSE or similar setting are not enabled. As a result, the arch/um is verbose, but the rest is not, by default. Overall I am really impressed. Like other arches in the Linux kernel, it is IMO very important to be able to work "out of the box", without patches. Jeff |
From: BlaisorBlade <bla...@ya...> - 2004-09-06 18:00:52
|
On Sunday 05 September 2004 22:28, Jeff Garzik wrote: > Just a FWIW... Thanks a lot anyway! We need eagerly mainline developer which help us, in any way (even comments). > Using the latest UML code in 2.6.x BitKeeper (2.6.9-rc1-bk), things are > looking pretty darn good. I was able to boot a Fedora Core userland > with "init=/bin/sh", set up networking easily [uml_net helper worked], > and ssh into my virtual host. > To problems, one major and two minor: > (major) > 1) Using standard Fedora Core SysvInit, rc.sysinit will proceed to > completion, then the userland boot halts. ps shows the virtual host's > /sbin/init process in constant run state, chewing CPU time like mad. > Didn't investigate further, simply booted with init=/bin/sh and moved on. > (minor) > 2) The Makefiles appear to add "-um1" to the kernel version for some > reason, and IMHO violates the Principle of Least Surprise. This breaks > several of my scripts :( Yes, this should not have gone in mainline (it made sense when releasing separate UML patches). I'm queueing the fix to Andrew Morton. I guess that future separate UML patches (just for updates) will have to insert their extraversion to Makefile rather than arch/um/Makefile. > 3) The arch/um build is verbose, showing each gcc command line as it is > executed. This is incorrect when KBUILD_VERBOSE or similar setting are > not enabled. As a result, the arch/um is verbose, but the rest is not, > by default. There is a ton of problems with kbuild in UML - actually it must build a lot of files against userspace headers, and there isn't a kbuild support for that. This means for instance that files are not rebuilt when CONFIG_* changes, when a header changes and so on. I've just implemented a patch which simply makes the output nice, but I'm not sure whether it should be included. Some time ago I implemented a rough version of a good patch (which fixed all the above problems, but in an unclean way), but nobody liked it, and moreover Jeff Dike said that: 1) he did not want to modify kbuild for UML only 2) he was going to build an abstraction layer, which would build with userspace includes and would be restricted to arch/um/os-*/, and then the Makefile hacks would have been done only for that folder. > Overall I am really impressed. Like other arches in the Linux kernel, > it is IMO very important to be able to work "out of the box", without > patches. -- Paolo Giarrusso, aka Blaisorblade Linux registered user n. 292729 |
From: Jeff G. <jg...@po...> - 2004-09-07 04:41:03
Attachments:
config.txt
|
Well, after some fiddling, I am able to boot the current 2.6.9-rc1-bk UML on Fedora Core 2 userland. Networking comes up correctly, and everything works as expected except... syslog hangs on initialization. 765 pts/0 S 0:00 | \_ initlog -q -c syslogd -m 0 766 pts/0 S 0:00 | \_ syslogd -m 0 767 ? S 0:00 | \_ syslogd -m 0 If I run "chkconfig syslog off" then userland boots correctly, and I can ssh in or login using one of the many xterm consoles. My kernel config is attached, if someone is interested. Great work! I have no idea where my /sbin/init problem went, but aside from syslog userland is working great. Now to figure out how to get IPv6 automatically configuring itself... Jeff |
From: Adam H. <ad...@do...> - 2004-09-07 05:06:02
|
On Tue, 7 Sep 2004, Jeff Garzik wrote: > > Well, after some fiddling, I am able to boot the current 2.6.9-rc1-bk > UML on Fedora Core 2 userland. Networking comes up correctly, and > everything works as expected except... syslog hangs on initialization. > > 765 pts/0 S 0:00 | \_ initlog -q -c syslogd -m 0 > 766 pts/0 S 0:00 | \_ syslogd -m 0 > 767 ? S 0:00 | \_ syslogd -m 0 > > If I run "chkconfig syslog off" then userland boots correctly, and I can > ssh in or login using one of the many xterm consoles. > > My kernel config is attached, if someone is interested. > > Great work! I have no idea where my /sbin/init problem went, but aside > from syslog userland is working great. > > Now to figure out how to get IPv6 automatically configuring itself... Sounds like a syslog bug, that should have been fixed long ago. Syslog had a race condition at startup; under normal linux, the problem never occurred. However, under smp, or under uml, the scheduling is different enough that the race condition occurs. However, the symptoms of this I have seen have never been a lookup; just an annoying error message at startup. However, I've only got experience with Debian. |
From: BlaisorBlade <bla...@ya...> - 2004-09-07 18:18:39
|
For LKML: I'm not subscribed, so don't forget to CC me. On Monday 06 September 2004 19:56, BlaisorBlade wrote: > On Sunday 05 September 2004 22:28, Jeff Garzik wrote: > > Overall I am really impressed. Like other arches in the Linux kernel, > > it is IMO very important to be able to work "out of the box", without > > patches. Yes - especially when microAPI changes happen every day, as of 2.6. I've just downloaded a snapshot including the merge, so I'll be able to merge some little fixes which have happened since. Do you think that keeping a UML tree for new, experimental features is a good idea, or that this role should go to -mm? I ask this also because I don't know how much would help general review for new features. For instance, the "hostfs" feature is in the middle of a rewrite and the new code is still very broken (the current release says more or less "VFS: busy inodes after unmount - self destroying in 5 seconds. Have a nice day", but maybe this is fixed; plus has a number of other bugs). Also, SMP hasn't compiled for a while, so there is a number of locking problem - at least one straight deadlock in the ubd driver when passing ubd=sync. It's not a hard problem - just not yet fixed it. -- Paolo Giarrusso, aka Blaisorblade Linux registered user n. 292729 |
From: Jeff G. <jg...@po...> - 2004-09-09 05:31:38
|
BlaisorBlade wrote: > For LKML: I'm not subscribed, so don't forget to CC me. > On Monday 06 September 2004 19:56, BlaisorBlade wrote: > >>On Sunday 05 September 2004 22:28, Jeff Garzik wrote: >> >>>Overall I am really impressed. Like other arches in the Linux kernel, >>>it is IMO very important to be able to work "out of the box", without >>>patches. > > Yes - especially when microAPI changes happen every day, as of 2.6. I've just > downloaded a snapshot including the merge, so I'll be able to merge some > little fixes which have happened since. > > Do you think that keeping a UML tree for new, experimental features is a good > idea, or that this role should go to -mm? > > I ask this also because I don't know how much would help general review for > new features. It's up to you. Andrew pulls several BitKeeper trees into his -mm tree, so you could do both if you wished. If you do that, though, just make sure that the code you push is in a state that's ready for review and testing :) > For instance, the "hostfs" feature is in the middle of a rewrite and the new > code is still very broken (the current release says more or less "VFS: busy > inodes after unmount - self destroying in 5 seconds. Have a nice day", but > maybe this is fixed; plus has a number of other bugs). I usually create a new "patch queue" for experimental features, to make sure that (a) it's seperated from the main testing branch but (b) it's easy to merge it back into the main testing branch when it's ready. If you use BitKeeper, this is accomplished simply by creating another cloned repository. Jeff |
From: Jeff D. <jd...@ad...> - 2004-09-08 19:37:36
|
jg...@po... said: > (minor) 2) The Makefiles appear to add "-um1" to the kernel version > for some reason, and IMHO violates the Principle of Least Surprise. > This breaks several of my scripts :( As BlaisorBlade already said, this wasn't intended to reach any major trees - it's just for my own patch numbering. I'll clean this up, along with a couple of other things. Jeff |
From: Jeff G. <jg...@po...> - 2004-09-07 05:13:34
|
Adam Heath wrote: > Sounds like a syslog bug, that should have been fixed long ago. Perhaps... strace running inside UML host seems to stop at nanosleep, of all places: read(3, "676\n", 4096) = 4 close(3) = 0 munmap(0x40017000, 4096) = 0 kill(676, SIG_0) = -1 ESRCH (No such process) clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x4013d2e8) = 716 rt_sigaction(SIGTERM, {0x15559810, [TERM], SA_RESTORER|SA_RESTART, 0x40048f38}, {SIG_DFL}, 8) = 0 rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0 rt_sigaction(SIGCHLD, NULL, {SIG_DFL}, 8) = 0 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 nanosleep({300, 0}, |
From: Adam H. <ad...@do...> - 2004-09-07 05:40:03
|
On Tue, 7 Sep 2004, Jeff Garzik wrote: > Adam Heath wrote: > > Sounds like a syslog bug, that should have been fixed long ago. > > Perhaps... strace running inside UML host seems to stop at nanosleep, > of all places: Oh, well that sounds like something else. Probably a uml problem. |