From: roland <for...@gm...> - 2003-11-28 01:14:45
|
Hi! I made some further investigation. i forgot to mention, that i`m using reiserfs and also use the copy-on-write(cow) feature, uml has. the system-freeze i experience seems to have to do with copy-on-write. as a test, i made a copy of my rootfs, made it available under /dev/ubd/2 inside the uml and mounted that under /mnt. i did an extra fsck before - no inconsistency was reported. then i did some stresstests again - worked fine! after making the rootfs-clone a cow-filesystem (by adding ubd2=rootfsclone.cow,rootfsclone.img), the freeze happened again when doing some heavy writing. so - am i allowed to assume, that there is a bug in the copy-on-write code? btw: could anybody recommend a general filesystem stresstest-tool which does the worst to filesytems, you can do (besides formatting or overwriting)? :D it seems, that this bug happens under certain circumstances only. if i "dd" to an empty filesystem, it doesn`t seem to freeze the uml at once. but it happend again after a while, when i did some additionial copying of directory-trees while dd`ìng to disk in the background. regards roland ps: accidentally, i opened a cow-file with vi - and i saw that the path to the original readonly filesystem image seems to be stored with full path information. would that mean: if i want to relocate the fs-image with it`s accociated cow-file, i`m out of luck? is there any reason why there are absolute, not relative paths inside? pps: > > The UML did the frist two dd's without problems, then it started to > > hang.. i checked outside and saw the process switching states from > > R to D and then back to R... so after a few secs the dd's did all > > succeed and the uml is still running. So it didn't crash or so > > but it also didn't play very well with that. > > Sounds as if it started to get blocked by disk I/O on the host.. ubd > device I/O is a latency killer for UML as a blocking disk I/O operation > will block the whole UML kernel until the host finishes. yes - i think what hendrik says is right! this is just normal behaviour. the first 2 dd`s ran without "problems" because the host just cached that i/o`s. then it began writing to disk and this may have caused some delay you recognized as a "hang". in general - i/o scheduling of 2.4.x kernels is not the very best. 2.6 kernel series has new i/o schedulers which are improved a LOT! ----- Original Message ----- From: "Henrik Nordstrom" <hn...@ma...> To: "Sven 'Darkman' Michels" <sv...@da...> Cc: <use...@li...> Sent: Friday, November 28, 2003 12:50 AM Subject: Re: [uml-devel] Re: [uml-user] uml 2.6.0-test9 crash > On Thu, 27 Nov 2003, Sven 'Darkman' Michels wrote: > > > The UML did the frist two dd's without problems, then it started to > > hang.. i checked outside and saw the process switching states from > > R to D and then back to R... so after a few secs the dd's did all > > succeed and the uml is still running. So it didn't crash or so > > but it also didn't play very well with that. > > Sounds as if it started to get blocked by disk I/O on the host.. ubd > device I/O is a latency killer for UML as a blocking disk I/O operation > will block the whole UML kernel until the host finishes. > > > BTW: did you had any problems with building an uml kernel on 9.0? > > I have no problem with building UML on RH9, but I have a memory of some > singnal trick which was needed at some point.. Make sure the UML is up to > date. > > Regards > Henrik > hi !´ i think , i`m able to crash my uml. :( i did some stress-testing with my uml-2.6.0-test9 on 2.6.0-test9-skas host and i unfortunately seem able to crash it very easily. this was the way i recognized it: first i executed while true;do find /;done >/dev/zero & several times and all worked well. watched that with "top" inside the uml and found dozens of "finds" sharing the cpu as expected(spending most of their time on system calls indeed). on host, cpu usage of uml went to ~99% - all fine so far. then i thought: mhhhh - ok - eat THIS: while true;do dd if=/dev/zero of=test.dat bs=1k count=10000;done & after executing this, my uml got stuck. no output, no response to input - even a ping to the uml didn`t´give back a sign of life. the uml didn`t crash or panic - it just gets stuck and unresponsive - uml process remains at 99% on the host. at a second, third, fourth try, i isolated the problem a little bit: one or two non looped "dd if=/dev/zero of=test.dat bs=1k count=10000" just are enough to produce the same result.(under 2.6.0test9-skas-host and ALSO under 2.4.22-99-default suse9 host, rootfs inside uml is suse9) is anybody able to reproduce this? if so - what do do further for analyzing? i could make the uml downloadable and could send some output from strace (if i attach strace to the remaining "99% cpu hog", it exits very quickly with a SIGALARM) maybe an uml bug or even an issue for lkml? regards roland |
From: Erik W. <om...@te...> - 2003-11-28 01:28:06
Attachments:
uml-rewrite-cow
|
On Fri, 28 Nov 2003, roland wrote: > filesystem image seems to be stored with full path information. would > that mean: if i want to relocate the fs-image with it`s accociated > cow-file, i`m out of luck? is there any reason why there are absolute, > not relative paths inside? I've also run into the fact that something (either UML or the shell) seems to cause any symlinks in the path to be factored out as well. This means that any images I try to locate in the same place at all times, regardless of physical location, still end up being stored as their physical location. The end result has been a symlink farm to retroactively handle all the older CoW files. I've written a script to get around it, but it's an evil hack that depends on a few things being the right version. It's attached, but carefully check what file(1) returns on a CoW file before you run it on anything but a *COPY* of your CoW file. It should return something like: # file cow cow: User-mode Linux COW file, version 3, backing file ... There really should be a tool to do this, with the obvious warnings that putting the wrong backing file path in will cause the CoW to self-destruct in a really bad way. Erik Walthinsen <om...@te...> - System Administrator __ / \ GStreamer - The only way to stream! | | M E G A ***** http://gstreamer.net/ ***** _\ /_ |
From: Shao-Lin J. C. <sc...@cs...> - 2003-11-28 02:11:43
|
As the subject says, I'm trying to copy the COW files across the network. But the size of those copies got expanded to the same size of the backing file. Is there a way to keep these COWs small & cute? ;-) Thanks for help. Joseph |
From: Peter <pe...@ri...> - 2003-11-28 02:41:02
|
rsync --sparse --compress source.cow newhost:/dest.cow works well. Or you can use the 'sparse' option to tar up the file and copy it = across. Or you can (on the destination host) cp --sparse=3Dalways source.cow = dest.cow and it'll 'resparse' it for you. A neat trick (suggested by Jeff a while back) is to dd a zero byte = filled file from within your UML instance to 'fill up' any spare space. = After you do this, and use cp --sparse=3Dalways you cow file will be = 're-minimized'. Cheers, Peter BTW: this post was probably more suitable for _just_ the user list. ----- Original Message -----=20 From: "Shao-Lin Joseph Chung" <sc...@cs...> Subject: [uml-user] How to copy COW files to another machine across the = network? > As the subject says, I'm trying to copy the COW files across the = network.=20 > But the size of those copies got expanded to the same size of the = backing > file. >=20 > Is there a way to keep these COWs small & cute? ;-) >=20 > Thanks for help. >=20 > Joseph >=20 > |
From: Henrik N. <hn...@ma...> - 2003-11-28 08:49:55
|
On Fri, 28 Nov 2003, Shao-Lin Joseph Chung wrote: > As the subject says, I'm trying to copy the COW files across the network. > But the size of those copies got expanded to the same size of the backing > file. You can't move COW files around unless you make a tool to adjust the COW file header after the move. If not UML will disregard the COW file as invalid and revert to the backing file copy.. UML checks that the following parameters match the backing file * Full file name including path * File size * Modification timestamp Regarding the size: You need to use a copying tool which accounts for holes in the file. See for example the --sparse option to cp. Regards Henrik |
From: roland <for...@gm...> - 2003-11-30 21:18:19
|
hi! > You can't move COW files around but i`d like to move around my uml`s! hey - this is one of the uml benefits: you have a virtual machine which is independent from hardware and is _generally_ quite "relocateable". if it would have swsusp - i could even relocate a running instance - like i can do with vmware. why was that relocateability being made difficult artificially by putting that "features" into COW? is this "problem" going to be adressed? regards roland ----- Original Message ----- From: "Henrik Nordstrom" <hn...@ma...> To: "Shao-Lin Joseph Chung" <sc...@cs...> Cc: "UML User list" <use...@li...>; "UML Devel list" <use...@li...> Sent: Friday, November 28, 2003 9:49 AM Subject: Re: [uml-devel] How to copy COW files to another machine across the network? > On Fri, 28 Nov 2003, Shao-Lin Joseph Chung wrote: > > > As the subject says, I'm trying to copy the COW files across the network. > > But the size of those copies got expanded to the same size of the backing > > file. > > You can't move COW files around unless you make a tool to adjust the COW > file header after the move. If not UML will disregard the COW file as > invalid and revert to the backing file copy.. UML checks that the > following parameters match the backing file > * Full file name including path > * File size > * Modification timestamp > > Regarding the size: You need to use a copying tool which accounts for > holes in the file. See for example the --sparse option to cp. > > Regards > Henrik > > > > ------------------------------------------------------- > This SF.net email is sponsored by: SF.net Giveback Program. > Does SourceForge.net help you be more productive? Does it > help you create better code? SHARE THE LOVE, and help us help > YOU! Click Here: http://sourceforge.net/donate/ > _______________________________________________ > User-mode-linux-devel mailing list > Use...@li... > https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel > |
From: Henrik N. <hn...@ma...> - 2003-11-30 21:38:00
|
On Sun, 30 Nov 2003, roland wrote: > but i`d like to move around my uml`s! Then as I said you (or someone else) needs to write the tool required to adjust the COW header after the move. > hey - this is one of the uml benefits: you have a virtual machine which > is independent from hardware and is _generally_ quite "relocateable". It is, except for COW files.. > if it would have swsusp - i could even relocate a running instance - like > i can do with vmware. To my knowledge nobody has solved suspend with UML yet.. > why was that relocateability being made difficult artificially by putting > that "features" into COW? To make sure there is no mixup between COW and backing file, especially if you umlmoo the backing file with one COW file but there is multiple other UMLs also using the same backing file. > is this "problem" going to be adressed? It is simply one of many things which is not yet done. If a user who needs such functionality contributes a patch I am pretty sure it will get accepted, or at least commented on how it should be done. Regards Henrik |
From: Nigel C. <ncu...@cl...> - 2003-11-30 23:48:53
|
Hi. On Mon, 2003-12-01 at 10:37, Henrik Nordstrom wrote: > > if it would have swsusp - i could even relocate a running instance - like > > i can do with vmware. > > To my knowledge nobody has solved suspend with UML yet.. That's correct. It's on my todo list, but not high in the priority of things. Regards, Nigel -- Nigel Cunningham 495 St Georges Road South, Hastings 4201, New Zealand Evolution (n): A hypothetical process whereby infinitely improbable events occur with alarming frequency, order arises from chaos, and no one is given credit. |
From: roland <for...@gm...> - 2003-12-01 09:08:06
|
hi ! > That's correct. It's on my todo list, but not high in the priority of things. ohh, too bad - but many people would like it! ;) -> http://usermodelinux.org/modules.php?name=Surveys&op=results&pollID=7&mode=&order=&thold= regards roland ----- Original Message ----- From: "Nigel Cunningham" <ncu...@cl...> To: "Henrik Nordstrom" <hn...@ma...> Cc: "roland" <for...@gm...>; "Shao-Lin Joseph Chung" <sc...@cs...>; "UML User list" <use...@li...>; "UML Devel list" <use...@li...> Sent: Monday, December 01, 2003 12:31 AM Subject: Re: [uml-devel] How to copy COW files to another machine across the network? > Hi. > > On Mon, 2003-12-01 at 10:37, Henrik Nordstrom wrote: > > > if it would have swsusp - i could even relocate a running instance - like > > > i can do with vmware. > > > > To my knowledge nobody has solved suspend with UML yet.. > > That's correct. It's on my todo list, but not high in the priority of > things. > > Regards, > > Nigel > -- > Nigel Cunningham > 495 St Georges Road South, Hastings 4201, New Zealand > > Evolution (n): A hypothetical process whereby infinitely improbable events occur > with alarming frequency, order arises from chaos, and no one is given credit. > |
From: Jeff D. <jd...@ad...> - 2003-12-05 23:52:29
|
for...@gm... said: > why was that relocateability being made difficult artificially by > putting that "features" into COW? is this "problem" going to be > adressed? To prevent people from mangling filesystems by providing the wrong backing file to UML. > is this "problem" going to be adressed? It's not a problem. If you really move a backing file, there is a perfectly good way of telling UML to update the COW file. Jeff |
From: Jeff D. <jd...@ad...> - 2003-12-05 23:53:06
|
hn...@ma... said: > You can't move COW files around unless you make a tool to adjust the > COW file header after the move. Ummm, you can move COW files around arbitrarily. It's the backing files you need to be careful with. Jeff |
From: Geert U. <ge...@li...> - 2003-11-28 09:13:48
|
On Fri, 28 Nov 2003, Shao-Lin Joseph Chung wrote: > As the subject says, I'm trying to copy the COW files across the network. > But the size of those copies got expanded to the same size of the backing > file. > > Is there a way to keep these COWs small & cute? ;-) From rsync(1): -S, --sparse handle sparse files efficiently Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@li... In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds |
From: Jeff D. <jd...@ad...> - 2003-12-05 23:52:34
|
om...@te... said: > I've also run into the fact that something (either UML or the shell) > seems to cause any symlinks in the path to be factored out as well. It's UML. I didn't want backing file paths to be invalidated by a component of a symlink going away when the backing file is still perfectly accessible. > This means that any images I try to locate in the same place at all > times, regardless of physical location, still end up being stored as > their physical location. I don't parse that at all. Can you try again, maybe with an example or two? Jeff |
From: Henrik N. <hn...@ma...> - 2003-11-28 08:39:55
|
On Fri, 28 Nov 2003, roland wrote: > after making the rootfs-clone a cow-filesystem (by adding ubd2=rootfsclone.cow,rootfsclone.img), > the freeze happened again when doing some heavy writing. so - am i allowed to assume, that > there is a bug in the copy-on-write code? Maybe, maybe not. What does vmstat on the host report while you run this? > btw: could anybody recommend a general filesystem stresstest-tool which does the worst to > filesytems, you can do (besides formatting or overwriting)? :D bonnie++ is a good one.. Regards Henrik |
From: roland <for...@gm...> - 2003-11-28 20:38:21
|
hi! > bonnie++ is a good one.. thanks - will give it a try. i already were aware of bonnie, but i thought it was just a benchmark program. > Maybe, maybe not. What does vmstat on the host report while you run this? here is the output. the i/o stops, but uml still consumes cpu. regards roland 2 1 0 273464 33628 156188 0 0 40 1732 1208 64200 42 58 0 0 1 0 0 269412 33628 160104 0 0 3916 16 1988 56447 23 28 48 0 1 0 0 265368 33628 164024 0 0 3920 0 1984 11543 6 12 82 0 1 0 0 265372 33628 164024 0 0 0 0 1006 43904 38 62 0 0 2 0 0 262528 33628 166768 0 0 2744 0 1699 71773 42 58 0 0 1 0 0 255804 33628 173280 0 0 6512 0 2632 73736 40 60 0 0 1 1 0 249364 33628 179512 0 0 6232 0 2570 102489 46 54 0 0 1 2 0 246336 33700 182364 0 0 2856 5800 1992 95142 45 55 0 0 1 2 0 246332 33700 182368 0 0 4 4340 1211 43884 39 60 1 0 procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 0 3 0 246324 33700 182376 0 0 8 6364 1238 4349 4 9 87 0 0 3 0 246308 33700 182392 0 0 16 7548 1289 294 0 1 99 0 0 3 0 246304 33700 182396 0 0 4 2104 1293 279 0 1 99 0 0 3 0 246296 33700 182404 0 0 8 1716 1271 267 0 2 98 0 1 0 0 243456 33700 185148 0 0 2744 68 1764 3261 7 19 74 0 <---! 1 0 0 243440 33700 185148 0 0 0 0 1003 56 34 66 0 0 1 0 0 243440 33700 185148 0 0 0 0 1002 54 33 67 0 0 1 0 0 243440 33700 185148 0 0 0 0 1002 54 31 69 0 0 1 0 0 243440 33700 185148 0 0 0 0 1002 52 33 67 0 0 1 0 0 243440 33700 185148 0 0 0 0 1002 47 29 71 0 0 1 1 0 243396 33744 185148 0 0 0 3240 1116 75 31 69 0 0 1 0 0 243396 33744 185148 0 0 0 0 1006 53 34 66 0 0 1 0 0 243396 33744 185148 0 0 0 0 1003 57 27 73 0 0 1 0 0 243396 33744 185148 0 0 0 0 1003 52 33 67 0 0 1 0 0 243396 33744 185148 0 0 0 0 1003 48 32 68 0 0 1 0 0 243396 33744 185148 0 0 0 0 1003 47 34 66 0 0 ----- Original Message ----- From: "Henrik Nordstrom" <hn...@ma...> To: "roland" <for...@gm...> Cc: <use...@li...>; "uml-user" <use...@li...> Sent: Friday, November 28, 2003 9:39 AM Subject: Re: bug in COW? - Re: [uml-user] uml 2.6.0-test9 crash > On Fri, 28 Nov 2003, roland wrote: > > > after making the rootfs-clone a cow-filesystem (by adding ubd2=rootfsclone.cow,rootfsclone.img), > > the freeze happened again when doing some heavy writing. so - am i allowed to assume, that > > there is a bug in the copy-on-write code? > > Maybe, maybe not. What does vmstat on the host report while you run this? > > > btw: could anybody recommend a general filesystem stresstest-tool which does the worst to > > filesytems, you can do (besides formatting or overwriting)? :D > > bonnie++ is a good one.. > > Regards > Henrik > |
From: Henrik N. <hn...@ma...> - 2003-11-28 21:35:46
|
On Fri, 28 Nov 2003, roland wrote: > thanks - will give it a try. i already were aware of bonnie, but i thought it > was just a benchmark program. It is, by trying to stress the I/O as hard as possible in different conditions. > here is the output. > the i/o stops, but uml still consumes cpu. Then there most likely is a UML problem. Recommended action is to attach a debugger an look what the UML kernel is doing. Regards Henrik |
From: roland <for...@gm...> - 2003-11-30 02:25:27
|
hi, i have 2.6.0-test11-um running - as expected the error is still there. i compiled a debug version and attached a debugger to the "hanging" uml-process: linux:/uml/suse9 # gdb linux-2.6.0-test11-um-debug 2234 GNU gdb 5.3.92 Copyright 2003 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i586-suse-linux"... Attaching to program: /uml/suse9/linux-2.6.0-test11-um-debug, process 2234 0xa0002091 in munmap () (gdb) cont Continuing. Program received signal SIGSEGV, Segmentation fault. 0xa002a80b in cowify_req (req=0xa02236e4, dev=0xa022ab40) at arch/um/drivers/ubd_kern.c:820 820 arch/um/drivers/ubd_kern.c: No such file or directory. <- ??? in arch/um/drivers/ubd_kern.c (gdb) cont Continuing. Program received signal SIGSEGV, Segmentation fault. 0xa002a80b in cowify_req (req=0xa02236e4, dev=0xa022ab40) at arch/um/drivers/ubd_kern.c:820 820 in arch/um/drivers/ubd_kern.c (gdb) cont Continuing. Program received signal SIGSEGV, Segmentation fault. 0xa002a80b in cowify_req (req=0xa02236e4, dev=0xa022ab40) at arch/um/drivers/ubd_kern.c:820 820 in arch/um/drivers/ubd_kern.c (gdb) so my speculation seems affirmed, that the bug is "somewhere inside COW". here is the appropriate code snippet from ubd_kern.c : 790 void cowify_req(struct io_thread_req *req, struct ubd *dev) 791 { 792 int i, update_bitmap, sector = req->offset >> 9; 793 794 if(req->length > (sizeof(req->sector_mask) * 8) << 9) 795 panic("Operation too long"); 796 if(req->op == UBD_READ) { 797 for(i = 0; i < req->length >> 9; i++){ 798 if(ubd_test_bit(sector + i, (unsigned char *) 799 dev->cow.bitmap)){ 800 ubd_set_bit(i, (unsigned char *) 801 &req->sector_mask); 802 } 803 } 804 } 805 else { 806 update_bitmap = 0; 807 for(i = 0; i < req->length >> 9; i++){ 808 ubd_set_bit(i, (unsigned char *) 809 &req->sector_mask); 810 if(!ubd_test_bit(sector + i, (unsigned char *) 811 dev->cow.bitmap)) 812 update_bitmap = 1; 813 ubd_set_bit(sector + i, (unsigned char *) 814 dev->cow.bitmap); 815 } 816 if(update_bitmap){ 817 req->cow_offset = sector / (sizeof(unsigned long) * 8); 818 req->bitmap_words[0] = 819 dev->cow.bitmap[req->cow_offset]; 820 req->bitmap_words[1] = 821 dev->cow.bitmap[req->cow_offset + 1]; 822 req->cow_offset *= sizeof(unsigned long); 823 req->cow_offset += dev->cow.bitmap_offset; 824 } 825 } 826 } sorry, i have no real experience in debugging with gdb, nor am i a good c programmer. does anybody have a clue whats going wrong here ? regards roland ----- Original Message ----- From: "Henrik Nordstrom" <hn...@ma...> To: "roland" <for...@gm...> Cc: <use...@li...>; "uml-user" <use...@li...> Sent: Friday, November 28, 2003 10:35 PM Subject: Re: bug in COW? - Re: [uml-user] uml 2.6.0-test9 crash > On Fri, 28 Nov 2003, roland wrote: > > > thanks - will give it a try. i already were aware of bonnie, but i thought it > > was just a benchmark program. > > It is, by trying to stress the I/O as hard as possible in different > conditions. > > > here is the output. > > the i/o stops, but uml still consumes cpu. > > Then there most likely is a UML problem. Recommended action is to attach a > debugger an look what the UML kernel is doing. > > Regards > Henrik > |
From: Henrik N. <hn...@ma...> - 2003-11-30 07:47:45
|
On Sun, 30 Nov 2003, roland wrote: > 820 arch/um/drivers/ubd_kern.c: No such file or directory. <- ??? Odd.. GDB should know where to find your sources.. Try starting the UML from the top of your kernel source tree. > Program received signal SIGSEGV, Segmentation fault. > 0xa002a80b in cowify_req (req=0xa02236e4, dev=0xa022ab40) > at arch/um/drivers/ubd_kern.c:820 > 820 in arch/um/drivers/ubd_kern.c > (gdb) cont > Continuing. > > Program received signal SIGSEGV, Segmentation fault. > 0xa002a80b in cowify_req (req=0xa02236e4, dev=0xa022ab40) > at arch/um/drivers/ubd_kern.c:820 > 820 in arch/um/drivers/ubd_kern.c > (gdb) Use the print command to try to figure out which part of the statement that is failing and what the different values are.. Regards Henrik |
From: roland <for...@gm...> - 2003-11-30 11:32:21
|
hi! thanks for helping me with this! the patch from lynn fixes this problem for me. (http://sourceforge.net/mailarchive/message.php?msg_id=4096993) i really wonder a little bit, that such "one liners" don`t find their way more quickly into uml. i spend at least some hours in solving this problem, needed to post to the ML, needed the time of others and recognized (with their help) that it must have been in COW. ok - good - i have learned something, but why bothering with such when others have bothered already and the problem is solved in general? this looks quite similar to the "clock skew issue". at least a few people complained, a patch seems to be there for some time now - but there also seems no reference in the bug- or patchtracker. i`m NOT angry here and I don`t want to point my fingers to anybody - all are doing great work here - especially jeff(btw: where is he?)- but since i didn`t find the cow-bug in the SF bugtracker or patchtracker i think: "mhhh,couldn`t there be something made better?" (hey, that`s just MY view of the things, maybe i`m wrong here, but IMHO patchtracking and bugfixing is one of the most important things in OSS development). unfortunately bug/patchtracker usage on SF seems declining. is this done more and more "manually" now or is bug/patchtracker moved to somewhere else? i hope this doesn`t sound presumptuous - but is there need for a "patch/bug monkey", e.g. someone for filtering patch submissions and bug reports from the ML and put that into SF "trackers" - just somebody supporting main developers with some "QA" work? maybe i`m just too impatient - but i`m wishfully awaiting 2.6_uml_stable. maybe it isn`t uml developer`s priority to "go 2.6" ? i know - patches are primary for developers, so should i leave my fingers away and should not complain about something which isn`t "ready" for somewhat "public usage" yet ? regards roland ----- Original Message ----- From: "Lynn Kerby" <lf...@ke...> To: <use...@li...> Sent: Sunday, November 30, 2003 9:23 AM Subject: Re: [uml-devel] Re: bug in COW? - Re: [uml-user] uml 2.6.0-test9 crash > > On 2003.11.29 18:29 roland wrote: > >hi, > >i have 2.6.0-test11-um running - as expected the error is still there. > >i compiled a debug version and attached a debugger to the "hanging" uml-process: > > > >linux:/uml/suse9 # gdb linux-2.6.0-test11-um-debug 2234 > >GNU gdb 5.3.92 > >Copyright 2003 Free Software Foundation, Inc. > >GDB is free software, covered by the GNU General Public License, and you are > >welcome to change it and/or distribute copies of it under certain conditions. > >Type "show copying" to see the conditions. > >There is absolutely no warranty for GDB. Type "show warranty" for details. > >This GDB was configured as "i586-suse-linux"... > >Attaching to program: /uml/suse9/linux-2.6.0-test11-um-debug, process 2234 > >0xa0002091 in munmap () > >(gdb) cont > >Continuing. > > > >Program received signal SIGSEGV, Segmentation fault. > >0xa002a80b in cowify_req (req=0xa02236e4, dev=0xa022ab40) > > at arch/um/drivers/ubd_kern.c:820 > >820 arch/um/drivers/ubd_kern.c: No such file or directory. <- ??? > > in arch/um/drivers/ubd_kern.c > >(gdb) cont > >Continuing. > > > >Program received signal SIGSEGV, Segmentation fault. > >0xa002a80b in cowify_req (req=0xa02236e4, dev=0xa022ab40) > > at arch/um/drivers/ubd_kern.c:820 > >820 in arch/um/drivers/ubd_kern.c > >(gdb) cont > >Continuing. > > > >Program received signal SIGSEGV, Segmentation fault. > >0xa002a80b in cowify_req (req=0xa02236e4, dev=0xa022ab40) > > at arch/um/drivers/ubd_kern.c:820 > >820 in arch/um/drivers/ubd_kern.c > >(gdb) > > > > > >so my speculation seems affirmed, that the bug is "somewhere inside COW". > > > >here is the appropriate code snippet from ubd_kern.c : > > > > > > 790 void cowify_req(struct io_thread_req *req, struct ubd *dev) > > 791 { > > 792 int i, update_bitmap, sector = req->offset >> 9; > > 793 > > 794 if(req->length > (sizeof(req->sector_mask) * 8) << 9) > > 795 panic("Operation too long"); > > 796 if(req->op == UBD_READ) { > > 797 for(i = 0; i < req->length >> 9; i++){ > > 798 if(ubd_test_bit(sector + i, (unsigned char *) > > 799 dev->cow.bitmap)){ > > 800 ubd_set_bit(i, (unsigned char *) > > 801 &req->sector_mask); > > 802 } > > 803 } > > 804 } > > 805 else { > > 806 update_bitmap = 0; > > 807 for(i = 0; i < req->length >> 9; i++){ > > 808 ubd_set_bit(i, (unsigned char *) > > 809 &req->sector_mask); > > 810 if(!ubd_test_bit(sector + i, (unsigned char *) > > 811 dev->cow.bitmap)) > > 812 update_bitmap = 1; > > 813 ubd_set_bit(sector + i, (unsigned char *) > > 814 dev->cow.bitmap); > > 815 } > > 816 if(update_bitmap){ > > 817 req->cow_offset = sector / (sizeof(unsigned long) * 8); > > 818 req->bitmap_words[0] = > > 819 dev->cow.bitmap[req->cow_offset]; > > 820 req->bitmap_words[1] = > > 821 dev->cow.bitmap[req->cow_offset + 1]; > > 822 req->cow_offset *= sizeof(unsigned long); > > 823 req->cow_offset += dev->cow.bitmap_offset; > > 824 } > > 825 } > > 826 } > > > > > >sorry, i have no real experience in debugging with gdb, nor am i a good c programmer. > >does anybody have a clue whats going wrong here ? > > > >regards > >roland > > Yes, this is a long standing bug in COW. A crash occurs when attempting to update > the bitmap for writes out near the end of certain sized disk images. The attempt > to change the bitmap at req->cow_offset+1 can be out of bounds. > > I'm running a RH kernel that is incompatible with any available skas patch so I'm not > actively using UML at the moment and have no idea what the status of this bug is. I > submitted a patch many months ago that I believe fixes the problem with no significant > side effects. I thought it - or something close - was integrated into the base long ago. > A little searching through the list archives from early March or a search through the > bug lists should get you a little history on the problem and some suggested solutions. > -- > Lynn Kerby <mailto:lf...@ke...> > > > ------------------------------------------------------- > This SF.net email is sponsored by: SF.net Giveback Program. > Does SourceForge.net help you be more productive? Does it > help you create better code? SHARE THE LOVE, and help us help > YOU! Click Here: http://sourceforge.net/donate/ > _______________________________________________ > User-mode-linux-devel mailing list > Use...@li... > https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel > |
From: Jeff D. <jd...@ad...> - 2003-12-05 23:52:33
|
for...@gm... said: > i really wonder a little bit, that such "one liners" don`t find their > way more quickly into uml. The one-liners are getting in. I disposed of ~40 of them a few weeks ago. > i`m NOT angry here and I don`t want to point my fingers to anybody - > all are doing great work here - especially jeff(btw: where is he?) Tokyo (or at least I was, now I'm in NH recovering from let lag). > but IMHO patchtracking and bugfixing is one of the most important > things in OSS development). unfortunately bug/patchtracker usage on > SF seems declining. is this done more and more "manually" now or is > bug/patchtracker moved to somewhere else? The bug tracker is the UML mailing lists. I stick bug reports and patches in a todo folder and delete them from there when I fix (or merge) them. I pay almost no attention to the SF patch and bug report thingies. So if you want your bug to get attention, send it to one of the lists. > i hope this doesn`t sound presumptuous - but is there need for a > "patch/bug monkey", e.g. someone for filtering patch submissions and > bug reports from the ML and put that into SF "trackers" - just > somebody supporting main developers with some "QA" work? maybe i`m > just too impatient - but i`m wishfully awaiting 2.6_uml_stable. It would be useful for someone to start a new UML tree, collect patches, merge them in, get testing, and forward them on to me. But so far, pretty much no one is in the code but me, so you're limited by the time that I can put into it. Jeff |
From: Henrik N. <hn...@ma...> - 2003-12-06 09:41:53
|
On Fri, 5 Dec 2003, Jeff Dike wrote: > It would be useful for someone to start a new UML tree, collect patches, merge > them in, get testing, and forward them on to me. But so far, pretty much no > one is in the code but me, so you're limited by the time that I can put into > it. I could volonteer for this job if it wasn't for bitkeeper and it's non-competive license restriction and I can not accept a license which prevents me from working on other similar projects just to be able to work on an open-source project (section 3d in the BKL license agreement), and based on what we do with some tools around CVS I am probably already in violation with the BKL agreement. If BitMovers would change the license to exclude Open Source version management tools from section 3d then the license is probably acceptable even if I dislike certain other terms.. If there is a way where the above can be done without having to use bitkeeper then I am all set. Regards Henrik |
From: Jeff D. <jd...@ad...> - 2003-12-06 18:05:03
|
On Sat, Dec 06, 2003 at 10:41:43AM +0100, Henrik Nordstrom wrote: > If there is a way where the above can be done without having to use > bitkeeper then I am all set. There's no requirement at all to use BitKeeper. Feel free to send me patches in any reasonable form you want, which in your case would probably just be patches. I also make plain patches available, so there's no need for BK in order to get my tree. Also, my 2.4 tree, which is where new stuff is added first right now, is just in CVS. Jeff |
From: Henrik N. <hn...@ma...> - 2003-12-06 23:33:03
|
On Sat, 6 Dec 2003, Jeff Dike wrote: > There's no requirement at all to use BitKeeper. Feel free to send me patches > in any reasonable form you want, which in your case would probably just be > patches. Ok. I will try my best to track and collect any patches seen from now on to make sure none is lost or forgotten. Up to now I have only tracked the patches which directly affect my use of UML. Note to others: The SF bug/patch tracker looks nice, but is in my opinion really awful to use for the purpose. I am not going to actively look in these tools. > I also make plain patches available, so there's no need for BK in order to > get my tree. Also, my 2.4 tree, which is where new stuff is added first > right now, is just in CVS. The 2.4 CVS tree at SourceForge is what I am currently following so that part looks promising. Regards Henrik |
From: roland <for...@gm...> - 2003-12-08 01:16:05
|
hi ! > Note to others: The SF bug/patch tracker looks nice, but is in my opinion > really awful to use for the purpose. I am not going to actively look in > these tools. mhh - that really is a matter of taste. from my personal view, i find it difficult to track patches or bugs if one must dig for that into a ML archive. for you developers it`s ok, to filter them out from the ML. but - if other people want to take a look, what problems have probably been resolved, they don`t have the same "view" of the things, like you have, because your bugfix/patchcollection is somewhat "non public" (because of too much "ML-noise" around). isn`t it? from another sf-projekt i`m related somewhat (rockbox) i can say: bug and patchtracking seems to work just fine. the project maintaines even recommend, that contributors should sending language-file-updates as patches to SF patchtracker. everybody can easily take it out there - add comments - and more important - see what the status is or when it has been merged. as an example, the "configuration management" the rockbox/haxx.se guys (btw: curl is from them,too) is marvellous, imho. (see http://rockbox.haxx.se ->recent cvs activity,daily builds,cvs compile status,bleeding edge builds,bug reports,patches.... ) just cool! ok - we cannot really compare these projects side by side, they are very different. >No, simply UML 2.6 works only if you don't enable module support. Obviously >this is a bug, and patches have been posted around 5-6 times or so(search in >the archives, but some modules could not work anyway because some symbols >must still be exported); but no fix has gone in Jeff's patch. see what i mean? 5-6 times posted - but people need a helping hand to find them. ok - i cannot arrogate, to tell you code wizards(respect!) how to work - but perhaps i can give some pro`s/con`s. okok - last words - i shut up now ;) regards roland |
From: Jeff D. <jd...@ad...> - 2003-12-08 19:08:16
|
for...@gm... said: > that really is a matter of taste. from my personal view, i find it > difficult to track patches or bugs if one must dig for that into a ML > archive. I personally hate the SF patch/bug trackers. Webby crap like that is unusable as far as I'm concerned. > see what i mean? 5-6 times posted - but people need a helping hand to > find them. That makes this a candidate for the usermodelinux.org FAQ section, which I created for exactly this reason. Feel free to make submissions. Jeff |