From: Lynn K. <lf...@ke...> - 2003-03-08 09:00:09
|
[A little introduction...] I'm new here and only recently discovered the UML project after being frustrated by VMware's inability to install/run on my RedHat 7.3 system. Very cool stuff - I'm currently running (or trying to run) 3 full-blown gui app development setups as UMLs (and an additional one on the host) for Redhat 7.3 & 8.0, and Mandrake 9.0. [And now to butt right in on a 2 week old discussion...] doug@ea wrote: >Jeff Dike wrote: >> doug@ea... said: >> >>>I worked around this by going into ubd_kern.c and taking the point >>>where memory is allocated for the dev->cow.buffer and adding another >>>page to it as in: >>> >>> dev->cow.bitmap = (void *) vmalloc(dev->cow.bitmap_len+4096); >> >> >> That's an OK workaround. This is a known bug which has been around for a >> while. This is the first time I've heard of it causing crashes. >> >> Jeff > >I noticed a post from late last year that had exactly the same gdb dump >as I got. This first happened when I was trying to run an ext3 inside >the UM. I would get this hang when building a large, empty, non-sparse >file. Usually, things would crash about 400 meg in. I suspect that >ext3 placed the journal close to the end of the volume. After moving to >ext2, the problem was harder to create but still there. In this case, >it did not happen until the volume was getting pretty full. > >I suspect that users that don't have exactly even boundaries for the cow >filesystems don't get this. If this fix is valid, perhaps you should >add it to the dist. I just hate "fudges" without knowing the underlying >logic. This is indeed the case. My cowfs covers exactly 512MB and the resulting bitmap is 128k in size resulting in a bitmap that completely fills the resulting allocation in the kernel. Walking off the end results in a segfault. There are obviously other combinations of COW sizes that will result in similar situations. I just found this list (and this posting) after discovering the exact cause of the problem I had in one of my UMLs and I didn't like the workaround because it simply masks a buffer overrun and *may* even wind up corrupting the first word of the COW fs for some undetermined layout(s). I say *may* because I didn't dig far enough to determine what the COW alignment is in the disk file. I'd suspect that there is some padding at the end of the bitmap so that the COW is at least sector aligned (and probably should be bumped to an 8k boundary in the file). If this isn't the case, maybe there will be a COW v3? I think the workaround above is largely harmless, but I'm proposing the following change for a IMO better solution. --- linux-2.4.19/arch/um/drivers/ubd_kern.c.orig Tue Mar 4 15:20:47 2003 +++ linux-2.4.19/arch/um/drivers/ubd_kern.c Sat Mar 8 00:11:42 2003 @@ -835,7 +835,20 @@ dev->cow.bitmap); } if(update_bitmap){ + /* + * There is apparently a long standing bug when the + * cow_offset of the start of the request falls into the + * last word of the bitmap. At this point, the + * bitmap_word[1] walks off the end. The most obvious + * solution is to back up by one word and write the last + * two words of the map instead of garbage or the + * segfault that occurs for some page sized bitmaps. + */ req->cow_offset = sector / (sizeof(unsigned long) * 8); + if (req->cow_offset == + (dev->cow.bitmap_len / sizeof(unsigned long)) - 1) { + req->cow_offset -= 1; + } req->bitmap_words[0] = dev->cow.bitmap[req->cow_offset]; req->bitmap_words[1] = -- Lynn Kerby <mailto:lf...@ke...> |