Thread: [OSR-devel] OpenVZ for OSR

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

Last new thread for today, I promise. ;)

I've been thinking about this for a while, and I think it's worth 
sharing. OSR is conceptually quite smilar virtualization - there is an 
init-root "hypervisor" that boot-straps the shared storage and then 
starts up the "guest" root chrooted to the shared storage.

OpenVZ (http://en.wikipedia.org/wiki/OpenVZ) virtualization is very 
similar to this (very much like Solaris "zones" or FreeBSD "jails"). It 
starts up the "guest" installation in it's own chroot on the local file 
system, without actually having a disk image file container, and the 
virtualization abstraction layer is paper thin because the only things 
being virtualized for the guest are the process IDs (since some things 
are allegedly sensitive to init not being pid 1) and the networking (so 
that each guest can have it's completely independent network 
configuration). All this means that the performance penalty is 
negligible. The guest VM doesn't run it's own kernel - the host kernel 
does all the kernel tasks, the guest's lowest level is it's init.

What I'm thinking about is coming up with an OSR modification that takes 
advantage of this - making the init root slightly more fully featured 
(useful for debug purposes!) so that it boots into it's own init, has 
it's own console login, and sets up the disk volumes (of whatever 
description) for the shared root guest. It then simply starts up the 
shared root guest VM.

Now, I know this is a lot to take in (and yes, I know it sounds like a 
mad idea at first), and it is conceptually a pretty big change. But do 
you think it makes any sense to even look into going in this direction?

The benefits would be:

1) A more fully featured standalone init-root host would allow for 
easier debugging.

2) The "guest" wouldn't need any modification or tweaks - this would 
have avoided a number of problems - e.g. the issue on the killall5 
thread, volume mounting above/below the guest's init line (/etc/mtab 
thread), need for patches to the guest's halt/network init scripts, and 
possibly other things that show up the fragility of the initrd.

3) The guest wouldn't need any awareness of the file system it lives on 
or of any daemons required to sustain that file system - the host would 
take care of all of that with complete transparency. This means no need 
to worry about killing a process upon which things like the rootfs 
depends on.

The reasons against this I can think of:

1) A tiny performance hit due to the networking stack and PID 
virtualization. (I don't think this would be measurable considering the 
inevitable cluster fs overheads.)

2) The initrd would end up being a bit bigger (if it ends up having it's 
own init and getties it would be doing more, so it's inevitable for it 
to grow slightly, but it would almost certainly grow by a lot less than 
the savings yielded recently by the pruning of the unused kernel modules 
and the pyc/pyo files. ;)

3) Any other unforeseen things that show up only once the prototype is 
built. This is a big one. There has been a whole array of bugs I had 
tripped in glfs because nobody ever considered the use-case of using it 
for a rootfs during testing (the biggest one off the top of my head was 
a massive memory leak stemming from mmap()-induced memory fragmentation 
that only arose when shared libraries were kept on glfs). I suspect this 
would likely expose similar problems - but I guess that is inevitable 
when straying off the straight and narrow.

Gordan

Thread: [OSR-devel] OpenVZ for OSR

open-sharedroot-devel