On Wed, Sep 15, 2004 at 01:05:11AM +0100, Tom Davis wrote:
> I sent this email to Moreno Baricevic who suggested that I forward it to
> this mailing list where more people may be able to help.
> Date: Tue, 14 Sep 2004 17:08:14 +0100
> From: Tom Davis <tom@...>
> To: baro@...
> Subject: OpenMosix Development
> I represent a group of fourth year Computer Science MEng students at the
> University of Warwick, UK. At the moment we're looking for ideas for a
> group project in the coming academic year. Although we have not come up
> with anything specific, we are interested in the area of Recovery
> Oriented Computing (ROC) with regards to clustering. In particular the
> effects on the cluster if a node fails and how any problems may be resolved.
> I wonder if I could ask you a few questions about OpenMosix so we might
> make a more informed decision on whether OpenMosix would be suitable for
> our project?
> Although I have not used OpenMosix, reading your documentation has left
> me with the impression that should a node suddenly fail (e.g. through a
> power cut), data will be lost?
Yes. In oM, the user-space memory of a migrated process lives on the
remote node, with system calls passed back to the home node. If
communication is lost (e.g. because of a crash or network failure), the home
node abandons the process. It's similar to if it received a SIGKILL.
> And currently the system will not
> reassign the process to an alternative functioning node Is this
> impression correct?
Yes. The home node only has the kernel-side state of the process, not it's
text or data. This is good, because you can run several large computational
processes and have them migrate off to nodes with enough RAM to handle them.
If they all came home at once, the home node would swap heavily, or even run
out of VM altogether.
> A couple of ideas I have had as solutions would be some kind of watchdog
> to check a processing node is still functioning.
oM already notices when a migrated process loses communication.
> Some way of keeping a
> 'backup' of a process before it is executed would be needed too.
> Although these are only first impressions, I have not looked into the
> specifics of OpenMosix enough to make a more informed speculation.
Someone wrote chpox to checkpoint process's memory and file descriptor
state to disk, with the ability to restore. (It has limitations, though.)
If want to recover from a remote machine dieing, you only need to keep
checkpoint user-space state, because the deputy (part of the process that
stays on the home node) has all the kernel-side state (e.g. file descriptors).
However, doing this robustly would be very hard; If you roll back the
process to the state it was in when you last saved a copy of its memory, it
thinks it hasn't yet written to files that in fact already have been written
to, and so on. You'd have to roll back system calls, too. (chpox doesn't
do that...) For a purely computational process, which reads its data to
start, crunches, and finally writes results, roll-back wouldn't cause
problems. At the other end of the spectrum, a process that interacts
through the filesystem (by creating and/or removing files) with other
running processes would get out of phase if rolled back to before it had
done some operations.
Also, if your solution requires the home node to keep all remote processes
in its own virtual memory, not all openMosix users would be able to take
advantage of it. Some people use oM to farm out lots of large-memory jobs
to their clusters.
> How easy do you think a solution like this would be to implement?
Unless you have some better ideas than what I've suggested, or a fast
enough way to update the local backup of the process state after every
system call that affects externally visible stuff (e.g. unlink(2) or
write(2) (and maybe read(2)?), but not gettimeofday()), you could do it.
> a team of 6 and have about 6 months to implement it.
I'm no good at software engineering stuff... No idea.
> Also, how long do
> you think it would take to get up to speed on the project? Although we
> have no experience with kernel development, we are all capable
> programmers in one language or another (usually C++ or Java) and have
> some experience of C. Brushing up on the language hopefully wouldn't
> present too much of a problem.
> I realise these questions are a bit like asking how long a piece of
> string is, but any help you could give would be much appreciated. :-)
#define X(x,y) x##y
Peter Cordes ; e-mail: X(peter@... , des.ca)
"The gods confound the man who first found out how to distinguish the hours!
Confound him, too, who in this place set up a sundial, to cut and hack
my day so wretchedly into small pieces!" -- Plautus, 200 BC