Thread: RE: [SSI-devel] Re: get shared memory and user space talking

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

Frank, Stevie:
   This looks like it started as a private exchange and you didn't
include stevie's original message, so responding is a little difficult.
Anyway, I have had some communication with Asmita (of the geek girls -
love the name).  The migshm extension to openmosix is providing 2
things.  The first is a way to migrate when you have a shared memory
segment attached.  OpenSSI already has that and in fact does so in a
more coherent manner than is possible with migshm (see the comment below
about the use of semaphores).

The second feature migshm has is that it will re-join clones that have
both migrated to the same node (so they share their process address
space again).  What I was hoping they had was a way to actually
coherently have the clones on different nodes.  I'm pretty sure that is
not provided.  I believe we could provide the memory re-join capability
pretty trivially in the current code base  (I'll check with John Byrne
on this).  The concern we had about migrating clones was making sure
they all migrated at the same time and to the same node.  I don't know
if openMosix does anything to guarantee that.  If you don't guarantee
it, the process data space can become incoherent and the processes are
in trouble.  John was telling me that current base code now has a
primitive to rendevous all the threads so migrating the thread group
together is more feasible.

We are certainly interested in migrating multi-threaded processes
(clones) so stay tuned.

Bruce

************************************************************************
*******************
Some notes from Asmita about migshm:  =20

--- Its a bit different. Atleast one node (the owner node of the sared=20
memory segment) has the latest synched entire copy. All others nodes,
either=20
have the latest copy or dont have the copy at all, and a process
accessign a=20
shared memory page there would result in a page fault which we route to
the=20
owner node so that it gets the latest copy. So, even we have a single
point=20
of failure for the shared memory segment.

--- Well, openMosix creates a new memory map for a migrated process on
the=20
new node. we make sure that when two processes attached to the same
shared=20
memory section (or when two clones) get migrate dto the same node, they=20
share the same physical pages of the shm segment (or the memory map in
case=20
of clones) on the new node just as they would had they not been migrated
at=20
all.

About the coherency, migShm assumes that processes would use semaphores
to=20
synchronize accesses to the shared memory segment. At the time of
release of=20
the semaphore, we sync up the changes to teh owner node copy (we send
only=20
the dirty pages) and invalidate pte's of rest of the nodes. So when a=20
process running on a node which in neither the owner node nor that of
the=20
last writer accesses the page, it page faults. We route this page fault
to=20
the owner node and get the latest page from there.

You can get more details about this in=20
http://www.mcaserta.com/maask/Migshm_Report.pdf.

Regards,
Asmita

************************************************************************
************

> From: Frank Mayhar [mailto:fr...@ex...]=20
> Sent: Monday, October 20, 2003 7:51 PM
> To: stevie mckibbin
> Cc: ssi...@li...
> Subject: [SSI-devel] Re: get shared memory and user space talking
>=20
>=20
> Hi, I'm not ignoring you, I've just been a bit busy with=20
> other things (mostly
> trying to find a job).
>=20
> The place you want to start is the SysV Shared Memory IPC=20
> handling.  The
> underlying implementation is exactly what you want.  Basically, if my
> (limited) understanding is correct, there's a vnode (at least one) for
> the shared memory segment.  This vnode is handled by CFS; there's sort
> of a file system that describes that kind of shared memory,=20
> or at least
> CFS is stacked on top of the SHM implementation.  Look at the file
> openssi/kernel/cluster/ssi/cfs/cfs_ipcshm.c for some clues.  I've CC'd
> this to the devel list, so Dave can explain more fully if he wants...
>=20
> The real question I have is, what is backing a process's=20
> virtual address
> space?  The executable itself is backed by the disk image of=20
> the program;
> that's static and isn't really interesting.  The anonymous=20
> pages are what
> have to be shared, stuff like the stack, data pages and=20
> malloc'ed memory.
> To do this the way SHM does it, we would need to back anonymous pages
> with another file system.  Kind of like distributed swap was=20
> in Unixware,
> I guess, although I never really dug into that.
>=20
> Alternatively, one could expose the token interfaces within=20
> CFS so that
> you could use them directly.  I suspect you would get into=20
> some serious
> wheel-reinvention doing that, though.
>=20
> I would be interested to see any feedback you get from the=20
> "geek girls."
> --=20
> Frank Mayhar fr...@ex...	http://www.exit.com/
> Exit Consulting                 http://www.gpsclock.com/
>                                 http://www.exit.com/blog/frank/
>=20
>=20
> -------------------------------------------------------
> This SF.net email is sponsored by OSDN developer relations
> Here's your chance to show off your extensive product knowledge
> We want to know what you know. Tell us and you have a chance=20
> to win $100
> http://www.zoomerang.com/survey.zgi?HRPT1X3RYQNC5V4MLNSV3E54
> _______________________________________________
> ssic-linux-devel mailing list
> ssi...@li...
> https://lists.sourceforge.net/lists/listinfo/ssic-linux-devel
>=20