Re: [SSI-users] How much CPU power & memory per node?
Brought to you by:
brucewalker,
rogertsang
From: John H. <john@Calva.COM> - 2009-10-08 10:54:06
|
pk wrote: > John Hughes wrote: > > >> Multi-threaded apps run all threads on the same node. (This is because >> sharing writable memory between nodes is pretty inefficient). If you >> write your app as "multi-process" rather than "multi-thread", >> communicating using pipes, sockets or queues instead of shared memory >> then you can distribute your app across multiple nodes. >> > > Sorry for "butting" in... > > I'm just curious about what happened to the thread migration plans or > perhaps I remember that from another project? OpenMosix maybe? As I > understand it thread migration requires shared memory. Of course this > will introduce huge differences depending on the connection between > nodes but can't a NUMA model, perhaps with rDMA, handle such things? > Again, I'm just curious about the subject and don't know much about it... > In most OpenSSI clusters the connection between nodes is *many* times slower than the cpu-memory connection, historically it has been 10mbit ethernet (LOCUS), Compaq Servernet or 100mbit ethernet (UnixWare clusters), 1000mbit ethernet (OpenSSI). Shared memory between OpenSSI nodes "works" but you don't want to be using it in a multiple-writer situation like most threaded programs. How shared memory works - If a process writes a shared memory page then either: 1. it is currently owned by (writable by) the node running the process, nothing special happens, the process writes the page as usual 2. it has been written or read by processes on some other node or node, then the process gets a page fault and the cluster system makes an inter-node call to the other nodes, "stealing" the page, it is marked unreadable/unwritable on the other nodes and writable on this node. If a process reads a shared memory page then either: 1. It is currently readable by the node running the process. Nothing special happens. 2. Some other node has a readable copy - an inter-node call is made to get a copy of the page. 3. It is currently writable by some other node - an inter-node call is made to get a copy of the page and mark it unwritable. So any page can have one writer, or multiple readers. Multiple read access is fast, but once someone starts writing you get page faults and inter-node IPC. In other words OpenSSI has an extremely NUMA memory model. In this situation allowing different threads of a process, which by definition share memory, to run on different nodes doesn't seem reasonable. What we do allow (or should allow) is for all threads of a process to migrate to some other node in one go. |