From: Hugo C. <hug...@gm...> - 2006-08-30 15:25:48
|
Many issues raised during the parallelization thread are related to the functional decomposition of Moose. In attachment you find a proposal for a functional decomposition of G3, a slide that I made for a recent presentation about G2 and G3. Some notes : - Heccer, mentioned at the bottom of the slide, is the hsolve replacement that I am working on. It addresses several technical problems with hsolve. More about it later this week or next week or so. - One of the principles is that all the boxes are well separated components, accessible via interfaces. I have no problem if someone incorporates Heccer in Topographica f.i. (I would actually encourage this). So all components can be logically compiled, run and tested in isolation. - I make a strict separation between what is above the scheduler and below the scheduler. - parallelization is not mentioned. Parallelization can be used at several components at the same time or not at all. It is dependent on the implementation of components. - a couple of things are missing : -- Some relationships are missing, f.i. I guess a link is needed between the model tools and the scripting facilities. A 2D diagramma cannot represent everything... -- a component that deals with changes in the model during simulation time is missing. For my internal notes, I call this a model event generator, but that might be the wrong name. -- a solver factory, that given a piece of the model, maps it to a solver type and an instance of the solver. Such a solver factory is currently implemented as part of Neurospaces. A general solver factory will be a complicated piece of code, so I would like to separate this out. One of the main targets of this design is the generation of an API over the different tiers in a neuronal simulator. A target of distributing the slide is to generate something that can be used as a communication medium for G3 developers. So comments welcome. Hugo On 8/29/06, Greg Hood <gh...@ps...> wrote: > Upinder S. Bhalla writes: > > Dear Greg et al, > > Thanks for raising some very important and interesting points. I have > > not yet thought much about parallel model loading, because I don't have > > much idea about how much of a bottleneck it might be. > > Efficient parallel model loading (or setup) is definitely important; this > sort of thing can quickly become the bottleneck in running a large simulation. > Setup time is, of course, model-dependent, but one data point I can cite is > the large PGENESIS cerebellar model that was run on our T3E several years > ago by Fred Howell et al.: -- for their largest model (on 128 nodes) the > setup time (done in parallel) was taking 65% as long as the time for doing > the actual simulation. > > > > Before I dive into > > the details, this is my earlier line of thought; please comment on it. > > > > 1. Threads: I had considered restricting multithreading to solvers, on a > > per-node basis, for some of the reasons Greg has outlined. > > While solvers should definitely be able to take advantage of multithreading, > I'm uncomfortable restricting everything else to just one thread. For > instance, the GUI will likely need a variable number of threads to make > programming easier. I also like doing network I/O and large file I/O as > separate threads so that they do not freeze up the system if delays occur. > > > > 2. RelativeFind etc: I had considered caching info on the postmaster to > > speed up the process of finding remote objects, and grouping requests for > > remote-node element info. > > Yes, those things help. If the cache is required to give 100% accurate > information (as opposed to hints with no guarantee of correctness), then > cache consistency issues have to be dealt with, since elements can come > into and out of existence. If elements are allowed to be movable between > nodes, this gets messier. > > > > 3. Parallel model building: I thought that almost all cases where this > > would be critical would be through special calls like createmap, > > region-connect, and perhaps copy. Most of these can be rather cleanly done > > in parallel with minimal internode communication. However, a global > > checkpoint would be needed to ensure synchrony between these calls. > > "region-connect" will almost certainly require a lot of internode > communication. And while we can anticipate the most common patterns of > connectivity (as GENESIS 2 did with planarconnect and volumeconnect), there > will be a significant number of people who want to specify connections > some other way, and they have to resort to connecting up many elements > individually. We need to let them do this in a way that happens in parallel. > > > > I should also add that the divide between setup time and runtime is > > probably not so clean and we will definitely need to figure out efficient > > ways of handling this. For example, in signalling simulations I have > > already had issues where new organelles are budding off and being > > destroyed at runtime. > > I think this is a very important point, and I completely agree. It would > be possible to get better simulation performance if we assumed a sharp > division between the model construction and simulation phases, and compiled > the model down to a super-efficient simulatable form, but then this > makes it more difficult to dynamically view or alter the models at runtime. > And, as you mentioned, real cellular processes occur that are best modeled > as structural changes to the model rather than numerical changes to > already existing parameters. > > > > To consider Greg's points: > > > The greatest concern I have is with the many places in the basecode that > > make an implicit assumption that elements are locally resident in the > > nodes's memory, and that only one thread will be actively > > > modifying them. (...) > > > Some form of locking will thus be needed (probably on a per-Element basis). > > Couldn't we put a lock at an appropriate place in an element tree, but > > permit other element trees to be accessed safely ? > > As long as the element tree is located entirely on a single node, this should > work. I don't think locking subtrees that are distributed over multiple > nodes would be desirable because of performance and deadlock issues. > > > > >The most troublesome situations will be when > > > modifications are being made to the element tree, such as when new > > >elements are being created or old ones destroyed. > > Can we have a lock set whenever 'dangerous' commands are being executed? > > Most commands at runtime are relatively safe. > > > > > One solution may be to standardize at the .mh level.... This approach > > > might make sense if nearly all the > > > visualization and other add-on code would be at the .mh level or higher, > > > but not if those things require major changes to the existing basecode. > > > > I'm not sure what you have in mind here. To me it looks like all the > > locking stuff should be done at the basecode level, so the user does not > > need to know about it even if they are developing new objects using .mh > > files. Could you expand on it? > > I wasn't suggesting that locking should be done in the .mh files. > What I was suggesting is that the syntax/semantics of the .mh level > should cleaned up and more or less frozen. This would allow > development to proceed at the .mh level and higher (GUIs, new modeling > primitives, solvers(?), etc.) using the current MOOSE kernel (with some > modifications). We can then plug in a parallel-capable kernel at a > later time and get everything to run on parallel systems. People > writing at the .mh level and higher should be writing code that is > independent of the hardware characteristics, such as the number of > nodes available, or the relative performance of the nodes. An example > of a change that would be necessary at the .mh level would be to use > something like "ElementID" or "ElementHandle" instead of "Element *", > because Element* makes an implicit assumption that the Element is > located on the same node as where the code is executing. The current > MOOSE could just define ElementID to Element*, but a parallel > implementation could define it to something else (e.g.,pair<NodeID,uint64_t>). > --Greg > > ------------------------------------------------------------------------- > Using Tomcat but need to do more? Need to support web services, security? > Get stuff done quickly with pre-integrated technology to make your job easier > Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 > _______________________________________________ > Moose-g3-devel mailing list > Moo...@li... > https://lists.sourceforge.net/lists/listinfo/moose-g3-devel > -- Hugo Cornelis Ph.D. Research Imaging Center University of Texas Health Science Center at San Antonio 7703 Floyd Curl Drive San Antonio, TX 78284-6240 Phone: 210 567 8112 Fax: 210 567 8152 |