From: Greg H. <gh...@ps...> - 2006-08-29 23:24:07
|
Upinder S. Bhalla writes: > Dear Greg et al, > Thanks for raising some very important and interesting points. I have > not yet thought much about parallel model loading, because I don't have > much idea about how much of a bottleneck it might be. Efficient parallel model loading (or setup) is definitely important; this sort of thing can quickly become the bottleneck in running a large simulation. Setup time is, of course, model-dependent, but one data point I can cite is the large PGENESIS cerebellar model that was run on our T3E several years ago by Fred Howell et al.: -- for their largest model (on 128 nodes) the setup time (done in parallel) was taking 65% as long as the time for doing the actual simulation. > Before I dive into > the details, this is my earlier line of thought; please comment on it. > > 1. Threads: I had considered restricting multithreading to solvers, on a > per-node basis, for some of the reasons Greg has outlined. While solvers should definitely be able to take advantage of multithreading, I'm uncomfortable restricting everything else to just one thread. For instance, the GUI will likely need a variable number of threads to make programming easier. I also like doing network I/O and large file I/O as separate threads so that they do not freeze up the system if delays occur. > 2. RelativeFind etc: I had considered caching info on the postmaster to > speed up the process of finding remote objects, and grouping requests for > remote-node element info. Yes, those things help. If the cache is required to give 100% accurate information (as opposed to hints with no guarantee of correctness), then cache consistency issues have to be dealt with, since elements can come into and out of existence. If elements are allowed to be movable between nodes, this gets messier. > 3. Parallel model building: I thought that almost all cases where this > would be critical would be through special calls like createmap, > region-connect, and perhaps copy. Most of these can be rather cleanly done > in parallel with minimal internode communication. However, a global > checkpoint would be needed to ensure synchrony between these calls. "region-connect" will almost certainly require a lot of internode communication. And while we can anticipate the most common patterns of connectivity (as GENESIS 2 did with planarconnect and volumeconnect), there will be a significant number of people who want to specify connections some other way, and they have to resort to connecting up many elements individually. We need to let them do this in a way that happens in parallel. > I should also add that the divide between setup time and runtime is > probably not so clean and we will definitely need to figure out efficient > ways of handling this. For example, in signalling simulations I have > already had issues where new organelles are budding off and being > destroyed at runtime. I think this is a very important point, and I completely agree. It would be possible to get better simulation performance if we assumed a sharp division between the model construction and simulation phases, and compiled the model down to a super-efficient simulatable form, but then this makes it more difficult to dynamically view or alter the models at runtime. And, as you mentioned, real cellular processes occur that are best modeled as structural changes to the model rather than numerical changes to already existing parameters. > To consider Greg's points: > > The greatest concern I have is with the many places in the basecode that > make an implicit assumption that elements are locally resident in the > nodes's memory, and that only one thread will be actively > > modifying them. (...) > > Some form of locking will thus be needed (probably on a per-Element basis). > Couldn't we put a lock at an appropriate place in an element tree, but > permit other element trees to be accessed safely ? As long as the element tree is located entirely on a single node, this should work. I don't think locking subtrees that are distributed over multiple nodes would be desirable because of performance and deadlock issues. > >The most troublesome situations will be when > > modifications are being made to the element tree, such as when new > >elements are being created or old ones destroyed. > Can we have a lock set whenever 'dangerous' commands are being executed? > Most commands at runtime are relatively safe. > > > One solution may be to standardize at the .mh level.... This approach > > might make sense if nearly all the > > visualization and other add-on code would be at the .mh level or higher, > > but not if those things require major changes to the existing basecode. > > I'm not sure what you have in mind here. To me it looks like all the > locking stuff should be done at the basecode level, so the user does not > need to know about it even if they are developing new objects using .mh > files. Could you expand on it? I wasn't suggesting that locking should be done in the .mh files. What I was suggesting is that the syntax/semantics of the .mh level should cleaned up and more or less frozen. This would allow development to proceed at the .mh level and higher (GUIs, new modeling primitives, solvers(?), etc.) using the current MOOSE kernel (with some modifications). We can then plug in a parallel-capable kernel at a later time and get everything to run on parallel systems. People writing at the .mh level and higher should be writing code that is independent of the hardware characteristics, such as the number of nodes available, or the relative performance of the nodes. An example of a change that would be necessary at the .mh level would be to use something like "ElementID" or "ElementHandle" instead of "Element *", because Element* makes an implicit assumption that the Element is located on the same node as where the code is executing. The current MOOSE could just define ElementID to Element*, but a parallel implementation could define it to something else (e.g.,pair<NodeID,uint64_t>). --Greg |