Re: [Moose-g3-devel] questions about MOOSE

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

Upinder S. Bhalla writes:
 > Dear Greg et al,
 >   Thanks for raising some very important and interesting points. I have
 > not yet thought much about parallel model loading, because I don't have
 > much idea about how much of a bottleneck it might be.

Efficient parallel model loading (or setup) is definitely important; this
sort of thing can quickly become the bottleneck in running a large simulation.
Setup time is, of course, model-dependent, but one data point I can cite is
the large PGENESIS cerebellar model that was run on our T3E several years
ago by Fred Howell et al.: -- for their largest model (on 128 nodes) the
setup time (done in parallel) was taking 65% as long as the time for doing
the actual simulation.

 > Before I dive into
 > the details, this is my earlier line of thought; please comment on it.
 > 
 > 1. Threads: I had considered restricting multithreading to solvers, on a
 > per-node basis, for some of the reasons Greg has outlined.

While solvers should definitely be able to take advantage of multithreading,
I'm uncomfortable restricting everything else to just one thread.  For
instance, the GUI will likely need a variable number of threads to make
programming easier.  I also like doing network I/O and large file I/O as
separate threads so that they do not freeze up the system if delays occur.

 > 2. RelativeFind etc: I had considered caching info on the postmaster to
 > speed up the process of finding remote objects, and grouping requests for
 > remote-node element info.

Yes, those things help.  If the cache is required to give 100% accurate
information (as opposed to hints with no guarantee of correctness), then
cache consistency issues have to be dealt with, since elements can come
into and out of existence.  If elements are allowed to be movable between
nodes, this gets messier.

 > 3. Parallel model building: I thought that almost all cases where this
 > would be critical would be through special calls like createmap,
 > region-connect, and perhaps copy. Most of these can be rather cleanly done
 > in parallel with minimal internode communication. However, a global
 > checkpoint would be needed to ensure synchrony between these calls.

"region-connect" will almost certainly require a lot of internode
communication.  And while we can anticipate the most common patterns of
connectivity (as GENESIS 2 did with planarconnect and volumeconnect), there
will be a significant number of people who want to specify connections
some other way, and they have to resort to connecting up many elements
individually.  We need to let them do this in a way that happens in parallel.

 > I should also add that the divide between setup time and runtime is
 > probably not so clean and we will definitely need to figure out efficient
 > ways of handling this. For example, in signalling simulations I have
 > already had issues where new organelles are budding  off and being
 > destroyed at runtime.

I think this is a very important point, and I completely agree.  It would
be possible to get better simulation performance if we assumed a sharp
division between the model construction and simulation phases, and compiled
the model down to a super-efficient simulatable form, but then this
makes it more difficult to dynamically view or alter the models at runtime.
And, as you mentioned, real cellular processes occur that are best modeled
as structural changes to the model rather than numerical changes to
already existing parameters.

 > To consider Greg's points:
 > > The greatest concern I have is with the many places in the basecode that
 > make an implicit assumption that elements are locally resident in the
 > nodes's memory, and that only one thread will be actively
 > > modifying them. (...)
 > > Some form of locking will thus be needed (probably on a per-Element basis).
 > Couldn't we put a lock at an appropriate place in an element tree, but
 > permit other element trees to be accessed safely ?

As long as the element tree is located entirely on a single node, this should
work.  I don't think locking subtrees that are distributed over multiple
nodes would be desirable because of performance and deadlock issues.

 > >The most troublesome situations will be when
 > > modifications are being made to the element tree, such as when new
 > >elements are being created or old ones destroyed.
 > Can we have a lock set whenever 'dangerous' commands are being executed?
 > Most commands at runtime are relatively safe.
 > 
 > > One solution may be to standardize at the .mh level.... This approach
 > > might make sense if nearly all the
 > > visualization and other add-on code would be at the .mh level or higher,
 > > but not if those things require major changes to the existing basecode.
 > 
 > I'm not sure what you have in mind here. To me it looks like all the
 > locking stuff should be done at the basecode level, so the user does not
 > need to know about it even if they are developing new objects using .mh
 > files. Could you expand on it?

I wasn't suggesting that locking should be done in the .mh files.
What I was suggesting is that the syntax/semantics of the .mh level
should cleaned up and more or less frozen.  This would allow
development to proceed at the .mh level and higher (GUIs, new modeling
primitives, solvers(?), etc.) using the current MOOSE kernel (with some
modifications).  We can then plug in a parallel-capable kernel at a
later time and get everything to run on parallel systems.  People
writing at the .mh level and higher should be writing code that is
independent of the hardware characteristics, such as the number of
nodes available, or the relative performance of the nodes.  An example
of a change that would be necessary at the .mh level would be to use
something like "ElementID" or "ElementHandle" instead of "Element *",
because Element* makes an implicit assumption that the Element is
located on the same node as where the code is executing.  The current
MOOSE could just define ElementID to Element*, but a parallel
implementation could define it to something else (e.g.,pair<NodeID,uint64_t>).
--Greg

Re: [Moose-g3-devel] questions about MOOSE

Multiscale Neuroscience and Systems Biology Simulator

Re: [Moose-g3-devel] questions about MOOSE