Re: [Moose-g3-devel] questions about MOOSE

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

Upi,
I have been looking at the MOOSE code, and thinking about certain issues
involved in parallelizing it, and have some serious concerns.

The greatest concern I have is with the many places in the basecode
that make an implicit assumption that elements are locally resident in
the nodes's memory, and that only one thread will be actively
modifying them.  For example, if the elements are distributed over
many nodes, then Element::relativeFind() will potentially require
information on 2 or more nodes.  This will cause the code to block for
indefinite periods of time while the interprocess communication is
performed and the remote nodes do what they need to do.  The simplest
way of dealing with this would be to only allow one active thread over
the entire set of nodes on which MOOSE is running.  However, this
would be disastrous in term of performance -- network setup would be
much slower than doing it on a single node.  If we allow multiple
active threads on each node to avoid the performance hit, then every
method that directly or indirectly calls one of these methods that
require off-node information will potentially block.  While this
occurs, incoming requests from other nodes must be handled, and some
of those may involve the Element in question.  Some form of locking will
thus be needed (probably on a per-Element basis).  The difficult thing
is that each of the places in the code where a potentially blocking
call will occur will have to release the Element lock, and must leave
the Element (as well as any kernel data structures) in a safe and
consistent state.  I can't see this being done without rewriting many
sections of code.  The most troublesome situations will be when
modifications are being made to the element tree, such as when new
elements are being created or old ones destroyed.  Once the network is
set up, things may not be so bad, but the network needs to get set up
in order to run it.

One solution may be to standardize at the .mh level.  The existing
MOOSE code could support running models (i.e., a script + a set of .mh
files) on serial machines, and we could have a separately developed
parallel version that can run the same models.  A few changes would
probably still be needed to the existing .mh files, but probably not
too many.  This approach might make sense if nearly all the visualization
and other add-on code would be at the .mh level or higher, but not if
those things require major changes to the existing basecode.

If you have thought of solutions to any of these problems, I would
be interested in hearing about them.
--Greg

Re: [Moose-g3-devel] questions about MOOSE

Multiscale Neuroscience and Systems Biology Simulator

Re: [Moose-g3-devel] questions about MOOSE