You can subscribe to this list here.
2006 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(6) |
Jun
(8) |
Jul
(3) |
Aug
(27) |
Sep
(10) |
Oct
(1) |
Nov
(4) |
Dec
|
---|---|---|---|---|---|---|---|---|---|---|---|---|
2007 |
Jan
(1) |
Feb
(4) |
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2012 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
(8) |
Dec
(8) |
2013 |
Jan
|
Feb
(2) |
Mar
(1) |
Apr
(1) |
May
|
Jun
|
Jul
(3) |
Aug
(1) |
Sep
(2) |
Oct
(2) |
Nov
(2) |
Dec
|
2014 |
Jan
|
Feb
(1) |
Mar
(13) |
Apr
(41) |
May
(20) |
Jun
(5) |
Jul
(5) |
Aug
|
Sep
(3) |
Oct
(4) |
Nov
|
Dec
|
2015 |
Jan
|
Feb
(3) |
Mar
(2) |
Apr
(2) |
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
From: Niraj D. <ni...@nc...> - 2012-11-07 09:51:07
|
Dear subscribers of the 'moose-devel' mailing list, This list was once active but has been hibernating for a while now. I'm sending this mail to see if we can restart discussions here, and also for following reasons: - The list has a rich list of subscribers, who may be interested in staying informed of, and perhaps even contributing to, MOOSE development activities. - MOOSE dev's core team does have regular discussions in person and over email. The email chats remain ad-hoc, and this list will be a good way to keep it coherent and publicly archived. - I'll like to see if this thing still works. In other news, MOOSE 2.0.0 was released very recently. You can find the release notes here: - https://sourceforge.net/projects/moose/files/moose/Moose%202.0.0%20Kalakand/RELEASE_NOTES.txt/download Other links of interest are the main website, and the road map for development in the near future: - http://moose.ncbs.res.in/ - http://moose.ncbs.res.in/component/option,com_wrapper/Itemid,95/ We'll love to hear from you: Do take the new version out for a spin, a glimpse at the road map, source code, and the excellent new documentation (for developers and for users). All of this can be found on the MOOSE website. Best wishes to all, Niraj PS: If you'll like to tweak your mailing list subscription settings, go here: https://lists.sourceforge.net/lists/listinfo/moose-devel |
From: Upinder S. B. <bh...@nc...> - 2007-02-06 02:59:54
|
Dear all, As Niraj's checkin message stated, we have started a complete reimplementation based on lessons from the MOOSE2006 API. There are many more details in the subversion log in the intro section, and there is als= o a documentation directory that has a couple of sections handwritten, and becomes richly populated when you run doxygen. Hugo has brought up most of the key additional points: > > 1. I don't see any of the biophysics / parallel / solver stuff > anymore. So, or this is work in progress, or this has been accidently forgotten/removed. It would be nice to know what the status is, and to have at least two or three examples of how to work with the new basecode. Yes, these are works in progress. I thought we should go for the 'release early, release often' policy. You will see a lot more activity on the subversion tree as a result. Immediate development sequence: - flesh out the parser interfaces. Currently compiles with Python/SWIG, but I'll use the same function hooks to talk to the old GENESIS parser. - Bring over the scheduling structure from MOOSE06 - Set up biophysics - Set up hsolver (Niraj is working on this) Examples of how to work with the new base code will be in the form of the biophysics modules. In principle I could do those right away, but without the scheduling and parser stuff that would not be terribly functional. Bu= t you can look at the element/Neutral files to get a preview. > > 2. I would like to learn more about the relationships between Finfo and Ftype and related. e.g. why do we need Ftype and Ftype2. Is there any more documentation available about the general relationships ? Finfos hold information about fields. They are not typed, and rely on the templated Ftype classes to handle type specific functions and information= . This separation helps because it keeps templating to a minimum and encourages most operations to remain generic. > > 3. a question about the tests: wouldn't it be better to separate the code from the tests completely from the code from the core ? I have seen to many cases where bugs in test code were interfering with bugs in the tested code, such that everything seemed to work properly although it was not. For instance, the SimpleElement seems to be intended for tests only, but is shipped in the basecode directory. The tests are all wrapped in #ifdef DO_UNIT_TESTS. I like to have them close to the classes they test because then it is easier to understand what they are doing. I have assertions everywhere, and those are intentional. Again, asserts can be flagged out of the system when we are sure it works well. SimpleElement is actually used for everything. It is currently the only instantiable version of Element. > > 4. If I am correct, Element specific data is supposed to be hidden with the ->data() method. I am not sure if I like this idea or not, it has both advantages and disadvantages: to make sure that I > understand: my guess is that it avoids cases of multiple inheritance, and it decouples privately owned data, so might require more > implementation work and cause run-time overhead. I think of this as a major design decision, and it is different from the previous release of Moose. Can you explain the rationale for this, can you explain why the previous release of Moose did this differently and what the > problems were ? Well, you've pointed out both the pros and the cons. It avoids multiple inheritance and decouples private data at the cost of an extra indirection. Another big reason for doing this is that it makes it easier to set up variants of the Element class. As it turns out, it has helped a lot in keeping the implementation of the wrapper code separate from the user object code. For example, message and field handling becomes much more modular in this system than it was earlier. > > 5. Is it correct to say that this version of Moose adds array based fields and that this was not present in the previous release ? Is this a preparation for array based messaging ? > Array fields are now cleaner, but I did have a clunkier version previously. I'm not sure what you mean by array-based messaging. We can now send and receive messages from individual array entries. We were always able to send and receive entire arrays. Cheers, Upi |
From: Hugo C. <hug...@gm...> - 2007-02-05 21:18:59
|
I took a look at this new release, and it looks almost like a reimplementation from scratch. I only had a superficial glance, but I like the overall code and it gives a more consistent and transparant impression than the previous release. I have several questions: 1. I don't see any of the biophysics / parallel / solver stuff anymore. So, or this is work in progress, or this has been accidently forgotten/removed. It would be nice to know what the status is, and to have at least two or three examples of how to work with the new basecode. 2. I would like to learn more about the relationships between Finfo and Ftype and related. e.g. why do we need Ftype and Ftype2. Is there any more documentation available about the general relationships ? 3. a question about the tests: wouldn't it be better to separate the code from the tests completely from the code from the core ? I have seen to many cases where bugs in test code were interfering with bugs in the tested code, such that everything seemed to work properly although it was not. For instance, the SimpleElement seems to be intended for tests only, but is shipped in the basecode directory. 4. If I am correct, Element specific data is supposed to be hidden with the ->data() method. I am not sure if I like this idea or not, it has both advantages and disadvantages: to make sure that I understand: my guess is that it avoids cases of multiple inheritance, and it decouples privately owned data, so might require more implementation work and cause run-time overhead. I think of this as a major design decision, and it is different from the previous release of Moose. Can you explain the rationale for this, can you explain why the previous release of Moose did this differently and what the problems were ? 5. Is it correct to say that this version of Moose adds array based fields and that this was not present in the previous release ? Is this a preparation for array based messaging ? Hugo On 2/5/07, Niraj Dudani <ni...@nc...> wrote: > > Hi all, > > The new and improved MOOSE has been checked in. Upi has put MOOSE through > a weight loss programme, and it shows--do refer to the commit log for > further info. > > Invite your comments. > > Cheers, > Niraj > > > > > ------------------------------------------------------------------------- > Using Tomcat but need to do more? Need to support web services, security? > Get stuff done quickly with pre-integrated technology to make your job easier. > Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 > _______________________________________________ > Moose-g3-devel mailing list > Moo...@li... > https://lists.sourceforge.net/lists/listinfo/moose-g3-devel > -- Hugo Cornelis Ph.D. Research Imaging Center University of Texas Health Science Center at San Antonio 7703 Floyd Curl Drive San Antonio, TX 78284-6240 Phone: 210 567 8112 Fax: 210 567 8152 |
From: Niraj D. <ni...@nc...> - 2007-02-05 18:33:02
|
Hi all, The new and improved MOOSE has been checked in. Upi has put MOOSE through a weight loss programme, and it shows--do refer to the commit log for further info. Invite your comments. Cheers, Niraj |
From: Upinder S. B. <bh...@nc...> - 2007-02-02 03:09:15
|
Dear all, I have added Niraj Dudani to the developers list for MOOSE. Niraj has been working primarily on the MOOSE hsolver, but has also done the Windows compilation and now has the additional entertaining job of putting the new version of MOOSE onto SourceForge so that we all can play with it. More generally, I expect that he will implement the site updates that have been pending for a long while, such as making the documentation more visible, providing precompiled versions and so on. Cheers, Upi --=20 Upinder S. Bhalla National Centre for Biological Sciences, bh...@nc... Tata Institute of Fundamental Research, +91-80-23666130 Bellary Road, Fax: +91-80-23636662 Bangalore 560065, INDIA Web: http://www.ncbs.res.in/~bhalla/index.html |
From: Michael E. <mie...@gm...> - 2007-01-08 18:56:35
|
Per Upi's request I have removed the full diffs from the svn checkin messages. Sorry for the lag time, this request got misfiled and I just noticed it hadn't been done. On 11/12/06, Upi Bhalla <bh...@us...> wrote: > > Message body follows: > > Hi, Mike, > How can one modify the settings so that when updates are > committed to the svn repository on sourceforge, we only get > the log message? Currently the system seems keen to send out > the entire patch when I commit any updates. > > Thanks, > Upi > > -- > This message has been sent to you, a registered SourceForge.net user, > by another site user, through the SourceForge.net site. This message > has been delivered to your SourceForge.net mail alias. You may reply > to this message using the "Reply" feature of your email client, or > using the messaging facility of SourceForge.net at: > https://sourceforge.net/sendmessage.php?touser=1525879 > > |
From: Greg H. <gh...@ps...> - 2006-11-07 01:34:21
|
Upinder S. Bhalla writes: > Nevertheless, I think that some form of visible postmaster is necessary. > My concern here is that I don't want to have hidden plumbing in MOOSE. > There was a lot of that in GENESIS, and it ended up being accessed through > a whole lot of special purpose calls. I would much rather have visible > 'plumbing' objects that most people do not need to worry about, which do > provide a consistent interface to those who do care about such things. For > example, where would you go to ask the system how many nodes the model was > running on, or if you want to explicitly shift objects around? For > instance, MOOSE already has the rather complex plumbing of the scheduler > completely visible. Upi, My feeling is that the functions associated with the postmasters *should* be "hidden plumbing". If someone checkpoints their model that happened to be running on 16 processors, then with model-level postmaster objects, the postmaster information would get included in the checkpoint file. However, say the next day that person wants to resume the run, but only 15 processors are available (due to hardware failure or other people running jobs), then it becomes messy to restore the model state. This does not have to be a messy operation if the connections are saved directly by the hidden plumbing, and then restored based on the new localities of the source and destination objects. And while I think there ought to be a way for the model to find out how many nodes it is running on (e.g. for displaying to the user), I don't think the model writer should be encouraged to exploit that or to become involved in explicit model partitioning. The kind of information that the model writer might provide would be something like "keep A on the same node as B because those objects communicate frequently", rather than directives to place A on node 7 and B on node 7. That way, the model is not tied to specific hardware and can be portably run on many systems. > Having said this, I really don't know what is to happen to the postmasters > should we shift to the actual MPI communication to a two-hop process as > you described. Obvious possibility is to use a variant of the postmaster > class, with the design requirement that the first-order user viewpont > remains that of a transparent, singly rooted object tree with ordinary > messaging between objects. Anyway, it is still some way off. Yes, I agree with this design requirement. I guess the difference between our positions is that I would prefer not to have higher-order user viewpoints of the communication infrastructure -- first, because it limits portability, and second because I see almost no one using it. Models get altered frequently enough that it usually doesn't make sense to excessively hand-tune details such as partitioning (provided that it is done reasonably well in some automated way). It's sort of like the situation with compilers -- even if we could get a factor of 2 improvement by coding directly in assembler, it's almost never worth the extra effort. Regards, --Greg |
From: Upinder S. B. <bh...@nc...> - 2006-11-04 05:54:53
|
Hi, Greg, That is very interesting, and I can see there will be lots of subtleties for our ongoing implementation. For a first pass, then, I will go ahead and have one-to-one mapping, but with the idea that later we may do something more sophisticated. I can see that this strengthens your point about the postmasters not being visible entities, because the semantics will possibly change. But see below. Greg Hood said: > I'm not sure exactly how you are thinking of the postmasters, but I fee= l > very strongly that there should not be postmaster objects of the sort > that are present in PGENESIS. In PGENESIS, the postmaster is an > object that is no different than any other object in the model. That i= s, > it is visible to users as part of the element tree, and "showmsg" shows > connections as going to or coming from the postmaster. However, it > fundamentally *is* different than other model constructs, and has no pl= ace > at the model level. It is something that should be completely invisibl= e > to users, both in specifying models, and in examining them. In other > words, > users should see internode connections as going from A to B, not as goi= ng > from A to a postmaster, and then from another postmaster to B. > > The postmaster concept may, however, be valid at the simulator level, > and implementing it as a C++ object within MOOSE (but not as a MOOSE > object > itself) could be a legitimate realization of this concept. I agree that the messaging should appear to go right to the target object= . In fact I think we are in good shape to have a completely transparent way of spreading objects around, so that at first order the user never sees which node they are on. That includes having messages appear to go right to their target. Nevertheless, I think that some form of visible postmaster is necessary. My concern here is that I don't want to have hidden plumbing in MOOSE. There was a lot of that in GENESIS, and it ended up being accessed throug= h a whole lot of special purpose calls. I would much rather have visible 'plumbing' objects that most people do not need to worry about, which do provide a consistent interface to those who do care about such things. Fo= r example, where would you go to ask the system how many nodes the model wa= s running on, or if you want to explicitly shift objects around? For instance, MOOSE already has the rather complex plumbing of the scheduler completely visible. Having said this, I really don't know what is to happen to the postmaster= s should we shift to the actual MPI communication to a two-hop process as you described. Obvious possibility is to use a variant of the postmaster class, with the design requirement that the first-order user viewpont remains that of a transparent, singly rooted object tree with ordinary messaging between objects. Anyway, it is still some way off. Cheers, Upi |
From: Greg H. <gh...@ps...> - 2006-11-04 01:21:29
|
Upinder S. Bhalla writes: > Hi, Greg, > A more practical question. Is there a significant penalty in issuing > MPI_Send as opposed to MPI_Isend? As I see it MPI_Send just needs to > hand off the data to the sending process, but I may be misunderstanding > how the system works. > > -- Upi Upi, No, I don't think there is much (if any) penalty with MPI_Send. In fact, I usually suggest to people that they use the conceptually simpler MPI_Send, rather than the more esoteric flavors of Send that MPI provides. Many of the strange varieties of Send (e.g. ready sends) are only useful in very specialized circumstances, and even there usually don't offer enough of a performance boost with current MPI implementations to justify the additional programming and debugging complexity that they create in the program. Also, some may have perfectly legal, but unexpected, consequences. For instance, if you use MPI_Isend, many implementations will just leave the sent data lying around in the sending process until there is an attempt to check whether the operation has completed with MPI_Test or MPI_Wait. So if you're debugging and step through the MPI_Isend, you won't see the message appear on the receiving node. A common misconception is that MPI has another process off on the side that is doing the communication operations that have been queued up. On most systems that is not the case, and the only opportunity that MPI has to actually send and receive data is within MPI_* calls that your own code makes. Thus, if you do something that MPI is allowed to defer until a later time, and then go off and do a lot of computation, the operation may never actually happen. --Greg |
From: Greg H. <gh...@ps...> - 2006-11-04 00:58:20
|
Upinder S. Bhalla writes: > Hi, Greg, > Here's my first question: a general design issue for scaling. I'm > thinking of giving each node a postmaster for all the nodes it connects > to. Each postmaster has a buffer for outgoing and incoming data. Seems > easy enough for small systems, but I'm not sure how to deal with scaling > here. Is this something I need to worry about, and what kinds of designs > do people use for really big systems? I know that Neuron uses global > message sends, but that is for spike data and I don't think it would be > so good for stuff that has to go on each timestep. > > -- Upi Upi, I'm not sure exactly how you are thinking of the postmasters, but I feel very strongly that there should not be postmaster objects of the sort that are present in PGENESIS. In PGENESIS, the postmaster is an object that is no different than any other object in the model. That is, it is visible to users as part of the element tree, and "showmsg" shows connections as going to or coming from the postmaster. However, it fundamentally *is* different than other model constructs, and has no place at the model level. It is something that should be completely invisible to users, both in specifying models, and in examining them. In other words, users should see internode connections as going from A to B, not as going from A to a postmaster, and then from another postmaster to B. The postmaster concept may, however, be valid at the simulator level, and implementing it as a C++ object within MOOSE (but not as a MOOSE object itself) could be a legitimate realization of this concept. OK, on to the question about scaling. The issues involved in scaling to node counts of 16 or less are not that critical, and many different approaches will work almost equally well. Things start to get bad with larger numbers of nodes, but how many is really model- and program-dependent. For concreteness, let's consider a 1024 node system. Now, imagine if on every timestep every node has to send a message (in the MPI sense, not the GENESIS sense) to every other node. Each node will have to send and receive 1024 probably small messages (lets assume 100 bytes apiece here), and the overheads will kill performance. One possible way of dealing with this is to organize the nodes into a 32x32 array, where information that has to get from one node to another does not get there in one hop, but in 2 hops (first vertical, then horizontal). After the first hop, MPI messages are broken apart, the data are sorted and reassembled into messages, and then sent onward to their final destination. So, for every simulation timestep, we have 2 transfers of messages among the nodes, and on each transfer, each node will communicate with 32 other nodes. Thus, we have decreased the total number of messages sent from 1024*1024 to 1024*32*2, or a factor of 16. Each message on average will be about 32 times as large as before, or 3200 bytes, but the cost of sending a 3200 byte message may be not that much greater than for a 100 byte message, so the net effect might be speedup of nearly 16 versus the naive approach. This is just one possible solution; the choice of a particular solution in a real application depends on a lot of factors. Incidentally, you may have noticed there is an MPI_Alltoall function that is supposed to efficiently transfer data among a set of processors, and it does. However, it assumes fixed size messages, and that may not be the case for a neural simulation, and so one has to shoehorn what one wants to do into the semantics of MPI_Alltoall. Also, MPI_Alltoall is a collective operation, meaning that all nodes must participate simultaneouly, which might be OK for a synchronous simulation (global timesteps), but clashes with asynchronous styles of simulation. --Greg P.S. I hope you don't mind -- I am CC'ing the moose-g3 list since we are getting into general design issues that other people might be interested in. |
From: Hugo C. <hug...@gm...> - 2006-10-06 15:51:38
|
Note : posted to genesis-dev and neuroml mailing lists. Comments below. On 10/2/06, Mike Schachter <mik...@gm...> wrote: > >[Josef] > > >Moose (the new back end for genesis) is supposed to be heading in the > direction > >of exposing it's API through swig. IMO, Moose is still at a pre-alpha > stage. > >It's been very difficult to convince powers that it's difficult, if not > >impossible, to create a swig interface to what's there. There needs to be > some > >core re-architecting to avoid exposing all the plumbing when all you really > >want is a glass of water from the tap ;-) Instead of focusing on just C++, > I > >think it would make more sense to decide what you want to expose and > expose it > >using swig. Then users have a lot more language choices for interacting > with it. > As explained on the genesis-dev mail list some time ago, moose suffers from mixing up several data layers in the software architecture. The core of the problem for running simulations in computational neuroscience is, I think, that there is no one-to-one mapping between biological concepts and mathematical equations. As a consequence of this, moose forces biological quantities, physics quantities, numerical quantities and algorithmic quantities to live in the same big space. Interfacing to this big space is a nice software engineering problem, and is not just a matter of data bindings or script bindings, exactly what you say. The moose framework partitions the space with Genesis2 style functional objects such that simulations can be run, but it does not separate out quantities living in different domains. So indeed, I am fully convinced that interfacing to moose is cumbersome. On the other hand, it is fair to add that moose addresses a number of scheduling problems in a nice way. Josef, It would help the whole community if you could write an outline or summary of what you think the problem is. Also, I do not understand exactly what you mean if you say that moose should be a library. If you mean that moose should be split up in different functional components, then I agree. If not, I would be interested to hear what you really mean. I have just coded a new numerical solver for compartmental models. It is called heccer. It is as fast as hsolve (Genesis2), clean implementation, and puts a lot of emphasis on interfacing. It is currently driven by a scheduler written in perl, but it is straightforward to link heccer to matlab or neurospaces fi. Neurospaces is addressing the domain problems outlined above. The three tools, heccer, neurospaces and the ssp scheduler, have been developed separately and their code is nicely isolated. Heccer as well as neurospaces are different functional components and come as separate link libraries, so you can use them in different ways : isolated for testing, model validation and visualization (neurospaces), connecting to databases (neurospaces), instantiating large simulations from binary data (heccer, ssp). Or, of course using all together, for network simulations. Neurospaces and heccer are core components of Genesis 3 and have been designed based on the Genesis 3 functional decomposition I recently posted on the genesis-dev mailing list. If you are interested in heccer, check the Neurospaces website (www.neurospaces.org), I will post releases there in a week or two - three. Hugo -- Hugo Cornelis Ph.D. Research Imaging Center University of Texas Health Science Center at San Antonio 7703 Floyd Curl Drive San Antonio, TX 78284-6240 Phone: 210 567 8112 Fax: 210 567 8152 |
From: Upinder S. B. <bh...@nc...> - 2006-09-22 08:38:28
|
Hi, Joe, The lack of backward compatibility is only in the implementation of function calls that SLI makes. The parser for SLI in MOOSE is actually the same as the one in GENESIS, that is, they use essentially the same yacc specification files. So, as and when functions are implemented, we will have a more nearly complete backward compatible implementation. -- Upi --=20 Upinder S. Bhalla National Centre for Biological Sciences, bh...@nc... Tata Institute of Fundamental Research, +91-80-2363-6420X3230 Bellary Road, Fax: +91-80-23636662 Bangalore 560065, INDIA Web: http://www.ncbs.res.in/~bhalla/index.html On Fri, September 22, 2006 3:41 am, Josef Svitak said: > I've been laboring under the impression that the moose sli was intended= to > be > 100% backward-compatible with the current genesis sli, but it doesn't > appear > that way. Where are the incompatibilities with what's already implmente= d? > Will > the workarounds be internal or will they require script modification? > > Thanks, > joe > > Software Engineer > Linux/OSX C/C++/Java > > -----------------------------------------------------------------------= -- > Take Surveys. Earn Cash. Influence the Future of IT > Join SourceForge.net's Techsay panel and you'll get the chance to share > your > opinions on IT & business topics through brief surveys -- and earn cash > http://www.techsay.com/default.php?page=3Djoin.php&p=3Dsourceforge&CID=3D= DEVDEV > _______________________________________________ > Moose-g3-devel mailing list > Moo...@li... > https://lists.sourceforge.net/lists/listinfo/moose-g3-devel > |
From: Josef S. <js...@ya...> - 2006-09-21 22:11:59
|
I've been laboring under the impression that the moose sli was intended to be 100% backward-compatible with the current genesis sli, but it doesn't appear that way. Where are the incompatibilities with what's already implmented? Will the workarounds be internal or will they require script modification? Thanks, joe Software Engineer Linux/OSX C/C++/Java |
From: Upinder S. B. <bh...@nc...> - 2006-09-15 02:50:23
|
Hi, Joe, others, We use the Conn* as a handle for the functions that mediate messaging, because we may need to carry extra information in specialized message situations. In those cases we use a derived class from Conn that does something special. In one of the possible recodings of the messaging system, we still would use the Conn because that would give us offset information so we could find the index of the incoming message. There are various other options to think about, including functors and overloading the Element pointer rather than the Conn pointer. I've deferred this refactoring till we have the base code running well enough to get us into alpha. I've found it helps a lot to have a completely running system so we are aware of all the use cases, before trying to rebuild stuff. The getResponse call is not used for messaging, so it doesn't have to dea= l with the above issues. -- Upi --=20 Upinder S. Bhalla National Centre for Biological Sciences, bh...@nc... Tata Institute of Fundamental Research, +91-80-2363-6420X3230 Bellary Road, Fax: +91-80-23636662 Bangalore 560065, INDIA Web: http://www.ncbs.res.in/~bhalla/index.html On Thu, September 14, 2006 11:26 pm, Josef Svitak said: > Why do some functions need a Conn while others need an Element, e.g.: > > void ShellWrapper::setResponse( Conn* c, string value ) { > static_cast< ShellWrapper* >( c->parent() )->response_ =3D value; > } > string ShellWrapper::getResponse( const Element* e ) { > return static_cast< const ShellWrapper* >( e )->response_; > } > > (oooh. and I suppose Conn* could be const in setResponse since we're no= t > changing c, we're changing the Element) > > joe > > Software Engineer > Linux/OSX C/C++/Java > > -----------------------------------------------------------------------= -- > Using Tomcat but need to do more? Need to support web services, securit= y? > Get stuff done quickly with pre-integrated technology to make your job > easier > Download IBM WebSphere Application Server v.1.0.1 based on Apache Geron= imo > http://sel.as-us.falkag.net/sel?cmd=3Dlnk&kid=3D120709&bid=3D263057&dat= =3D121642 > _______________________________________________ > Moose-g3-devel mailing list > Moo...@li... > https://lists.sourceforge.net/lists/listinfo/moose-g3-devel > |
From: Josef S. <js...@ya...> - 2006-09-14 17:56:36
|
Why do some functions need a Conn while others need an Element, e.g.: void ShellWrapper::setResponse( Conn* c, string value ) { static_cast< ShellWrapper* >( c->parent() )->response_ = value; } string ShellWrapper::getResponse( const Element* e ) { return static_cast< const ShellWrapper* >( e )->response_; } (oooh. and I suppose Conn* could be const in setResponse since we're not changing c, we're changing the Element) joe Software Engineer Linux/OSX C/C++/Java |
From: Josef S. <js...@ya...> - 2006-09-09 17:37:18
|
Original message from Greg Hood. Note: forwarded message attached. |
From: Michael E. <mie...@gm...> - 2006-09-07 21:44:49
|
Comments inline. On 9/7/06, Dave Beeman <db...@do...> wrote: > When is the move likely to happen? How much time do we have before we > are forced to do something? The timing of the move is not critical. The folks that have been administering the machine genesis-sim.org is on are currently feeling a bit short handed and have been pushing for me to deal with Jim's machines personally. The easiest, and most sensible from my point of view, way for me to do that is for them to be moved to my existing web server at UTSA. > Is it a certainty that we can't run the present combination of perl and > sendmail for a while, until we properly set up the new system? Is it a > problem of administrators not wanting us to run our own mail server > through the UTSA firewall? Correct. Otherwise, I suppose I could simply move the current machine to UTSA, get it an IP address here and leave things more or less as they are. I personally think this is an excelent oportunity to "upgrade" and move babel to sourceforge, but thats just my 0.02$. > Would these same considerations prevent us from running the MailMan > software ourselves from UTSA? Also correct. Baring some political roughhousing from Jim, UTSA has again made it clear that they really don't want us to run a mail server. For some semi-reasonable reasons... > How big of a deal is it to set up MailMan ourselves? It seems that once it > is set up, it isn't any easier to manage the mailing list on Sourceforge > than it would be to manage it on our own machine. The big advantage of > Sourceforge is that they have it all installed and set up. Is that > correct? I have never set up sendmail or MailMan myself. This is one of the semi-reasonable reasons UTSA doesn't want me to run a mail server, though there are a whole list of mainly security/administrative overhead issues cited as well. I am told it is a bit of a pain to get things working smoothly, especially running a fairly complicated set up like we would need to accomidate both my in-house organizational email needs as well as those for Genesis. Once it was set up, it would be functionally equivalent to sourceforge's system (probably) but any lists or email addresses would have the genesis-sim.org name attached to it (which is nice). > (2) Should the site for the MOOSE and GENESIS 3 development (including the > moose-g3-devel mailing list and source code repository) be moved from the > present Sourceforge site to our own server? As Joe has already set things up on sourceforge (I haven't been able to connect to it today, myself to look at it) if those arangements are satisfactory, why rock the boat? Especially as running the site on sourceforge makes it slightly easier for arbitrary developers to deal with the site (since they already have a SF login to check out the code). If people are unhappy with the performance of the SF site, we can move it at a later date. |
From: Josef S. <js...@ya...> - 2006-09-07 20:12:04
|
--- Dave Beeman <db...@do...> wrote: > This makes it clear the the repository and development site for MOOSE must > stay where it is. At some point, we might want to split the G3 > development site off from that for MOOSE, but that seems awfully premature > to me. I think that any split should come when we have a public release > of something (with a GUI) that we can honestly call GENESIS 3. At > present, GENESIS 3 is nothing more than the MOOSE core. > > As Upi and Greg have made clear, there is a lot more yet to be done in the > way of MOOSE core code development. I think it would hurt collaboration, > rather than foster it, to separate the GENESIS 3 site from that of MOOSE at > this early stage. My (twenty-)two cents: What you call it IS important (see http://producingoss.com/html-chunk/index.html), but saying there's nothing to release is inaccurate. I suspect Upi has already been doing real science with Moose. Dieter may argue that putting some sort of GUI on what's already there is really just icing (fluff?). Certainly the SLI interface to Moose must qualify as something you'd consider calling Genesis? There hasn't really been any feedback on my view of the Big Picture: 1. Moose - core library. Savvy users can put whatever front end on it they wish, using swig interface. Does NOT provide a front end, even a command line. It's a library. 2. Genesis3 - provide all the groovy front ends that Jim wants and the nasty back end that we're stuck with (SLI). This provides some clear boundaries. Cohesion. Focus. The age of the monolithic software app is dead (Thank Crom!), so we should try to get with the program - at this early stage (actually 3 years ago). So, rip the SLI back out of Moose and make it communicate through the swig interface. Easier said than done, but done it must be. There are obviously a lot of other areas that need to be addressed and relagated to the right project. > But, we do need a wiki installed on the moose-g3 > Sourceforge site. Michael, can you work on this? There's already one there - working. Just have to figure out what we want to put under it. http://moose-g3.sourceforge.net/phpwiki/index.php joe js...@ya... Software Engineer Linux/OSX C/C++/Java |
From: Dave B. <db...@do...> - 2006-09-07 18:38:29
|
Jim and Michael, There are some decisions to be made, which have become more pressing due to the planned move of the genesis-sim.org server (biad181) from UTHSCSA to UTSA. There was a flurry of emails on August 17-18 and August 30-31 between Michael, Joe, Hugo, Upi, and Mando. and me. I'll do my best to summarize the situation, and give some opinions. I've copied this email to the moose-g3-devel mailing list, so that it will go to everyone who was involved in this discussion, so that it will be archived, and so that they can point out omissions or errors in my account of what they said. As I see it, two separate issues came up: (1) Should the BABEL mailing list be moved to Sourceforge, as is now the case with the moose-g3-devel mailing list? In any case, we should change from the present mailing list software, which is based on my own perl scripts and requires sendmail to be installed on the server. We need a standard, well-supported mailing list manager that deals automatically with bounced mail, subscribes/unsubscribes, etc. Michael suggested that there may be problems continuing to use the present system when the server is moved to the UTSA network. To me, this is the most urgent issue, because I don't want to see any interuption of service, or other proiblems, for our users. Moving it to Sourceforge would be a very easy solution. We could simply set up a mailing list on the same Sourceforge site that we use for the GENESIS 2 source code repository and the user forums. With the exception of a short period of problems with sourceforge (see below), this worked out pretty well when we moved the genesis-dev mailing list to Sourceforge as "moose-g3-devel". Not only would this save Michael a lot of work, but it would help steer our users to the user forums, which haven't been getting as much use as they should. The GENESIS and BABEL web pages have links to this site for getting the latest GENESIS OS/X versions and updates from the repository. The disadvantage is that we would have to educate several hundred apathetic BABEL members to use a new email address at lists.sourceforge.net, and if we grow unhappy with Sourceforge, then change it back again. Alternatively, we could run the same mailing list software that Sourceforge uses (GNU MailMan) and host and manage it ourselves at UTSA. That would be more work for us, but would allow us to keep the same email address for the GENESIS Users Group. If I could believe that we are likely to stick with Sourceforge, I wouldn't object to moving it there, while keeping the BABEL website on the the genesis-sim.org server. So this raises some questions for Michael that may help us decide: When is the move likely to happen? How much time do we have before we are forced to do something? Is it a certainty that we can't run the present combination of perl and sendmail for a while, until we properly set up the new system? Is it a problem of administrators not wanting us to run our own mail server through the UTSA firewall? Would these same considerations prevent us from running the MailMan software ourselves from UTSA? How big of a deal is it to set up MailMan ourselves? It seems that once it is set up, it isn't any easier to manage the mailing list on Sourceforge than it would be to manage it on our own machine. The big advantage of Sourceforge is that they have it all installed and set up. Is that correct? (2) Should the site for the MOOSE and GENESIS 3 development (including the moose-g3-devel mailing list and source code repository) be moved from the present Sourceforge site to our own server? We went through a period of problems with Sourceforge around the beginning of August. There were frequent downtimes preventing access to the repository for the MOOSE source code, and a period when the moose-g3-devel mailing list wasn't getting archived. These problems have been resolved (at least for now). But this led to some discussion, aside from the issue of the reliability of Sourceforge, of whether we need more sophisticated tools for collaborative software development than Sourceforge can offer. Joe pointed out that many of these needs can be met with a wiki, and it turns out that one can be used with Sourceforge. Joe has given Michael the information needed to set one up at the moose-g3 sourceforge site. Hugo argued that, in particular, we need a version control system more flexible than the ones supported by Sourceforge, that is better for distributed development from many sites. Most of the others were happy to continue with the present Subversion system being hosted at Sourceforge, along with the mailing list. Upi pointed out that there would be problems with his funding agencies if the MOOSE development site were moved from a neutral place like sourceforge to UTSA, or even to his own site. This makes it clear the the repository and development site for MOOSE must stay where it is. At some point, we might want to split the G3 development site off from that for MOOSE, but that seems awfully premature to me. I think that any split should come when we have a public release of something (with a GUI) that we can honestly call GENESIS 3. At present, GENESIS 3 is nothing more than the MOOSE core. As Upi and Greg have made clear, there is a lot more yet to be done in the way of MOOSE core code development. I think it would hurt collaboration, rather than foster it, to separate the GENESIS 3 site from that of MOOSE at this early stage. But, we do need a wiki installed on the moose-g3 Sourceforge site. Michael, can you work on this? If any of you want to further discuss the issue of collaborative development tools for GENESIS/MOOSE, please post your comments to moo...@li..., so that they will be archived. Dave |
From: Upinder S. B. <bh...@nc...> - 2006-09-07 15:52:18
|
Hi, Joe, I guess I got too used to seeing diagnostics scroll over the screen. Th= e initial few error messages are easy to fix, and the last one can be ignored. So all seems well. -- Upi On Thu, September 7, 2006 8:43 pm, Josef Svitak said: > The latest rev compiles, but with -DDO_UNIT_TESTS on the command line, = I > get: > > [moose-g3]$ ./moose > Error: parseTriggerList: Could not find field 'reinitOut' on class > 'KineticHub'Error: parseTriggerList: Could not find field 'sumTotMolIn'= on > class 'KineticHub' > Error: parseTriggerList: Could not find field 'processTabOut' on class > 'KineticHub' > Error: parseTriggerList: Could not find field 'reinitTabOut' on class > 'KineticHub' > ValueFinfoBase::Error: Unable to initialize class Multi > Checking shell->findElement................ done > Checking shell::createFuncLocal(arg1, arg2)........ done > Checking shell::setFuncLocal(arg1, arg2)...... done > Checking shell::copyFuncLocal(arg1, arg2)...... done > Shell Tests complete > > Checking wildcarding: Enumerated list. done > Checking wildcarding: Trees, Types and Field equalities........ done > Checking wildcarding: Numerical Field tests.......... done > Wildcarding tests complete > > Testing Shell: parseArgs......... done > Testing Scheduling > doing reset > starting run for 10 sec. > ....................Scheduling tests complete > Testing table: ................................... done > Error: PlainMultiConn::disconnectAll > moose > > > All is as expected? > > js...@ya... > Software Engineer > Linux/OSX C/C++/Java > > -----------------------------------------------------------------------= -- > Using Tomcat but need to do more? Need to support web services, securit= y? > Get stuff done quickly with pre-integrated technology to make your job > easier > Download IBM WebSphere Application Server v.1.0.1 based on Apache Geron= imo > http://sel.as-us.falkag.net/sel?cmd=3Dlnk&kid=3D120709&bid=3D263057&dat= =3D121642 > _______________________________________________ > Moose-g3-devel mailing list > Moo...@li... > https://lists.sourceforge.net/lists/listinfo/moose-g3-devel > |
From: Josef S. <js...@ya...> - 2006-09-07 15:13:48
|
The latest rev compiles, but with -DDO_UNIT_TESTS on the command line, I get: [moose-g3]$ ./moose Error: parseTriggerList: Could not find field 'reinitOut' on class 'KineticHub'Error: parseTriggerList: Could not find field 'sumTotMolIn' on class 'KineticHub' Error: parseTriggerList: Could not find field 'processTabOut' on class 'KineticHub' Error: parseTriggerList: Could not find field 'reinitTabOut' on class 'KineticHub' ValueFinfoBase::Error: Unable to initialize class Multi Checking shell->findElement................ done Checking shell::createFuncLocal(arg1, arg2)........ done Checking shell::setFuncLocal(arg1, arg2)...... done Checking shell::copyFuncLocal(arg1, arg2)...... done Shell Tests complete Checking wildcarding: Enumerated list. done Checking wildcarding: Trees, Types and Field equalities........ done Checking wildcarding: Numerical Field tests.......... done Wildcarding tests complete Testing Shell: parseArgs......... done Testing Scheduling doing reset starting run for 10 sec. ....................Scheduling tests complete Testing table: ................................... done Error: PlainMultiConn::disconnectAll moose > All is as expected? js...@ya... Software Engineer Linux/OSX C/C++/Java |
From: Hugo C. <hug...@gm...> - 2006-08-30 15:25:48
|
Many issues raised during the parallelization thread are related to the functional decomposition of Moose. In attachment you find a proposal for a functional decomposition of G3, a slide that I made for a recent presentation about G2 and G3. Some notes : - Heccer, mentioned at the bottom of the slide, is the hsolve replacement that I am working on. It addresses several technical problems with hsolve. More about it later this week or next week or so. - One of the principles is that all the boxes are well separated components, accessible via interfaces. I have no problem if someone incorporates Heccer in Topographica f.i. (I would actually encourage this). So all components can be logically compiled, run and tested in isolation. - I make a strict separation between what is above the scheduler and below the scheduler. - parallelization is not mentioned. Parallelization can be used at several components at the same time or not at all. It is dependent on the implementation of components. - a couple of things are missing : -- Some relationships are missing, f.i. I guess a link is needed between the model tools and the scripting facilities. A 2D diagramma cannot represent everything... -- a component that deals with changes in the model during simulation time is missing. For my internal notes, I call this a model event generator, but that might be the wrong name. -- a solver factory, that given a piece of the model, maps it to a solver type and an instance of the solver. Such a solver factory is currently implemented as part of Neurospaces. A general solver factory will be a complicated piece of code, so I would like to separate this out. One of the main targets of this design is the generation of an API over the different tiers in a neuronal simulator. A target of distributing the slide is to generate something that can be used as a communication medium for G3 developers. So comments welcome. Hugo On 8/29/06, Greg Hood <gh...@ps...> wrote: > Upinder S. Bhalla writes: > > Dear Greg et al, > > Thanks for raising some very important and interesting points. I have > > not yet thought much about parallel model loading, because I don't have > > much idea about how much of a bottleneck it might be. > > Efficient parallel model loading (or setup) is definitely important; this > sort of thing can quickly become the bottleneck in running a large simulation. > Setup time is, of course, model-dependent, but one data point I can cite is > the large PGENESIS cerebellar model that was run on our T3E several years > ago by Fred Howell et al.: -- for their largest model (on 128 nodes) the > setup time (done in parallel) was taking 65% as long as the time for doing > the actual simulation. > > > > Before I dive into > > the details, this is my earlier line of thought; please comment on it. > > > > 1. Threads: I had considered restricting multithreading to solvers, on a > > per-node basis, for some of the reasons Greg has outlined. > > While solvers should definitely be able to take advantage of multithreading, > I'm uncomfortable restricting everything else to just one thread. For > instance, the GUI will likely need a variable number of threads to make > programming easier. I also like doing network I/O and large file I/O as > separate threads so that they do not freeze up the system if delays occur. > > > > 2. RelativeFind etc: I had considered caching info on the postmaster to > > speed up the process of finding remote objects, and grouping requests for > > remote-node element info. > > Yes, those things help. If the cache is required to give 100% accurate > information (as opposed to hints with no guarantee of correctness), then > cache consistency issues have to be dealt with, since elements can come > into and out of existence. If elements are allowed to be movable between > nodes, this gets messier. > > > > 3. Parallel model building: I thought that almost all cases where this > > would be critical would be through special calls like createmap, > > region-connect, and perhaps copy. Most of these can be rather cleanly done > > in parallel with minimal internode communication. However, a global > > checkpoint would be needed to ensure synchrony between these calls. > > "region-connect" will almost certainly require a lot of internode > communication. And while we can anticipate the most common patterns of > connectivity (as GENESIS 2 did with planarconnect and volumeconnect), there > will be a significant number of people who want to specify connections > some other way, and they have to resort to connecting up many elements > individually. We need to let them do this in a way that happens in parallel. > > > > I should also add that the divide between setup time and runtime is > > probably not so clean and we will definitely need to figure out efficient > > ways of handling this. For example, in signalling simulations I have > > already had issues where new organelles are budding off and being > > destroyed at runtime. > > I think this is a very important point, and I completely agree. It would > be possible to get better simulation performance if we assumed a sharp > division between the model construction and simulation phases, and compiled > the model down to a super-efficient simulatable form, but then this > makes it more difficult to dynamically view or alter the models at runtime. > And, as you mentioned, real cellular processes occur that are best modeled > as structural changes to the model rather than numerical changes to > already existing parameters. > > > > To consider Greg's points: > > > The greatest concern I have is with the many places in the basecode that > > make an implicit assumption that elements are locally resident in the > > nodes's memory, and that only one thread will be actively > > > modifying them. (...) > > > Some form of locking will thus be needed (probably on a per-Element basis). > > Couldn't we put a lock at an appropriate place in an element tree, but > > permit other element trees to be accessed safely ? > > As long as the element tree is located entirely on a single node, this should > work. I don't think locking subtrees that are distributed over multiple > nodes would be desirable because of performance and deadlock issues. > > > > >The most troublesome situations will be when > > > modifications are being made to the element tree, such as when new > > >elements are being created or old ones destroyed. > > Can we have a lock set whenever 'dangerous' commands are being executed? > > Most commands at runtime are relatively safe. > > > > > One solution may be to standardize at the .mh level.... This approach > > > might make sense if nearly all the > > > visualization and other add-on code would be at the .mh level or higher, > > > but not if those things require major changes to the existing basecode. > > > > I'm not sure what you have in mind here. To me it looks like all the > > locking stuff should be done at the basecode level, so the user does not > > need to know about it even if they are developing new objects using .mh > > files. Could you expand on it? > > I wasn't suggesting that locking should be done in the .mh files. > What I was suggesting is that the syntax/semantics of the .mh level > should cleaned up and more or less frozen. This would allow > development to proceed at the .mh level and higher (GUIs, new modeling > primitives, solvers(?), etc.) using the current MOOSE kernel (with some > modifications). We can then plug in a parallel-capable kernel at a > later time and get everything to run on parallel systems. People > writing at the .mh level and higher should be writing code that is > independent of the hardware characteristics, such as the number of > nodes available, or the relative performance of the nodes. An example > of a change that would be necessary at the .mh level would be to use > something like "ElementID" or "ElementHandle" instead of "Element *", > because Element* makes an implicit assumption that the Element is > located on the same node as where the code is executing. The current > MOOSE could just define ElementID to Element*, but a parallel > implementation could define it to something else (e.g.,pair<NodeID,uint64_t>). > --Greg > > ------------------------------------------------------------------------- > Using Tomcat but need to do more? Need to support web services, security? > Get stuff done quickly with pre-integrated technology to make your job easier > Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 > _______________________________________________ > Moose-g3-devel mailing list > Moo...@li... > https://lists.sourceforge.net/lists/listinfo/moose-g3-devel > -- Hugo Cornelis Ph.D. Research Imaging Center University of Texas Health Science Center at San Antonio 7703 Floyd Curl Drive San Antonio, TX 78284-6240 Phone: 210 567 8112 Fax: 210 567 8152 |
From: Greg H. <gh...@ps...> - 2006-08-29 23:24:07
|
Upinder S. Bhalla writes: > Dear Greg et al, > Thanks for raising some very important and interesting points. I have > not yet thought much about parallel model loading, because I don't have > much idea about how much of a bottleneck it might be. Efficient parallel model loading (or setup) is definitely important; this sort of thing can quickly become the bottleneck in running a large simulation. Setup time is, of course, model-dependent, but one data point I can cite is the large PGENESIS cerebellar model that was run on our T3E several years ago by Fred Howell et al.: -- for their largest model (on 128 nodes) the setup time (done in parallel) was taking 65% as long as the time for doing the actual simulation. > Before I dive into > the details, this is my earlier line of thought; please comment on it. > > 1. Threads: I had considered restricting multithreading to solvers, on a > per-node basis, for some of the reasons Greg has outlined. While solvers should definitely be able to take advantage of multithreading, I'm uncomfortable restricting everything else to just one thread. For instance, the GUI will likely need a variable number of threads to make programming easier. I also like doing network I/O and large file I/O as separate threads so that they do not freeze up the system if delays occur. > 2. RelativeFind etc: I had considered caching info on the postmaster to > speed up the process of finding remote objects, and grouping requests for > remote-node element info. Yes, those things help. If the cache is required to give 100% accurate information (as opposed to hints with no guarantee of correctness), then cache consistency issues have to be dealt with, since elements can come into and out of existence. If elements are allowed to be movable between nodes, this gets messier. > 3. Parallel model building: I thought that almost all cases where this > would be critical would be through special calls like createmap, > region-connect, and perhaps copy. Most of these can be rather cleanly done > in parallel with minimal internode communication. However, a global > checkpoint would be needed to ensure synchrony between these calls. "region-connect" will almost certainly require a lot of internode communication. And while we can anticipate the most common patterns of connectivity (as GENESIS 2 did with planarconnect and volumeconnect), there will be a significant number of people who want to specify connections some other way, and they have to resort to connecting up many elements individually. We need to let them do this in a way that happens in parallel. > I should also add that the divide between setup time and runtime is > probably not so clean and we will definitely need to figure out efficient > ways of handling this. For example, in signalling simulations I have > already had issues where new organelles are budding off and being > destroyed at runtime. I think this is a very important point, and I completely agree. It would be possible to get better simulation performance if we assumed a sharp division between the model construction and simulation phases, and compiled the model down to a super-efficient simulatable form, but then this makes it more difficult to dynamically view or alter the models at runtime. And, as you mentioned, real cellular processes occur that are best modeled as structural changes to the model rather than numerical changes to already existing parameters. > To consider Greg's points: > > The greatest concern I have is with the many places in the basecode that > make an implicit assumption that elements are locally resident in the > nodes's memory, and that only one thread will be actively > > modifying them. (...) > > Some form of locking will thus be needed (probably on a per-Element basis). > Couldn't we put a lock at an appropriate place in an element tree, but > permit other element trees to be accessed safely ? As long as the element tree is located entirely on a single node, this should work. I don't think locking subtrees that are distributed over multiple nodes would be desirable because of performance and deadlock issues. > >The most troublesome situations will be when > > modifications are being made to the element tree, such as when new > >elements are being created or old ones destroyed. > Can we have a lock set whenever 'dangerous' commands are being executed? > Most commands at runtime are relatively safe. > > > One solution may be to standardize at the .mh level.... This approach > > might make sense if nearly all the > > visualization and other add-on code would be at the .mh level or higher, > > but not if those things require major changes to the existing basecode. > > I'm not sure what you have in mind here. To me it looks like all the > locking stuff should be done at the basecode level, so the user does not > need to know about it even if they are developing new objects using .mh > files. Could you expand on it? I wasn't suggesting that locking should be done in the .mh files. What I was suggesting is that the syntax/semantics of the .mh level should cleaned up and more or less frozen. This would allow development to proceed at the .mh level and higher (GUIs, new modeling primitives, solvers(?), etc.) using the current MOOSE kernel (with some modifications). We can then plug in a parallel-capable kernel at a later time and get everything to run on parallel systems. People writing at the .mh level and higher should be writing code that is independent of the hardware characteristics, such as the number of nodes available, or the relative performance of the nodes. An example of a change that would be necessary at the .mh level would be to use something like "ElementID" or "ElementHandle" instead of "Element *", because Element* makes an implicit assumption that the Element is located on the same node as where the code is executing. The current MOOSE could just define ElementID to Element*, but a parallel implementation could define it to something else (e.g.,pair<NodeID,uint64_t>). --Greg |
From: Josef S. <js...@ya...> - 2006-08-29 09:50:32
|
--- Michael Edwards <mie...@gm...> wrote: > Hardware is also moving in an increasingly thread optimized direction, > so making moose thread friendly will go a long way to making it run > well on future machines. > Agreed. There is definitely some work to do in this regard. Keep in mind we didn't gain much by using the STL. The most you can even hope for in any STL implementation is that: 1)multiple readers are safe and 2)multiple writers to _different_ containers are safe. That ain't much and even these aren't guaranteed. I think the boost (boost.org) libraries would be particularly helpful here, especially the portable threads and reference-counted pointers (sorry. I'm a geek. This is a development mailing list after all). joe js...@ya... Software Engineer Linux/OSX C/C++/Java |
From: Josef S. <js...@ya...> - 2006-08-29 09:34:25
|
Hi Greg, --- Greg Hood <gh...@ps...> wrote: > Upi, > I have been looking at the MOOSE code, and thinking about certain issues > involved in parallelizing it, and have some serious concerns. > > The greatest concern I have is with the many places in the basecode > that make an implicit assumption that elements are locally resident in > the nodes's memory, and that only one thread will be actively > modifying them. For example, if the elements are distributed over > many nodes, then Element::relativeFind() will potentially require > information on 2 or more nodes. This will cause the code to block for > indefinite periods of time while the interprocess communication is > performed and the remote nodes do what they need to do. The simplest > way of dealing with this would be to only allow one active thread over > the entire set of nodes on which MOOSE is running. However, this > would be disastrous in term of performance -- network setup would be > much slower than doing it on a single node. If we allow multiple > active threads on each node to avoid the performance hit, then every > method that directly or indirectly calls one of these methods that > require off-node information will potentially block. While this > occurs, incoming requests from other nodes must be handled, and some > of those may involve the Element in question. Some form of locking will > thus be needed (probably on a per-Element basis). The difficult thing > is that each of the places in the code where a potentially blocking > call will occur will have to release the Element lock, and must leave > the Element (as well as any kernel data structures) in a safe and > consistent state. I can't see this being done without rewriting many > sections of code. The most troublesome situations will be when > modifications are being made to the element tree, such as when new > elements are being created or old ones destroyed. Once the network is > set up, things may not be so bad, but the network needs to get set up > in order to run it. Moose has a solid foundation using decent design patterns. Since these patterns were introduced, several hurricanes struck. My guess is that among them were lack of a cohesive and comprehensive architectural design, personnel changes, premature optimization, and just wanting to get some pieces done. I was going to warn you about these issues. I think it's completely unreasonable to try to parallelize the code in it's current incarnation. In a nutshell, it's just entirely too tightly coupled with very little apparent cohesion in any of the classes: Elements contain connections, connections contain Elements, Fields contain connections and Elements, etc, etc, etc. This has lead to the dreaded "header.h", AKA KitchenSink.h. I've been trying to get the code back in line with the (maybe just apparent) original design patterns. It's difficult. What's most intimidating is that all roads lead to rewriting the moose pre processor and subsequently regenerating a bunch of code from the .mh files. The problem is that many of the generated files have since been edited by hand. Ugh. The core moose needs to be a library - some thing to be used by programmers. Creating libraries requires a greater attention to implementation details in order to provide accepted and expected behaviors. We need to be able to present a clean API to this library, ideally through swig. The genesis parser, ReadCell, Plot, etc should use the API to access the core library. However, core architectural issues must be addressed before this will be attainable. > [...] > --Greg joe js...@ya... Software Engineer Linux/OSX C/C++/Java |