From: Roy S. <roy...@ic...> - 2012-07-20 19:39:34
|
On Fri, 20 Jul 2012, Cody Permann wrote: > On Fri, Jul 20, 2012 at 12:34 PM, Roy Stogner <roy...@ic...> wrote: > > What's the reason why we need a separate print_trace() at > parameters.C:441? Shouldn't the libmesh_error() be throwing an > exception which takes you to the libmesh_terminate_handler() which > does a libmesh_write_traceout() anyway? > Well, I just had a discussion with John about this and he gave me a > little background. I guess you don't like to print traces in > general because of issues with jumbled terminals in parallel. Right - if we print to stderr then the results are basically gibberish on more than a handful of processors. ... with any MPI stack I've seen, anyway. It seems like it should be obvious for an MPI implementation to interleave only when a buffer is flushed, but I suppose at the MPI level there may be no way to tell when an output buffer is explicitly being flushed or when it just happens to start putting out real data anyway. We could possibly implement such behavior ourselves, by modifying OStreamProxy to be MPI-aware... operator<< on any processor could buffer things into discrete asynchronous sends to proc 0, and operator<< on proc 0 could probe for such sends... but that would fail poorly for debugging in cases where proc 0 goes into an infinite loop or a blocking wait and doesn't pass along important information from other processors. There's no way to have a "callback" function activated by an incoming MPI message, is there? > It appears that we have to configure to turn on trace files to see > these stack traces. Right. > I was creating lots of new types a few weeks ago and found the stack > trace useful in that location, but if that's not the libMesh way > then I suppose it can be removed. Well, I wasn't *just* trying to be passive-aggressive here. It would be good not to have a potentially-redundant print_trace() before one particular libmesh_error(), I think, but the fact that you needed it in the first place suggests that we ought to be doing something to improve the libmesh_error() behavior. Maybe enabling tracefiles by default rather than requiring --enable-tracefiles and/or --enable-everything? Opinions or other suggestions would be welcome. --- Roy |
From: Cody P. <cod...@gm...> - 2012-07-20 19:52:32
|
On Fri, Jul 20, 2012 at 1:39 PM, Roy Stogner <roy...@ic...>wrote: > > On Fri, 20 Jul 2012, Cody Permann wrote: > > On Fri, Jul 20, 2012 at 12:34 PM, Roy Stogner <roy...@ic...> >> wrote: >> >> What's the reason why we need a separate print_trace() at >> parameters.C:441? Shouldn't the libmesh_error() be throwing an >> exception which takes you to the libmesh_terminate_handler() which >> does a libmesh_write_traceout() anyway? >> > > Well, I just had a discussion with John about this and he gave me a >> little background. I guess you don't like to print traces in >> general because of issues with jumbled terminals in parallel. >> > > Right - if we print to stderr then the results are basically gibberish > on more than a handful of processors. > > ... with any MPI stack I've seen, anyway. It seems like it should be > obvious for an MPI implementation to interleave only when a buffer is > flushed, but I suppose at the MPI level there may be no way to tell > when an output buffer is explicitly being flushed or when it just > happens to start putting out real data anyway. > > We could possibly implement such behavior ourselves, by modifying > OStreamProxy to be MPI-aware... operator<< on any processor could > buffer things into discrete asynchronous sends to proc 0, and > operator<< on proc 0 could probe for such sends... but that would fail > poorly for debugging in cases where proc 0 goes into an infinite loop > or a blocking wait and doesn't pass along important information from > other processors. > > There's no way to have a "callback" function activated by an incoming > MPI message, is there? > > > It appears that we have to configure to turn on trace files to see >> these stack traces. >> > > Right. > > > I was creating lots of new types a few weeks ago and found the stack >> trace useful in that location, but if that's not the libMesh way >> then I suppose it can be removed. >> > > Well, I wasn't *just* trying to be passive-aggressive here. It would > be good not to have a potentially-redundant print_trace() before one > particular libmesh_error(), Yeah - I agree with this > I think, but the fact that you needed it > in the first place suggests that we ought to be doing something to > improve the libmesh_error() behavior. Maybe enabling tracefiles by > default rather than requiring In my opinion, I don't think this is a very good idea. I don't know if normal users want to deal with files appearing on their systems during the normal coding/debugging cycle. > --enable-tracefiles and/or > --enable-everything? Opinions or other suggestions would be welcome. > Why not print a trace to the screen when encountering an error and running serially, but failing back to the current trace files behavior when running in parallel? Cody > --- > Roy |
From: Roy S. <roy...@ic...> - 2012-07-20 20:17:11
|
On Fri, 20 Jul 2012, Cody Permann wrote: > Why not print a trace to the screen when encountering an error and > running serially, but failing back to the current trace files > behavior when running in parallel? I can't *believe* I didn't think of that. Sounds like a pretty unform improvement, so long as we do "both screen and file-if-so-configured" in serial for consistency. If you want to implement/test/commit this go right ahead; if not I'll try to get to it this weekend. Thanks, --- Roy |
From: Kirk, B. (JSC-EG311) <ben...@na...> - 2012-07-20 20:23:39
|
On 7/20/12 3:17 PM, "Roy Stogner" <roy...@ic...> wrote: >> Why not print a trace to the screen when encountering an error and >> running serially, but failing back to the current trace files >> behavior when running in parallel? > > I can't *believe* I didn't think of that. Don't be so hard on yourself - based on my memory both you and John should still be asymptoting back to your pre-offspring sensibilities. Agree that's a great solution. -Ben |
From: Roy S. <roy...@ic...> - 2012-07-21 04:19:22
|
On Fri, 20 Jul 2012, Kirk, Benjamin (JSC-EG311) wrote: > On 7/20/12 3:17 PM, "Roy Stogner" <roy...@ic...> wrote: > >>> Why not print a trace to the screen when encountering an error and >>> running serially, but failing back to the current trace files >>> behavior when running in parallel? >> >> I can't *believe* I didn't think of that. > > Don't be so hard on yourself - based on my memory both you and John should > still be asymptoting back to your pre-offspring sensibilities. What's weird is that I don't feel as tired this time as I did when my eldest was a month and a half old and I was afraid of falling asleep on the commute to work... but if I actually count up all the stupid things I've done in just the past couple weeks, it's significantly worse than before; I'm forced to conclude that my intelligence has fallen so low that it's not even competent at self-estimation anymore. > Agree that's a great solution. And much easier and safer than trying to make OStreamProxy MPI-aware. I don't think there'd be any *real* obstacles making that solution thread-safe and properly memory-managed, but there would certainly be places we could slip up. On the subject of sleep-deprivation-stupidity and tricky memory management: if anyone wants to code review the Parallel:: tricks I added a couple weeks ago, I would appreciate it. Parallel::Request::add_post_wait_work() is for allowing arbitrary callback functors to be attached to a Request object so that the wait() can do things like cleaning up temporary buffers after an asynchronous send. We had a nasty race condition there before. For the other new code (which is also now on a critical path: MeshInput of a serial format in a parallel run) search parallel.h and mesh_communication.C for "packed_range", and see the specializations in packed_node.C and packed_elem.C and the utility class in mesh_inserter_iterator.h. Basically this let us shave mesh broadcasts down to 50 lines with sexy code like: for (unsigned int l=0; l != n_levels; ++l) Parallel::broadcast_packed_range(&mesh, mesh.level_elements_begin(l), mesh.level_elements_end(l), &mesh, mesh_inserter_iterator<Elem>(mesh)); which handles communicating everything from boundary conditions to neighbor topology to element DoF indices, using generic code that ought to be easily extensible to other variable-size data types too. I left PackedNode and PackedElem around for backwards compatibility, but they could be eliminated if/when the Idaho folks want to use packed_range based code instead. In hindsight I should have posted such a significant patch to libmesh-devel before committing, despite it passing my tests. It was originally only going to be affecting distributed ParallelMesh code paths, and after realizing that it could greatly simplify SerialMesh communication too I forgot to reconsider the question of whether or not I should get additional eyes on it. --- Roy |
From: Cody P. <cod...@gm...> - 2012-07-22 02:17:21
|
On Fri, Jul 20, 2012 at 10:19 PM, Roy Stogner <roy...@ic...>wrote: > > On Fri, 20 Jul 2012, Kirk, Benjamin (JSC-EG311) wrote: > > > On 7/20/12 3:17 PM, "Roy Stogner" <roy...@ic...> wrote: > > > >>> Why not print a trace to the screen when encountering an error and > >>> running serially, but failing back to the current trace files > >>> behavior when running in parallel? > >> > >> I can't *believe* I didn't think of that. > > > > Don't be so hard on yourself - based on my memory both you and John > should > > still be asymptoting back to your pre-offspring sensibilities. > > What's weird is that I don't feel as tired this time as I did when my > eldest was a month and a half old and I was afraid of falling asleep > on the commute to work... but if I actually count up all the stupid > things I've done in just the past couple weeks, it's significantly > worse than before; I'm forced to conclude that my intelligence has > fallen so low that it's not even competent at self-estimation anymore. > > > Agree that's a great solution. > > And much easier and safer than trying to make OStreamProxy MPI-aware. > I don't think there'd be any *real* obstacles making that solution > thread-safe and properly memory-managed, but there would certainly be > places we could slip up. > > On the subject of sleep-deprivation-stupidity and tricky memory > management: if anyone wants to code review the Parallel:: tricks I > added a couple weeks ago, I would appreciate it. > > Parallel::Request::add_post_wait_work() is for allowing arbitrary > callback functors to be attached to a Request object so that the > wait() can do things like cleaning up temporary buffers after an > asynchronous send. We had a nasty race condition there before. > > For the other new code (which is also now on a critical path: > MeshInput of a serial format in a parallel run) search parallel.h and > mesh_communication.C for "packed_range", and see the specializations > in packed_node.C and packed_elem.C and the utility class in > mesh_inserter_iterator.h. Basically this let us shave mesh broadcasts > down to 50 lines with sexy code like: > > for (unsigned int l=0; l != n_levels; ++l) > Parallel::broadcast_packed_range(&mesh, > mesh.level_elements_begin(l), > mesh.level_elements_end(l), > &mesh, > mesh_inserter_iterator<Elem>(mesh)); > > Oh wow, very nice. I could actually use something like this in a couple of classes I recently built. Except my needs are even more straightforward. I just have lists of nodes (std::vector<Node *>) that I'd like to gather on all processors. Right now I'm just passing Ids and extracting them on each processor but it sounds like with the right function call, I could avoid that with all the fancy broadcast stuff you've built. Will this work outside of the mesh? I haven't really looked into it yet. > which handles communicating everything from boundary conditions to > neighbor topology to element DoF indices, using generic code that > ought to be easily extensible to other variable-size data types too. > I left PackedNode and PackedElem around for backwards compatibility, > but they could be eliminated if/when the Idaho folks want to use > packed_range based code instead. > I bet we can take a look so that we can do some housekeeping here. We don't have this code scattered too far and wide. Cody > > In hindsight I should have posted such a significant patch to > libmesh-devel before committing, despite it passing my tests. It was > originally only going to be affecting distributed ParallelMesh code > paths, and after realizing that it could greatly simplify SerialMesh > communication too I forgot to reconsider the question of whether or > not I should get additional eyes on it. > --- > Roy > > > ------------------------------------------------------------------------------ > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. Discussions > will include endpoint security, mobile security and the latest in malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > _______________________________________________ > Libmesh-devel mailing list > Lib...@li... > https://lists.sourceforge.net/lists/listinfo/libmesh-devel > |
From: Cody P. <cod...@gm...> - 2012-07-22 04:15:36
|
On Sat, Jul 21, 2012 at 8:17 PM, Cody Permann <cod...@gm...> wrote: > > > On Fri, Jul 20, 2012 at 10:19 PM, Roy Stogner <roy...@ic...>wrote: > >> >> On Fri, 20 Jul 2012, Kirk, Benjamin (JSC-EG311) wrote: >> >> > On 7/20/12 3:17 PM, "Roy Stogner" <roy...@ic...> wrote: >> > >> >>> Why not print a trace to the screen when encountering an error and >> >>> running serially, but failing back to the current trace files >> >>> behavior when running in parallel? >> >> >> >> I can't *believe* I didn't think of that. >> > >> > Don't be so hard on yourself - based on my memory both you and John >> should >> > still be asymptoting back to your pre-offspring sensibilities. >> >> What's weird is that I don't feel as tired this time as I did when my >> eldest was a month and a half old and I was afraid of falling asleep >> on the commute to work... but if I actually count up all the stupid >> things I've done in just the past couple weeks, it's significantly >> worse than before; I'm forced to conclude that my intelligence has >> fallen so low that it's not even competent at self-estimation anymore. >> >> > Agree that's a great solution. >> >> And much easier and safer than trying to make OStreamProxy MPI-aware. >> I don't think there'd be any *real* obstacles making that solution >> thread-safe and properly memory-managed, but there would certainly be >> places we could slip up. >> >> On the subject of sleep-deprivation-stupidity and tricky memory >> management: if anyone wants to code review the Parallel:: tricks I >> added a couple weeks ago, I would appreciate it. >> >> Parallel::Request::add_post_wait_work() is for allowing arbitrary >> callback functors to be attached to a Request object so that the >> wait() can do things like cleaning up temporary buffers after an >> asynchronous send. We had a nasty race condition there before. >> >> For the other new code (which is also now on a critical path: >> MeshInput of a serial format in a parallel run) search parallel.h and >> mesh_communication.C for "packed_range", and see the specializations >> in packed_node.C and packed_elem.C and the utility class in >> mesh_inserter_iterator.h. Basically this let us shave mesh broadcasts >> down to 50 lines with sexy code like: >> >> for (unsigned int l=0; l != n_levels; ++l) >> Parallel::broadcast_packed_range(&mesh, >> mesh.level_elements_begin(l), >> mesh.level_elements_end(l), >> &mesh, >> mesh_inserter_iterator<Elem>(mesh)); >> >> Oh wow, very nice. I could actually use something like this in a couple > of classes I recently built. Except my needs are even more > straightforward. I just have lists of nodes (std::vector<Node *>) that I'd > like to gather on all processors. Right now I'm just passing Ids and > extracting them on each processor but it sounds like with the right > function call, I could avoid that with all the fancy broadcast stuff you've > built. Will this work outside of the mesh? I haven't really looked into > it yet. > Well on second thought, that doesn't really make sense. Asking the (serial) mesh for a pointer is always going to work better than copying a node object across MPI processes. Still, I really like this new utilities. Way cool! > > >> which handles communicating everything from boundary conditions to >> neighbor topology to element DoF indices, using generic code that >> ought to be easily extensible to other variable-size data types too. >> I left PackedNode and PackedElem around for backwards compatibility, >> but they could be eliminated if/when the Idaho folks want to use >> packed_range based code instead. >> > > I bet we can take a look so that we can do some housekeeping here. We > don't have this code scattered too far and wide. > > Cody > >> >> In hindsight I should have posted such a significant patch to >> libmesh-devel before committing, despite it passing my tests. It was >> originally only going to be affecting distributed ParallelMesh code >> paths, and after realizing that it could greatly simplify SerialMesh >> communication too I forgot to reconsider the question of whether or >> not I should get additional eyes on it. >> --- >> Roy >> >> >> ------------------------------------------------------------------------------ >> Live Security Virtual Conference >> Exclusive live event will cover all the ways today's security and >> threat landscape has changed and how IT managers can respond. Discussions >> will include endpoint security, mobile security and the latest in malware >> threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ >> _______________________________________________ >> Libmesh-devel mailing list >> Lib...@li... >> https://lists.sourceforge.net/lists/listinfo/libmesh-devel >> > > |
From: Roy S. <roy...@ic...> - 2012-07-22 04:26:55
|
On Sat, 21 Jul 2012, Cody Permann wrote: > for (unsigned int l=0; l != n_levels; ++l) > Parallel::broadcast_packed_range(&mesh, > mesh.level_elements_begin(l), > mesh.level_elements_end(l), > &mesh, > mesh_inserter_iterator<Elem>(mesh)); > > Oh wow, very nice. I could actually use something like this in a > couple of classes I recently built. Except my needs are even more > straightforward. I just have lists of nodes (std::vector<Node *>) > that I'd like to gather on all processors. Right now I'm just > passing Ids and extracting them on each processor but it sounds like > with the right function call, I could avoid that with all the fancy > broadcast stuff you've built. Well, if the nodes already exist and are up to date on each processor, then the most efficient way to pass a list of them from processor to processor is still to pass a list of ids and MeshBase::get_node for each. If you don't necessarily have the nodes on each processor (distributed ParallelMesh situation) then the packed_range code is the way to go. If you have the nodes on each processor but their information (locations, associated boundary condition ids) is being updated by their owning processor, then the packed_range code is probably the best way to propagate that, but it might need some adjustment - IIRC I've currently got the unpack code loaded with assertions that any received data corresponding to an existing node or element is consistent with the existing data. > Will this work outside of the mesh? I haven't really looked into > it yet. Hmmm... with the current design, there's some side effects due to backwards compatibility: because we don't add boundary conditions (or in the case of elements, update topology) with MeshBase::add_node/elem I'm forced to do those things in the Parallel::unpack, and so if you don't want to modify your Mesh on the receiving end then you'd have to give the receive function a new "dummy" Mesh to work with. > I left PackedNode and PackedElem around for backwards > compatibility, but they could be eliminated if/when the Idaho > folks want to use packed_range based code instead. > > I bet we can take a look so that we can do some housekeeping here. > We don't have this code scattered too far and wide. No rush; just let me know. --- Roy |