From: Cody P. <cod...@gm...> - 2012-07-22 02:17:21
|
On Fri, Jul 20, 2012 at 10:19 PM, Roy Stogner <roy...@ic...>wrote: > > On Fri, 20 Jul 2012, Kirk, Benjamin (JSC-EG311) wrote: > > > On 7/20/12 3:17 PM, "Roy Stogner" <roy...@ic...> wrote: > > > >>> Why not print a trace to the screen when encountering an error and > >>> running serially, but failing back to the current trace files > >>> behavior when running in parallel? > >> > >> I can't *believe* I didn't think of that. > > > > Don't be so hard on yourself - based on my memory both you and John > should > > still be asymptoting back to your pre-offspring sensibilities. > > What's weird is that I don't feel as tired this time as I did when my > eldest was a month and a half old and I was afraid of falling asleep > on the commute to work... but if I actually count up all the stupid > things I've done in just the past couple weeks, it's significantly > worse than before; I'm forced to conclude that my intelligence has > fallen so low that it's not even competent at self-estimation anymore. > > > Agree that's a great solution. > > And much easier and safer than trying to make OStreamProxy MPI-aware. > I don't think there'd be any *real* obstacles making that solution > thread-safe and properly memory-managed, but there would certainly be > places we could slip up. > > On the subject of sleep-deprivation-stupidity and tricky memory > management: if anyone wants to code review the Parallel:: tricks I > added a couple weeks ago, I would appreciate it. > > Parallel::Request::add_post_wait_work() is for allowing arbitrary > callback functors to be attached to a Request object so that the > wait() can do things like cleaning up temporary buffers after an > asynchronous send. We had a nasty race condition there before. > > For the other new code (which is also now on a critical path: > MeshInput of a serial format in a parallel run) search parallel.h and > mesh_communication.C for "packed_range", and see the specializations > in packed_node.C and packed_elem.C and the utility class in > mesh_inserter_iterator.h. Basically this let us shave mesh broadcasts > down to 50 lines with sexy code like: > > for (unsigned int l=0; l != n_levels; ++l) > Parallel::broadcast_packed_range(&mesh, > mesh.level_elements_begin(l), > mesh.level_elements_end(l), > &mesh, > mesh_inserter_iterator<Elem>(mesh)); > > Oh wow, very nice. I could actually use something like this in a couple of classes I recently built. Except my needs are even more straightforward. I just have lists of nodes (std::vector<Node *>) that I'd like to gather on all processors. Right now I'm just passing Ids and extracting them on each processor but it sounds like with the right function call, I could avoid that with all the fancy broadcast stuff you've built. Will this work outside of the mesh? I haven't really looked into it yet. > which handles communicating everything from boundary conditions to > neighbor topology to element DoF indices, using generic code that > ought to be easily extensible to other variable-size data types too. > I left PackedNode and PackedElem around for backwards compatibility, > but they could be eliminated if/when the Idaho folks want to use > packed_range based code instead. > I bet we can take a look so that we can do some housekeeping here. We don't have this code scattered too far and wide. Cody > > In hindsight I should have posted such a significant patch to > libmesh-devel before committing, despite it passing my tests. It was > originally only going to be affecting distributed ParallelMesh code > paths, and after realizing that it could greatly simplify SerialMesh > communication too I forgot to reconsider the question of whether or > not I should get additional eyes on it. > --- > Roy > > > ------------------------------------------------------------------------------ > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. Discussions > will include endpoint security, mobile security and the latest in malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > _______________________________________________ > Libmesh-devel mailing list > Lib...@li... > https://lists.sourceforge.net/lists/listinfo/libmesh-devel > |