Thread: Re: [Libmesh-devel] print_trace() in r5863

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

On Fri, 20 Jul 2012, Cody Permann wrote:

> On Fri, Jul 20, 2012 at 12:34 PM, Roy Stogner <roy...@ic...> wrote:
>
>       What's the reason why we need a separate print_trace() at
>       parameters.C:441?  Shouldn't the libmesh_error() be throwing an
>       exception which takes you to the libmesh_terminate_handler() which
>       does a libmesh_write_traceout() anyway?

> Well, I just had a discussion with John about this and he gave me a
> little background.  I guess you don't like to print traces in
> general because of issues with jumbled terminals in parallel.  

Right - if we print to stderr then the results are basically gibberish
on more than a handful of processors.

... with any MPI stack I've seen, anyway.  It seems like it should be
obvious for an MPI implementation to interleave only when a buffer is
flushed, but I suppose at the MPI level there may be no way to tell
when an output buffer is explicitly being flushed or when it just
happens to start putting out real data anyway.

We could possibly implement such behavior ourselves, by modifying
OStreamProxy to be MPI-aware...  operator<< on any processor could
buffer things into discrete asynchronous sends to proc 0, and
operator<< on proc 0 could probe for such sends... but that would fail
poorly for debugging in cases where proc 0 goes into an infinite loop
or a blocking wait and doesn't pass along important information from
other processors.

There's no way to have a "callback" function activated by an incoming
MPI message, is there?

> It appears that we have to configure to turn on trace files to see
> these stack traces.

Right.

> I was creating lots of new types a few weeks ago and found the stack
> trace useful in that location, but if that's not the libMesh way
> then I suppose it can be removed.

Well, I wasn't *just* trying to be passive-aggressive here.  It would
be good not to have a potentially-redundant print_trace() before one
particular libmesh_error(), I think, but the fact that you needed it
in the first place suggests that we ought to be doing something to
improve the libmesh_error() behavior.  Maybe enabling tracefiles by
default rather than requiring --enable-tracefiles and/or
--enable-everything?  Opinions or other suggestions would be welcome.
---
Roy

Thread: Re: [Libmesh-devel] print_trace() in r5863

libmesh-devel