Here's some more info on this subject:

1.  The old System::update() is really segfaulting.  It's pretty reproducible with ~60 million DoFs.
2.  Using the old System::update() with a solution->close() at the beginning is _not_ sufficient!  It still segfaults!
3.  Using the new System::update() works.

I'm still investigating and will let you know more when I can.  Kinda hard to debug on ~5,000 procs....

How about the issues you brought up below?  Any clarity on those yet?  In particular the Trilinos problem shouldn't be happening.  Trilinos support in libMesh doesn't support GHOSTED vectors (as far as I know anyway).... so you really shouldn't be able to compile with Trilinos and Ghosted both on...

Derek

On Thu, May 12, 2011 at 9:05 PM, Roy Stogner <roystgnr@ices.utexas.edu> wrote:

On Thu, 12 May 2011, Derek Gaston wrote:

Sent from my iPad

On May 12, 2011, at 5:22 PM, Roy Stogner <roystgnr@ices.utexas.edu> wrote:

operator= is implemented correctly for src.type == dst.type, and as of
mid-last year it's implemented correctly in PetscVector when one type
is PARALLEL and the other is GHOSTED.  That's it.  The new
System::update() probably doesn't even work correctly for Petsc+MPI
when --enable-ghosted-vectors is off.

Why would that be?  It would just run the old code with the
localize().... if it worked before it would still work now.

Ah, you're right - I wasn't thinking about the other half of the ifdef
in the new update.  That would explain why my "disable MPI" test
failed but my "disable everything" test succeeded.

But in that case I have no idea *what* is going on with PETSc w/o MPI,
just that it's now dying in ex0 when the libmesh_assert at
petsc_vector.C:619 fails.


As for operator=, David and I shored it up for ghosted = parallel a
while ago.

Sure, for PETSc.  Trilinos is now throwing exceptions, though.
(Barely... "terminate called after throwing an instance of 'int'",
guys, seriously?).  Don't we basically fall back to SERIAL there when
GHOSTED is requested?  And then System::update doesn't check whether
current_local_solution is ghosted, just whether it could have been
ghosted?


All of our regression tests pass with the new System::update()...

Can you pinpoint a problem that doesn't work properly?  We might be
able to take a look at it.

I do agree that there might be better ways... and I'm committed to
looking into them... but we need to understand why this wouldn't work
because that would mean there is a serious implementation issue with
operator=.

I don't think it's going to hit anything you're using; most of our
stuff still passes.  Here's our two tested configurations that are now
failing:

module load intel/10.1 tbb mkl-pecos petsc-nompi glpk vtk && ./configure --enable-everything --disable-mpi

module load intel/11.1-ubuntu tbb mpich2/1.2.1 mkl-pecos petsc slepc trilinos glpk vtk &&./configure --enable-everything --enable-mpi --with-mpi=$MPI_DIR --disable-petsc --disable-laspack


I'm sure it's harder, but could you answer the same question?  Is it
possible to get a simple app (ex19 with --enable-parmesh and the
refinement level cranked way up??) to fail with the old
System::update()?  I just finished fixing some (app-level) bugs that
only triggered on several dozen nodes, which was quite an unpleasant
experience.  If there's a possible gotcha that only shows up on
several hundred nodes then I'd like to figure out how to assert
(and/or better, regression test) the hell out of it.
---
Roy