On Mon, 2 Mar 2009, Roy Stogner wrote:
> On Mon, 2 Mar 2009, Tim Kroeger wrote:
>> Update: It doesn't crash any more. With the patch that I sent you in the
>> previous mail, it seems to work.
> That's interesting. Did you ever track down the source of the
> problem? The patch you provided looks like it could have fixed some
> inaccuracy bugs with ghosted vectors, but I can't see what code path
> in the old code might have triggered an actual crash.
Well, actually the crash was due to the missing a.close() between a=b
and a.scale(). After adding that, but before my latest patch, it
didn't crash any more, but it produced totally wrong results. That
led me to the idea that PetscVector::scale() (and hence other methods)
should use the local form.
>> Well, at least the first 8 time steps of my application do the same
>> as they did without the ghost dofs. And the memory requirements are
>> less than 50% of before (on 8 processors).
> Really? That's very good to hear.
Yes. Well, the point is that my application uses a large number of
systems, some of which have a lot of additional vectors. I expected
it to be like this; that's why I pushed the ghosted vectors. (-:
> Is that with SerialMesh?
However, I how have another problem: After the first ~20 time steps
of my application, it starts to refine the grid and -- crashes. I
have no idea why, but I will try to track that down.
>> However, (a) the results do not *exactly* coincide with the previous ones;
>> that it, I get e.g. a residual norm of 1.16562e-08 instead of 1.16561e-08
>> after 608 linear iterations (although using the same number of processors),
> Hmm... maybe we still have some inaccuracy bugs in there? There
> aren't any other confounding changes? If you configure using
> --disable-ghosted you get back the old results?
I haven't tested this yet. I will try to find the cause of the crash
first. (Re-configuring takes a lot of time, you know.)
>> and (b) I had to add a.close() between a=b and a.scale(), where a and b are
>> *System::solution of two different systems.
> The systems are identical? Same mesh, same variable types added in
> the same order?
The mesh is the same, and also the variables. One is an
ExplicitSystem, the other a LinearImplicitSystem, but that shouldn't
be important, should it?
>> Both facts surprise me. In particular, (b) surprises me for two reasons,
>> that is (b1) PetscVector::operator=(const PetscVector&) should not require
>> closing the vector afterwards,
> I wouldn't think so either, but I'm not sure how best to divine PETSc
> standards there. The VecCopy man page doesn't mention needing
> VecAssembly afterwards, but then again neither do the VecSet* pages;
> there's just the brief discussion in the user's manual.
> I don't know what to think yet... but I actually can't replicate
> problem (b) myself with the attached vectest.C -
I can't either. Very strange.
> I assume it bombs out
> for you with the usual PETSc "object in wrong state" error?
Yes, that's what it did.
I'll keep you informed.
Dr. Tim Kroeger
tim.kroeger@... Phone +49-421-218-7710
tim.kroeger@... Fax +49-421-218-4236
Fraunhofer MEVIS, Institute for Medical Image Computing
Universitaetsallee 29, 28359 Bremen, Germany