Thread: [Libmesh-devel] PetscVector addition nonsense | libMesh: A C++ Finite Element Library

libmesh-devel

[Libmesh-devel] PetscVector addition nonsense

From: Kirk, B. (JSC-EG311) <ben...@na...> - 2009-04-17 11:09:34

Attachments: petsc_vector_nonsense

Tim,

Please find attached a patch which addresses the ridiculous implementation
of adding a scalar to a PetscVector pointed out last week...  Let me know if
it works for you, if so I'll submit it.

-Ben

Re: [Libmesh-devel] PetscVector addition nonsense

From: Tim K. <tim...@ce...> - 2009-04-17 12:20:19

Dear Ben,

On Fri, 17 Apr 2009, Kirk, Benjamin (JSC-EG311) wrote:

> Please find attached a patch which addresses the ridiculous implementation
> of adding a scalar to a PetscVector pointed out last week...  Let me know if
> it works for you, if so I'll submit it.

Actually, I'm not using that function anywhere in my code, so there is 
no easy way for me to test it.  (Remember that Jed was the one who 
pointed this out initially.)

On the other hand, there are some more possibly inefficient things in 
that file.  For instance, PetscVector::insert() calls 
PetscVector::set() for each index, and I suppose it would be better to 
use VecSetValues() instead.  It's difficult to decide how far one 
should go at the moment with optimizing the PetscVector class.

Anyway, I looked over your patch and it seems correct to me, and in 
particular it coincides with Jed's suggestion.  Since Jed seems to be 
familiar with PETSc very well, this could be considered of being 
enough reason to submit the patch right now.

Other optimizations should probably wait until we know whether this is 
really a bottleneck.  Before I run the application with the PETSc log 
output option that Jed suggested, I would like the remining bug to be 
fixed, since one never knows what that implies.  I'm currently waiting 
for Roy to report whether he can reproduce the bug; I guess that he 
has been too busy by now.

Best Regards,

Tim

-- 
Dr. Tim Kroeger
tim...@me...            Phone +49-421-218-7710
tim...@ce...            Fax   +49-421-218-4236

Fraunhofer MEVIS, Institute for Medical Image Computing
Universitaetsallee 29, 28359 Bremen, Germany

Re: [Libmesh-devel] PetscVector addition nonsense

From: Roy S. <roy...@ic...> - 2009-04-17 14:41:11

On Fri, 17 Apr 2009, Tim Kroeger wrote:

> I would like the remining bug to be fixed, since one never knows
> what that implies.  I'm currently waiting for Roy to report whether
> he can reproduce the bug; I guess that he has been too busy by now.

Sorry about the delay; you guess right.

The prodding was helpful.  I did have time to set up your test program
and get it started this morning.  Not sure if/when it's going to
finish, though.  I'd like to run all 24 processes on the same box for
ease of debugging, but it's currently halfway through the refinement
steps and using 6GB memory out of 4GB available RAM.  We've got a
couple 16GB nodes here, but they're even busier than I am through most
of the week lately.  But if the 4GB server proves insufficient I'll
see if I can monopolize one of the big guys tomorrow or Sunday.
---
Roy

Re: [Libmesh-devel] PetscVector addition nonsense

From: Tim K. <tim...@ce...> - 2009-04-22 06:18:44

Dear Roy,

On Fri, 17 Apr 2009, Roy Stogner wrote:

> I did have time to set up your test program and get it started this 
> morning.  Not sure if/when it's going to finish, though.

Did it reveal anything yet?

> I'd like to run all 24 processes on the same box for
> ease of debugging, but it's currently halfway through the refinement
> steps and using 6GB memory out of 4GB available RAM.

You mean, it was already swapping?  I see, that's going to slow it 
down essentially.

By the way: Nobody seems to have checked in my patch that I sent to 
the list last week (April 15) (nor has anybody stated that my patch 
could cause problems).

Best Regards,

Tim

-- 
Dr. Tim Kroeger
tim...@me...            Phone +49-421-218-7710
tim...@ce...            Fax   +49-421-218-4236

Fraunhofer MEVIS, Institute for Medical Image Computing
Universitaetsallee 29, 28359 Bremen, Germany

Re: [Libmesh-devel] PetscVector addition nonsense

From: Roy S. <roy...@ic...> - 2009-04-24 22:21:32


On Wed, 22 Apr 2009, Tim Kroeger wrote:

> On Fri, 17 Apr 2009, Roy Stogner wrote:
>
>> I did have time to set up your test program and get it started this 
>> morning.  Not sure if/when it's going to finish, though.
>
> Did it reveal anything yet?

On the big machine I get the same error, and relatively quickly.  I'm
going to glance over the data structures in the debugger tonight, but
unless I can find an obvious underlying cause I won't have much time
to spend on it for a while.
---
Roy

Re: [Libmesh-devel] PetscVector addition nonsense

From: Roy S. <roy...@ic...> - 2009-05-01 23:50:51

On Fri, 24 Apr 2009, Roy Stogner wrote:

> On Wed, 22 Apr 2009, Tim Kroeger wrote:
>
>> On Fri, 17 Apr 2009, Roy Stogner wrote:
>> 
>>> I did have time to set up your test program and get it started this 
>>> morning.  Not sure if/when it's going to finish, though.
>> 
>> Did it reveal anything yet?
>
> On the big machine I get the same error, and relatively quickly.  I'm
> going to glance over the data structures in the debugger tonight, but
> unless I can find an obvious underlying cause I won't have much time
> to spend on it for a while.

The old version of gdb on that machine is killing me - even with
METHOD=dbg it can't seem to find many libMesh methods, can't seem to
call the ones it does find, and sometimes crashes or returns incorrect
results from calls it does make.  If anyone knows how to walk through
the libstdc++ std::map data structure to find a specific entry, let me
know.  This didn't work for me:
http://help.lockergnome.com/linux/GDB-capabilities-exploring-STL-classes--ftopict279673.html

I've hassled the sysadmins to try and get a newer gdb to use.  For
now, examining what data structures I could directly, I've at least
got a start on the problem.

It's the same symptom as last time (constraint application failing
because a constraining DoF isn't semilocal) but it's not the same
cause.  Last time we were trying to satisfy the correct constraint
equation but we missed adding the constraining DoF to the send_list.
This time the constraint equation itself apparently came out wrong -
it's trying to use one incorrect DoF index.  And that's probably not
the root of the problem.  I presume that the odd choice of "24.01" in
the build_cube parameters was necessary to reproduce the problem?
Then the real bug has to be in one of the hacks where we use nodal
coordinates to identify a node.  I think most of my own sins in that
regard are limited to ParallelMesh.  Ben does a bit of that in
MeshCommunication that we might look at.  But the most likely culprit
is probably find_neighbors().

I've got to run right now, and won't get a chance to really look at
this again until next Wednesday.  Sorry about the delays.

And thanks again for all the ghosted vectors work.  Regardless of how
long it takes us to make it efficient, it's already been invaluable as
a debugging tool.
---
Roy

Re: [Libmesh-devel] PetscVector addition nonsense

From: Tim K. <tim...@ce...> - 2009-05-04 06:53:38

Dear Roy,

On Fri, 1 May 2009, Roy Stogner wrote:

> I presume that the odd choice of "24.01" in
> the build_cube parameters was necessary to reproduce the problem?

Actually, for me there is no reason to believe this.  The "24.01" is 
used for a different reason, having to do with some of the interna of 
the application.  I have not tested whether the crash depends on this.

(In fact, for the previous bug, as you might remember, I mixed up the 
arguments of build_cube() once again, hence unwittingly using a 
completely different geometry than in the application, which did not 
influence the bug.  Only because this time I got the argument order 
right and hence use the same geometry as in the application, this is 
not a reason to believe that the bug *depends* on the geometry.)

> I've got to run right now, and won't get a chance to really look at
> this again until next Wednesday.  Sorry about the delays.

Doesn't matter; it's not so urgent any more as it was some time ago. 
(-:

> And thanks again for all the ghosted vectors work.  Regardless of how
> long it takes us to make it efficient, it's already been invaluable as
> a debugging tool.

I am pleased to hear that!

Best Regards,

Tim

-- 
Dr. Tim Kroeger
tim...@me...            Phone +49-421-218-7710
tim...@ce...            Fax   +49-421-218-4236

Fraunhofer MEVIS, Institute for Medical Image Computing
Universitaetsallee 29, 28359 Bremen, Germany

Re: [Libmesh-devel] PetscVector addition nonsense

From: Tim K. <tim...@ce...> - 2009-05-14 14:00:41

Dear Roy,

On Mon, 4 May 2009, Tim Kroeger wrote:

>> I've got to run right now, and won't get a chance to really look at
>> this again until next Wednesday.  Sorry about the delays.
>
> Doesn't matter; it's not so urgent any more as it was some time ago.
> (-:

Well, to prevent any possible misunderstanding, I would like to add 
that by this sentence, I didn't mean that it became totally irrelevant 
for me.  Although there is currently no short term deadline pressure 
for me on this item, I would appreciate to have the ghosted vectors 
working before the next short term deadline pressure arises.  If there 
is any sensible task that I could do to assist you in finding the bug, 
please let me know.

Best Regards,

Tim

-- 
Dr. Tim Kroeger
tim...@me...            Phone +49-421-218-7710
tim...@ce...            Fax   +49-421-218-4236

Fraunhofer MEVIS, Institute for Medical Image Computing
Universitaetsallee 29, 28359 Bremen, Germany

Re: [Libmesh-devel] PetscVector addition nonsense

From: Roy S. <roy...@ic...> - 2009-05-14 23:14:54

On Thu, 14 May 2009, Tim Kroeger wrote:

> Although there is currently no short term deadline pressure for me on this 
> item, I would appreciate to have the ghosted vectors working before the next 
> short term deadline pressure arises.  If there is any sensible task that I 
> could do to assist you in finding the bug, please let me know.

If I could think of anything, I'd have mentioned it now, I'm afraid.
Right now the debug cycle is glacial, but I'm not sure how to avoid:

Recompiling.  I can't get gdb or idb to walk though our DofConstraints
structure properly (can't find function X, can't cast to unknown type
Y, etc...) and that's left me using std::cerr as a major debugging
tool.

Rerunning.  This bug depends on the precise mesh partitioning, which
we redo whenever we load a new file, so I can't just save the failing
mesh and restart from that, I have to restart from the beginning and
walk though all the dozens of AMR/C steps... in dbg or devel mode, if
I want to be able to use gdb at all on the result.

Work.  Your typical debug cycle doesn't include "Spend days giving and
listening to talks" or "Fix and rerun sensitivity analyses with a
different code", but the last few weeks have been swamped for me.

It looks like add_constraints_to_send_list didn't quite do what it was
supposed to, because this is another version of the same bug: a
hanging face node has four dependencies, one of which is a hanging
edge node with two dependencies, and the processor with the face node
somehow isn't getting the farther grand-dependency added to its
send_list.

It'll be another run or two until I'm certain of that, though, and
probably a few more before I've figured out why it's happening and
how to fix it.
---
Roy

Re: [Libmesh-devel] PetscVector addition nonsense

From: Tim K. <tim...@ce...> - 2009-05-15 06:20:48

Dear Roy,

On Thu, 14 May 2009, Roy Stogner wrote:

> Right now the debug cycle is glacial, but I'm not sure how to avoid: 
> [...]

Thank you very much for your interim report.  I just wanted to make 
sure that things have not been forgotten.  I understand that it takes 
time to find this bug.

Sorry if my mail made you feel defensive.

Best Regards,

Tim

-- 
Dr. Tim Kroeger
tim...@me...            Phone +49-421-218-7710
tim...@ce...            Fax   +49-421-218-4236

Fraunhofer MEVIS, Institute for Medical Image Computing
Universitaetsallee 29, 28359 Bremen, Germany

Re: [Libmesh-devel] PetscVector addition nonsense

From: Roy S. <roy...@ic...> - 2009-05-15 11:57:25

On Fri, 15 May 2009, Tim Kroeger wrote:

> On Thu, 14 May 2009, Roy Stogner wrote:
>
>> Right now the debug cycle is glacial, but I'm not sure how to avoid: [...]
>
> Thank you very much for your interim report.  I just wanted to make sure that 
> things have not been forgotten.  I understand that it takes time to find this 
> bug.
>
> Sorry if my mail made you feel defensive.

Not defensive at you, just exasperated at gdb.  My attempts to get
a proper debugger's-eye-view of complex STL-tree-based classes have
been both numerous and fruitless.
---
Roy

[Libmesh-devel] Ghosted vector bug fixed

From: Roy S. <roy...@ic...> - 2009-05-16 13:26:17

On Thu, 14 May 2009, Roy Stogner wrote:

> It'll be another run or two until I'm certain of that, though, and
> probably a few more before I've figured out why it's happening and
> how to fix it.

Got it!

The reason this bug was so hard to find is that, in some sense, it's
more of a design mistake than a bug!  enforce_constraints_exactly()
works correctly with serial vectors; ghosted vectors are correctly
behaving as their API specifies... the two correct behaviors just
weren't 100% compatible yet, is all.

When I wrote enforce_constraints_exactly(), for some reason, (possibly
because we weren't yet recursively expanding contraints and so a more
straightforward method wasn't yet possible, possibly because I was
still learning how the contraint system worked), I made it work by
building the constraint matrix C for each element, looping over the
rows that correspond to local constrained dofs, and setting 
vglobal_i = sum(C_ij*vlocal_j)

If you'll forgive my abused notation: The problem here is that we've
got an element with dof a on processor 1 and dof b on processor 2, the
latter of which depends on dof c on processor 3.  Because dof c isn't
a constraint dependency on processor 1, that processor doesn't have it
in the send_list.  This means that vlocal_c is inaccurately 0 on a
processor 1 serial vector, but we don't really care, because C_ac is 0
too and the inaccuracy in vlocal_c doesn't propagate to vglobal_a.
But accessing vlocal_c, even to multiply it by 0, throws a
libmesh_error() on a ghosted vector!

Anyway, an immediate fix is simple: just skip accumulating indices
where C_ij==0.0.  I've committed that to SVN, and on my machine it
works to take the test case you sent all the way to completion.  In
the long term we'll want to change enforce_constraints_exactly to just
loop over local dofs and directly use the constraint rows, but I'd
like to make sure this old bug is fixed before I risk mucking things
up and adding new bugs.  ;-)
---
Roy

Re: [Libmesh-devel] Ghosted vector bug fixed

From: John P. <jwp...@gm...> - 2009-05-16 14:56:35

On Sat, May 16, 2009 at 8:26 AM, Roy Stogner <roy...@ic...> wrote:
>
> On Thu, 14 May 2009, Roy Stogner wrote:
>
>> It'll be another run or two until I'm certain of that, though, and
>> probably a few more before I've figured out why it's happening and
>> how to fix it.
>
> Got it!
>
> The reason this bug was so hard to find is that, in some sense, it's
> more of a design mistake than a bug!  enforce_constraints_exactly()
> works correctly with serial vectors; ghosted vectors are correctly
> behaving as their API specifies... the two correct behaviors just
> weren't 100% compatible yet, is all.
>
> When I wrote enforce_constraints_exactly(), for some reason, (possibly
> because we weren't yet recursively expanding contraints and so a more
> straightforward method wasn't yet possible, possibly because I was
> still learning how the contraint system worked), I made it work by
> building the constraint matrix C for each element, looping over the
> rows that correspond to local constrained dofs, and setting
> vglobal_i = sum(C_ij*vlocal_j)
>
> If you'll forgive my abused notation: The problem here is that we've
> got an element with dof a on processor 1 and dof b on processor 2, the
> latter of which depends on dof c on processor 3.  Because dof c isn't
> a constraint dependency on processor 1, that processor doesn't have it
> in the send_list.  This means that vlocal_c is inaccurately 0 on a
> processor 1 serial vector, but we don't really care, because C_ac is 0
> too and the inaccuracy in vlocal_c doesn't propagate to vglobal_a.
> But accessing vlocal_c, even to multiply it by 0, throws a
> libmesh_error() on a ghosted vector!
>
> Anyway, an immediate fix is simple: just skip accumulating indices
> where C_ij==0.0.  I've committed that to SVN, and on my machine it
> works to take the test case you sent all the way to completion.  In
> the long term we'll want to change enforce_constraints_exactly to just
> loop over local dofs and directly use the constraint rows, but I'd
> like to make sure this old bug is fixed before I risk mucking things
> up and adding new bugs.  ;-)

Way to go Roy!

I'd like to personally double your libmesh developer salary this month ;-)

-- 
John

Re: [Libmesh-devel] Ghosted vector bug fixed

From: Kirk, B. (JSC-EG311) <ben...@na...> - 2009-05-16 15:11:01

> Way to go Roy!
> 
> I'd like to personally double your libmesh developer salary this month ;-)

I second that.  

When we are all in Austin week after next let's decide on a libMesh-0.7.0
release date.  We also need to devise some bonus incentive program for
meeting the milestone? ;-)

-Ben

Re: [Libmesh-devel] Ghosted vector bug fixed

From: Tim K. <tim...@ce...> - 2009-05-18 06:48:00

Dear Roy,

On Sat, 16 May 2009, Roy Stogner wrote:

> Got it! [...]

Great work!  I'll let the application run today and see whether any 
new bugs emerge.  (-:

> Anyway, an immediate fix is simple: just skip accumulating indices
> where C_ij==0.0.  I've committed that to SVN, and on my machine it
> works to take the test case you sent all the way to completion.

I would strongly suggest to add a comment at that position in the 
code.  Otherwise, if you (or somebody else) later should decide that 
-Wfloat-equal should be enabled, you'll get a warning at this point, 
and since you might have forgotten the reason, you'll be tempted to 
remove that seemingly useless piece of code.

Best Regards,

Tim

-- 
Dr. Tim Kroeger
tim...@me...            Phone +49-421-218-7710
tim...@ce...            Fax   +49-421-218-4236

Fraunhofer MEVIS, Institute for Medical Image Computing
Universitaetsallee 29, 28359 Bremen, Germany

Re: [Libmesh-devel] Ghosted vector bug fixed

From: Tim K. <tim...@ce...> - 2009-05-26 06:35:04

Attachments: patch

Dear Roy,

On Mon, 18 May 2009, Tim Kroeger wrote:

>> Got it! [...]
>
> Great work!  I'll let the application run today and see whether any
> new bugs emerge.  (-:

When I first tested it, it kept crashing at the same point as before. 
It took me quite long to find out what I did wrong: I compiled using

 	METHO=devel make

and linked against the devel version.  Seems right, doesn't it?  Well, 
but if you look closely, you see that I missed out a "D", so I 
actually compiled the optimized version.

Feature request for your build system: If the user ever has compiled a 
version other than optimized, "make" with unset "METHOD" variable 
should produce an error message and not work.

After I had fixed my compile statement, the next run crashed at a 
different (later) point.  I conjecture that that was out-of-memory, so 
I restarted it on a larger number of nodes (same number of CPUs, 
though).  It then finally ran through without crash.

By the way, the temporal scalability is also quite bad.  You might 
remember that this bug has emerged as a test of how my application 
behaves on a larger number of CPUs.  The runtimes for 8 CPUs on 3 
nodes (copied from my mail of April 8, 2009) were:

no-ghosted-1 : 11:34:10
no-ghosted-2 : 11:35:54
ghosted-1    : 17:25:28
ghosted-2    : 16:33:23

Now, the new result for 24 CPUs on 4 nodes (3 nodes ran out of memory) 
is:

ghosted      : 10:42:34

Not really nice.  In particular not essentially faster than the 
unghosted version on a comparable number of nodes.

I'll perfom the log output that Jed suggested and see what he says. 
Perhaps there is some easy possibility to make it faster.

>> Anyway, an immediate fix is simple: just skip accumulating indices
>> where C_ij==0.0.  I've committed that to SVN, and on my machine it
>> works to take the test case you sent all the way to completion.
>
> I would strongly suggest to add a comment at that position in the
> code.  Otherwise, if you (or somebody else) later should decide that
> -Wfloat-equal should be enabled, you'll get a warning at this point,
> and since you might have forgotten the reason, you'll be tempted to
> remove that seemingly useless piece of code.

I guess that you didn't have the time yet to write such a comment, so 
I wrote that for you (see attachment), you just have to check it in.

Best Regards,

Tim

-- 
Dr. Tim Kroeger
tim...@me...            Phone +49-421-218-7710
tim...@ce...            Fax   +49-421-218-4236

Fraunhofer MEVIS, Institute for Medical Image Computing
Universitaetsallee 29, 28359 Bremen, Germany

Re: [Libmesh-devel] PetscVector addition nonsense

From: Roy S. <roy...@ic...> - 2009-04-22 23:09:25

On Wed, 22 Apr 2009, Tim Kroeger wrote:

> On Fri, 17 Apr 2009, Roy Stogner wrote:
>
>> I did have time to set up your test program and get it started this 
>> morning.  Not sure if/when it's going to finish, though.
>
> Did it reveal anything yet?

It left the system hosed for long enough that I had to kill it.  I'll
be able to try again on a bigger system this weekend.

> By the way: Nobody seems to have checked in my patch that I sent to the list 
> last week (April 15) (nor has anybody stated that my patch could cause 
> problems).

It's one of the many things backing up in my inbox right now, sorry.
---
Roy

Re: [Libmesh-devel] PetscVector addition nonsense

From: Roy S. <roy...@ic...> - 2009-04-22 23:29:18

On Wed, 22 Apr 2009, Roy Stogner wrote:

> On Wed, 22 Apr 2009, Tim Kroeger wrote:
>
>> By the way: Nobody seems to have checked in my patch that I sent to the 
>> list last week (April 15) (nor has anybody stated that my patch could cause 
>> problems).
>
> It's one of the many things backing up in my inbox right now, sorry.

Patch looks good.  I'm in a rush now, but I'll add it late tonight or
tomorrow.  One question first: can anyone think of a better name than
constrain_nothing()?  The analogy to our other constrain_* functions
makes sense, but it seems odd outside that context to have a method
essentially named "do_nothing" that does modify its inputs.

I can't think of any better name myself, so I'll commit the new
method as constrain_nothing().  But if someone *can* think of a more
descriptive name, let me know so we can change it while the API's
still only got one user.  ;-)
---
Roy

Re: [Libmesh-devel] PetscVector addition nonsense

From: Tim K. <tim...@ce...> - 2009-04-23 06:34:10

Dear Roy,

On Wed, 22 Apr 2009, Roy Stogner wrote:

> One question first: can anyone think of a better name than 
> constrain_nothing()?  The analogy to our other constrain_* functions 
> makes sense, but it seems odd outside that context to have a method 
> essentially named "do_nothing" that does modify its inputs.

Well, of course you are right.  But actually, I find this is only a 
symptom of the fact that a user might not expect the constrain_*() 
methods to modify their dof_indices argument.  At least, that was true 
for me quite a long time, and it caused a number of programming errors 
in my applications.  I learned this when I implemented the 
constrain_dyad_matrix() function.  Since that time I understand that 
these methods *have* to do this.

What I want to say is this: First, having a method called 
"constrain_nothing()" might make the innocent user look into the 
details earlier and prevent him from making mistakes.  Second, the 
whole constaining mechanism could be reworked completely.  I'm 
thinking about something like this:

start_constraining(const old_row_dofs,
                    const old_col_dofs,
                    new_row_dofs,
                    new_col_dofs);

constrain_vector(const old_row_dofs,
                  const new_row_dofs,
                  vector);

constrain_matrix(const old_row_dofs,
                  const old_col_dofs,
                  const new_row_dofs,
                  const new_col_dofs,
                  matrix);

constrain_dyad_matrix(const old_row_dofs,
                       const old_col_dofs,
                       const new_row_dofs,
                       const new_col_dofs,
                       v,
                       w);

Or, perhaps better, have a "Constraining" class that is created 
locally by the user and holds all required information (including the 
constraint matrix).

Any opinions?

Best Regards,

Tim

-- 
Dr. Tim Kroeger
tim...@me...            Phone +49-421-218-7710
tim...@ce...            Fax   +49-421-218-4236

Fraunhofer MEVIS, Institute for Medical Image Computing
Universitaetsallee 29, 28359 Bremen, Germany

Re: [Libmesh-devel] PetscVector addition nonsense

From: Roy S. <roy...@ic...> - 2009-04-23 15:47:35

On Thu, 23 Apr 2009, Tim Kroeger wrote:

> On Wed, 22 Apr 2009, Roy Stogner wrote:
>
>> One question first: can anyone think of a better name than 
>> constrain_nothing()?  The analogy to our other constrain_* functions makes 
>> sense, but it seems odd outside that context to have a method essentially 
>> named "do_nothing" that does modify its inputs.
>
> Well, of course you are right.  But actually, I find this is only a symptom 
> of the fact that a user might not expect the constrain_*() methods to modify 
> their dof_indices argument.

Good point.

> At least, that was true for me quite a long time, and it caused a
> number of programming errors in my applications.  I learned this
> when I implemented the constrain_dyad_matrix() function.  Since that
> time I understand that these methods *have* to do this.

Well, they have to get an expanded dof_indices vector.  They don't
necessarily have to expand their argument rather than making a copy or
keeping new indices in aseparate vector.  Presumably we just chose the
most efficient semantic over the more intuitive ones.

> What I want to say is this: First, having a method called
> "constrain_nothing()" might make the innocent user look into the
> details earlier and prevent him from making mistakes.

constrain_nothing() it is, then.

> Second, the whole constaining mechanism could be reworked
> completely.  I'm thinking about something like this:
>
> start_constraining(const old_row_dofs,
>                   const old_col_dofs,
>                   new_row_dofs,
>                   new_col_dofs);
>
> constrain_vector(const old_row_dofs,
>                 const new_row_dofs,
>                 vector);
>
> constrain_matrix(const old_row_dofs,
>                 const old_col_dofs,
>                 const new_row_dofs,
>                 const new_col_dofs,
>                 matrix);
>
> constrain_dyad_matrix(const old_row_dofs,
>                      const old_col_dofs,
>                      const new_row_dofs,
>                      const new_col_dofs,
>                      v,
>                      w);
>
> Or, perhaps better, have a "Constraining" class that is created locally by 
> the user and holds all required information (including the constraint 
> matrix).
>
> Any opinions?

Probably a Constraining class; otherwise we'd have to regenerate that
constraint matrix over and over again, right?  And while a more
intuitive constraint API would be nice, we've got something that works
now, so it's not a high priority for me.  Patches would be welcomed
and (eventually...) included, though.
---
Roy

Re: [Libmesh-devel] PetscVector addition nonsense

From: Tim K. <tim...@ce...> - 2009-04-24 06:15:57

Dear Roy,

On Thu, 23 Apr 2009, Roy Stogner wrote:

> constrain_nothing() it is, then.

Okay, thank you.

>> Second, the whole constaining mechanism could be reworked
>> completely.  [...]
>
> Probably a Constraining class; otherwise we'd have to regenerate that
> constraint matrix over and over again, right?

Yes, you are right.

> And while a more
> intuitive constraint API would be nice, we've got something that works
> now, so it's not a high priority for me.

The same applies to me.

> Patches would be welcomed and (eventually...) included, though.

I'll keep that in mind for the unlikely case that I should some day 
feel that I have nothing to do.  (-:

Best Regards,

Tim

-- 
Dr. Tim Kroeger
tim...@me...            Phone +49-421-218-7710
tim...@ce...            Fax   +49-421-218-4236

Fraunhofer MEVIS, Institute for Medical Image Computing
Universitaetsallee 29, 28359 Bremen, Germany

Re: [Libmesh-devel] PetscVector addition nonsense

From: Roy S. <roy...@ic...> - 2009-05-14 18:27:32

On Sat, 9 May 2009, Lorenzo Botti wrote:

> This code produces the problem during the reinit after the second solve.Hope
> it can help. 
>
> Without coarsening it seems that all works fine!

Here's what I get running the code on the libMesh svn head with METHOD=dbg:

*** Warning, This Code is Deprecated! src/base/libmesh.C, line 356, compiled May 14 2009 at 11:28:14 ***
   Beginning Solve 0
Number of elements: 219
  assembling elliptic dg system... done
System has: 768 degrees of freedom.
Linear solver converged at step: 21, final residual: 4.1012e-12
L2-Error is: 0.00666744
H1-Error is: 0.144821
   Beginning Solve 1
Number of elements: 827
  assembling elliptic dg system... done
System has: 2896 degrees of freedom.
Linear solver converged at step: 30, final residual: 1.0506e-11
L2-Error is: 0.00264921
H1-Error is: 0.102276
   Beginning Solve 2
Number of elements: 3003
  assembling elliptic dg system... done
System has: 10512 degrees of freedom.
Linear solver converged at step: 50, final residual: 2.03714e-11
L2-Error is: 0.0016323
H1-Error is: 0.070119
*** Warning, This Code is Deprecated! src/base/libmesh.C, line 366, compiled May 14 2009 at 11:28:14 ***

Since your code hadn't updated the libMesh::init() calls to use a
LibMeshInit object instead, is it safe for me to assume you're running
with an older libMesh version?  Which one?  Would you try checking out
the current SVN version and see if you can reproduce the problem
there?
---
Roy

Re: [Libmesh-devel] PetscVector addition nonsense

From: Jed B. <je...@59...> - 2009-05-15 12:07:12

Attachments: signature.asc

Roy Stogner wrote:

> My attempts to get a proper debugger's-eye-view of complex
> STL-tree-based classes have been both numerous and fruitless.

Do you have any experience with

  http://sourceware.org/gdb/wiki/ProjectArcher

I haven't used it because I work primarily in C, but it should help with
this.

Jed

Re: [Libmesh-devel] PetscVector addition nonsense

From: Roy S. <roy...@ic...> - 2009-05-15 12:49:02

On Fri, 15 May 2009, Jed Brown wrote:

> Roy Stogner wrote:
>
>> My attempts to get a proper debugger's-eye-view of complex
>> STL-tree-based classes have been both numerous and fruitless.
>
> Do you have any experience with
>
>  http://sourceware.org/gdb/wiki/ProjectArcher

No, thank you!  I'd tried a couple different macro sets that were
supposed to work on top of vanilla gdb, but never anything that
changed the source itself.  I'll give this a shot.
---
Roy

Re: [Libmesh-devel] Ghosted vector bug fixed

From: Jed B. <je...@59...> - 2009-05-26 11:07:38

Attachments: signature.asc

Tim Kroeger wrote:

> Not really nice.  In particular not essentially faster than the
> unghosted version on a comparable number of nodes.

It is important to know what preconditioners are being used
(preconditioners always change in parallel, though not as much when
there is a coarse level as in multigrid).  Also, memory performance
(especially bandwidth) is usually the overwhelming issue for implicit
solvers (e.g. you are very lucky to get 4% of peak FPU performance for
MatVec on Core 2 Quad).  Thus using more cores frequently does not help,
and you need more sockets to improve performance.  Network latency is
also a factor, but if your subdomains are big enough that you are having
memory issues, it should not be an issue (rather, if it is an issue then
we have faulty algorithms at play).

There may well be more going on here, but it's important to consider
these issues.

> I'll perfom the log output that Jed suggested and see what he says.
> Perhaps there is some easy possibility to make it faster.

FWIW, I always run nontrivial jobs with -log_summary.  It does not add
measurable overhead and that output is really useful.  If the effort
required to ignore that output is less than the run (i.e. the run is
more than a few seconds), it is worthwhile to have that profiling info.

Jed

1 2 3 > >> (Page 1 of 3)