Thread: [Libmesh-users] Memory scaling of current_local_solution

Brought to you by: benkirk, jwpeterson, roystgnr

libmesh-users

[Libmesh-users] Memory scaling of current_local_solution

From: Tim K. <tim...@ce...> - 2009-01-14 14:51:30

Dear libMesh team,

Is there any chance that the memory scaling of 
System::current_local_solution will be improved in the near future?

In my application, I have a large number of systems and a large number 
of cells, and the fact that System::current_local_solution is always a 
serial vector seems to destroy the memory scalability completely.  The 
cluster I'm computing on has 8 CPUs per node but I cannot use more 
than 2 (or perhaps three) since the nodes run out of memory otherwise.

I don't think that the (serial) grid itself is the decisive thing in 
my application.  (I watched the memory consumption, and it remains 
small during the grid creation procedure and increases drastically 
when the systems are created and initialized.)

I think that adding the required functionality to NumericVector and 
PetscVector would not be too complicated (PETSc's VecCreateGhost() 
seems to do the trick).  I can try to do this part myself, that is to 
add a constructor that aditionally takes a list of ghost indices that 
we want to store (and implement everything on the PETSc side).  The 
other side is to make the System class use that new constructor (in 
PETSc case at least) and, in particular, determine which indices are 
actually required.  And to do the correct thing when the grid is 
refined/coarsened.  I feel not able to implement that part because I'm 
not familiar enough with the interna of libMesh.

Let me know what you guys think.

Best Regards,

Tim

-- 
Dr. Tim Kroeger
tim...@me...            Phone +49-421-218-7710
tim...@ce...            Fax   +49-421-218-4236

Fraunhofer MEVIS, Institute for Medical Image Computing
Universitaetsallee 29, 28359 Bremen, Germany

Re: [Libmesh-users] Memory scaling of current_local_solution

From: Roy S. <roy...@ic...> - 2009-01-14 15:16:20

On Wed, 14 Jan 2009, Tim Kroeger wrote:

> I don't think that the (serial) grid itself is the decisive thing in
> my application.  (I watched the memory consumption, and it remains
> small during the grid creation procedure and increases drastically
> when the systems are created and initialized.)

Keep in mind, the serial mesh isn't the only thing that goes on the
SerialMesh.  We store our degree of freedom indexing on the Node and
Elem objects themselves, and so on a SerialMesh those don't scale well
either.  And like the current_local_solution, those get allocated and
initialized during system initialization.

> I think that adding the required functionality to NumericVector and
> PetscVector would not be too complicated (PETSc's VecCreateGhost()
> seems to do the trick).  I can try to do this part myself, that is to
> add a constructor that aditionally takes a list of ghost indices that
> we want to store (and implement everything on the PETSc side).  The
> other side is to make the System class use that new constructor (in
> PETSc case at least) and, in particular, determine which indices are
> actually required.  And to do the correct thing when the grid is
> refined/coarsened.  I feel not able to implement that part because I'm
> not familiar enough with the interna of libMesh.

Yeah, I can do that easily enough.  Ben's already done the hard work
of properly creating the "send_list" of ghost DoFs; I don't think I'll
have any problem figuring out when to plug that into a constructor or
reinitialization function.
---
Roy

Re: [Libmesh-users] Memory scaling of current_local_solution

From: Tim K. <tim...@ce...> - 2009-01-14 15:21:23

Dear Roy,

On Wed, 14 Jan 2009, Roy Stogner wrote:

> On Wed, 14 Jan 2009, Tim Kroeger wrote:
>
>> I don't think that the (serial) grid itself is the decisive thing in
>> my application.  (I watched the memory consumption, and it remains
>> small during the grid creation procedure and increases drastically
>> when the systems are created and initialized.)
>
> Keep in mind, the serial mesh isn't the only thing that goes on the
> SerialMesh.  We store our degree of freedom indexing on the Node and
> Elem objects themselves, and so on a SerialMesh those don't scale well
> either.  And like the current_local_solution, those get allocated and
> initialized during system initialization.

Okay, I didn't know that.

>> I think that adding the required functionality to NumericVector and
>> PetscVector would not be too complicated (PETSc's VecCreateGhost()
>> seems to do the trick).  I can try to do this part myself, that is to
>> add a constructor that aditionally takes a list of ghost indices that
>> we want to store (and implement everything on the PETSc side).  The
>> other side is to make the System class use that new constructor (in
>> PETSc case at least) and, in particular, determine which indices are
>> actually required.  And to do the correct thing when the grid is
>> refined/coarsened.  I feel not able to implement that part because I'm
>> not familiar enough with the interna of libMesh.
>
> Yeah, I can do that easily enough.  Ben's already done the hard work
> of properly creating the "send_list" of ghost DoFs; I don't think I'll
> have any problem figuring out when to plug that into a constructor or
> reinitialization function.

What do you mean by "Yeah"?  Would you like me to do something 
(NumericVector, PetscVector), or do you mean you will do everything 
yourselves?

In any case, is there any idea about when this will be usable?

Best Regards,

Tim

-- 
Dr. Tim Kroeger
tim...@me...            Phone +49-421-218-7710
tim...@ce...            Fax   +49-421-218-4236

Fraunhofer MEVIS, Institute for Medical Image Computing
Universitaetsallee 29, 28359 Bremen, Germany

Re: [Libmesh-users] Memory scaling of current_local_solution

From: Kirk, B. (JSC-EG) <Ben...@na...> - 2009-01-14 15:30:00

>>> I don't think that the (serial) grid itself is the decisive thing in
>>> my application.  (I watched the memory consumption, and it remains
>>> small during the grid creation procedure and increases drastically
>>> when the systems are created and initialized.)
>> 
>> Keep in mind, the serial mesh isn't the only thing that goes on the
>> SerialMesh.  We store our degree of freedom indexing on the Node and
>> Elem objects themselves, and so on a SerialMesh those don't scale well
>> either.  And like the current_local_solution, those get allocated and
>> initialized during system initialization.
> 
> Okay, I didn't know that.

The memory usage spike is almost certainly happening when the degree of
freedom indices are allocated and stored in the DofObject.  We've whittled
down the memory usage of that class a few times over the years, and I owe it
one more major refactoring to try and reduce memory consumption.  John and I
were talking about this a little while back...  Right now we essentially
build a 2D matrix with variable row length to store the dof index data, we
should instead pack it into a contiguous array.

-Ben

Re: [Libmesh-users] Memory scaling of current_local_solution

From: Roy S. <roy...@ic...> - 2009-01-14 15:52:31

On Wed, 14 Jan 2009, Tim Kroeger wrote:

>>> I think that adding the required functionality to NumericVector and
>>> PetscVector would not be too complicated (PETSc's VecCreateGhost()
>>> seems to do the trick).  I can try to do this part myself, that is to
>>> add a constructor that aditionally takes a list of ghost indices that
>>> we want to store (and implement everything on the PETSc side).  The
>>> other side is to make the System class use that new constructor (in
>>> PETSc case at least) and, in particular, determine which indices are
>>> actually required.  And to do the correct thing when the grid is
>>> refined/coarsened.  I feel not able to implement that part because I'm
>>> not familiar enough with the interna of libMesh.
>> 
>> Yeah, I can do that easily enough.  Ben's already done the hard work
>> of properly creating the "send_list" of ghost DoFs; I don't think I'll
>> have any problem figuring out when to plug that into a constructor or
>> reinitialization function.
>
> What do you mean by "Yeah"?  Would you like me to do something 
> (NumericVector, PetscVector), or do you mean you will do everything 
> yourselves?

By "that" I just meant the stuff you've referred to as "the other
side"; I don't have time to figure out the right PETSc APIs for
part-contiguous, part-sparse vectors.

I'll write "dummy" implementations for Laspack and Trilinos vectors
too - not to get the same performance out of them yet, but just to
make sure stuff still compiles after the change.

> In any case, is there any idea about when this will be usable?

Depends how fast you get the Petsc implementation working.  ;-)  But
if you put in that part of the work, I'll do my best not to hold you
up waiting for the rest.  This won't be a high priority yet on my day
job, but for a real chance at finishing something that's been on the
libMesh TODO list for years, I wouldn't mind working a couple nights
or a weekend.
---
Roy

Re: [Libmesh-users] Memory scaling of current_local_solution

From: Tim K. <tim...@ce...> - 2009-01-14 17:18:04

Dear Roy,

On Wed, 14 Jan 2009, Roy Stogner wrote:

> By "that" I just meant the stuff you've referred to as "the other
> side"; I don't have time to figure out the right PETSc APIs for
> part-contiguous, part-sparse vectors.

Okay, so I will start doing the NumericVector and the PetscVector 
stuff now.  I will keep you informed.

Now, the first problem I encounter -- but actually it's not really a 
problem -- is that PETSc supplies the possibility to automatically 
communicate any changes of vector components to the other processors 
that may have those indices as ghost indices.  While, in a long term 
view, using that functionality would help to get rid of the duplicated 
solution representation, I think that I should for now *not* use that 
functionality since the current construction of current_local_solution 
does not expect that.

Let me know if you have a different option.

Best Regards,

Tim

-- 
Dr. Tim Kroeger
tim...@me...            Phone +49-421-218-7710
tim...@ce...            Fax   +49-421-218-4236

Fraunhofer MEVIS, Institute for Medical Image Computing
Universitaetsallee 29, 28359 Bremen, Germany

Re: [Libmesh-users] Memory scaling of current_local_solution

From: Roy S. <roy...@ic...> - 2009-01-14 17:34:08

On Wed, 14 Jan 2009, Tim Kroeger wrote:

> Now, the first problem I encounter -- but actually it's not really a problem 
> -- is that PETSc supplies the possibility to automatically communicate any 
> changes of vector components to the other processors that may have those 
> indices as ghost indices.  While, in a long term view, using that 
> functionality would help to get rid of the duplicated solution 
> representation, I think that I should for now *not* use that functionality 
> since the current construction of current_local_solution does not expect 
> that.

Interesting possibility, but even in the long run I don't think we'd
want to use it unless it's quite efficient.  I can't see how they'd
accomplish that without requiring every set() to perform tests, every
ghost dof set() to do non-blocking message sends, and every get() to
test for a message receipt.  Synchronizing data only in large batches
when we know it's necessary isn't quite as intuitive but is probably
faster.

Plus there's the Trilinos question - if we assume NumericVectors have
such functionality, are we going to be able to get it easily from
every implementation without adding it ourselves?
---
Roy

Re: [Libmesh-users] Memory scaling of current_local_solution

From: Derek G. <fri...@gm...> - 2009-01-14 17:53:08

I'm pretty sure that Trilinos doesn't have this capability.  I'm with  
Roy in thinking that it would be better to code this once ourselves so  
it will work for all NumericVectors.

Derek

Re: [Libmesh-users] Memory scaling of current_local_solution

From: Tim K. <tim...@ce...> - 2009-01-15 07:49:51

Dear Roy,

On Wed, 14 Jan 2009, Roy Stogner wrote:

> On Wed, 14 Jan 2009, Tim Kroeger wrote:
>
>> Now, the first problem I encounter -- but actually it's not really a 
>> problem -- is that PETSc supplies the possibility to automatically 
>> communicate any changes of vector components to the other processors that 
>> may have those indices as ghost indices.  While, in a long term view, using 
>> that functionality would help to get rid of the duplicated solution 
>> representation, I think that I should for now *not* use that functionality 
>> since the current construction of current_local_solution does not expect 
>> that.
>
> Interesting possibility, but even in the long run I don't think we'd
> want to use it unless it's quite efficient.  I can't see how they'd
> accomplish that without requiring every set() to perform tests, every
> ghost dof set() to do non-blocking message sends, and every get() to
> test for a message receipt.  Synchronizing data only in large batches
> when we know it's necessary isn't quite as intuitive but is probably
> faster.

I formulated my statement misleading.  The communication is not done 
automatically after each set() operation, it is (as yours) done in 
blocks on request.  There are two advantages I see over the current 
implementation: First, the necessity to have two vectors would be 
removed.  Second, communication is reduced by only transfering those 
values that have actually changed (It seems to me that PETSc manages a 
list of them).

But still, I think we should not use that, at least at the moment.

But there is another problem (things turn out to be more difficult 
than I thought): In the ghost cell case, PETSc does not provide the 
appropriate global-to-local mapping that would be required for e.g. 
NumericVector::operator()(unsigned int).  I asked this in the 
petsc-users list.  Jed Brown commented on this by arguing the natural 
thing would be that libMesh's DofMap should work on local dof numbers. 
(A local-to-global mapping seems to be provided by PETSc.)

My idea would be that PetscVector creates the global-to-local mapping 
(for the ghost cells only) itself (and stores it, e.g. as a 
std::map<unsigned int, unsigned int>).  This should still save a lot 
of memory compared with the serial vector version.

Do you agree?

Best Regards,

Tim

-- 
Dr. Tim Kroeger
tim...@me...            Phone +49-421-218-7710
tim...@ce...            Fax   +49-421-218-4236

Fraunhofer MEVIS, Institute for Medical Image Computing
Universitaetsallee 29, 28359 Bremen, Germany

Re: [Libmesh-users] Memory scaling of current_local_solution

From: Roy S. <roy...@ic...> - 2009-01-15 14:00:06

On Thu, 15 Jan 2009, Tim Kroeger wrote:

> But there is another problem (things turn out to be more difficult than I 
> thought): In the ghost cell case, PETSc does not provide the appropriate 
> global-to-local mapping that would be required for e.g. 
> NumericVector::operator()(unsigned int).  I asked this in the petsc-users 
> list.  Jed Brown commented on this by arguing the natural thing would be that 
> libMesh's DofMap should work on local dof numbers.

Giving identical degrees of freedom different ids on different
processors would not be natural for us, especially considering the
changes we'd be forced to make to the DofMap (which would now need to
talk to our numeric interfaces for the first time!), to our non-Petsc
numeric interfaces, to code that stores non-DoF data associated with
particular degrees of freedom...  I sympathize with their desire to
have contiguous local dof numbers, but I think this would be too big a
change for us right now.

> (A local-to-global mapping seems to be provided by PETSc.)

Interesting - like a sparsity pattern for vectors.  Is it shared
between multiple vectors?

> My idea would be that PetscVector creates the global-to-local mapping (for 
> the ghost cells only) itself (and stores it, e.g. as a std::map<unsigned int, 
> unsigned int>).  This should still save a lot of memory compared with the 
> serial vector version.

Sounds like the best we can do.  Maybe typedef the container to make
it easier for us to play with map vs. hash_map performance for it.
---
Roy

Re: [Libmesh-users] Memory scaling of current_local_solution

From: Tim K. <tim...@ce...> - 2009-01-15 16:13:02

On Thu, 15 Jan 2009, Roy Stogner wrote:

>> (A local-to-global mapping seems to be provided by PETSc.)
>
> Interesting - like a sparsity pattern for vectors.  Is it shared
> between multiple vectors?

I think yes, but I'm not sure.

>> My idea would be that PetscVector creates the global-to-local mapping (for 
>> the ghost cells only) itself (and stores it, e.g. as a std::map<unsigned 
>> int, unsigned int>).  This should still save a lot of memory compared with 
>> the serial vector version.
>
> Sounds like the best we can do.  Maybe typedef the container to make
> it easier for us to play with map vs. hash_map performance for it.

Okay.  (I've not done much yet, though.)

Best Regards,

Tim

-- 
Dr. Tim Kroeger
tim...@me...            Phone +49-421-218-7710
tim...@ce...            Fax   +49-421-218-4236

Fraunhofer MEVIS, Institute for Medical Image Computing
Universitaetsallee 29, 28359 Bremen, Germany