Thread: [Libmesh-users] Assemble matrices without initializing equation system?

Brought to you by: benkirk, jwpeterson, roystgnr

libmesh-users

[Libmesh-users] Assemble matrices without initializing equation system?

From: David X. <dx...@my...> - 2006-07-22 23:30:50

Hi All,

I was trying to assemble the stiffness and mass matrices on a dense mesh
with 531441 nodes (40x40x40, HEX27, 3rd, HERMITE) and I ran into "Out of
memory" problem at the line of:

equation_systems.init();


Here's the error message:

[0]PETSC ERROR: PetscMallocAlign() line 62 in src/sys/memory/mal.c
[0]PETSC ERROR: Out of memory. This could be due to allocating
[0]PETSC ERROR: too large an object or bleeding by not properly
[0]PETSC ERROR: destroying unneeded objects.
[0]PETSC ERROR: Memory allocated 1380430780 Memory used by process 886095872
[0]PETSC ERROR: Try running with -malloc_dump or -malloc_log for info.
[0]PETSC ERROR: Memory requested 907039528!
[0]PETSC ERROR: PetscTrMallocDefault() line 191 in src/sys/memory/mtr.c
[0]PETSC ERROR: MatSeqAIJSetPreallocation_SeqAIJ() line 2735 in
src/mat/impls/aij/seq/aij.c
[0]PETSC ERROR: MatCreateSeqAIJ() line 2621 in src/mat/impls/aij/seq/aij.c
[0]PETSC ERROR: User provided function() line 137 in
unknowndirectory/src/numerics/petsc_matrix.C
[unset]: aborting job:
application called MPI_Abort(comm=0x84000000, 1) - process 0

My question is: is it possible to assemble the matrices without having to
initilize the equation system? My goal is just to output the assembled
system matrices to files and i don't have to solve them inside libmesh.

Thanks!

David

Re: [Libmesh-users] Assemble matrices without initializing equation system?

From: Roy S. <roy...@ic...> - 2006-07-22 23:53:53

On Sat, 22 Jul 2006, David Xu wrote:

> I was trying to assemble the stiffness and mass matrices on a dense mesh
> with 531441 nodes (40x40x40, HEX27, 3rd, HERMITE)

Can I suggest trying HEX8?  The HERMITE elements are unique among our
higher order elements in that all their degrees of freedom are
topologically associated with mesh vertices, so unless you need
quadratic mapping functions you don't need HEX27 elements.  Those
nodes aren't reponsible for most of your memory use (the system matrix
is), but every megabyte helps.

> and I ran into "Out of memory" problem at the line of:
>
> equation_systems.init();
>
>
> Here's the error message:
>
> [0]PETSC ERROR: PetscMallocAlign() line 62 in src/sys/memory/mal.c
> [0]PETSC ERROR: Out of memory. This could be due to allocating
> [0]PETSC ERROR: too large an object or bleeding by not properly
> [0]PETSC ERROR: destroying unneeded objects.
> [0]PETSC ERROR: Memory allocated 1380430780 Memory used by process 886095872
> [0]PETSC ERROR: Try running with -malloc_dump or -malloc_log for info.
> [0]PETSC ERROR: Memory requested 907039528!
> [0]PETSC ERROR: PetscTrMallocDefault() line 191 in src/sys/memory/mtr.c
> [0]PETSC ERROR: MatSeqAIJSetPreallocation_SeqAIJ() line 2735 in
> src/mat/impls/aij/seq/aij.c
> [0]PETSC ERROR: MatCreateSeqAIJ() line 2621 in src/mat/impls/aij/seq/aij.c
> [0]PETSC ERROR: User provided function() line 137 in
> unknowndirectory/src/numerics/petsc_matrix.C
> [unset]: aborting job:
> application called MPI_Abort(comm=0x84000000, 1) - process 0
>
> My question is: is it possible to assemble the matrices without having to
> initilize the equation system?

I'm afraid not.  You could initialize a finite element object and
evaluate the element matrices, but putting them into the system matrix
requires that matrix and the degree of freedom structures to be
initialized, and those two things are probably what's sucking up all
your memory.

> My goal is just to output the assembled system matrices to files and
> i don't have to solve them inside libmesh.

The system matrix should have 551368 degrees of freedom, most of which
couple to 216 others.  With 8 byte coefficients that's a hundred megs
of RAM and with sparsity pattern overhead it's probably two hundred
megs... but nine hundred MB seems excessive.  Are you solving for more
than one scalar, using a system like a generalized EigenSystem that
builds more than one matrix, using complex-valued variables, or
anything else that might bump up the RAM requirements?

I'd appreciate it if you've got debugging tools that can give you a
memory breakdown by object type and you could give use such output.
It sounds like either we or PETSc might need to do a little more
optimization.

To work around your immediate problem, however: can you output the
element matrices instead of the system matrix, and assemble them
outside of libMesh?  It sounds like you're using a structured mesh,
which can require much less overhead than the unstructured mesh class
in libMesh.
---
Roy

Re: [Libmesh-users] Assemble matrices without initializing equation system?

From: David X. <dx...@my...> - 2006-07-23 00:15:52

On 7/22/06, Roy Stogner <roy...@ic...> wrote:
>
> On Sat, 22 Jul 2006, David Xu wrote:
>
> > I was trying to assemble the stiffness and mass matrices on a dense mesh
> > with 531441 nodes (40x40x40, HEX27, 3rd, HERMITE)
>
> Can I suggest trying HEX8?  The HERMITE elements are unique among our
> higher order elements in that all their degrees of freedom are
> topologically associated with mesh vertices, so unless you need
> quadratic mapping functions you don't need HEX27 elements.  Those
> nodes aren't reponsible for most of your memory use (the system matrix
> is), but every megabyte helps.


Are you saying HEX27 and HEX8 will give same level of accuracy in the
solutions if I don't need quadratic mapping functions?

> and I ran into "Out of memory" problem at the line of:
> >
> > equation_systems.init();
> >
> >
> > Here's the error message:
> >
> > [0]PETSC ERROR: PetscMallocAlign() line 62 in src/sys/memory/mal.c
> > [0]PETSC ERROR: Out of memory. This could be due to allocating
> > [0]PETSC ERROR: too large an object or bleeding by not properly
> > [0]PETSC ERROR: destroying unneeded objects.
> > [0]PETSC ERROR: Memory allocated 1380430780 Memory used by process
> 886095872
> > [0]PETSC ERROR: Try running with -malloc_dump or -malloc_log for info.
> > [0]PETSC ERROR: Memory requested 907039528!
> > [0]PETSC ERROR: PetscTrMallocDefault() line 191 in src/sys/memory/mtr.c
> > [0]PETSC ERROR: MatSeqAIJSetPreallocation_SeqAIJ() line 2735 in
> > src/mat/impls/aij/seq/aij.c
> > [0]PETSC ERROR: MatCreateSeqAIJ() line 2621 in
> src/mat/impls/aij/seq/aij.c
> > [0]PETSC ERROR: User provided function() line 137 in
> > unknowndirectory/src/numerics/petsc_matrix.C
> > [unset]: aborting job:
> > application called MPI_Abort(comm=0x84000000, 1) - process 0
> >
> > My question is: is it possible to assemble the matrices without having
> to
> > initilize the equation system?
>
> I'm afraid not.  You could initialize a finite element object and
> evaluate the element matrices, but putting them into the system matrix
> requires that matrix and the degree of freedom structures to be
> initialized, and those two things are probably what's sucking up all
> your memory.


I see.

> My goal is just to output the assembled system matrices to files and
> > i don't have to solve them inside libmesh.
>
> The system matrix should have 551368 degrees of freedom, most of which
> couple to 216 others.  With 8 byte coefficients that's a hundred megs
> of RAM and with sparsity pattern overhead it's probably two hundred
> megs... but nine hundred MB seems excessive.  Are you solving for more
> than one scalar, using a system like a generalized EigenSystem that
> builds more than one matrix, using complex-valued variables, or
> anything else that might bump up the RAM requirements?


It's for solving 2 system matrices in a real-valued generalized eigenvalue
problem. I went back and tried (30x30x30, HEX27, 3rd order HERMITE). This
time it didn't blew out the memory, but did take significantly longer time
to assemble the matrices than (30x30x30, HEX27, 2nd order LANGRANGE). Maybe
HERMITE is the problem?

I'd appreciate it if you've got debugging tools that can give you a
> memory breakdown by object type and you could give use such output.
> It sounds like either we or PETSc might need to do a little more
> optimization.


I don't have any debugging tools and I have to  admit that I'm a
below-average  C++ user.

To work around your immediate problem, however: can you output the
> element matrices instead of the system matrix, and assemble them
> outside of libMesh?  It sounds like you're using a structured mesh,
> which can require much less overhead than the unstructured mesh class
> in libMesh.


How to assemble the element matrices outside  of libmesh? Do you know any
existing code/program that can do that?  So, that would be, output each
element matrix to a file and a problem should be able to read in all the
element matrices from the file and assemble them to system matrices. I'm
definitely interested if this is doable.

Thanks,

David

Re: [Libmesh-users] Assemble matrices without initializing equation system?

From: Roy S. <roy...@ic...> - 2006-07-23 01:22:43

On Sat, 22 Jul 2006, David Xu wrote:

> On 7/22/06, Roy Stogner <roy...@ic...> wrote:
>
> Are you saying HEX27 and HEX8 will give same level of accuracy in the
> solutions if I don't need quadratic mapping functions?

Yes.

> It's for solving 2 system matrices in a real-valued generalized eigenvalue
> problem.

Okay, then the simplest workaround is clear: only build one matrix at
a time.  As long as you're just writing them both out to files
anyway, there's no reason you need them both in RAM simultaneously.

> I went back and tried (30x30x30, HEX27, 3rd order HERMITE). This
> time it didn't blew out the memory, but did take significantly
> longer time to assemble the matrices than (30x30x30, HEX27, 2nd
> order LANGRANGE). Maybe HERMITE is the problem?

Almost certainly it is.  None of our finite element classes are as
optimized as they should be, but I think the Hermite elements may be
worse than average.

Keep in mind, too, that even if they were equally optimized, the
Hermite assembly would be more expensive.  If you're using the default
quadrature order (which is designed for nonlinear problems and may be
gross overkill for you) then I think quadratic hexes will be
calculating at 27 points and cubic hexes will be calculating at 64.

> How to assemble the element matrices outside  of libmesh?

Basically you'd almost do what libMesh does: create a big empty matrix
of the appropriate sparsity pattern, then loop though all the element
matrices and add their entries after looking up the global index for
each local degree of freedom.

The only difference is that because you know you've got a uniform
grid, you could do that local->global lookup with a few equations
instead of the big data structures that general unstructured grids
require.

> Do you know any existing code/program that can do that?  So, that
> would be, output each element matrix to a file and a problem should
> be able to read in all the element matrices from the file and
> assemble them to system matrices. I'm definitely interested if this
> is doable.

It's definitely doable, but I don't know of any existing code to do
it.  I could script it up in Matlab pretty easily, but the Matlab
sparse matrix format sucks and so I wouldn't want to work with the
result.
---
Roy

Re: [Libmesh-users] Assemble matrices without initializing equation system?

From: David X. <dx...@my...> - 2006-07-23 02:31:40

On 7/22/06, Roy Stogner <roy...@ic...> wrote:
>
> On Sat, 22 Jul 2006, David Xu wrote:
>
> > On 7/22/06, Roy Stogner <roy...@ic...> wrote:
> >
> > Are you saying HEX27 and HEX8 will give same level of accuracy in the
> > solutions if I don't need quadratic mapping functions?
>
> Yes.


Just ccurious. Does  the same rule apply to other types of element? What
about Tet, Tri, Quad and Prism. So, is the level of solution accuracy
independent with the number of node within the same type of element? What
about the difference between different element types in terms of the effect
on the quality of the solutions?

> It's for solving 2 system matrices in a real-valued generalized eigenvalue
> > problem.
>
> Okay, then the simplest workaround is clear: only build one matrix at
> a time.  As long as you're just writing them both out to files
> anyway, there's no reason you need them both in RAM simultaneously.


Yes, that's a great idea.

> I went back and tried (30x30x30, HEX27, 3rd order HERMITE). This
> > time it didn't blew out the memory, but did take significantly
> > longer time to assemble the matrices than (30x30x30, HEX27, 2nd
> > order LANGRANGE). Maybe HERMITE is the problem?
>
> Almost certainly it is.  None of our finite element classes are as
> optimized as they should be, but I think the Hermite elements may be
> worse than average.
>
> Keep in mind, too, that even if they were equally optimized, the
> Hermite assembly would be more expensive.  If you're using the default
> quadrature order (which is designed for nonlinear problems and may be
> gross overkill for you) then I think quadratic hexes will be
> calculating at 27 points and cubic hexes will be calculating at 64.


That explains why the ouput filesize from hermite is much larger than
lagrange. Does that mean, even the matrix dimension is the same, but hermit
produces more entries in the matrix, thus it's less sparse?

> How to assemble the element matrices outside  of libmesh?
>
> Basically you'd almost do what libMesh does: create a big empty matrix
> of the appropriate sparsity pattern, then loop though all the element
> matrices and add their entries after looking up the global index for
> each local degree of freedom.
>
> The only difference is that because you know you've got a uniform
> grid, you could do that local->global lookup with a few equations
> instead of the big data structures that general unstructured grids
> require.
>
> > Do you know any existing code/program that can do that?  So, that
> > would be, output each element matrix to a file and a problem should
> > be able to read in all the element matrices from the file and
> > assemble them to system matrices. I'm definitely interested if this
> > is doable.
>
> It's definitely doable, but I don't know of any existing code to do
> it.  I could script it up in Matlab pretty easily, but the Matlab
> sparse matrix format sucks and so I wouldn't want to work with the
> result.


I might try it using python/numpy/scipy. Thanks for the great tips!

David

Re: [Libmesh-users] Assemble matrices without initializing equation system?

From: Roy S. <roy...@ic...> - 2006-07-23 03:02:49

On Sat, 22 Jul 2006, David Xu wrote:

> On 7/22/06, Roy Stogner <roy...@ic...> wrote:
>> 
>> On Sat, 22 Jul 2006, David Xu wrote:
>> 
>> > Are you saying HEX27 and HEX8 will give same level of accuracy in the
>> > solutions if I don't need quadratic mapping functions?
>> 
>> Yes.
>
> Just ccurious. Does  the same rule apply to other types of element? What
> about Tet, Tri, Quad and Prism. So, is the level of solution accuracy
> independent with the number of node within the same type of element?

The level of solution accuracy is independent of the number of
geometric nodes... but libMesh reuses geometric nodes to store degrees
of the freedom that have the same topological connectivity, so you
usually still need second-order nodes even if you aren't fitting a
second-order geometry.  As far as I know, the HERMITE elements and the
two discontinuous elements are the only way to get better than linear
approximations on linear geometric elements.

If you try to use finite elements on geometric elements that don't
support them, however, you won't just get reduced accuracy, your code
will exit with an error.

> What about the difference between different element types in terms
> of the effect on the quality of the solutions?

You can get better solutions (better conditioned matrices, at least)
from quadratic elements if you use a mesh smoother that takes
advantage of them.  Mostly, though, you only need higher order
geometric elements to better fit curved domain boundaries.

>> Almost certainly it is.  None of our finite element classes are as
>> optimized as they should be, but I think the Hermite elements may be
>> worse than average.
>> 
>> Keep in mind, too, that even if they were equally optimized, the
>> Hermite assembly would be more expensive.  If you're using the default
>> quadrature order (which is designed for nonlinear problems and may be
>> gross overkill for you) then I think quadratic hexes will be
>> calculating at 27 points and cubic hexes will be calculating at 64.
>
> That explains why the ouput filesize from hermite is much larger than
> lagrange.

No, it doesn't.  I'm talking about quadrature points here, and the
size of your final matrix is (with few exceptions) independent of the
quadrature rule you use to calculate it.  I can see why that's
confusing, though: by coincidence the number of quadrature points is
the same as the number of local DoFs for both elements here.

Of course, it's not just the quadrature rule that's important.
Having 64 local DoFs instead of 27 also increases calculation time.
Finally, on uniform meshes Hermite cube DoFs usually couple to 216
DoFs rather than 27, 45, or 125, which is probably what's increasing
your output file size.

> Does that mean, even the matrix dimension is the same, but hermit
> produces more entries in the matrix, thus it's less sparse?

Yes.  Increasing polynomial order requires more bandwidth, so does
increaing continuity, and going from quadratic Lagrange to cubic
Hermite does both at once.
---
Roy