Thanks for the infos.
I guess that the savings I was reporting were also due to the DofObjects
Do you store one DofObject per VariableGroup?
I guess in my usage pattern one DofObject per element would be the most
Would it be hard to do such a thing?
I think that to use add_block_matrix the local matrices should have
contiguous VariableGrops dofs, right?
The problem is that my local matrices now have contiguous dofs for each
Also, how do I get the blocked_dof_indices?
I'm going to try with log_summary to understand if the PETSc matrix has the
right structure and why a block of size 12 instead of 3 doesn't lead to
Thanks for help.
On Fri, May 31, 2013 at 3:15 PM, Kirk, Benjamin (JSC-EG311) <
> On May 31, 2013, at 4:04 AM, Lorenzo Alessio Botti <
> lorenzoalessiobotti@...> wrote:
> > I don't know if it is the "correct" way to do it but the memory
> footprint has not changed.
> > I'm also interested in knowing if the memory savings take place on the
> libMesh side or on the PETSc side, or on both sides.
> That's what I would have done, for what it's worth.
> The memory savings with this most recent patch is purely on the PETSc
> side. Specifically:
> - Back in December when we introduced "VariableGroups" there was an
> appreciable memory savings in the DofObject by reducing the number of
> variable indices that we store. The forward plan is to also optimize the
> DofMap for the case of VariableGroups, where sparsity pattern generation
> could be more efficient. However, this is not done yet. We still generate
> the sparsity pattern and constraints at the per-DOF level.
> - This most recent change boils down to just this, in petsc_matrix.C:
> #ifdef LIBMESH_ENABLE_BLOCKED_STORAGE
> if (blocksize > 1)
> // specified blocksize, bs>1.
> // double check sizes.
> libmesh_assert_equal_to (m_local % blocksize, 0);
> libmesh_assert_equal_to (n_local % blocksize, 0);
> libmesh_assert_equal_to (m_global % blocksize, 0);
> libmesh_assert_equal_to (n_global % blocksize, 0);
> ierr = MatSetType(_mat, MATBAIJ); // Automatically chooses seqbaij
> or mpibaij
> ierr = MatSetBlockSize(_mat, blocksize);
> // transform the per-entry n_nz and n_oz arrays into their block
> std::vector<numeric_index_type> b_n_nz, b_n_oz;
> transform_preallocation_arrays (blocksize,
> n_nz, n_oz,
> b_n_nz, b_n_oz);
> ierr = MatSeqBAIJSetPreallocation(_mat, blocksize, 0,
> ierr = MatMPIBAIJSetPreallocation(_mat, blocksize,
> The function 'transform_preallocation_arrays() is defined in that file and
> simply takes the usual per-dof n_nz, n_oz libMesh-computed arrays and
> transforms them into their per-block counterpart.
> So you see the most recent memory savings is almost entirely PETSc. Try
> running with the -log_summary command line option too…
> For a sparse matrix, PETSc must store the graph of it as well. Without
> blocked DOFS, the size of the graph is the same as the number of nonzeros
> in the matrix. With blocked DOFs, each graph entry represents an NB*NB
> dense matrix. So there is a memory savings there for sure, but at some
> point you'll still be dominated by storing the coefficients. Maybe that's
> already happening in your case?
> The other thing this allows for is more efficient element matrix
> insertion. Rather than traversing the graph at the scalar value level it
> can be done at the block index level. This is accessed via libmesh using
> jacobian->add_block_matrix (dRdUe, blocked_dof_indices);
> I haven't advertised this much yet…