Thank you all for the details. All of these ideas are useful, but I think
for my application I will have to build all three matrices simultaneously.
I will describe a little more. I am writing an image renderer and the FEM
code simulates light transfer in a material using a diffusion process. The
three solutions are the three different colors of light the renderer
simulates. The matrix assembly requires a lighting simulation performed by
another library. This simulation is significantly more expensive than the
integration cost and other elements of the assembly. This library produces
the all three color elements together and the only practical choice for me is
to cache the three color values need for each matrix element until all the
solutions are generated. This storage is close to the size of the matrix so
I just create three slightly different copies of the matrix and solve them in
parallel. Your serial methods would certainly avoid duplication, but I can't
really avoid it. However, because I know I must have significant
duplication, I want to decrease that footprint as much as possible. In my
initial testing with other libraries, I will likely need grids with as many
as 1 million elements. My goal for a libmesh implementation is to have as
minimal overhead as possible which is why I was hopeful about avoiding DofMap
That said can you give me a sense of how large the DofMap would be for an AMR
application, say relative the matrix storage? Also, if you were going to try
as quickly as possible, but were willing to do what was necessary to avoid
the duplication, what would be the way to start?
I also want to position my expertise, I am very comfortable with the
programming and such and much less a FEM person, I just really need this work
in the next couple of months.
Thanks for all the help.
> -----Original Message-----
> From: Roy Stogner [mailto:roystgnr@...]
> Sent: Wednesday, October 01, 2008 11:41 AM
> To: John Peterson
> Cc: Adam Arbree; libmesh-users@...
> Subject: Re: [Libmesh-users] (no subject)
> On Wed, 1 Oct 2008, John Peterson wrote:
> > Static may not even be required. I think a single system with 3
> > different matrix assembly "branches" is what you want.
> John is probably exactly right. In the worst case, if one of your
> systems has a constant matrix, having to reassemble it over and over
> again will waste a few CPU cycles... but even then it may be worth it
> to save the memory cost of the entire matrix.
> > for (qp)
> > for (i)
> > for (j)
> > switch (system_num) // A variable you define to order the 3 systems
> > case 1:
> > K(i,j) += something; break;
> > case 2:
> > K(i,j) += something else; break;
> > end switch
> > end j
> > end i
> > end qp
> I think putting the switch statement outside the for loops is slightly
> more efficient, if matrix assembly time ever becomes a noticeable
> issue (as it did for one app of mine).
> > I've done this in the past to assemble different right-hand side
> > vectors depending on whether I was solving the main system or an
> > auxiliary arclength continuation system of equations. It should be
> > straightforward to extend that to the matrix assembly itself as in the
> > pseudocode above.
> It is straightforward, and I have no excuse for not thinking of it
> and mentioning it myself. From my first shear thinning flow code:
> void assemble_stokes (EquationSystems& es,
> const std::string& system_name)
> if (find_vorticity)
> assemble_vorticity(es, system_name);
> assemble_stokes() built the residual and jacobian of the Navier-Stokes
> system to solve; assemble_vorticity built a linear system to project
> the vorticity into the same smooth space for plotting.