Lorenzo, Roy, Derek and Jed

Thanks a lot for your comments. To be honest, I didn't think about the cache missing before. I know it is very important for a scientific code. I just don't know how to think about it when I write the code. I will do some experiments to compare the results based on computing every cheap thing when needed and store everything needed.

The other question is the element iterator. When we write ++iter, what happened behind this operator? Does the current element need to climb along the quad forest and go down to the lowest child? Can we just construct a map with all active elementIndex -> element * mappings? and directly use this map or vector to loop the elements.

I know there is no anisotropic refinement in  the lib. How much effort I need to put to add anisotropic? I mean how many percent the current code based on those isotropic assumption?  Thanks a lot. I really appreciate the
opportunity I can talk to you guys on those topics.

Sincerely Yours,

Lei Shi
----------


On Wed, Aug 29, 2012 at 12:52 PM, Jed Brown <jed@59a2.org> wrote:
I'll also comment that aggressive caching is pessimal from a modern-hardware and energy perspective. Memory and memory bandwidth take an increasing part of the power and acquisition budget. On modern hardware, for operations that can be vectorized, you should usually plan to recompute everything that takes less than 50 flops per scalar value. If you have a computation that takes more than 50 flops to recompute, you may want to store it, but be aware that reloading it may displace more useful cache lines and if you aren't careful about cache locality (e.g. if the element is visited by a thread running on a different socket later), performance results may be very bad and/or not reproducible.

Lei, I suggest being very conservative when selecting what may be worth caching. Also, depending on your application, there may be much larger gains to be had by looking elsewhere.


On Wed, Aug 29, 2012 at 12:04 PM, Derek Gaston <friedmud@gmail.com> wrote:
With MOOSE we have the full gamut of options.  The default is to
recompute everything.  We also have the option to cache one FE reinit
in the case where you have a regular grid.  We also have the option of
caching EVERY FE reinit for every element so that you can reuse them
until the mesh changes.

I'll say that that last one is not used very often... it eats up SO
much damn memory... especially in the cases where it would be useful
(like on higher order elements in 3D).  So it's kind of a
self-defeating optimization...

Derek

Sent from my iPhone

On Aug 29, 2012, at 5:56 PM, Roy Stogner <roystgnr@ices.utexas.edu> wrote:

>
> On Wed, 29 Aug 2012, Lorenzo Alessio Botti wrote:
>
>> You can always store quantities that you need several times but this
>> might eat up a lot of memory. Accessing to memory might be expensive
>> and sometimes it might be even faster to recompute.
>
> This can absolutely be true - on modern processors some problems end
> up, even in the assembly, bottlenecked by memory bandwidth rather than
> CPU speed.  At that point anything that you can compute based on data
> that's already in the processor cache is "free".
>
>> The libMesh approach is to recompute everything.
>
> That's the libMesh examples approach, anyway, but that's intended to
> keep the examples simple more than anything else, I thought.  When
> profiling real applications I've found that finding ways to cache
> transcendental function evaluations (e.g. by doing nodal quadrature or
> reinterpolation of such terms) can give a decent savings in runtime.
>
> Of course efficiency discussion is influenced by the fact that all the
> main libMesh developers are profiling based on our own app codes:
> implicit solves with intricate constitutive models.  If you're doing
> an explicit calculation then it probably makes sense to cache even
> mappings and shape functions; if you're solving a linear problem then
> you could do much better with Rob Kirby's element matrix
> transformation tricks than with plain quadrature; etc.
> ---
> Roy
>
> ------------------------------------------------------------------------------
> Live Security Virtual Conference
> Exclusive live event will cover all the ways today's security and
> threat landscape has changed and how IT managers can respond. Discussions
> will include endpoint security, mobile security and the latest in malware
> threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
> _______________________________________________
> Libmesh-users mailing list
> Libmesh-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/libmesh-users

------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and
threat landscape has changed and how IT managers can respond. Discussions
will include endpoint security, mobile security and the latest in malware
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Libmesh-users mailing list
Libmesh-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/libmesh-users