Lorenzo, Roy, Derek and Jed
Thanks a lot for your comments. To be honest, I didn't think about the
cache missing before. I know it is very important for a scientific code. I
just don't know how to think about it when I write the code. I will do some
experiments to compare the results based on computing every cheap thing
when needed and store everything needed.
The other question is the element iterator. When we write ++iter, what
happened behind this operator? Does the current element need to climb along
the quad forest and go down to the lowest child? Can we just construct a
map with all active elementIndex -> element * mappings? and directly use
this map or vector to loop the elements.
I know there is no anisotropic refinement in the lib. How much effort I
need to put to add anisotropic? I mean how many percent the current code
based on those isotropic assumption? Thanks a lot. I really appreciate the
opportunity I can talk to you guys on those topics.
On Wed, Aug 29, 2012 at 12:52 PM, Jed Brown <jed@...> wrote:
> I'll also comment that aggressive caching is pessimal from a
> modern-hardware and energy perspective. Memory and memory bandwidth take an
> increasing part of the power and acquisition budget. On modern hardware,
> for operations that can be vectorized, you should usually plan to recompute
> everything that takes less than 50 flops per scalar value. If you have a
> computation that takes more than 50 flops to recompute, you may want to
> store it, but be aware that reloading it may displace more useful cache
> lines and if you aren't careful about cache locality (e.g. if the element
> is visited by a thread running on a different socket later), performance
> results may be very bad and/or not reproducible.
> Lei, I suggest being very conservative when selecting what may be worth
> caching. Also, depending on your application, there may be much larger
> gains to be had by looking elsewhere.
> On Wed, Aug 29, 2012 at 12:04 PM, Derek Gaston <friedmud@...> wrote:
>> With MOOSE we have the full gamut of options. The default is to
>> recompute everything. We also have the option to cache one FE reinit
>> in the case where you have a regular grid. We also have the option of
>> caching EVERY FE reinit for every element so that you can reuse them
>> until the mesh changes.
>> I'll say that that last one is not used very often... it eats up SO
>> much damn memory... especially in the cases where it would be useful
>> (like on higher order elements in 3D). So it's kind of a
>> self-defeating optimization...
>> Sent from my iPhone
>> On Aug 29, 2012, at 5:56 PM, Roy Stogner <roystgnr@...>
>> > On Wed, 29 Aug 2012, Lorenzo Alessio Botti wrote:
>> >> You can always store quantities that you need several times but this
>> >> might eat up a lot of memory. Accessing to memory might be expensive
>> >> and sometimes it might be even faster to recompute.
>> > This can absolutely be true - on modern processors some problems end
>> > up, even in the assembly, bottlenecked by memory bandwidth rather than
>> > CPU speed. At that point anything that you can compute based on data
>> > that's already in the processor cache is "free".
>> >> The libMesh approach is to recompute everything.
>> > That's the libMesh examples approach, anyway, but that's intended to
>> > keep the examples simple more than anything else, I thought. When
>> > profiling real applications I've found that finding ways to cache
>> > transcendental function evaluations (e.g. by doing nodal quadrature or
>> > reinterpolation of such terms) can give a decent savings in runtime.
>> > Of course efficiency discussion is influenced by the fact that all the
>> > main libMesh developers are profiling based on our own app codes:
>> > implicit solves with intricate constitutive models. If you're doing
>> > an explicit calculation then it probably makes sense to cache even
>> > mappings and shape functions; if you're solving a linear problem then
>> > you could do much better with Rob Kirby's element matrix
>> > transformation tricks than with plain quadrature; etc.
>> > ---
>> > Roy
>> > Live Security Virtual Conference
>> > Exclusive live event will cover all the ways today's security and
>> > threat landscape has changed and how IT managers can respond.
>> > will include endpoint security, mobile security and the latest in
>> > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
>> > _______________________________________________
>> > Libmesh-users mailing list
>> > Libmesh-users@...
>> > https://lists.sourceforge.net/lists/listinfo/libmesh-users
>> Live Security Virtual Conference
>> Exclusive live event will cover all the ways today's security and
>> threat landscape has changed and how IT managers can respond. Discussions
>> will include endpoint security, mobile security and the latest in malware
>> threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
>> Libmesh-users mailing list