From: Todd V. <tve...@oo...> - 1998-06-22 18:39:36
|
> sorry for the long delay, but I was quite busy the last few days. > > During the weekend I built the latest egcs snapshot on my linux box at home, > and it compiled Blitz and its test programs without problems. > The only thing that irritates me is the memory and CPU time consumption of the > compiler: > the typical compiler process size was 100MB; when using aggressive optimization > I run out of virtual memory (ca. 180 MB). Maybe I should consider a memory upgrade ;) I'm glad it works! I got to the point of duplicating the bug (it was crashing even for: int main( { TinyVector<int,3> x; } which was fairly pathetic). > > Finite element or finite difference? > > I'll use the finite volume PPM (piecewise parabolic method by Colella & > Woodward (JCP, 54 (1984), 174)), most likely on a 3D cartesian grid. The > implementation I start from > is PROMETHEUS by Bruce Fryxell which is often used for parallel computer benchmarks > (e.g. Beowulf computers). The original code is a real mess (if you want to see > something scary, > just tell me so, and I'll send you a copy). > Because I have to make some extensions for it, I think I'll rewrite the > whole thing from scratch. Sure, I'd appreciate a copy of the code. I'm always on the lookout for blitz benchmarks. > I'll keep you informed about progress and performance, once I have a running Blitz++ > here at work.But there is one thing I'd like to know: > > PROMETHEUS is full of constructs like the following: > > do i=1,n > a(i) = b+c*d(i) > e(i) = f*g(i) > h(i) = j(i)/k(i) > end do > > The programmers always tried to create "meaty" loops. > Using Blitz++ I would write: > > a=b+c*d; > e=f*g; > h=j/k; > > But, as far as I know, this code results in three separate loops. Does this result in > significant > performance loss? Or does the C++-compiler do the loop fusing by itself? This is on my todo list for the summer (actually, I'm starting work on it this week). The performance issues are tricky; to summarize, - if the arrays fit in cache, then fusing the loops causes a slowdown on some architectures (particularly T3E) - if it is out-of-cache, and there are common array operands among the expressions (e.g. x=a+b; y=a-b;) then it is better to fuse. Blitz will provide either "stencil objects" to solve this problem, or an explicit fuse notation like: fuse(a=b+c*d, e=f*g, h=j/k); or maybe both. Stencil objects are the more elegant solution, and more OO, but might result in wordier code. Cheers, Todd |