Re: BZBUGS: Segfaults on AIX 4.2

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

> sorry for the long delay, but I was quite busy the last few days.
> 
> During the weekend I built the latest egcs snapshot on my linux box at home,
> and it compiled Blitz and its test programs without problems.
> The only thing that irritates me is the memory and CPU time consumption of the
> compiler:
> the typical compiler process size was 100MB; when using aggressive optimization
> I run out of virtual memory (ca. 180 MB). Maybe I should consider a memory upgrade ;)

I'm glad it works!  I got to the point of duplicating the bug (it was
crashing even for:

int main(
{
    TinyVector<int,3> x;
}

which was fairly pathetic).

> > Finite element or finite difference?
> 
> I'll use the finite volume PPM (piecewise parabolic method by  Colella &
> Woodward (JCP, 54 (1984), 174)), most likely on a 3D cartesian grid. The
> implementation I start from
> is PROMETHEUS by Bruce Fryxell which is often used for parallel computer benchmarks
> (e.g. Beowulf computers). The original code is a real mess (if you want to see
> something scary,
> just tell me so, and I'll send you a copy).
> Because I have to make some extensions for it, I think I'll rewrite the
> whole thing from scratch.

Sure, I'd appreciate a copy of the code.  I'm always on the lookout for
blitz benchmarks.

> I'll keep you informed about progress and performance, once I have a running Blitz++
> here at work.But there is one thing I'd like to know:
> 
> PROMETHEUS is full of constructs like the following:
> 
> do i=1,n
>   a(i) = b+c*d(i)
>   e(i) = f*g(i)
>   h(i) = j(i)/k(i)
> end do
> 
> The programmers always tried to create "meaty" loops.
> Using Blitz++ I would write:
> 
> a=b+c*d;
> e=f*g;
> h=j/k;
> 
> But, as far as I know, this code results in three separate loops. Does this result in
> significant
> performance loss? Or does the C++-compiler do the loop fusing by itself?

This is on my todo list for the summer (actually, I'm starting work on
it this week).  The performance issues are tricky; to summarize,

- if the arrays fit in cache, then fusing the loops causes a slowdown on
  some architectures (particularly T3E)
- if it is out-of-cache, and there are common array operands among the 
  expressions (e.g. x=a+b; y=a-b;) then it is better to fuse.  

Blitz will provide either "stencil objects" to solve this problem, or
an explicit fuse notation like:

  fuse(a=b+c*d,
       e=f*g,
       h=j/k);

or maybe both.  Stencil objects are the more elegant solution, and more OO,
but might result in wordier code.

Cheers,
Todd