Re: [Algorithms] Octree\KD tree Generation for better cache coherence...

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

RE: [Algorithms] Octree\KD tree Generation for better cache =
coherence...Well, I would say your best bet for cache coherance is to =
actually optimise for size. But it doesn't really matter that much, =
because the trees don't have that many nodes, and traversal doesn't =
happen often (unless you have a raytracer).=20

Kd-tree tends to be the better I find (maybe not in terms of storage, =
but in terms of speed/flexibility).=20

If you don't wanna fragment RAM, the easiest way would be to just =
allocate a chunk, and use your own memory handler. Although, the easiest =
way to write a kd-tree or octree generation system tends to be with =
recursive constructors.=20

When it comes to in order traversal, both systems can be used to easily =
traverse in order (although, for frustum culling it isn't needed). A =
kd-tree is the same to do back to front as a BSP, except all your planes =
to test are axis aligned (means you only need a single compare to =
traverse).=20

Front to back for a kd-tree (sorta pseudo cody stuff warning):

class kdtreenode {
    float m_partition;
    UINT8 m_axii;
    kdtreenode* m_children[2];
public:
    void Traverse(float* camera,FifoStack* fs);
};

void kdtreenode::Traverse(float* camera,FifoStack* fs) {
    if (m_children[0]) { // Test leaf
        // Not leaf, traverse further
        UINT8 travtest=3D(m_partition>camera[m_axii]);
        m_children[travtest]->Traverse(camera);
        m_children[travtest^1]->Traverse(camera);
    } else {
        // Is leaf, put on stack (in order)=20
        fs->Push(this);       =20
    }
}

An octree can be traversed similarily, although you do 3 axii tests at =
once, and use binary logic for the sort.=20

Conor Stokes

  ----- Original Message -----=20
  From: Chris Brodie=20
  To: 'gda...@li...'=20
  Sent: Thursday, August 31, 2000 8:57 AM
  Subject: RE: [Algorithms] Octree\KD tree Generation for better cache =
coherence...

  (Apologies for the HTML. My mail administrators are still working on =
it. Something to do with the mail system ignoring the client settings. =
If this goes on much longer I'll switch to a native SMTP client and talk =
to a unix box.)

  I've been thinking recently about how to best structure the data in =
ram to make best use of my cache. Initially I was thinking about a =
tightly packed array that stores the data\splitting planes =
hierarchically top down. Using this method I could use pointer math to =
move around within the tree's data, which I'm not sure is a good thing =
or not from a performance point.

  The other idea I had recently is to store my tree in traversal order. =
Obviously this wouldn't be optimised for every situation but the more =
traversal that you do the greater you efficiency would be as all the =
data will likely be in order until you start skipping hunks of the tree. =

  I'm guessing that this is probably one or the more important things to =
optimise in an engine. The only source I've ever seen to do with these =
kinds of tree's was on flipcode(ask midnight) and that used dynamic =
memory allocation for storage.=20

  One last question. When you perform frustum culling to get your list =
of visible nodes do you use a special traversal technique to get a front =
to back order or do you just read them in any order then blast them =
through a radix sort. I have seen some references to techniques that do =
this but I'm not sure if the unusual traversal order would be worse on =
the cache than just reading them in as you find them and then doing a =
fast sort. In my case I'm only expecting to find perhaps 150-250 nodes =
at most.

  Many thanks=20

  Chris=20