Re: [Algorithms] Cache-coherent hashing for broad-phase collision detection

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

You definitely do not want to ever be using a set or a map.  The only reason
to use a set or a map is if you absolutely need to constantly maintain
something in sorted order; which is rarely ever the case.  Most of the time
you just want to do a key/value lookup and for that a hash table is
extremely fast.  Using the STL in performance sensitive code is also
questionable as a whole lot of memory allocations and copy constructors get
executed both of which you want to avoid.

If you are going to use the STL, and then have a profile of many tiny
allocations, you should consider using a custom small memory allocator;
there are several implementations available.  If you are dealing in a
heavily multi-threaded environment you need to take that into consideration
as well.  Something like the Hoard memory allocator is not only designed to
handle many tiny allocations efficiently but also minimizes thread
contention.

John

On Mon, Mar 22, 2010 at 6:05 PM, Sebastian Sylvan <
seb...@gm...> wrote:

>
>
> On Mon, Mar 22, 2010 at 9:49 PM, Darren Grant <
> dg...@ke...> wrote:
>
>> Hi,
>>
>> I am looking at ways of improving cache coherence for a typical
>> broad-phase collision detection scenario (cpu).  The basic
>> implementation utilizes these data structures:
>>
>>        struct ObjectTracking {
>>                int ObjectIndex;
>>                int RefCount;
>>        };
>>
>>        typedef set<ObjectTracking> Tile;
>>
>>        struct TilingLevel {
>>                vector<Tile> Tiles;
>>                int Pitch;
>>                Tile& Map(int x, int y) { return Tiles[y*Pitch+x]; }
>>        };
>>
>>        struct TilingHierarchy {
>>                vector<TilingLevel> TilingLevels;
>>                Tile& Map(int level, int x, int y)  { return
>> TilingLevels[level].Map(x,y); }
>>        };
>>
>>
>> And the hot process involves frequent iteration of:
>>
>>        for level from 0 to maxlevel
>>        for y from y0(level) to y1(level)
>>        for x from x0(level) to x1(level)
>>                tilingHierarchy.Map(level,x,y).InsertOrRemove();
>>
>>
>> Performance becomes an issue here when the std::set allocations get
>> spread out across process memory.  I will therefore replace
>> individual sets with a single shared map to explore implementations
>> for coherence:
>>
>>        (level,x,y,ObjectIndex)->(RefCount).
>>
>>
>> (level,x,y,ObjectIndex) has simple perfect hash functions since the
>> boundary conditions are known at start.  Local memory access roughly
>> favours tiling levels > tiling level > tile row > tile according to
>> the given code, but I am wary of less obvious yet still important
>> observations that may be missed.
>>
>>
>> Does this sound about right?  Are there any better options I have
>> flown right past?
>>
>> Thank you.
>
>
> Try a single hash table (multimap) mapping from the cell position to
> objects in that cell position? You could try open adressing to get rid of
> pointers altogether and have just a single flat array at the bottom of your
> data structure, although that has limitations (difficulty deleting, less
> than gracious performance degradation as the load factor grows).
>
> --
> Sebastian Sylvan
>
>
> ------------------------------------------------------------------------------
> Download Intel&#174; Parallel Studio Eval
> Try the new software tools for yourself. Speed compiling, find bugs
> proactively, and fine-tune applications for parallel performance.
> See why Intel Parallel Studio got high marks during beta.
> http://p.sf.net/sfu/intel-sw-dev
> _______________________________________________
> GDAlgorithms-list mailing list
> GDA...@li...
> https://lists.sourceforge.net/lists/listinfo/gdalgorithms-list
> Archives:
> http://sourceforge.net/mailarchive/forum.php?forum_name=gdalgorithms-list
>