Thread: changes fix #2
Status: Inactive
Brought to you by:
madduck
From: Paul <ele...@ya...> - 2004-11-05 05:21:17
Attachments:
kdtree_fix2.patch
|
Don't think i attached it ... Hi, Thought I'd better run the changes past you before I checked them in. Attached is the diff. Let me know if you are happy with it, I'd love to check it in myself. Lots of changes... to make your old code work with the new library, the changes you will need to do include: - find_within_range() doesn't return a size_t anymore (thats a previous patch) - its optimize() not optimise() (read on for more about why) The changes: Looked at std::set and std::list, and the stl documentation for stuff on associative container concepts, sequence concepts etc. Not sure exactly what applies here, but I just added support for what looked ok. Theres now a full set of constructors, including default and range. You can also pass an allocator (STL does, I never use the feature... not yet anyway). Fill constructors are obviously not appropriate for this one. Note that this probably is closest to a priority_queue, which is actually an adapted vector<>. They don't seem to stick to a single concept specifically, so I guess we have some room to move. Range constructors are great. Just do Tree tree(begin,end); and its filled and ready to go... after an optimisation. Theres also now a full set of inserters. This allows inserter(cont,cont.begin()) to work. Note the 'position' iterator is always disregarded (at least at this stage). It might in the future be helpful for insertion hints (like in set), but that is Future with a capital F. Not now. I also removed the requirement for types to have default-construction. Previously for a type to work with kdtree++, it required a Type() style constructor. This is unnecessary as demonstrated by set's constructor. I also had to slightly change the way it was destroyed. I just followed the example in std::set. This also fixed problems demonstrated by: KDTree<3,Point> tree; assert( tree.begin() == tree.end() ); // this would fail !! tree.optimise(); // segmentation fault!! I readded the 'iterator' type. set uses iterator instead of const_iterator in all of its definitions except for find() etc. Both are defined as the same type. Scott Meyers (i think?) recommended not using const_iterator too much, as its a pain if you want to use it in some algorithms (which require iterators for whatever reason)... whatever, we should try and follow what others do - everything except find() uses iterator. I removed 'inline' declarations in front of method calls. They are not required... IIRC inline methods are more like hints to the compiler, 'inline' is much more useful for free functions, which will create multiple implementation instances without the inline (and cause all sorts of link collisions). The compiler will figure out if its best to inline or not, its hard for us to know best. an example is std::set. the only inline functions are things like operator==() and in stl_tree.h: _Rb_tree_rebalance(). These are both free functions and must be inlined if you want to keep them in a header file. I added a count_within_range() method for simply counting the number of items inside a range. This is more efficient than doing a find if you just want a count, since the values aren't written to an output iterator. I added this-> to some of the method calls. The STL does this, presumably to avoid confusing the calls with free-function calls. This is also probably why the STL 'reserves' function/method/variable names starting with __ I am torn between american-ising the spelling, and not. As an aussie, i get a bit pissed off when i have to continuously spell in american, or mix spellings. I bet non-english speakers feel the same way. However, for the sake of consistency, its best to americanise. americanize. So i renamed optimise(). I hate the 'z', so if you insist on the old spelling, I won't be disappointed, but I bet we would have to switch back eventually. Until Australian becomes the world's standard language. I renamed _M_meta to _M_header because they call it that in set<>. I noticed rend() returned reverse_iterator(end()). It should return begin(). In optimize(), I use a range constructor instead of a copy(). This will be faster. Things that could still possibly be done: - Improve operator=() and copy-constructor... theres a faster way of copying the tree across. - I looked at erase(), It would be nice to have a fast rebalancer like set, but right now thats asking too much. I was using erase() to average and group points in the tree. I would average the first point and erase the items that were averaged, eventually the tree was empty. It worked ok, but it was O(nlogn), and a bad one. Instead, I wrote some code that 'tagged' each item with an id, and checked a set<size_t> registry to ignore the points I had ignored. MUCH much faster, and while there is an O(nlogn) in the set<> find, its a much smaller hit. well, thats all for now. Paul |
From: Paul <ele...@ya...> - 2004-11-05 02:22:39
|
Hi, Thought I'd better run the changes past you before I checked them in. Attached is the diff. Let me know if you are happy with it, I'd love to check it in myself. Lots of changes... to make your old code work with the new library, the changes you will need to do include: - find_within_range() doesn't return a size_t anymore (thats a previous patch) - its optimize() not optimise() (read on for more about why) The changes: Looked at std::set and std::list, and the stl documentation for stuff on associative container concepts, sequence concepts etc. Not sure exactly what applies here, but I just added support for what looked ok. Theres now a full set of constructors, including default and range. You can also pass an allocator (STL does, I never use the feature... not yet anyway). Fill constructors are obviously not appropriate for this one. Note that this probably is closest to a priority_queue, which is actually an adapted vector<>. They don't seem to stick to a single concept specifically, so I guess we have some room to move. Range constructors are great. Just do Tree tree(begin,end); and its filled and ready to go... after an optimisation. Theres also now a full set of inserters. This allows inserter(cont,cont.begin()) to work. Note the 'position' iterator is always disregarded (at least at this stage). It might in the future be helpful for insertion hints (like in set), but that is Future with a capital F. Not now. I also removed the requirement for types to have default-construction. Previously for a type to work with kdtree++, it required a Type() style constructor. This is unnecessary as demonstrated by set's constructor. I also had to slightly change the way it was destroyed. I just followed the example in std::set. This also fixed problems demonstrated by: KDTree<3,Point> tree; assert( tree.begin() == tree.end() ); // this would fail !! tree.optimise(); // segmentation fault!! I readded the 'iterator' type. set uses iterator instead of const_iterator in all of its definitions except for find() etc. Both are defined as the same type. Scott Meyers (i think?) recommended not using const_iterator too much, as its a pain if you want to use it in some algorithms (which require iterators for whatever reason)... whatever, we should try and follow what others do - everything except find() uses iterator. I removed 'inline' declarations in front of method calls. They are not required... IIRC inline methods are more like hints to the compiler, 'inline' is much more useful for free functions, which will create multiple implementation instances without the inline (and cause all sorts of link collisions). The compiler will figure out if its best to inline or not, its hard for us to know best. an example is std::set. the only inline functions are things like operator==() and in stl_tree.h: _Rb_tree_rebalance(). These are both free functions and must be inlined if you want to keep them in a header file. I added a count_within_range() method for simply counting the number of items inside a range. This is more efficient than doing a find if you just want a count, since the values aren't written to an output iterator. I added this-> to some of the method calls. The STL does this, presumably to avoid confusing the calls with free-function calls. This is also probably why the STL 'reserves' function/method/variable names starting with __ I am torn between american-ising the spelling, and not. As an aussie, i get a bit pissed off when i have to continuously spell in american, or mix spellings. I bet non-english speakers feel the same way. However, for the sake of consistency, its best to americanise. americanize. So i renamed optimise(). I hate the 'z', so if you insist on the old spelling, I won't be disappointed, but I bet we would have to switch back eventually. Until Australian becomes the world's standard language. I renamed _M_meta to _M_header because they call it that in set<>. I noticed rend() returned reverse_iterator(end()). It should return begin(). In optimize(), I use a range constructor instead of a copy(). This will be faster. Things that could still possibly be done: - Improve operator=() and copy-constructor... theres a faster way of copying the tree across. - I looked at erase(), It would be nice to have a fast rebalancer like set, but right now thats asking too much. I was using erase() to average and group points in the tree. I would average the first point and erase the items that were averaged, eventually the tree was empty. It worked ok, but it was O(nlogn), and a bad one. Instead, I wrote some code that 'tagged' each item with an id, and checked a set<size_t> registry to ignore the points I had ignored. MUCH much faster, and while there is an O(nlogn) in the set<> find, its a much smaller hit. well, thats all for now. Paul |
From: martin f k. <ma...@ma...> - 2004-11-06 15:22:32
|
also sprach Paul <ele...@ya...> [2004.11.05.0320 +0100]: > Thought I'd better run the changes past you before I checked them in.=20 Ah, lovely fucking sourceforge crap. Sorry for my French there, but they are highly annoying. I shall move this library to alioth.debian.org soon. > Looked at std::set and std::list, and the stl documentation for > stuff on associative container concepts, sequence concepts etc. > Not sure exactly what applies here, but I just added support for > what looked ok. Cool. > always disregarded (at least at this stage). It might in the > future be helpful for insertion hints (like in set), but that is > Future with a capital F. Not now. I checked out insertion hints, but they are impossible to manage, as the tree itself is not sorted across the contained elements, only down a single root->leaf walk. > I also removed the requirement for types to have > default-construction. Previously for a type to work with kdtree++, > it required a Type() style constructor. This is unnecessary as > demonstrated by set's constructor. Could you elaborate on this, please? > I readded the 'iterator' type. set uses iterator instead of > const_iterator in all of its definitions except for find() etc. > Both are defined as the same type. Scott Meyers (i think?) > recommended not using const_iterator too much, as its a pain if > you want to use it in some algorithms (which require iterators for > whatever reason)... whatever, we should try and follow what others > do - everything except find() uses iterator. Mh, I would be interested in the reference. Scott Meyers says many things, and I surely don't approve of all. If algorithms use iterators without modifying the element pointed to, then it's a bug in the algorithm, not in kdtree. I also think that following the others is the way to go... unless I have to implement all the whacko bugs and hacks the STL contains. > I removed 'inline' declarations in front of method calls. They > are not required... True, but I always like to make things explicit. In this case, however, I am willing to go along without further discussion. > I added a count_within_range() method for simply counting the > number of items inside a range. This is more efficient than doing > a find if you just want a count, since the values aren't written > to an output iterator. Yeah, nice! > I added this-> to some of the method calls. The STL does this, > presumably to avoid confusing the calls with free-function calls. > This is also probably why the STL 'reserves' > function/method/variable names starting with __ Yes, precisely. Thanks for catching these. > I am torn between american-ising the spelling, and not.=20 As I said, let's do both. I don't want the API to change right now. Let's add optimize() to call optimise() and add some snide comment in there about how British English is superior :) > I renamed _M_meta to _M_header because they call it that in set<>. Fine. > I noticed rend() returned reverse_iterator(end()). It should > return begin(). Oops. > - Improve operator=3D() and copy-constructor... theres a faster way > of copying the tree across. Certainly. > - I looked at erase(), It would be nice to have a fast rebalancer > like set, but right now thats asking too much. Ha! Have a ball. I tore out my hair for two days on this. I may well simply have overlooked it though. If you figure something out, you know it's worth an ACM or IEEE paper! Again, Paul, thanks for your help. --=20 martin; (greetings from the heart of the sun.) \____ echo mailto: !#^."<*>"|tr "<*> mailto:" net@madduck =20 invalid/expired pgp subkeys? use subkeys.pgp.net as keyserver! spamtraps: mad...@ma... =20 "if i am occasionally a little overdressed, i make up for it by being always immensely over-educated." -- oscar wilde |