[Opentrep-svn] SF.net SVN: opentrep:[126] trunk/opentrep
Status: Beta
Brought to you by:
denis_arnaud
From: <den...@us...> - 2009-07-14 10:45:31
|
Revision: 126 http://opentrep.svn.sourceforge.net/opentrep/?rev=126&view=rev Author: denis_arnaud Date: 2009-07-14 10:45:19 +0000 (Tue, 14 Jul 2009) Log Message: ----------- [Ternary Trees] Added the ternary trees structure. Added Paths: ----------- trunk/opentrep/ternary_tree/ trunk/opentrep/ternary_tree/README trunk/opentrep/ternary_tree/doxygen_input/ trunk/opentrep/ternary_tree/doxygen_input/blather.hpp trunk/opentrep/ternary_tree/doxygen_input/concepts.txt trunk/opentrep/ternary_tree/doxygen_input/doxygen-old.css trunk/opentrep/ternary_tree/doxygen_input/doxygen.css trunk/opentrep/ternary_tree/doxygen_input/external.png trunk/opentrep/ternary_tree/doxygen_input/featuretable.html trunk/opentrep/ternary_tree/doxygen_input/footer_inc.html trunk/opentrep/ternary_tree/doxygen_input/header_inc.html trunk/opentrep/ternary_tree/doxygen_input/performancetable.html trunk/opentrep/ternary_tree/doxygen_input/tree - trie concepts.txt trunk/opentrep/ternary_tree/doxygen_input/usage.hpp trunk/opentrep/ternary_tree/examples/ trunk/opentrep/ternary_tree/examples/examples.vcproj trunk/opentrep/ternary_tree/examples/locale_less.hpp trunk/opentrep/ternary_tree/examples.cpp trunk/opentrep/ternary_tree/fill_dictionary.cpp trunk/opentrep/ternary_tree/full-docs-index.html trunk/opentrep/ternary_tree/html/ trunk/opentrep/ternary_tree/html/annotated.html trunk/opentrep/ternary_tree/html/blather_8hpp.html trunk/opentrep/ternary_tree/html/class_data_t.html trunk/opentrep/ternary_tree/html/class_data_t_01_5.html trunk/opentrep/ternary_tree/html/classcontainers_1_1search__results__list-members.html trunk/opentrep/ternary_tree/html/classcontainers_1_1search__results__list.html trunk/opentrep/ternary_tree/html/classcontainers_1_1search__results__list_1_1iterator-members.html trunk/opentrep/ternary_tree/html/classcontainers_1_1search__results__list_1_1iterator.html trunk/opentrep/ternary_tree/html/classcontainers_1_1structured__map-members.html trunk/opentrep/ternary_tree/html/classcontainers_1_1structured__map.html trunk/opentrep/ternary_tree/html/classcontainers_1_1structured__map_1_1value__compare-members.html trunk/opentrep/ternary_tree/html/classcontainers_1_1structured__map_1_1value__compare.html trunk/opentrep/ternary_tree/html/classcontainers_1_1structured__multimap-members.html trunk/opentrep/ternary_tree/html/classcontainers_1_1structured__multimap.html trunk/opentrep/ternary_tree/html/classcontainers_1_1structured__multimap_1_1value__compare-members.html trunk/opentrep/ternary_tree/html/classcontainers_1_1structured__multimap_1_1value__compare.html trunk/opentrep/ternary_tree/html/classcontainers_1_1structured__multiset-members.html trunk/opentrep/ternary_tree/html/classcontainers_1_1structured__multiset.html trunk/opentrep/ternary_tree/html/classcontainers_1_1structured__set-members.html trunk/opentrep/ternary_tree/html/classcontainers_1_1structured__set.html trunk/opentrep/ternary_tree/html/classcontainers_1_1ternary__tree-members.html trunk/opentrep/ternary_tree/html/classcontainers_1_1ternary__tree.html trunk/opentrep/ternary_tree/html/classcontainers_1_1ternary__tree_1_1key__compare-members.html trunk/opentrep/ternary_tree/html/classcontainers_1_1ternary__tree_1_1key__compare.html trunk/opentrep/ternary_tree/html/classcontainers_1_1tst__detail_1_1_base_t.html trunk/opentrep/ternary_tree/html/classcontainers_1_1tst__detail_1_1tst__iterator__base-members.html trunk/opentrep/ternary_tree/html/classcontainers_1_1tst__detail_1_1tst__iterator__base.html trunk/opentrep/ternary_tree/html/classstd_1_1back__insert__iterator.html trunk/opentrep/ternary_tree/html/classstd_1_1binary__function.html trunk/opentrep/ternary_tree/html/dir_0df55976ff011c1ef61da79183e9e28f.html trunk/opentrep/ternary_tree/html/dir_59457a7c227558cb0e28f31428e14f54.html trunk/opentrep/ternary_tree/html/dirs.html trunk/opentrep/ternary_tree/html/doxygen.css trunk/opentrep/ternary_tree/html/doxygen.png trunk/opentrep/ternary_tree/html/files.html trunk/opentrep/ternary_tree/html/functions.html trunk/opentrep/ternary_tree/html/functions_0x62.html trunk/opentrep/ternary_tree/html/functions_0x63.html trunk/opentrep/ternary_tree/html/functions_0x64.html trunk/opentrep/ternary_tree/html/functions_0x65.html trunk/opentrep/ternary_tree/html/functions_0x66.html trunk/opentrep/ternary_tree/html/functions_0x67.html trunk/opentrep/ternary_tree/html/functions_0x68.html trunk/opentrep/ternary_tree/html/functions_0x69.html trunk/opentrep/ternary_tree/html/functions_0x6b.html trunk/opentrep/ternary_tree/html/functions_0x6c.html trunk/opentrep/ternary_tree/html/functions_0x6d.html trunk/opentrep/ternary_tree/html/functions_0x6e.html trunk/opentrep/ternary_tree/html/functions_0x6f.html trunk/opentrep/ternary_tree/html/functions_0x70.html trunk/opentrep/ternary_tree/html/functions_0x72.html trunk/opentrep/ternary_tree/html/functions_0x73.html trunk/opentrep/ternary_tree/html/functions_0x74.html trunk/opentrep/ternary_tree/html/functions_0x75.html trunk/opentrep/ternary_tree/html/functions_0x76.html trunk/opentrep/ternary_tree/html/functions_0x77.html trunk/opentrep/ternary_tree/html/functions_0x7e.html trunk/opentrep/ternary_tree/html/functions_enum.html trunk/opentrep/ternary_tree/html/functions_eval.html trunk/opentrep/ternary_tree/html/functions_func.html trunk/opentrep/ternary_tree/html/functions_func_0x62.html trunk/opentrep/ternary_tree/html/functions_func_0x63.html trunk/opentrep/ternary_tree/html/functions_func_0x64.html trunk/opentrep/ternary_tree/html/functions_func_0x65.html trunk/opentrep/ternary_tree/html/functions_func_0x66.html trunk/opentrep/ternary_tree/html/functions_func_0x67.html trunk/opentrep/ternary_tree/html/functions_func_0x68.html trunk/opentrep/ternary_tree/html/functions_func_0x69.html trunk/opentrep/ternary_tree/html/functions_func_0x6b.html trunk/opentrep/ternary_tree/html/functions_func_0x6c.html trunk/opentrep/ternary_tree/html/functions_func_0x6d.html trunk/opentrep/ternary_tree/html/functions_func_0x6e.html trunk/opentrep/ternary_tree/html/functions_func_0x6f.html trunk/opentrep/ternary_tree/html/functions_func_0x70.html trunk/opentrep/ternary_tree/html/functions_func_0x72.html trunk/opentrep/ternary_tree/html/functions_func_0x73.html trunk/opentrep/ternary_tree/html/functions_func_0x74.html trunk/opentrep/ternary_tree/html/functions_func_0x75.html trunk/opentrep/ternary_tree/html/functions_func_0x76.html trunk/opentrep/ternary_tree/html/functions_func_0x7e.html trunk/opentrep/ternary_tree/html/functions_rela.html trunk/opentrep/ternary_tree/html/functions_type.html trunk/opentrep/ternary_tree/html/functions_type_0x62.html trunk/opentrep/ternary_tree/html/functions_type_0x63.html trunk/opentrep/ternary_tree/html/functions_type_0x64.html trunk/opentrep/ternary_tree/html/functions_type_0x66.html trunk/opentrep/ternary_tree/html/functions_type_0x68.html trunk/opentrep/ternary_tree/html/functions_type_0x69.html trunk/opentrep/ternary_tree/html/functions_type_0x6b.html trunk/opentrep/ternary_tree/html/functions_type_0x6c.html trunk/opentrep/ternary_tree/html/functions_type_0x6d.html trunk/opentrep/ternary_tree/html/functions_type_0x6e.html trunk/opentrep/ternary_tree/html/functions_type_0x70.html trunk/opentrep/ternary_tree/html/functions_type_0x72.html trunk/opentrep/ternary_tree/html/functions_type_0x73.html trunk/opentrep/ternary_tree/html/functions_type_0x74.html trunk/opentrep/ternary_tree/html/functions_type_0x76.html trunk/opentrep/ternary_tree/html/functions_vars.html trunk/opentrep/ternary_tree/html/globals.html trunk/opentrep/ternary_tree/html/globals_defs.html trunk/opentrep/ternary_tree/html/globals_func.html trunk/opentrep/ternary_tree/html/graph_legend.dot trunk/opentrep/ternary_tree/html/graph_legend.html trunk/opentrep/ternary_tree/html/graph_legend.png trunk/opentrep/ternary_tree/html/hierarchy.html trunk/opentrep/ternary_tree/html/index.html trunk/opentrep/ternary_tree/html/iteration__impl_8hpp.html trunk/opentrep/ternary_tree/html/iterator__wrapper_8hpp.html trunk/opentrep/ternary_tree/html/namespacecontainers.html trunk/opentrep/ternary_tree/html/namespacecontainers_1_1smap__detail.html trunk/opentrep/ternary_tree/html/namespacecontainers_1_1sset__detail.html trunk/opentrep/ternary_tree/html/namespacecontainers_1_1tst__detail.html trunk/opentrep/ternary_tree/html/namespacecontainers_1_1tst__detail_1_1mpl__detail.html trunk/opentrep/ternary_tree/html/namespacecontainers_1_1tst__erase__impl__detail.html trunk/opentrep/ternary_tree/html/namespacecontainers_1_1util.html trunk/opentrep/ternary_tree/html/namespaceiterators.html trunk/opentrep/ternary_tree/html/namespacemembers.html trunk/opentrep/ternary_tree/html/namespacemembers_func.html trunk/opentrep/ternary_tree/html/namespaces.html trunk/opentrep/ternary_tree/html/namespacestd.html trunk/opentrep/ternary_tree/html/new__iterator__base_8ipp.html trunk/opentrep/ternary_tree/html/pages.html trunk/opentrep/ternary_tree/html/perf_notes.html trunk/opentrep/ternary_tree/html/structcontainers_1_1smap__detail_1_1multimap__iterator-members.html trunk/opentrep/ternary_tree/html/structcontainers_1_1smap__detail_1_1multimap__iterator.html trunk/opentrep/ternary_tree/html/structcontainers_1_1sset__detail_1_1multiset__iterator-members.html trunk/opentrep/ternary_tree/html/structcontainers_1_1sset__detail_1_1multiset__iterator.html trunk/opentrep/ternary_tree/html/structcontainers_1_1ternary__tree_1_1find__result-members.html trunk/opentrep/ternary_tree/html/structcontainers_1_1ternary__tree_1_1find__result.html trunk/opentrep/ternary_tree/html/structcontainers_1_1tst__detail_1_1always__heap__node-members.html trunk/opentrep/ternary_tree/html/structcontainers_1_1tst__detail_1_1always__heap__node.html trunk/opentrep/ternary_tree/html/structcontainers_1_1tst__detail_1_1back__push__pop-members.html trunk/opentrep/ternary_tree/html/structcontainers_1_1tst__detail_1_1back__push__pop.html trunk/opentrep/ternary_tree/html/structcontainers_1_1tst__detail_1_1dummy__sequence-members.html trunk/opentrep/ternary_tree/html/structcontainers_1_1tst__detail_1_1dummy__sequence.html trunk/opentrep/ternary_tree/html/structcontainers_1_1tst__detail_1_1heap__node-members.html trunk/opentrep/ternary_tree/html/structcontainers_1_1tst__detail_1_1heap__node.html trunk/opentrep/ternary_tree/html/structcontainers_1_1tst__detail_1_1inorder__seek-members.html trunk/opentrep/ternary_tree/html/structcontainers_1_1tst__detail_1_1inorder__seek.html trunk/opentrep/ternary_tree/html/structcontainers_1_1tst__detail_1_1inplace__node-members.html trunk/opentrep/ternary_tree/html/structcontainers_1_1tst__detail_1_1inplace__node.html trunk/opentrep/ternary_tree/html/structcontainers_1_1tst__detail_1_1iter__method__forward-members.html trunk/opentrep/ternary_tree/html/structcontainers_1_1tst__detail_1_1iter__method__forward.html trunk/opentrep/ternary_tree/html/structcontainers_1_1tst__detail_1_1key__access-members.html trunk/opentrep/ternary_tree/html/structcontainers_1_1tst__detail_1_1key__access.html trunk/opentrep/ternary_tree/html/structcontainers_1_1tst__detail_1_1levenshtein__search__info-members.html trunk/opentrep/ternary_tree/html/structcontainers_1_1tst__detail_1_1levenshtein__search__info.html trunk/opentrep/ternary_tree/html/structcontainers_1_1tst__detail_1_1levenshtein__search__info_1_1search-members.html trunk/opentrep/ternary_tree/html/structcontainers_1_1tst__detail_1_1levenshtein__search__info_1_1search.html trunk/opentrep/ternary_tree/html/structcontainers_1_1tst__detail_1_1mpl__detail_1_1if__c-members.html trunk/opentrep/ternary_tree/html/structcontainers_1_1tst__detail_1_1mpl__detail_1_1if__c.html trunk/opentrep/ternary_tree/html/structcontainers_1_1tst__detail_1_1mpl__detail_1_1if__c_3_01false_00_01_t1_00_01_t2_01_4-members.html trunk/opentrep/ternary_tree/html/structcontainers_1_1tst__detail_1_1mpl__detail_1_1if__c_3_01false_00_01_t1_00_01_t2_01_4.html trunk/opentrep/ternary_tree/html/structcontainers_1_1tst__detail_1_1node__base-members.html trunk/opentrep/ternary_tree/html/structcontainers_1_1tst__detail_1_1node__base.html trunk/opentrep/ternary_tree/html/structcontainers_1_1tst__detail_1_1size__policy__node-members.html trunk/opentrep/ternary_tree/html/structcontainers_1_1tst__detail_1_1size__policy__node.html trunk/opentrep/ternary_tree/html/structiterators_1_1const__traits-members.html trunk/opentrep/ternary_tree/html/structiterators_1_1const__traits.html trunk/opentrep/ternary_tree/html/structiterators_1_1iterator__wrapper-members.html trunk/opentrep/ternary_tree/html/structiterators_1_1iterator__wrapper.html trunk/opentrep/ternary_tree/html/structiterators_1_1nonconst__traits-members.html trunk/opentrep/ternary_tree/html/structiterators_1_1nonconst__traits.html trunk/opentrep/ternary_tree/html/structured__map_8hpp.html trunk/opentrep/ternary_tree/html/structured__set_8hpp.html trunk/opentrep/ternary_tree/html/structured_concept.html trunk/opentrep/ternary_tree/html/tab_b.gif trunk/opentrep/ternary_tree/html/tab_l.gif trunk/opentrep/ternary_tree/html/tab_r.gif trunk/opentrep/ternary_tree/html/tabs.css trunk/opentrep/ternary_tree/html/ternary__tree_8hpp.html trunk/opentrep/ternary_tree/html/todo.html trunk/opentrep/ternary_tree/html/tst__implementation_8ipp.html trunk/opentrep/ternary_tree/html/tst__iterator__base_8ipp.html trunk/opentrep/ternary_tree/html/tst__iterator__facade_8hpp.html trunk/opentrep/ternary_tree/html/tst__node_8hpp.html trunk/opentrep/ternary_tree/html/tst__search__results_8ipp.html trunk/opentrep/ternary_tree/html/tst_impl.html trunk/opentrep/ternary_tree/html/tst_links.html trunk/opentrep/ternary_tree/html/tst_reference.html trunk/opentrep/ternary_tree/html/tst_tests.html trunk/opentrep/ternary_tree/html/tst_usage.html trunk/opentrep/ternary_tree/html/usage_8hpp.html trunk/opentrep/ternary_tree/index.html trunk/opentrep/ternary_tree/iterator_compile_test.cpp trunk/opentrep/ternary_tree/iterator_wrapper.hpp trunk/opentrep/ternary_tree/readme.txt trunk/opentrep/ternary_tree/structured_map.hpp trunk/opentrep/ternary_tree/structured_set.hpp trunk/opentrep/ternary_tree/ternary_tree.hpp trunk/opentrep/ternary_tree/test/ trunk/opentrep/ternary_tree/test/basic_insertion_test.hpp trunk/opentrep/ternary_tree/test/check_iteration.hpp trunk/opentrep/ternary_tree/test/copy_test.hpp trunk/opentrep/ternary_tree/test/element_range_test.hpp trunk/opentrep/ternary_tree/test/erase_test.cpp trunk/opentrep/ternary_tree/test/hamming_search_test.cpp trunk/opentrep/ternary_tree/test/iterator_test.cpp trunk/opentrep/ternary_tree/test/localization_test.cpp trunk/opentrep/ternary_tree/test/longest_match_test.cpp trunk/opentrep/ternary_tree/test/mapped_value_test.cpp trunk/opentrep/ternary_tree/test/partial_match_test.cpp trunk/opentrep/ternary_tree/test/prefix_range_test.cpp trunk/opentrep/ternary_tree/test/scrabble_search_test.cpp trunk/opentrep/ternary_tree/test/test.vcproj trunk/opentrep/ternary_tree/test/test_tst.cpp trunk/opentrep/ternary_tree/test/tests_common.hpp trunk/opentrep/ternary_tree/tst.doxy trunk/opentrep/ternary_tree/tst_concept_checks.cpp trunk/opentrep/ternary_tree/tst_detail/ trunk/opentrep/ternary_tree/tst_detail/iteration_impl.hpp trunk/opentrep/ternary_tree/tst_detail/new_iterator_base.ipp trunk/opentrep/ternary_tree/tst_detail/tst_implementation.ipp trunk/opentrep/ternary_tree/tst_detail/tst_iterator_base.ipp trunk/opentrep/ternary_tree/tst_detail/tst_iterator_facade.hpp trunk/opentrep/ternary_tree/tst_detail/tst_node.hpp trunk/opentrep/ternary_tree/tst_detail/tst_search_results.ipp trunk/opentrep/ternary_tree/tst_public.doxy trunk/opentrep/test/ternary/ Added: trunk/opentrep/ternary_tree/README =================================================================== --- trunk/opentrep/ternary_tree/README (rev 0) +++ trunk/opentrep/ternary_tree/README 2009-07-14 10:45:19 UTC (rev 126) @@ -0,0 +1,3 @@ + +Source: http://abc.se/~re/code/tst and http://abc.se/~re/code/tst/ternary_tree.zip + Added: trunk/opentrep/ternary_tree/doxygen_input/blather.hpp =================================================================== --- trunk/opentrep/ternary_tree/doxygen_input/blather.hpp (rev 0) +++ trunk/opentrep/ternary_tree/doxygen_input/blather.hpp 2009-07-14 10:45:19 UTC (rev 126) @@ -0,0 +1,558 @@ +/** \mainpage Structured Associative Containers + +Ternary Search Tree containers to replace \c set<string> and \c map<string, Value> </h2> + +<center><table bgcolor="#fbf9e5" style="border: thin dotted #808000;" width="95%" border=0> +<tr> +<td> +<h3>Table of contents</h3> +<dl> + <dt>\ref introduction "Introduction"</dt> + <dt>\ref subkey_search_overview "Advanced searches overview"</dt> + <dt>\ref tst_usage "Tutorial"</dt> + <dt>\ref tst_reference "Reference"</dt> <dd> + <dd>\ref structured_concept "Structured Container concept" \n + Class \ref containers::structured_set "structured_set" \n + Class \ref containers::structured_map "structured_map" \n + Class \ref containers::structured_multiset "structured_multiset" \n + Class \ref containers::structured_multimap "structured_multimap" \n + Implementation class \ref containers::ternary_tree "ternary_tree" + </dd></dt> + <dt>\ref perf_notes "Performance notes"</dt> + <dt>\ref tst_impl "Implementation details"</dt> + <dt>\ref tst_links "Links"</dt> + <dt>\ref tst_tests "Test Suite"</dt> +</dl> +</td> +</tr></table></center> + +Download: Latest version (0.684) http://abc.se/~re/code/tst/ternary_tree.zip\n + +Copyleft: <a href="mailto:rasmus%20point%20ekman%20at%20abc%20point%20se?subject=Structured Containers suck/rule"> +rasmus ekman</a> 2007-2009 \n +Weblink: http://abc.se/~re/code/tst + +\anchor introduction <hr> +<h2>Introduction</h2> +<b>Structured containers</b> are \c map and \c set -like containers specialized for strings. +They are commonly used for dictionaries.\n +Structured containers have two major benefits: +- They offer near-match searches (wildcard search, partial match etc) that are hard to implement + with other containers. +- Lookup performance is on a par with hashed containers for many common applications, +and 2-5 times faster than standard maps and sets (with string-like keys). + +Of course there is a price to pay: structured containers use much more memory than +other containers: Around 6-8 bytes <b>per letter</b> inserted (whether \c char or \c wchar_t); +an English 150 k word dictionary uses eg 7.3 MB to store 1.2 MB words (2.4 MB of \c wchar_t words). + +The container classes in this library can be used as drop-in replacements for \c set and \c map +(or \c unordered_set, \c unordered_map): + - \ref containers::structured_set "structured_set": This stores unique keys and allows structured key searches. + - \ref containers::structured_multiset "structured_multiset": This stores non-unique keys. + - \ref containers::structured_map "structured_map": This is a + <a target="sgi" href="http://www.sgi.com/tech/stl/PairAssociativeContainer.html">Pair Associative Container</a>, + as it allows associating a value with each key. + - \ref containers::structured_multimap "structured_multimap": Technically, a + <a target="sgi" href="http://www.sgi.com/tech/stl/MultipleSortedAssociativeContainer.html">Multiple, Sorted, + Pair Associative Container</a> - it allows storing several values with each key. + +While the STL standard associative containers are normally backed by a binary tree structure, +Structured Containers are backed by a Ternary Search Tree, as presented by +\ref note_1 "Jon Bentley and Robert Sedgewick in [1]". + +Class \ref containers::ternary_tree "ternary_tree<Key, Value, Comp, Alloc>" provides the implementation backend. +Due to its internals, its interface cannot easily be made to conform with standard STL concepts, +so it is used internally by the structured* wrapper classes (much like STL's internal \c rb_tree class). + +Basically, if you have code using sets or maps, you have code to use structured containers. +And with 1-3 lines of code, you're ready to make advanced imprecise searches in your dictionaries.\n +See \ref tst_usage "the usage section" for examples of how to use these classes. + +<table bgcolor="#f0f0ff" style="border: thin dotted #808000;" border=0> +<tr><th>Library status</th></tr> +<tr><td valign="top" align="right">Compatibility:</td> +<td>Note that the file \b tst_concept_checks.cpp is currently broken. Will investigate.\n +<!-- This used to compile with Mingw GCC 3.4.2 and with MSVC7.1 (with STLport 5). Requires Boost 1.33. +Not sure what happened in Boost 1.36-37 or if I've mangled something. \n +Due to recent changes, ternary tree does not support stateful allocators (earlier versions did this by implication) --> +</td> +<tr><td valign="top" align="right">version 0.684: (Jan 2009)</td> +<td>Fix standard-breakage in multimap/multiset return from <code>insert(const value_type&)</code>.<br> +Added <code>operator-></code> to iterator wrapper for C++0x compatibility. +Thanks to Geoffrey Noel for reports.</td> +</tr> +<tr><td valign="top" align="right">version 0.683: (March 2007)</td> +<td>Fix portability issues for GCC and non-STLport libraries. Fix longest_match.<br> +Thanks to Arjen Wagenaar for several reports, fixes and encouragement. Thanks also to Michel Tourn for reports.</td> +</tr> +<tr><td valign="top" align="right">version 0.68: (Dec 2006)</td> +<td>Implement TST_NODE_COUNT_TYPE macro, which can be used to control node size on 64-bit systems. + See \ref containers::ternary_tree "class ternary_tree"</td> +</tr> +<tr><td valign="top" align="right">version 0.68 (alpha):</td> +<td>Reimplemented node type. Do proper management of value type (was inconsistent, partly unimplemented - duh!)</td> +</tr> +<!--tr> +<tr><td valign="top" align="right">version 0.676:</td> +<td>Modified containers to follow C++0x draft standard: \n +Added \c cbegin, \c cend methods returning \c const_iterator, and \c crbegin, \c crend +returning \c const_reverse_iterator, to make it easier to code with const-correctness. \n +\c erase(iterator pos); and \c erase(iterator first, iterator last); methods now return iterators.</td> +<tr><td valign="top" align="right">version 0.675:</td> +<td>All Structured Container classes implemented. Structured search interface TBD. +</td--> +</table> + + +\anchor subkey_search_overview <hr> +<h2>Sub-key, or Structure Searches</h2> +<span style="color:#905050;">(a new interface for these searches will be specified in the future)</span> + +Ternary trees allow searches that match parts of keys and ignores mismatches in other parts.\n +In the current interface we specify a small number of searches facilitated by the tree structure; +the Partial Match and Hamming searches are defined in several other implementations +(showcased in \ref note_1 "Bentley and Sedgewick" code). +The Levenshtein and combinatorial searches are not found in other ternary trees (that I know of). + +<table border="1" cellspacing="0"> + <tr><th bgcolor="#f0f0ff">Name (function name)</th><th bgcolor="#f0f0ff">Description</th></tr> + <tr><th> + Prefix match (\ref containers::ternary_tree::prefix_range "prefix_range")</th><td> + Finds keys sharing a common prefix, returns a pair of iterators.</td></tr> + <tr><th> + Longest match (\ref containers::ternary_tree::longest_match "longest_match")</th><td> + Finds the longest key that matches beginning of search string. + A typical application is to tokenize a string using the ternary tree as dictionary.</td></tr> + <tr><th> + Partial match, or wildcard search (\ref containers::ternary_tree::partial_match_search "partial_match_search")</th><td> + Accepts a search string with wildcard characters that will match any letter, + eg "b?nd" would match "band", "bend", "bind", "bond" in an English dictionary.</td></tr> + <tr><th> + Search allowing \c N mismatches, + (\ref containers::ternary_tree::hamming_search "hamming_search"<span style="font-weight:normal;"></span>)</th><td> + Accepts a search string and an integer \c dist indicating how many non-matching letters are allowed, + then finds keys matching search string that have at most \a dist mismatches. + This works like a partial match search with all combinations of \a dist + wildcards in the search string.\n + \c hamming_search("band", 1) matches the wildcard search plus "bald", "bane" and "wand", etc. \n + The version here, following DDJ code, extends the strict Hamming search by also allowing shorter and longer + strings; a search for "band", \a dist = 1, also finds "ban" and "bandy" etc.\n + See also http://wikipedia.org/wiki/Hamming_distance</td></tr> + <tr><th> + Levenshtein distance search</b> (\ref containers::ternary_tree::levenshtein_search "levenshtein_search" + <span style="font-weight:normal;">- consider descriptive name</span>)</th><td> + + Hamming search matches characters in fixed position, allowing substitution of \a dist chars. + Levenshtein search also allows shifting parts of the search string by insertion or skipping chars (in \a dist places). + So <code>levenshtein_search("band", 1) </code> extends the hamming_search set with "and" and "bland", etc. + A typical application is to match mispelt words.\n + See also http://wikipedia.org/wiki/Levenshtein_distance</td></tr> + <tr><th> + Combinatorial or "scrabble" search (\ref containers::ternary_tree::combinatorial_search "combinatorial_search")</th><td> + Finds all keys using the characters in search string. \c combinatorial_search("band") finds + "ad", "and", "bad", "dab", "nab", etc. A count of wildcards can be added, also allowing + nonmatching characters (use with care, values over 10% of average key length + may cause the algorithm to traverse a large part of the tree).</td></tr> +</table> + +See \ref usage_imprecise_searches "advanced search overview" in the tutorial. + +These searches are defined for all containers in this library. +But they are also marked as deprecated (to be replaced by generic algorithms with same interface). +For a relative performance comparison of imprecise searches, see the second table in \ref perf_notes. + +<h3>Future directions</h3> +The searches currently defined are clearly special cases in a sea of search possibilities. +We have only defined searches that are relatively efficient, compared to other combinations of containers and algorithms. +But there can be many variations on the available searches: increasing Hamming/Levenshtein distance +at the end of words, or matching limited ranges of characters (eg allowing mismatches only in vowels), etc. + +The next step for this project is to support a more flexible low-level interface for +traversing and filtering tree nodes. +The interface for these "structured searches" is open for consideration, but it +will basically define sub-key iterators, conversion of full-key from sub-key iterators, +and a small collection of algorithms operating on these sub-key iterators. + +At least the following operations are needed: + + - sub-key match: matching a part of a key (prefix, or starting from current char position) + - key element range increment: from a sub-key position, match a range of characters + in next position (returns a list of sub-key iterators? - or iterator-like operation?) + - conversion from sub-key iterator to full-key iterator range (nearest and post-furthest + keys in the subtree) + - \c is_key(subkey_iterator pos): true if end-of-key exists at iterator position. + - \c count_elements(subkey_iterator pos): returns number of available key elements at position. + - In all predefined algorithms above, either a specific, or any char is matched, + we would also support arbitrary char sets (possibly with special case for char ranges). + + */ + +/** \page tst_reference Reference +<center><table bgcolor="#fbf9e5" style="border: thin dotted #808000;" width="95%" border=0> +<tr> +<td> +<dl> + <dt>\ref structured_concept "Structured Container concept"</dt> + <dt>\ref ref_sethpp "Header < structured_set.hpp >"</dt> + <dt>\ref ref_maphpp "Header < structured_map.hpp >"</dt> + <dt>\ref ref_tsthpp "Header < ternary_tree.hpp >"</dt> + <dt>\ref ref_iterhpp "Header < iterator_wrapper.hpp >"</dt> +</dl> +</td> +</tr></table></center> + +<hr> + +\anchor ref_sethpp +<h2>Header < <a href="../structured_set.hpp">%structured_set.hpp</a> > synopsis</h2> +<pre> +\b namespace containers { + \b template <\b class Key, + \b class Comp = std::less<\b typename Key::value_type>, + \b class Alloc = std::allocator<Key> > + \b class \ref containers::structured_set "structured_set"; + + \b template <\b class Key, + \b class Comp = std::less<\b typename Key::value_type>, + \b class Alloc = std::allocator<Key> > + \b class \ref containers::structured_multiset "structured_multiset"; +} +</pre> + +\anchor ref_maphpp +<h2>Header < <a href="../structured_map.hpp">%structured_map.hpp</a> > synopsis</h2> +<pre> +\b namespace containers { + \b template <\b class Key, + \b class T, + \b class Comp = std::less<\b typename Key::value_type>, + \b class Alloc = std::allocator<std::pair<\b const Key, T> > > + \b class \ref containers::structured_map "structured_map"; + + \b template <\b class Key, + \b class T, + \b class Comp = std::less<\b typename Key::value_type>, + \b class Alloc = std::allocator<std::pair<\b const Key, T> > > + \b class \ref containers::structured_multimap "structured_multimap"; +} +</pre> + +<hr> +Supplementary header files needed to support structured_set and -map classes. + + +\anchor ref_tsthpp +<h2>Header < <a href="../ternary_tree.hpp">%ternary_tree.hpp</a> > synopsis</h2> +<pre> +\b namespace containers { + + \b template <\b class Key, + \b class T, + \b class Comp = std::less<\b typename Key::value_type>, + \b class Alloc = std::allocator<std::pair<\b const Key, T> > > + \b class \ref containers::ternary_tree "ternary_tree"; + + \b template <\b class TreeT, \b class IteratorT> + \b class \ref containers::search_results_list "search_results_list"; + +} +</pre> + + +\anchor ref_iterhpp +<h2>Header < <a href="../iterator_wrapper.hpp">%iterator_wrapper.hpp</a> > synopsis</h2> +<pre> +\b namespace iterators { + + \b template <\b class T> \b struct const_traits; + \b template <\b class T> \b struct nonconst_traits; + + \b template <\b class BaseIterT, + \b class TraitsT, // either const_traits<T> or nonconst_traits<T> + \b class IterCatT = std::bidirectional_iterator_tag > + \b class \ref iterators::iterator_wrapper "iterator_wrapper"; +} +</pre> + +*/ + +/** +\page structured_concept Structured Associative Container Concept + +<span style="color:#905050;">(a preliminary sketch of the formal technical concept description)</span> + +A Structured Associative Container is a specialization of the C++ 98 standard concept +<a target="sgi" href="http://www.sgi.com/tech/stl/SortedAssociativeContainer.html">Sorted Associative Container</a>, +with extended interface. + +The template parameters are similar to that of the Associated Containers: + +<code> structured_set<Key, Comp, Alloc>; </code>\n +<code> structured_map<Key, Value, Comp, Alloc>; </code>\n + +where: + - \c <b>Key</b> type is itself a container (eg a \c std::string or \c std::wstring) + - \c <b>Comp</b> is a comparison operator that imposes a sort order on \c Key::value_type elements \n + (so if \c Key is string, \c Comp compares \c char, if \c Key is \c wstring, \c Comp applies to \c wchar_t). + - \c <b>Value</b> can be any Assignable type + - \c <b>Alloc</b> is an allocator that manages all memory allocation for the container. + +The \c Comp and the \c Alloc types have default template arguments. + +In other words Structured containers are like Sorted Associative Containers, BUT + - add the requirement on Key template type to be a + <a target="sgi" href="http://www.sgi.com/tech/stl/ForwardContainer.html">Forward Container</a>.\n + For example, \c std::basic_string<CharT> is compatible with this requirement. + - change the requirement on the \c Comp (comparator) template argument to operate on + \c key_type::value_type elements (rather than on \c key_type itself). + Like Sorted Associative comparator, the \c Comp type shall define a less-like comparison, a + <a target="sgi" href="http://www.sgi.com/tech/stl/StrictWeakOrdering.html">Strict Weak Ordering</a> + of key-elements. + +<b>Associated types</b> + - \b char_compare: less-like comparison of key elements (establishing a Strict Weak Ordering). + The <a target="sgi" href="http://www.sgi.com/tech/stl/AssociativeContainer.html">Associative Container</a> + \c key_compare type is also provided, but is defined in terms of \c char_compare. \n + - \b subkey_iterator: Used in structure searches. Convertible to iterator (TBD). + +In consequence it allows searches involving subparts of keys, ie with shared prefix and/or +with shared middle parts. + +<hr> +<h3>Deprecated search interface</h3> + +In the first iteration, additional searches are provided as methods on the containers. +This will be changed to use free functions operating on \c subkey_iterator. +The deprecated search methods will still be provided as convenience functions; +to migrate your code from present version to the new interface, will mean moving +the object name to the first argument, but also to respecify the search_results_list type. +(This sloppy-hackish type is by itself reason not to keep the method interface) + +See \ref subkey_search_overview "Structured search overview" +and \ref tst_structsearch "ternary_tree Structure search section". +*/ + +/* + +\b Notation \n +<table border=0> +<tr><td>\c X <td>A type that is a model of Associative Container </td></tr> +<tr><td>\c a <td>Object of type \c X </tr> +<tr><td>\c k <td>Object of type \c X::key_type </tr> +<tr><td>\c p, \c q <td>Object of type \c X::char_iterator </tr> +<tr><td>\c c <td>Object of type \c X::char_type </tr> +<tr><td>\c o <td>Object modelling output iterator </tr> +<tr><td>\c i <td>Object of type \c X::size_type </tr> +</dl> + +<table border=1> +<tr><th>Name</th><th>Expression</th><th>Return value</th> +<tr><td>Prefix match</td><td><code>a.prefix_range(k)</code></td><td> + \c std::pair<iterator, iterator> if \c a is mutable, otherwise <br>\c std::pair<const_iterator, const_iterator></td></tr> +<tr><td>Longest match</td><td><code>a.longest_match(p, q)</code></td><td> + \c iterator if \c a is mutable, otherwise \c const_iterator</td></tr> +<tr><td>Partial match, or <br>wildcard search</td><td><code>a.partial_match_search(k, o, c)</code></td><td> + The output iterator \c o</td></tr> +<tr><td>Hamming search</td><td><code>a.hamming_search(k, o, i)</code></td><td> + The output iterator \c o</td></tr> +<tr><td>Levenshtein search</td><td><code>a.levenshtein_search(k, o, i)</code></td><td> + The output iterator \c o</td></tr> +<tr><td>Combinatorial or <br>"scrabble" search</td><td><code>a.combinatorial_search(k, o, i)</code></td><td> + The output iterator \c o</td></tr> +</table> + +*/ + +/** \page tst_impl Implementation Details + * (In the following, "original" and "DDJ" code refers to the article by Bentley/Sedgewick + * published in Dr Dobb's Journal, and the accompanying C source code - see \ref tst_links) + * + * In most implementations, a ternary tree node has the following members: \code + * struct node { + * char splitchar; // key letter, or 0 at end of key + * node *hikid; // subtree of keys with higher value than splitchar + * node *eqkid; // subtree matching splitchar (pointer to mapped value at end-of-key node) + * node *lokid; // subtree less than splitchar + * node *parent; // necessary for iteration (not needed for insert/find) + * }; \endcode + * + * This means that each node is 1 char plus three or four pointers size. + * On many systems, struct member alignment makes the char member consume size of one pointer + * as well, so we have 4 (or 5) x sizeof(pointer) per node in the tree. + * With several kinds of dictionaries, the node count ends up at around 0.3-0.5 times + * total key length (since keys share nodes). + * This is even more expensive on 64-bit machines. + * + * There are several variation points in the node class: + * -# the DDJ C code designates an invalid value of zero to indicate end-of-string. We want to + * allow any string as key, so the end-of-string representation should change. + * We note that on many platforms, C/C++ struct member alignment leaves a "hole" + * in the binary representation of the node, between the char and the first pointer ("hikid"). + * On such systems there is no space cost to use another char-sized value to indicate end node. + * This also works for \c wchar_t strings on 32- or 64-bit systems. + * -# The original code stores a value for each string in the terminal node's "equal" pointer. + * The value in DDJ code is always a pointer to the terminated string. This is used to make + * advanced searches work (they return an array of pointers to strings stored in end-nodes). + * In reality this means that strings may need to be copied on insertion (not reflected in DDJ timings). + * -# Original DDJ code does not support iterating over strings in the tree. + * Idiomatic STL-like container style strongly suggests that iteration should be supported. + * This is fairly simple to implement if a parent pointer is added to the node struct: + * Because when an end-node is reached, the iterator must backtrack to find the previous + * branch point. + * + * The parent pointer also makes it possible to recover the inserted string by walking nodes + * backward from a terminal node to the root. Complexity is key length, plus log(tree.size), + * but it means inserted keys do not \b have to be copied to the end node. + * We opt to cache keys in iterators, at no measurable extra cost in iteration. + + * Instead of the key, an arbitrary value can be associated with endnodes. + * However, it should not be allowed to increase node size, since most nodes in the tree are not endnodes. + * In this library we store the mapped value directly in end-node if it is <tt> <= \c sizeof(void*). </tt> + * Larger objects are allocated on the heap, and a pointer to the copy is stored in end-node + * (the copy is managed by the tree). + * + * <h4>Now for some optimization</h4> + * We use a \c vector<node> as pool allocator, and record eq-hi-lo links as vector index instead of pointers. + * The pool allocation essentially follows original C code insert2() principle. + * For us, it also simplifies reallocation, since pointers do not have to be rebound; + * the indices are always valid. + * This has the following consequences: + * - allow the option of 4-byte indices also on 64-bit systems (with obvious resulting tree size limit) + * - When a new key is inserted, the last part (unique to the key) is always allocated in a batch. + * This means that one node member, \c "eqkid", becomes redundant, as it is always the next index + * (except after terminal nodes of course). + * - in DDJ code the end-node value is stored in union with the eqkid. We note that the \c lokid node index + * is also unused by end-nodes (as no char should be lower than zero), so all endnode children + * are linked to the hi node. + * + * (In our binary-cognizant version where zero is a regular char value, this still holds, + * we just change the end-node test accordingly) + * + * In the final cut, our node struct data members appear roughly like this: \code + * struct node { + * CharType splitchar; // key letter, or 0 at end of key (to make sure lokid is never allowed) + * CharType endflag; // zero on normal nodes, 1 at end nodes, 2 at erased nodes. + * node_index hikid; // subtree of keys with higher value than splitchar + * node_index lokid; // subtree less than splitchar + * node_index parent; // necessary for iteration (not needed for insert/find) + * }; \endcode + * + * where \c CharType is defined by template \c Key::value_type, and treated as an unsigned type + * (so 0 is the lowest value); and \c node_index is a \c size_t -like type used by the node + * storage backend (currently \c std::vector). + * + * This optimization could also be applied to C version, trimming space requirement in DDJ code + * to 3-word nodes. + */ + +/** \page perf_notes Performance Notes + * + * <h3>Space considerations</h3> + * + * Ternary trees are notably larger than hash maps or most binary tree designs. + * Each node holds only one character (instead of a whole key), and use 3-5 pointers. + * Our nodes consist of 4 \c size_t values (16 bytes) regardless of platform pointer size, + * or char type (if at most 2-byte like \c wchar_t). + * + * The shared parts of strings save space: In a typical English dictionary, + * each key shares over half its nodes with other keys, so the allocated space is about half + * of total key-length times 16. In a scrabble dictionary like the one reported below, + * which contains all valid word endings, most nodes are shared, so its storage cost is "only" + * total key length times 0.35 times 16, or less than 6 bytes per char. + * With \c wchar_t type, the storage cost cannot be considered overly large. + * + * See also \ref tst_impl + * + * <h3>Lookup speed</h3> + * + * The complexity of ternary tree operations is basically the same as for binary trees, + * (logarithmic in tree size) but with quite different constant factors. See \ref note_1 "[1]". + * + * Overall lookup and iteration speed depends on application factors - ie + * whether strings are inserted in random order or not, etc. + * + * Rough speed estimates (compared to Stlport hash_map and map). + * - insertion is a bit slower (>30% to 0%) than hash_map, ~30% faster than map. + * - finding a key is ~0-50% slower than hash_map (equal on failure, with short keys). + * - finding a key is 1.5-3 times faster than map (again with short keys). + * + * Compared to C versions (DDJ and libtst), + * - find and insert are slower, by factors ranging from 1.5-4. + * - partial_match and neighbour searches are 5-20% faster than published DDJ code - + * the code is essentially the same, but our implementation rolls out some recursion. + * This is easily back-ported, so in effect they should be considered to run at same speed. + * This by itself is good news though, since eg single-key lookup is always slower. + * + * Since each character in a key is at a separate node in the internal tree, + * iterating over values is a little slower than for other tree-based containers. + * + * For detailed test, see performance table below. + * + * <hr style="height: 3px; border-top: 0px; background-color: #e09060;"> + * \htmlinclude performancetable.html + */ + + +/** \page tst_links Links + * ternary_tree by rasmus ekman, see http://abc.se/~re/code/tst <br> + * Download: http://abc.se/~re/code/tst/ternary_tree.zip + * + * Some other TST implementations. + * - <b>DDJ code:</b> Original C implementation by Jon Bentley and Robert Sedgewick. + * Article in Dr Dobb's Journal, 1998 #4: http://www.ddj.com/documents/s=921/ddj9804a/9804a.htm \anchor note_1 \n + * See http://www.cs.princeton.edu/~rs/strings/ for C code and article on TST complexity. + * - \b libtst: Worked-out version of DDJ code by Peter A. Friend 2002. Version 1.3. \n + * See http://www.octavian.org/cs/software.html \anchor note_2 \n + * - \b Boost.Spirit version: C++ reimplementation by Joel de Guzman. \anchor note_3 \n + * See http://spirit.sourceforge.net/ internal file ./boost/spirit/symbols/impl/tst.h + * - <b>Hartmut Kaiser version:</b> C++ reimplementation intended for generalization of tst. + * Currently abandoned, available in Spirit CVS. (interesting for interface design) \n + * See http://lists.boost.org/Archives/boost/2005/09/93316.php \n + * and http://article.gmane.org/gmane.comp.parsers.spirit.general/6959 + * - \b pytst: C++ version by Nicolas Lehuen, with SWIG wrappers for use from other languages. Version 0.97. \n + * See http://nicolas.lehuen.com/download/pytst/ + * (not yet tested) \anchor note_4 + * + * <h2>Feature chart</h2> + * All versions have insert and plain search, other features available as tabled below: + * \htmlinclude featuretable.html + */ + +/** \page tst_tests Test Suite + +All tests require the <a href="http://boost.org">Boost library</a> to compile. + +<h3>Concept checks</h3> + +The file <a href="../tst_concept_checks.cpp">tst_concept_checks.cpp</a> +performs a compile-time test of structured containers. \n +A class \c StructuredAssociativeContainer is defined, which contains +prototypes of all required methods for structured containers (also class ternary_tree). +Relevant concepts from \c boost/concept_check.hpp are used to check the structured set/map +containers. + +<h3>Correctness tests</h3> + +The subdirectory \b test in the distribution contains a bunch of files hacked up during development. +All these tests are performed by a single main test file <a href="../test/test_tst.cpp">test_tst.cpp</a>. +This file includes individual .cpp files, since we use a simplified (hacked) version +of the Boost.Test harness. + +Each test prints a single line to \c std::cerr saying whether the test was "OK" or "FAIL". +A line is added if an exception was thrown. + +These are runtime tests, several which require a file name to a dictionary-type file, +a plain-text file with one word per line. +The file \c fill_dictionary.cpp must be compiled with test projects, +it reads a dictionary file and fills a std::vector with strings. + +Dictionary files can be found by an internet search (try eg "dictionary file"). + +<h3>To do</h3> + +Proper organization and cleanup of this part of our library will be required before 1.0 release. + +*/ + + Added: trunk/opentrep/ternary_tree/doxygen_input/concepts.txt =================================================================== --- trunk/opentrep/ternary_tree/doxygen_input/concepts.txt (rev 0) +++ trunk/opentrep/ternary_tree/doxygen_input/concepts.txt 2009-07-14 10:45:19 UTC (rev 126) @@ -0,0 +1,122 @@ +Tutorial + +In programming, a Concept is a set of formal requirements on input/output of a subsystem, or pre/post-conditions of +an operation, of complexity constraints and exceptional behaviour. +Note ye well the "complexity" bit. Since the specification of C++ STL, the complexity of operations on +a type have been introduced as a proper feature of its concept, a full-citizenship part of type specification. +(This diverges from the mathematical roots of programming, which defines types/concepts in purely structural terms +-- ie as long as an operation does not transgress countability or infinity boundaries, it doesn't matter a damn bit whether +it requires zero overhead or would enrol half the atoms in the universe to encode intermediate information. +Maths is not about bean counting.). + +Here we will discuss tree concepts in the common sense. +The following are some definitions of terms as used in documentation of the Structured Containers library. +The definitions given are stipulative, in that they do not purely document an existing usage, but unless an expert +tells me otherwise, I believe they should be made into when discussing trees and tries. + + +Tree =df a directed acyclic graph of single-parented, multi-childed Nodes. Usually single-rooted, but this is not essential. + Stipulative: tree nodes have a fixed maximum number of children. + An implementation constraint that has become ingrained in most programmers' understanding of the concept. + All trees can be reduced to (easily and naturally implemented by) a binary tree. + +We assume common terminology for the parts of Trees: + Root - a node designated as start point, from which other nodes are reachable as children, or children of children etc. + Single-rooted Tree - a Tree where all nodes are reachable from a single Root node. + Multi-rooted tree, or Forest - a Tree where several start nodes are designated. + Level N - the set of nodes that are at the same distance from Root. Every node at level 3 is reachable + from the/a Root node by following exactly 3 child-node links. + Sibling - relation between any two nodes on the same Level. + Leaf - a node without children. + Fanout - the number of children that a node can have. This defines the maximum number of nodes at each level. + +Trie =df A Tree where the nodes have a "alphabet"-sized (max) number of children, for some alphabet. + Typical alphabets are the English letters, Unicode, or the ACGT genetic bases. + +Tree nodes represent a full "key" of any [less-comparable] type. +Tries store string-like keys; a Trie node does not store a full key, only a part of it. +A full key is represented by a leaf node and its path back to the/a root node. +The reason for using Tries is that access to string-like keys is very fast - in principle linear in key length. +Binary Trees over the same key is O(log n) where n is count of keys in the tree, with average key length +as a constant factor. + + +Ternary Search Tree (TST) is a space optimization for Tries. + +Because each path to a child of a Tree node takes up memory space, Trie nodes are very expensive +if the alphabet is large. From the 3rd or 4th level on, most child-node links are empty. +A TST constructs exactly the number of links existing at each level of a Trie. + +Graphics: +Tree + root==node==node==node + \\node==node \\node==node + +Trie (6-letter alphabet: 123456) + root + ______________||______________ + || || || || || || + node1 (empty) node3 node4 node5 (empty) + ______________||_____________ + || || || || || || + node1 node2 node3 node4 node5 node6 + +Here we see the root node with 4/6 child links populated. Each child has 6 empty links, except the 5th child. +The 5th child has 6/6 links filled, and each of its children has 0/6 children. +In all there are 4+6 = 10 nodes, and 10*6 = 60 links. +Since a Trie only stores part of a key, the substantial information in each trie node is small, and the +structure overhead - the links - is very large. + +This cries out for optimization. Several kinds of variable-sized nodes have been tried, but they usually +end up with complex code to use and maintain, and thus squander the search speed which was the rationale +for constructing Tries in the first place. + + +TST is one such optimization. Here each trie node is constructed from much smaller nodes, but the +code to use and maintain nodes is still fairly simple, so search speed is not badly compromised. +Let's see the structure of the above tree: +The exact runtime layout depends on insert order. If child 3 is inserted before child 1, child1 may become a +"lower-child" child of child 2. +Here we assume insertion order 4, 3, 5, 1 for the root + + //node1 + //node3 +root=node4 + \\node5==(level2) + +level 2: assume insertion order 3,4,5,1,2,6 + + //node1 + // \\node2 +level2.root=node3 + \\node4 + \\node5 + \\node6 + + +Here we see 10 nodes, each with 3 child links. This means 30 links, ie half the link count of Trie. +The space savings are of course even better for larger alphabets. +(And in the Structured Containers implementation, the middle child link is omitted, so only 2*10=20 links are needed.) +Given English alphabet Trie with the above sparse population, there would be 26*10 = 260 links in +the Trie (each Trie node has 26 child links), and still the same count of TST node-links (30, or 20). +A Unicode Trie would have 10*2^16 links= 10*65536= 655 thousand links. A TST for Unicode again uses 30 (20) links only. + +Important points wrt TSTs and TST nodes. +A. TST nodes have two different kinds of child links: + (1) Two same-level sibling links + (2) One next-level "proper" child link. + +B. TSTs generalizes Trie implementations. + Tries with any alphabet can be implemented by the same TST node type - no new type needs to be defined for new alphabets. + (However a specialization is still often needed, since there must be a comparison function for the alphabet letters) + +C. TSTs are a hybrid of (binary) Tree and Trie. + In consequence of (A), TSTs combine the features of binary trees with Tries. + TST nodes can be viewed as binary nodes with associated data, where the data is a link to a next-level binary tree + (that implements a trie node). + - Against this view one may note that the binary treelets have absolute size constraints (defined by + count of letters in the implemented alphabet) - this is an "unnatural" constraint on a tree type. + - In support of the view one may note that search complexity is more like binary trees than pure Trie implementation. + + + Added: trunk/opentrep/ternary_tree/doxygen_input/doxygen-old.css =================================================================== --- trunk/opentrep/ternary_tree/doxygen_input/doxygen-old.css (rev 0) +++ trunk/opentrep/ternary_tree/doxygen_input/doxygen-old.css 2009-07-14 10:45:19 UTC (rev 126) @@ -0,0 +1,311 @@ +BODY,H1,H2,H3,H4,H5,H6,P,CENTER,TD,TH,UL,DL,DIV { + font-family: Geneva, Arial, Helvetica, sans-serif; +} +/*BODY,TD { font-size: 90%; } +H1 { + font-size: 150%; + background-color: #eeeeff; + width: 100%; + border: 1px solid #b00000; + margin: 2px; + padding: 2px; +} +H2 { font-size: 140%; } +H3 { font-size: 100%; } */ + +CAPTION { font-weight: bold } +DIV.qindex { + width: 100%; + background-color: #eeeeff; + border: 1px solid #b0b0b0; + text-align: center; + margin: 2px; + padding: 2px; + line-height: 140%; +} +DIV.nav { + width: 100%; + background-color: #eeeeff; + border: 1px solid #b0b0b0; + text-align: center; + margin: 2px; + padding: 2px; + line-height: 140%; +} +DIV.navtab { + background-color: #eeeeff; + border: 1px solid #b0b0b0; + text-align: center; + margin: 2px; + margin-right: 15px; + padding: 2px; +} +TD.navtab { + font-size: 90%; +} +A.qindex { + text-decoration: none; + font-weight: bold; + color: #1A419D; +} +A.qindex:visited { + text-decoration: none; + font-weight: bold; + color: #1A419D +} +A.qindex:hover { + text-decoration: none; + background-color: #ddddff; +} +A.qindexHL { + text-decoration: none; + font-weight: bold; + background-color: #6666cc; + color: #ffffff; + border: 1px double #9295C2; +} +A.qindexHL:hover { + text-decoration: none; + background-color: #6666cc; + color: #ffffff; +} +A.qindexHL:visited { text-decoration: none; background-color: #6666cc; color: #ffffff } +A.el { text-decoration: none; font-weight: bold } +A.elRef { font-weight: bold } +A.code:link { text-decoration: none; font-weight: normal; color: #0000a0 } +A.code:visited { text-decoration: none; font-weight: normal; color: #0000a0 } +A.codeRef:link { font-weight: normal; color: #0000FF} +A.codeRef:visited { font-weight: normal; color: #0000FF} +A:hover { text-decoration: none; background-color: #f2f2ff } +DL.el { margin-left: -1cm } +.fragment { + font-family: Fixed, monospace + font-size: 100%; +} +PRE.fragment { + font-size: normal; + border: 1px solid #CCCCCC; + background-color: #f5f5f5; + margin-top: 4px; + margin-bottom: 4px; + margin-left: 2px; + margin-right: 8px; + padding-left: 6px; + padding-right: 6px; + padding-top: 4px; + padding-bottom: 4px; +} +DIV.ah { background-color: black; font-weight: bold; color: #ffffff; margin-bottom: 3px; margin-top: 3px } +TD.md { background-color: #F4F4FB; font-weight: bold; } +TD.mdPrefix { + background-color: #F4F4FB; + color: #606060; + font-size: 90%; +} +TD.mdname1 { background-color: #F4F4FB; font-weight: bold; color: #602020; } +TD.mdname { background-color: #F4F4FB; font-weight: bold; color: #602020; width: 600px; } +DIV.groupHeader { + margin-left: 16px; + margin-top: 12px; + margin-bottom: 6px; + padding: 3px; + font-weight: bold; + font-size: 110%; + background-color: #d0d0ff; +} +DIV.groupText { margin-left: 16px; font-style: italic; font-size: 90%; } +BODY { + background: white; + color: black; + margin-right: 20px; + margin-left: 20px; +} +TD.indexkey { + background-color: #eeeeff; + font-weight: bold; + padding-right : 10px; + padding-top : 2px; + padding-left : 10px; + padding-bottom : 2px; + margin-left : 0px; + margin-right : 0px; + margin-top : 2px; + margin-bottom : 2px; + border: 1px solid #CCCCCC; +} +TD.indexvalue { + background-color: #eeeeff; + font-style: italic; + padding-right : 10px; + padding-top : 2px; + padding-left : 10px; + padding-bottom : 2px; + margin-left : 0px; + margin-right : 0px; + margin-top : 2px; + margin-bottom : 2px; + border: 1px solid #CCCCCC; +} +TR.memlist { + background-color: #f0f0f0; +} +P.formulaDsp { text-align: center; } +IMG.formulaDsp { } +IMG.formulaInl { vertical-align: middle; } +SPAN.keyword { color: #0000ff } +SPAN.keywordtype { color: #0000ff } +SPAN.keywordflow { color: #0000ff } +SPAN.comment { color: #008000 } +SPAN.preprocessor { color: #806020 } +SPAN.stringliteral { color: #800000 } +SPAN.charliteral { color: #800000 } +.mdTable { + border: 1px solid #868686; + background-color: #F4F4FB; +} +.mdRow { + padding: 8px 10px; +} +.mdescLeft { + padding: 0px 8px 4px 8px; + font-size: 12px; + font-style: italic; + background-color: #FAFAFA; + border-top: 1px none #E0E0E0; + border-right: 1px none #E0E0E0; + border-bottom: 1px none #E0E0E0; + border-left: 1px none #E0E0E0; + margin: 0px; +} +.mdescRight { + padding: 0px 8px 4px 8px; + font-size: 12px; + font-style: italic; + background-color: #FAFAFA; + border-top: 1px none #E0E0E0; + border-right: 1px none #E0E0E0; + border-bottom: 1px none #E0E0E0; + border-left: 1px none #E0E0E0; + margin: 0px; +} +.memItemLeft { + padding: 1px 0px 0px 8px; + margin: 4px; + border-top-width: 1px; + border-right-width: 1px; + border-bottom-width: 1px; + border-left-width: 1px; + border-top-color: #E0E0E0; + border-right-color: #E0E0E0; + border-bottom-color: #E0E0E0; + border-left-color: #E0E0E0; + border-top-style: solid; + border-right-style: none; + border-bottom-style: none; + border-left-style: none; + background-color: #FAFAFA; + font-size: 90%; +} +.memItemRight { + padding: 1px 8px 0px 8px; + margin: 4px; + border-top-width: 1px; + border-right-width: 1px; + border-bottom-width: 1px; + border-left-width: 1px; + border-top-color: #E0E0E0; + border-right-color: #E0E0E0; + border-bottom-color: #E0E0E0; + border-left-color: #E0E0E0; + border-top-style: solid; + border-right-style: none; + border-bottom-style: none; + border-left-style: none; + background-color: #FAFAFA; + font-size: 100%; +} +.memTemplItemLeft { + padding: 1px 0px 0px 8px; + margin: 4px; + border-top-width: 1px; + border-right-width: 1px; + border-bottom-width: 1px; + border-left-width: 1px; + border-top-color: #E0E0E0; + border-right-color: #E0E0E0; + border-bottom-color: #E0E0E0; + border-left-color: #E0E0E0; + border-top-style: none; + border-right-style: none; + border-bottom-style: none; + border-left-style: none; + background-color: #FAFAFA; + font-size: 90%; +} +.memTemplItemRight { + padding: 1px 8px 0px 8px; + margin: 4px; + border-top-width: 1px; + border-right-width: 1px; + border-bottom-width: 1px; + borde... [truncated message content] |