From: Stephen <yo...@gr...> - 2014-02-09 19:40:35
|
The original test showed a difference of 250% between best and worst. The new test shows 9%. The difference is: - the overhead of bulk conversion of kv lists in Tcl removed - the de-duplication of columns in list result types In this rather large test, de-duplicating the column keys saved 140,000 Tcl object allocations (and another 140,000 string allocations). In this real-world code: http://openacs.org/api-doc/proc-view?proc=db_multirow ...db_multirow converts each set-row to an array (like the rest of the dbi_* procs, it doesn't want sets), and then converts those arrays into kv-lists for caching. In the common case of a cache hit: the string from the cache is parsed as nested lists then fed to 'array set'. Arrays (and dicts, and sets) don't perform de-duplication because in the general case the keys aren't guaranteed to be common. dbi_rows' original output was the straight-forward de-duplicated flat rows and columns lists. It is easy to reason about. Missing from 'dbi_convert' (or dbi_loop, whatever) is an optional code block. An optional code block would be: - convenient - the obvious place to put code which adds computed columns (see other thread) - an opportunity for further optimisation Apart from not having to traverse the result twice, an optional code block could create dicts (say) on demand before running the code. Having run the code, check the ref count: if it's 1 then the code block didn't append the dict to a list or something and it can be reused, including keys, for the next block. If the values are also unshared then just set the value directly. Keep the values in an array for fast lookup without having to hash the keys each time. |