From: Lars H. <Lar...@re...> - 2008-02-16 12:34:52
|
> TIP #313: INEXACT SEARCHING IN SORTED LIST > > This TIP adds a new switch to *lsearch* to do a binary search to find > the insertion point in a sorted list. [snip] > Sometimes, it is necessary to find the location in a sorted list > where > a particular new element should be inserted. In view of this, the switch name could alternatively be formed from "insert", e.g. -insertpos or just -insert. > One question for the specification is exactly what index should be > returned. Below an increasing list is assumed, things are equivalent > for a decreasing. > > 1. First where element is greater than key. > > 2. Last where element is less than or equal to key. > > 3. First where element is greater than or equal to key. > > 4. Last where element is less than key. > > Here, 1 is the use case for a stable insertion sort. I'd like to offer an alternative formalisation which I think is easier: In a sorted list, there is a range of positions into which the key can be inserted that the list remains in order. If nothing equal to the key is in the list, then this range contains exactly one point, otherwise it contains two or more. Denoting this range by [a,b] (endpoints included, as for [regexp], [lrange], etc.), then choices 1-4 above are respectively: 1. Return b. 2. Return b-1. 3. Return a. 4. Return a-1. If we don't mind adding a few extra utensils to the kitchen sink :-), one could have separate options for returning lower endpoint a and higher endpoint b, e.g. -insertlo (3) and -inserthi (1). Phrasing the spec in terms of the range of allowed insert positions should also make it clearer what one wants in the decreasing list case. > In the core we can find *::tcl::clock::BSearch* which returns the > index > of the greatest element in $list that is less than or equal to $key, > i.e. type 2. The same can be found in tklib's ::khim::BSearch. A guess: These are about searching for the interval to which something belongs, when intervals are listed with their starting points and continue until the next interval begins. This is a slightly different application of bisecting search, but it is certainly not unimportant. Given that the current proposal returns b-1, one could even say it is geared more towards interval searching than towards the insertion point searching mentioned in the abstract. Lars Hellström |