#2381 lsearch -dictionary documentation misleading

obsolete: 8.4.2
Ken Jones

Either the documentation for the lsearch -dictionary
option is misleading, or the option isn't implemented

The documentation states:

"-dictionary: The list elements are to be compared using
dictionary-style comparisons. This option is only
meaningful when used with -exact or -sorted."

Drawing from the behavior of lsort -dictionary, one would
assume that the lsearch -dictionary would be case-
insensitive and handle embedded numbers appropriately.
In other words, one might reasonably assume that both
of the following lsearch commands would return 1:

% set vals [lsort -dictionary {B4.GIF b13.gif b2.Gif B7.gif
b2.Gif B4.GIF B7.gif b8.gif b13.gif
% lsearch -sorted -dictionary $vals b4.gif
% lsearch -exact -dictionary $vals b4.gif

In fact, they return -1. Thus, in reality the only use
currently for lsearch -dictionary is to signal the the list is
sorted equivalently to the return value of lsort -
dictionary. In fact, given such a list, a programmer
*must* use lsearch -dictionary; omitting it can fail to
match successfully. For example:

% lsearch -sorted -dictionary $vals B4.GIF
% lsearch -sorted $vals B4.GIF

Reading the code for lsearch in tclCmdIL.c, I'm not sure
what is the intended operation of lsearch -dictionary in
cases like this. Even in the case of a linear search (for
example, if the user executes lsearch -exact -
dictionary), the DictionaryCompare() function is used.
But this is pointless, as the "secondary comparison"
feature of DictionaryCompare() still reports "B4.GIF" as
less than "b4.gif" (not equal), the same as a simple
string comparison.

So, I don't know what the proper resolution is. I see 2

1) Change the documentation to indicate that the only
purpose of lsearch -dictionary is to signal a "dictionary-
sorted" list in conjunction with -sorted; or

2) Change the implementation of lsearch -dictionary to
produce the following results from the above examples:

% lsearch -sorted -dictionary $vals b4.gif
% lsearch -exact -dictionary $vals b4.gif


  • Logged In: YES

    Where there's no exact match, the meaning of [lsearch
    -sorted -dictionary] is fairly straight-forward (get the
    index of the first/every dictionary-matched value in the
    list). But what if there is an *exact* match and some other
    dictionary-matched values too?

    Once I understand the right semantics, a fix is pretty
    trivial. (My current thought is that we can't share the
    dictionary comparator between lsort and lsearch. Drat.)

  • Logged In: YES

    A different way to look at it is that -dictionary just
    alters the priority of case-difference relative to
    character-ordering. It doesn't make 'aBc' equivalent to
    'AbC' under either [lsort] or [lsearch]. I think tclguy
    prefers this explanation; anything else will probably
    require a TIP to allow for the specification of how 'exact'
    we want the dictionary matching to be...

  • Ken Jones
    Ken Jones

    Logged In: YES

    I don't object to a strict interpretation as suggested by your
    second follow-up. But that would imply that we need to
    change the lsearch man page to reflect that interpretation. If
    that's the approach to take, I don't see that -dictionary does
    anything in conjunction with the -exact option, and so we
    should remove the reference to -exact in the -dictionary
    description. This implies that the only purpose for -dictionary
    is to signal that the list is in "dictionary-sorted" order. The
    revise man page should also explicitly point out that lsearch -
    dictionary doesn't imply case-insensitive or other "dictionary-
    style" matches (for example, "b4.gif" won't match "b04.gif").

  • Logged In: YES

    Perhaps another way to write this is:
    ASCII/UNICODE ordering = ABCabc
    Dictionary ordering = AaBbCc
    However, there's also the matter of numeric handling...

  • Logged In: YES

    Clarified the documentation.

    For the record, the difference between -ascii and
    -dictionary is not in when two values are equal, but rather
    in how they differ from each other. Since -exact searching
    just requires a yes/no answer to equality, the differences
    between -ascii and -dictionary vanish.

    • status: open --> closed-fixed