#981 AutoComp: allow non-alphabetical sort

Completed
closed
Neil Hodgson
scintilla (69)
4
2013-04-12
2013-03-02
Alpha
No

Allow the auto-complete listing to work when given a list that is not alphabetically sorted (ex. sorted by priority).

1 Attachments

Discussion

1 2 > >> (Page 1 of 2)
  • Neil Hodgson
    Neil Hodgson
    2013-03-02

    • labels: --> scintilla
    • assigned_to: Neil Hodgson
    • priority: 5 --> 4
     
  • Neil Hodgson
    Neil Hodgson
    2013-03-02

    I could understand a variant of autocompletion lists that allowed arbitrary orderings and worked by eliminating non-matching choices but this retains non-matching items so the context provided by the list is not really useful.

     
  • Alpha
    Alpha
    2013-03-02

    If there are a lot of correct matches (ex. this situation), certain ones should be placed first (because they are more likely to be used), however, the other items cannot necessarily always be eliminated.

    (For pruning the choices, Code::Blocks rebuilds the list every x number of characters typed.)

     
  • Neil Hodgson
    Neil Hodgson
    2013-03-05

    ..\src\AutoComplete.cxx(127) : warning C4267: '=' : conversion from 'size_t' to 'int', possible loss of data
    cmp = lenA - lenB;

     
  • Neil Hodgson
    Neil Hodgson
    2013-03-05

    There are some performance concerns here: a potentially expensive sort is being performed where it wasn't before and clients have no way to turn this off. Unnecessary work is being performed: Sorter::operator() is stripping off image IDs when they are not returned from AutoComplete::GetValue.

    AutoComplete::GetValue is retrieving values from the GUI layer's list box implementation which may be expensive, depending on platform. It creates extra std::string objects which is fine for returning a single item when selected by the user but is not so great when in the middle of a comparison function.

    It would allow client choice to add another setting like SCI_AUTOCSETSORT with an enumerated value indicating if it is to be assumed sorted or not. Could be extended in the future to automatically sort the values. May also be faster and minimize platform-dependencies to perform the sort on the raw data instead of retrieving it from the platform layer.

     
  • Alpha
    Alpha
    2013-03-05

    Okay; I will rework the patch to attempt to address these points.

    (Earlier, I was have some issues when I did not strip the image IDs, but now that I tested removing them again, the problem does not seem to have returned... ?!)

     
  • Alpha
    Alpha
    2013-03-06

    This patch utilizes the raw data for sorting. It also adds a member variable autoSort to determine if it should assume presorted, sort internally (list remains in original order), or actually sort the list. (I was not able to make sense of how to hook this up to Scintilla's settings structure, so that still needs to be implemented.)

     
    Attachments
  • Alpha
    Alpha
    2013-03-06

    Minor optimizations of last patch.

     
    Attachments
  • Neil Hodgson
    Neil Hodgson
    2013-03-06

    This isn't portable:
    + int len = strlen(list);
    + char sortedList[len + 1];
    ../src/AutoComplete.cxx:164:25: warning: ISO C++ forbids variable length array 'sortedList' [-Wvla]
    You can use a vector<char> or similar.

    There needs to be an explanation of what this mode does that can go in the documentation. "sort internally (do not change the actual order)" will just cause support questions.

     
  • Alpha
    Alpha
    2013-03-06

    I am using -Wall and -Wextra; what other warning flags do you use?

     
    • Neil Hodgson
      Neil Hodgson
      2013-03-06

      g++ -DNDEBUG -Os -Wall -Wno-missing-braces -Wno-char-subscripts -pedantic -I ../include -I ../src -I../lexlib -fno-rtti -c ../src/AutoComplete.cxx
      ../src/AutoComplete.cxx: In member function 'void AutoComplete::SetList(const char*)':
      ../src/AutoComplete.cxx:164:25: warning: ISO C++ forbids variable length array 'sortedList' [-Wvla]

      g++ --version
      g++ (GCC) 4.7.2

       
  • Alpha
    Alpha
    2013-03-06

    Switched to char array to std::string.

     
    Attachments
  • Neil Hodgson
    Neil Hodgson
    2013-03-06

    An explanation should cover all the effects of the choice in full sentences, not as a short snippet. This is for people that who are coming to this new.

     
  • Alpha
    Alpha
    2013-03-07

    Are these comments descriptive enough?

    I also switched the constructor to autoSort(0) so the default behaviour will remain the same as previously.

     
    Attachments
    • Neil Hodgson
      Neil Hodgson
      2013-03-08

      OK. There's a large feature landing soon so this won't be committed until after that. May be a week or two.

       
  • Alpha
    Alpha
    2013-03-08

    Understood.

     
  • Neil Hodgson
    Neil Hodgson
    2013-03-16

    The code assumes that there is a terminating separator when there may not be. For example, the call may be SCI_AUTOCSHOW(1, "aardvark\nant"), not SCI_AUTOCSHOW(1, "aardvark\nant\n"). This often results in an extra garbage value in the list. It also means that the code that builds the sorted string may merge two values so "ant\naardvark" will produce a list containing an item "aardvarkant".

    If possible the "Check for a logically earlier match" should be moved out of the binary search. Yes, I know there is a similar case for SC_CASEINSENSITIVEBEHAVIOUR_RESPECTCASE. At some point I'd like to replace the binary search code with a call to std::binary_search.

    A patch is attached with the changes to other files required for this. Different orders have symbolic names that should be used instead of numeric literals. SC_ORDER_PERFORMSORT and SC_ORDER_UNSORTED have swapped values since it makes more sense as a narrative. SC_ORDER_UNSORTED is a little purposeless: something like SC_ORDER_PRIORITY may be better.

     
    Attachments
  • Alpha
    Alpha
    2013-03-17

    Fixed the terminating separator issue, incorporated the changes from your patch, and renamed SC_ORDER_UNSORTED to SC_ORDER_CUSTOM. I did not move the earlier match check because it would, in my opinion, be less clear.

     
    Attachments
  • Neil Hodgson
    Neil Hodgson
    2013-03-17

    The separator issue still causes problems: its not just that it still displays a garbage value but it also reads into any garbage after the argument. The \0 at the end of the argument is reached by the third ++i inside Sorter::Sorter (line 119) and then the for loop's ++i moves i into post \0 garbage.
    alternate text

    The problem with the extra match code is that it is inside the binary search loop. This raises questions like: Can it be executed more than once? What are the effects of executing it more than once? Can it prevent the binary search from terminating? Moving it outside the binary search makes it much easier to analyze its behaviour.

     
    Last edit: Neil Hodgson 2013-03-17
  • Alpha
    Alpha
    2013-03-19

    I see. Although I have been unable to reproduce this, I believe this patch should resolve it.

     
    Attachments
  • Neil Hodgson
    Neil Hodgson
    2013-03-20

    That is working better but there are still some problems. There are two parallel data structures here: the visual list and the sorted index and they need to be synchronized. One problem occurs if there is an empty entry in the list - the data structures aren't synchronized and it asserts out inside std::vector. Try this call in mode SC_ORDER_CUSTOM

    wEditor.CallString(SCI_AUTOCSHOW, 1, "aardvark\n\nant");
    

    and (with Visual Studio 2012) there'll be an assertion:

    ---------------------------
    Microsoft Visual C++ Runtime Library
    ---------------------------
    Debug Assertion Failed!
    
    Program: C:\Windows\system32\MSVCP110D.dll
    File: C:\Program Files (x86)\Microsoft Visual Studio 11.0\VC\include\vector
    Line: 1140
    
    Expression: vector subscript out of range
    

    Looking for problems with desynchronization can be made simplaer by adding another assertion to check that they are, at least, the same length. Just after the call to SetList for SC_ORDER_CUSTOM:

    PLATFORM_ASSERT(lb->Length() == sortMatrix.size());
    
     
  • Alpha
    Alpha
    2013-03-20

    That is a problem because the wxScintilla/wxSTC implementation discards empty entries:

    void ListBoxImpl::SetList(const char* list, char separator, char typesep)
    {
        GETLB(wid)->Freeze();
        Clear();
        wxStringTokenizer tkzr(sci2wx(list), (wxChar)separator);
        while ( tkzr.HasMoreTokens() ) {
            wxString token = tkzr.GetNextToken();
            long type = -1;
            int pos = token.Find(typesep);
            if (pos != -1) {
                token.Mid(pos+1).ToLong(&type);
                token.Truncate(pos);
            }
            Append(token, (int)type);
        }
        GETLB(wid)->Thaw();
    }
    

    How should Scintilla handle empty entries?

     
    • Neil Hodgson
      Neil Hodgson
      2013-03-20

      Modifying other platform layers to remove empty entries to match wxScintilla would be more work than changing wxScintilla. Its likely SC_ORDER_CUSTOM will lead to more formatting including non-selectable headers and blank lines.

      It should be possible to change wxScintilla to be the same as other platforms using wxTOKEN_RET_EMPTY in the tokenizer constructor.

       
  • Alpha
    Alpha
    2013-03-20

    Although, keeping empty entries in the sorted index can be done by switching the line:
    while (list[i] == ac->GetSeparator())
    to:
    if (list[i] == ac->GetSeparator())

     
    Attachments
    • Neil Hodgson
      Neil Hodgson
      2013-03-20

      Makes no difference. The first \n was consumed earlier in the loop so changing from while to if does not preserve the second \n.

       
1 2 > >> (Page 1 of 2)