#21 Support case-insensitive languages

Next_Release
pending
Andre-Littoz
5
2012-09-17
2009-04-12
AdrianIssott
No

This feature would make LXR better suited to at least the following languages:
* Pascal (fully case-insensitive)
* Visual Basic (fully case-insensitive)
* SQL (partially case-insensitive - reserved words are CI but table / columns names may not be)
* PHP (partially case-insensitive - fn / class names are CI but variable names aren't)

Two particular issues to keep in mind are:
* It's too simplistic to assume languages are either case-sensitive or not.
* Allowing searches to be case-insensitive (e.g. for identifiers) will increase the number of false positives returned (i.e. matches that the user doesn't care about) and make searching less useful.

I propose this feature is implemented in the following stages:
1) Feature Request 2755445 Language Specific Searches to mitigate the issue of increased false positives when searching by allow case-insensitive searches to be restricted to just those languages in which it makes sense. See https://sourceforge.net/tracker/?func=detail&aid=2755445&group_id=27350&atid=390120 for more details.
2) The format of generic.conf should be updated to allow case-insensitivity to be specified per language as follows:

'caseinsensitivity' => {
'reserved' => [01],
'type_a' => [01],
'type_b' => [01],
...
},

Where type_a, type_b etc are the types listed in the typemap for the language.
3) Add the case-insensitivity functionality for reserved words when locating them for mark up in source files being browsed (has no impact on searching).
4) Add the case-insensitivity functionality per type when locating symbols for mark up in source files being browsed.
5) Add the case-insensitivity functionality per type when adding links to identifier searches in source files being browsed.
6) Allow identifier searches to be specified as case-insensitive by the user. [Don't try to do anything fancy such as preset this according to what languages the search is restricted to].
7) Ensure general searches on file contents can be made case insensitive by the user.

Discussion

  • AdrianIssott
    AdrianIssott
    2009-04-12

    • assigned_to: nobody --> adrianissott
     
  • Andre-Littoz
    Andre-Littoz
    2012-09-17

    Preliminary implementation in release 1.0:

    - new flag added in language description 'case_insensitive'
    According to specifications above, this might be too coarse since it applies globally to a language.
    When declared symbols are added to the DB dictionary, they are converted to upper-case if flag is true. Same for references.
    In processcode(), if flag is true, the key to dictionary search is converted to upper-case (the symbol is kept as is in the display, it will be highlighted in any case).

    - ident is not modified: meaning that if cases (in file and in dictionary) do not correspond), the target symbol will not be found. The difficulty is: 'case_insensitive' is a language flag while dictionary lookup is global (for all languages making up a tree)..

    As suggested by adrianissott, a finer grained strategy should be used: maybe only consider case-insensitivity for keywords since it is a bad coding method to vary case on different uses of the same object and let native case in the dictionary.

     
  • Andre-Littoz
    Andre-Littoz
    2012-09-17

    • assigned_to: adrianissott --> ajlittoz
    • milestone: --> Next_Release
    • status: open --> pending