#1003 Tcl single line braced expressions should be considered strings

Completed
open
Neil Hodgson
5
2013-11-15
2013-07-26
Cousteau
No

In Tcl, code blocks are (usually) multi-line blocks of text delimited by braces. However, braces are used for delimiting strings in general, and thus can appear delimiting lists or literal strings. For this reason, code like

proc foo {list args} {
    do something with $list and $args
}

will be formatted so that list is highlighted as a keyword, when it's supposed to be a variable name.

In my opinion, this way of highlighting might be confusing, so I'd do the following:

  • Interpret single-line braced text as "list" and multi-line braced text as "code"
  • Highlight "list" in a similar way to "string", with no keyword parsing
  • Highlight "list" braces (both delimiting braces and inner non-escaped braces) on a different color than the rest of the list; inner braces could mean that the list contains a sub-list

Discussion

  • Neil Hodgson
    Neil Hodgson
    2013-07-29

    • labels: --> tcl, scintilla
    • assigned_to: Neil Hodgson
     
  • Neil Hodgson
    Neil Hodgson
    2013-07-29

    While I haven't worked with TCL for a couple of years, my recollection is that single line bracketed code is quite common. Unless there is a strong consensus that single line bracketed expressions should be treated differently, current behaviour should be preserved.

    The alternative lexing scheme could be added as an option.

     
    • Cousteau
      Cousteau
      2013-07-31

      While I haven't worked with TCL for a couple of years, my recollection is that single line bracketed code is quite common. Unless there is a strong consensus that single line bracketed expressions should be treated differently, current behaviour should be preserved.

      The Tcl Style Guide http://wiki.tcl.tk/708#pagetoc081af811 seems to suggest putting one statement per line (it mentions breakpoints in a debugger), and thus it would discourage code like

      if {something} {do something}
      

      in favor of

      if {something} {
          do something
      }
      

      Technically all of them are strings, but {do something} represents code and {something} represents an expression. My idea is to decide whether it's code or just a string/list/expression based on whether it spans several lines or not.

      As Eric has pointed out, maybe a better (but more complicated) solution would involve detecting specific "keywords" such as proc, if, for, expr, etc... and use them to determine the meaning of the arguments.

       
  • Eric Promislow
    Eric Promislow
    2013-07-29

    I've written lexers for a bunch of languages over the years, and
    have concluded that writing a lexer for Tcl that gets everything
    correct is hard, because in general you never know whether a
    bottom-level braced construct (one not containing further nested
    braces) will be used as a string, or will be reinterpreted and
    interpreted as something else. You can do a bit more parsing to
    recognize things like the parts of a proc definition or built-in
    structural expression, like a for-loop, but in general it comes
    down to the semantics of a particular program.

    In this particular case, Komodo's forked lexer also gets the
    list of args wrong, and colors "list" as a keyword. These
    words should only be colored as keywords either at the start of
    an expression or after a "[".

    I disagree on the single-line braced thing. If you want to work
    with a single double-quote, the easiest way is to encase it
    in matching braces.

     
    • Cousteau
      Cousteau
      2013-07-31

      you never know whether a
      bottom-level braced construct (one not containing further nested
      braces) will be used as a string, or will be reinterpreted and
      interpreted as something else.

      My idea is to deal with this issue in a way that satisfies most situations; it should be somehow predicted whether a braced expression is a string, list, or code. If it has multiple lines then it's probably code; if it has a single line then it's probably a string or list.

      I disagree on the single-line braced thing. If you want to work
      with a single double-quote, the easiest way is to encase it
      in matching braces.

      In my idea, a "list" would actually be just a special type of string, equivalent to single-quoted strings in Perl, so maybe "non-interpolated string" would be a better name. My point was that lists often use this syntax, but so do strings as you pointed out; that's why I said that the highlighting should be similar to that of "string". Probably using "character" highlighting (the same used for single-quoted strings in C and Perl) is the way to go.

      However, I consider that braces should be highlighted differently to the rest of the string because they have a special treatment within the string (they must be balanced, similarly to Perl's q{}) and may be used to indicate sub-lists, but this is actually not really important.

       
  • Cousteau
    Cousteau
    2013-11-15

    Maybe the best idea is to do as Eric suggested and do the parsing/highlighting based on the commands: known commands such as if, proc, set, string, etc could use special highlighting of their arguments based on what data types are expected.
    Here's my proposal:

    Expression quoting:

    A. Unquoted expressions:   Hello
    B. Braced expressions:   {Hello}
    C. Quoted expressions:   "Hello"

    Argument types expected by standard Tcl commands:

    1. Plain strings:   puts {Hello}
    2. String lists:   lindex {foo bar baz} 2
    3. Expressions:   expr {$foo+1}
    4. Variable names:   set {foo} 42
    5. Command lists:   uplevel {foo; bar; baz}
    6. Subcommands:   string {range} $foo $bar $baz
    7. Options:   puts {-nonewline} foo

    Highlighting scheme:

    A:
        A3: highlight as variable (same as $foo)
        A4: highlight as code
        A5: highlight as secondary keyword (lile C's int8_t or Python's repr())
        A6: highlight as operator
        others: regular highlighting (considering [commands], $variables, and \escapes)
    B:
        B1: treat each element as an argument of type 1 (A1/B1/C1)
        B2: highlight as expression (or use regular highlighting)
        B4: highlight as code (sequence list)
        others: highlight as verbatim quoted text (character)
    C:
        all: highlight as quoted string (considering [commands], $variables, and \escapes)