#1056 Support DBCS in "characters"

Completed
open
Neil Hodgson
dbcs (1)
3
2014-06-25
2014-06-25
RobertL
No

Could support DBCS in "characters"?
Such as whitespace.characters/word.characters, "which is treated as byte sequences" described in "SciTE Documentation\Properties file".

Also, I find SCI_SETPUNCTUATIONCHARS in "Scintilla Documentation" not support DBCS.
They have the same param list const char *characters.

Discussion

  • Neil Hodgson
    Neil Hodgson
    2014-06-25

    • labels: --> dbcs
    • assigned_to: Neil Hodgson
    • Priority: 5 --> 3
     
  • Neil Hodgson
    Neil Hodgson
    2014-06-25

    Contributions for DBCS are fairly rare as many applications have moved to Unicode/UTF-8.

    It would be fine to implement this but its not something I'll be working on.

     
    • RobertL
      RobertL
      2014-06-25

      I'm using Chinese in my code, which there is no space needed between each character, its word/character and punctuation are both DBCS.

      It's hard to participle from word/punctuation, also can't use ctrl+left/right to move between word.

      So, I want to add punctuation as white space, then it is easy to move between clause/sentence, instead of paragraph.

       
      Last edit: RobertL 2014-06-25
      • Neil Hodgson
        Neil Hodgson
        2014-06-25

        Adding DBCS punctuation as white space is the same problem as adding DBCS characters as word characters. Scintilla just has a set of bytes that are used to determine whether each byte in the document is word/punctuation/whitespace.

        Doing this with DBCS would require the word movement code to treat the document as a sequence of characters and to have different data structures to represent the different characters sets.

         
        • RobertL
          RobertL
          2014-06-26

          Hm.. I see.
          The word/punctuation/whitespace are three different sets, all be treated as a sequence of bytes.

           
          Last edit: RobertL 2014-06-26