Menu

#2390 Korean Ime move its caret blindly.

Bug
closed-rejected
5
2023-07-26
2023-06-20
johnsonj
No
// CS_NOMOVECARET: keep caret at beginning of composition string which already moved in InsertCharacter().
// GCS_CURSORPOS: current caret position is provided by IME.
Sci::Position imeEndToImeCaretU16 = -static_cast<Sci::Position>(wcs.size());
if (!(lParam & CS_NOMOVECARET) && (lParam & GCS_CURSORPOS)) {
        imeEndToImeCaretU16 += imc.GetImeCaretPos();
}

This code deprives korean Ime of the chance to get its ime caret position.
So korean ime move its ime caret blindly to the start of compStr.
This is a buggy behavior, not desirible.

It should honor ime caret position if possible rather than CS_NOMOVECARET.

In addition, All Other excution flows have to go through "if CS_NOMOVECARET check" to see MoveImeCarets().
There is no need to introduce CS_NOMOVECARET.
It is not efficent.

This is more reasonable patch. simple, intuitive and no need comments!

+ Sci::Position imeEndToImeCaretU16 = static_cast<Sci::Position>(-wcs.size() + imc.GetImeCaretPos());
1 Attachments

Discussion

  • Zufu Liu

    Zufu Liu - 2023-06-20

    Previous discussion [feature-requests:#1304].

    Current code does not break Korean IME:

    Sci::Position imeEndToImeCaretU16 = -static_cast<Sci::Position>(wcs.size());
    printf("CS_NOMOVECARET=%d, GCS_CURSORPOS=%d, pos=%d/%d\n", (int)(lParam & CS_NOMOVECARET), (int)(lParam & GCS_CURSORPOS),
        (int)(imeEndToImeCaretU16), (int)imc.GetImeCaretPos());
    

    result for above log:

    Korean IME:
    CS_NOMOVECARET=16384, GCS_CURSORPOS=0, pos=-1/0
    
    Other IME:
    CS_NOMOVECARET=0, GCS_CURSORPOS=128, pos=-2/2
    
     

    Related

    Feature Requests: #1304

  • Zufu Liu

    Zufu Liu - 2023-06-20
    • labels: honor ime caret position --> scintilla, win32, IME
     
  • Zufu Liu

    Zufu Liu - 2023-06-20

    I think current code is not wrong:
    https://learn.microsoft.com/en-us/windows/win32/intl/wm-ime-composition

    CS_NOMOVECARET
    Do not move the caret position as a result of processing the message. For example, if an IME specifies a combination of CS_INSERTCHAR and CS_NOMOVECARET, the application should insert the specified character at the current caret position but should not move the caret to the next position. A subsequent WM_IME_COMPOSITION message with GCS_RESULTSTR will replace this character.

    https://learn.microsoft.com/en-us/windows/win32/intl/ime-composition-string-values

    GCS_CURSORPOS
    Retrieve or update the cursor position in composition string.

     
  • johnsonj

    johnsonj - 2023-06-20

    It is too bad you assume the original code as wrong.
    what can I say about?

     
  • johnsonj

    johnsonj - 2023-06-20

    This is not about simplication, but about reverting.
    Why does one line need to expand five lines?
    Maybe the cause is what the comments give.
    But the comments do not give any excuse and is deceptive to code readers.
    The real behivor of the codes expanded I found is to filter korean ime out and to discriminate it.
    Currntly korean ime is blind not to know its ime caret position while other imes do.

    The codes expanded are very rude for korean.

     
  • Zufu Liu

    Zufu Liu - 2023-06-20

    Reverting seems has no impact for other IME.
    probably need to test the revert on wine where Korean IME seems not work: GCS_CURSORPOS is always composition string length, and CS_NOMOVECARET is never set or accompanied with GCS_CURSORPOS.

     
  • johnsonj

    johnsonj - 2023-06-21

    The more if checks, the more difficult to debug, the less we are sure that we are right.
    One liner can give sure that we are right.

     
  • Zufu Liu

    Zufu Liu - 2023-06-21

    Not find a way to test IME on Wine.

    I think revert (no bug here, just to simplify the code) is safe as only Korean IME set CS_INSERTCHAR and CS_NOMOVECARET (do in-place composition), other IME composition string grows on typing more.

    the one liner code was used before 2019 changes, and no bug caused by that line. it just leaves one question: it relies on imc.GetImeCaretPos() returns zero when (lParam & GCS_CURSORPOS) is zero?

     
  • johnsonj

    johnsonj - 2023-06-21

    One line code expects any value IMM32 can return.

    If CS_NOMOVECARET is intruduced,
    it should depends on that imc.GetImeCaretPos() return zero when (lParam & GCS_CURSORPOS) is zero.

     
  • Neil Hodgson

    Neil Hodgson - 2023-07-05

    So korean ime move its ime caret blindly to the start of compStr.
    This is a buggy behavior, not desirible.

    How can I reproduce this bug?

     
    • johnsonj

      johnsonj - 2023-07-05

      Your question confirms current code should be reverted.
      It compromises readability.

      On korean ime, imc.GetImeCaretPos() should return the start position of the last character for imeblokcaretoverride regardless of CS_NOMOVECARET.
      If CS_NOMOVECARET, then imc.GetImeCaretPos() returns 0 which is the start position of the last character.
      Even If not CS_NOMOVECARET, it also returns the start position of the last character.

      Current code exploits imc.GetImeCaretPos() absolutely is 0 when CS_NOMOVECARET set.

      It is OK to never mind CS_NOMOVECARET.

       
      • Neil Hodgson

        Neil Hodgson - 2023-07-05

        It compromises readability.

        That's all? There is no actual problem that can be seen by a user?

         
        • johnsonj

          johnsonj - 2023-07-05

          A user can not see any actual problem.
          But an implementer should catch hidden problems.
          Here are same effect for korean on imm32.
          ime caret is located at:

          1. the start of compStr.
          2. the start of the last charecter.
          3. the position pointed by imm32.

          I think 3 is right behavior.

          Previous code 3 did show any actual problem to user neither.
          but current code selects 1 with a discriminative cause.

          This is not simplication, but reverting with a right cause.

           
          • johnsonj

            johnsonj - 2023-07-06
            1. the start of compStr by CS_NOMOVECARET.
            2. the start of the last character by no CS_NOMOVECARET.
            3. the position pointed by IME regardless of CS_NOMOVECARET.

            IME knows at which ime caret should be located.
            so for one caracter input, it must be the start of one compStr by CS_NOMOVECARET.
            for clause input, it must be the start of the last character by no CS_NOMOVECARET.

            But for korean ime, current code implement as this.
            If (CS_NOMOVECARET) {
            // do nothing
            } else {
            imeEndToImeCaretU16 += imc.GetImeCaretPos();
            }

            Is this implemeted according to MS document?

             
        • johnsonj

          johnsonj - 2023-07-05
          if (!(lParam & CS_NOMOVECARET) && (lParam & GCS_CURSORPOS)) {
                  imeEndToImeCaretU16 += imc.GetImeCaretPos();
          }
          

          equals to :

          if (KoreanIME) {
                  //do nothing
           } esle {
                  imeEndToImeCaretU16 += imc.GetImeCaretPos();
          }
          

          Why does it adopt discrimination?
          The comments does not give clear explanations.

           

          Last edit: johnsonj 2023-07-05
          • Neil Hodgson

            Neil Hodgson - 2023-07-05

            Why does it adopt discrimination?

            It is implementing the behaviour specified by Microsoft documentation in WM_IME_COMPOSITION.

            There is some ambiguity over how to handle a message with both CS_NOMOVECARET and a non-zero GCS_CURSORPOS but the documentation for CS_NOMOVECARET is emphatic so that should receive priority.

            CS_NOMOVECARET Do not move the caret position as a result of processing the message.

            Code that implements a specification more fully and has no known problems should be preferred even when it covers cases that have not been observed. There are a wide variety of IMEs and some may produce these cases now or with future updates.

             
            • johnsonj

              johnsonj - 2023-07-06

              The reason why I want reverting is exactly for

              There are a wide variety of IMEs and some may produce these cases now or with future updates.

              It is desirable to implement ime codes as simple as possible.
              We do not use GCS_COMPCLAUSE for ime attributes.
              We do not CS_INSERTCHAR to insert characters.

               
  • johnsonj

    johnsonj - 2023-07-06

    from https://github.com/ONLYOFFICE/desktop-sdk/blob/master/ChromiumBasedEditors/lib/src/cef/windows/tests/cefclient/browser/osr_ime_handler_win.cc
    .....

      // **Retrieve the selection range information**. If CS_NOMOVECARET is specified
      // it means the cursor should not be moved and we therefore place the caret at
      // the beginning of the composition string. **Otherwise we should honour the
      // GCS_CURSORPOS value if it's available.**
      // TODO(suzhe): Due to a bug in WebKit we currently can't use selection range
      // with composition string.
      // See: https://bugs.webkit.org/show_bug.cgi?id=40805
      if (!(lparam & CS_NOMOVECARET) && (lparam & GCS_CURSORPOS)) {
        // IMM32 does not support non-zero-width selection in a composition. So
        // always use the caret position as selection range.
        int cursor = ::ImmGetCompositionString(imc, GCS_CURSORPOS, nullptr, 0);
        composition_start = cursor;
      } else {
        composition_start = 0;
      }
    

    .....

    I think this is the case "Otherwise"
    we should honour the GCS_CURSORPOS value.

    CS_NOMOVECARET breakes the rule already since the caret moved to the end of compStr.

    So does it break the rule for korean ime to honor ime caret potision which is pointed IME?

    if ((lParam & CS_NOMOVECARET) && (lParam & GCS_CURSORPOS)) {
            imeEndToImeCaretU16 += imc.GetImeCaretPos(); //instead of "do nothing" for korean ime
    } else {
            imeEndToImeCaretU16 += imc.GetImeCaretPos();
    }
    
     

    Last edit: johnsonj 2023-07-06
    • Neil Hodgson

      Neil Hodgson - 2023-07-06

      CS_NOMOVECARET breakes the rule already since the caret moved to the end of compStr.

      The caret movement is caused by reusing the implementation of InsertCharacter for displaying the composition string since it handles some complexities like multiple carets and protected regions. The caret position is corrected before it is visible to the user.

      So does it break the rule for korean ime to honor ime caret potision which is pointed IME?

      Yes, this is ignoring CS_NOMOVECARET for no visible improvement.

       
      • johnsonj

        johnsonj - 2023-07-06

        GCS_CURSORPOS is primary.
        but CS_NOMOVECARET is secondary.

        it really breaks the rule to ignore GCS_CURSORPOS.

         
  • Neil Hodgson

    Neil Hodgson - 2023-07-06
    • status: open --> open-rejected
     
  • Neil Hodgson

    Neil Hodgson - 2023-07-06

    Rejecting as the change has no effect.

     
  • Neil Hodgson

    Neil Hodgson - 2023-07-26
    • status: open-rejected --> closed-rejected
     

Log in to post a comment.

MongoDB Logo MongoDB