Scintilla / Bugs / #2390 Korean Ime move its caret blindly.

Previous discussion [feature-requests:#1304].

Current code does not break Korean IME:

Sci::Position imeEndToImeCaretU16 = -static_cast<Sci::Position>(wcs.size());
printf("CS_NOMOVECARET=%d, GCS_CURSORPOS=%d, pos=%d/%d\n", (int)(lParam & CS_NOMOVECARET), (int)(lParam & GCS_CURSORPOS),
    (int)(imeEndToImeCaretU16), (int)imc.GetImeCaretPos());

result for above log:

Korean IME:
CS_NOMOVECARET=16384, GCS_CURSORPOS=0, pos=-1/0

Other IME:
CS_NOMOVECARET=0, GCS_CURSORPOS=128, pos=-2/2

Feature Requests: ~~#1304~~

Zufu Liu - 2023-06-20

labels: honor ime caret position --> scintilla, win32, IME
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Zufu Liu - 2023-06-20

I think current code is not wrong:
https://learn.microsoft.com/en-us/windows/win32/intl/wm-ime-composition

CS_NOMOVECARET
Do not move the caret position as a result of processing the message. For example, if an IME specifies a combination of CS_INSERTCHAR and CS_NOMOVECARET, the application should insert the specified character at the current caret position but should not move the caret to the next position. A subsequent WM_IME_COMPOSITION message with GCS_RESULTSTR will replace this character.

https://learn.microsoft.com/en-us/windows/win32/intl/ime-composition-string-values

GCS_CURSORPOS
Retrieve or update the cursor position in composition string.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

johnsonj - 2023-06-20

It is too bad you assume the original code as wrong.
what can I say about?

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

johnsonj - 2023-06-20

This is not about simplication, but about reverting.
Why does one line need to expand five lines?
Maybe the cause is what the comments give.
But the comments do not give any excuse and is deceptive to code readers.
The real behivor of the codes expanded I found is to filter korean ime out and to discriminate it.
Currntly korean ime is blind not to know its ime caret position while other imes do.

The codes expanded are very rude for korean.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Zufu Liu - 2023-06-20

Reverting seems has no impact for other IME.
probably need to test the revert on wine where Korean IME seems not work: GCS_CURSORPOS is always composition string length, and CS_NOMOVECARET is never set or accompanied with GCS_CURSORPOS.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

johnsonj - 2023-06-21

The more if checks, the more difficult to debug, the less we are sure that we are right.
One liner can give sure that we are right.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Zufu Liu - 2023-06-21

Not find a way to test IME on Wine.

I think revert (no bug here, just to simplify the code) is safe as only Korean IME set CS_INSERTCHAR and CS_NOMOVECARET (do in-place composition), other IME composition string grows on typing more.

the one liner code was used before 2019 changes, and no bug caused by that line. it just leaves one question: it relies on imc.GetImeCaretPos() returns zero when (lParam & GCS_CURSORPOS) is zero?

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

johnsonj - 2023-06-21

One line code expects any value IMM32 can return.

If CS_NOMOVECARET is intruduced,
it should depends on that imc.GetImeCaretPos() return zero when (lParam & GCS_CURSORPOS) is zero.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Neil Hodgson - 2023-07-05

So korean ime move its ime caret blindly to the start of compStr.
This is a buggy behavior, not desirible.

How can I reproduce this bug?

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- johnsonj - 2023-07-05
  
  Your question confirms current code should be reverted.
  It compromises readability.
  
  On korean ime, imc.GetImeCaretPos() should return the start position of the last character for imeblokcaretoverride regardless of CS_NOMOVECARET.
  If CS_NOMOVECARET, then imc.GetImeCaretPos() returns 0 which is the start position of the last character.
  Even If not CS_NOMOVECARET, it also returns the start position of the last character.
  
  Current code exploits imc.GetImeCaretPos() absolutely is 0 when CS_NOMOVECARET set.
  
  It is OK to never mind CS_NOMOVECARET.
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
  - Neil Hodgson - 2023-07-05
    
    It compromises readability.
    
    That's all? There is no actual problem that can be seen by a user?
    
    If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
    - johnsonj - 2023-07-05
      
      A user can not see any actual problem.
      But an implementer should catch hidden problems.
      Here are same effect for korean on imm32.
      ime caret is located at:
      
      the start of compStr.
      
      the start of the last charecter.
      
      the position pointed by imm32.
      
      I think 3 is right behavior.
      
      Previous code 3 did show any actual problem to user neither.
      but current code selects 1 with a discriminative cause.
      
      This is not simplication, but reverting with a right cause.
      
      If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
      - johnsonj - 2023-07-06
        
        the start of compStr by CS_NOMOVECARET.
        
        the start of the last character by no CS_NOMOVECARET.
        
        the position pointed by IME regardless of CS_NOMOVECARET.
        
        IME knows at which ime caret should be located.
        so for one caracter input, it must be the start of one compStr by CS_NOMOVECARET.
        for clause input, it must be the start of the last character by no CS_NOMOVECARET.
        
        But for korean ime, current code implement as this.
        If (CS_NOMOVECARET) {
        // do nothing
        } else {
        imeEndToImeCaretU16 += imc.GetImeCaretPos();
        }
        
        Is this implemeted according to MS document?
        
        If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
    - johnsonj - 2023-07-05
      
      if (!(lParam & CS_NOMOVECARET) && (lParam & GCS_CURSORPOS)) { imeEndToImeCaretU16 += imc.GetImeCaretPos(); }
      
      equals to :
      
      if (KoreanIME) { //do nothing } esle { imeEndToImeCaretU16 += imc.GetImeCaretPos(); }
      
      Why does it adopt discrimination?
      The comments does not give clear explanations.
      
      Last edit: johnsonj 2023-07-05
      
      If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
      - Neil Hodgson - 2023-07-05
        
        Why does it adopt discrimination?
        
        It is implementing the behaviour specified by Microsoft documentation in WM_IME_COMPOSITION.
        
        There is some ambiguity over how to handle a message with both CS_NOMOVECARET and a non-zero GCS_CURSORPOS but the documentation for CS_NOMOVECARET is emphatic so that should receive priority.
        
        CS_NOMOVECARET Do not move the caret position as a result of processing the message.
        
        Code that implements a specification more fully and has no known problems should be preferred even when it covers cases that have not been observed. There are a wide variety of IMEs and some may produce these cases now or with future updates.
        
        If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
        
        johnsonj - 2023-07-06
        
        The reason why I want reverting is exactly for
        
        There are a wide variety of IMEs and some may produce these cases now or with future updates.
        
        It is desirable to implement ime codes as simple as possible.
        We do not use GCS_COMPCLAUSE for ime attributes.
        We do not CS_INSERTCHAR to insert characters.
        
        If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

johnsonj - 2023-07-06

from https://github.com/ONLYOFFICE/desktop-sdk/blob/master/ChromiumBasedEditors/lib/src/cef/windows/tests/cefclient/browser/osr_ime_handler_win.cc
.....

// **Retrieve the selection range information**. If CS_NOMOVECARET is specified // it means the cursor should not be moved and we therefore place the caret at // the beginning of the composition string. **Otherwise we should honour the // GCS_CURSORPOS value if it's available.** // TODO(suzhe): Due to a bug in WebKit we currently can't use selection range // with composition string. // See: https://bugs.webkit.org/show_bug.cgi?id=40805 if (!(lparam & CS_NOMOVECARET) && (lparam & GCS_CURSORPOS)) { // IMM32 does not support non-zero-width selection in a composition. So // always use the caret position as selection range. int cursor = ::ImmGetCompositionString(imc, GCS_CURSORPOS, nullptr, 0); composition_start = cursor; } else { composition_start = 0; }

.....

I think this is the case "Otherwise"
we should honour the GCS_CURSORPOS value.

CS_NOMOVECARET breakes the rule already since the caret moved to the end of compStr.

So does it break the rule for korean ime to honor ime caret potision which is pointed IME?

if ((lParam & CS_NOMOVECARET) && (lParam & GCS_CURSORPOS)) { imeEndToImeCaretU16 += imc.GetImeCaretPos(); //instead of "do nothing" for korean ime } else { imeEndToImeCaretU16 += imc.GetImeCaretPos(); }

Last edit: johnsonj 2023-07-06
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Neil Hodgson - 2023-07-06
  
  CS_NOMOVECARET breakes the rule already since the caret moved to the end of compStr.
  
  The caret movement is caused by reusing the implementation of InsertCharacter for displaying the composition string since it handles some complexities like multiple carets and protected regions. The caret position is corrected before it is visible to the user.
  
  So does it break the rule for korean ime to honor ime caret potision which is pointed IME?
  
  Yes, this is ignoring CS_NOMOVECARET for no visible improvement.
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
  - johnsonj - 2023-07-06
    
    GCS_CURSORPOS is primary.
    but CS_NOMOVECARET is secondary.
    
    it really breaks the rule to ignore GCS_CURSORPOS.
    
    If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Neil Hodgson - 2023-07-06

status: open --> open-rejected
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Neil Hodgson - 2023-07-06

Rejecting as the change has no effect.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Neil Hodgson - 2023-07-26

status: open-rejected --> closed-rejected
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Korean Ime move its caret blindly.

Group

Searches

Help

#2390 Korean Ime move its caret blindly.

Discussion

Related