UTF-8 Support

  • Niels van Reijmersdal


    We recently updated our application to have full UTF-8 support, so i figured i would write some tests.

    I am trying to use the following commands:
    Typeline "Chinese: 我能吞下玻璃而不伤身体"
    Typeline "Russian: Я могу есть стекло, оно мне не вредит"

    They look good in the editor, so i thought it would work. But it doesn't send them to the VNC-Server.

    Nor do they get saved correctly, if i reopen my .tpr file they look like:
    Typeline "Chinese: ???????????"
    Typeline "Russian: ? ???? ???? ??????, ??? ??? ?? ??????"

    I tried pasting to UltraVNC and RealVNC with their VNCClient software, that works like a charm, so i think the VNCServer i am using supports UTF-8.

    I am running Java version 1.6.0_17 on Windows 7. I could try other platforms if you want, but i feel its the VNCRobot and not my environment.



  • Robert Pes

    Robert Pes - 2010-01-27

    There are clearly two separate problems there.

    First, the tool fails to save UTF-8 characters to a file and reopen them correctly. This works fine on UTF-8 locales, for example on Linux. On other platforms the underlying system may use a different encoding and the characters get crippled. I'll check in the code whether the tool can force the UTF-8 environments. For now let's suppose this is a bug.

    Second, national characters can not be sent over to the server. This is a known limitation which is well described in the T-Plan Robot documentation. The tool supports just ISO 8859-1 (Latin 2) characters and from this point the behavior described above is expected behavior.

    I'm also surprised to hear that pasting of such a text works on RealVNC and UltraVNC. Text pasting is transferred using the ClientCutText message which is defined as follows:

    "The client has new ISO 8859-1 (Latin-1) text in its cut buffer. Ends of lines are represented by the linefeed / newline character (value 10) alone. No carriage-return (value 13) is needed. There is currently no way to transfer text outside the Latin-1 character set."  (RFB 3.8, page 26)

    I'll review the code whether support of other character sets could be implemented. As Java defines its own cross-platform virtual key set which need to be mapped onto the X-Windows key set used by VNC, such a feature may or may not be feasible. The fact that the other VNC products support it is irrelevant because they are built in a platform specific way and such functionality may be a proprietary extension out of the scope of the RFB protocol.

  • Robert Pes

    Robert Pes - 2010-01-28

    Two artifacts were created to keep track of the discussed items:

    1. Bug 2941550 tracks the crippled UTF-8 characters in editor. This issue has been fixed and it will be released in v2.0.3.

    2. Support of UTF8 character transfer over RFB was filed as feature request 2941597 and its feasibility will be considered for v2.1.

  • Niels van Reijmersdal

    Thanks for the follow up.

    For now i will use just open a document with the UTF-8 characters,/strings i need on the SUT.
    Then use the OS cut and paste features to get them into my application where needed, instead of sending the characters from the robot software.

  • Robert Pes

    Robert Pes - 2010-01-28

    Your approach will work as well. I used it in past when I was in localization and internationalization testing.

  • Robert Pes

    Robert Pes - 2010-03-29

    Bug 2941550 fix was released in 2.0.3 today.


Log in to post a comment.

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:

JavaScript is required for this form.

No, thanks