Menu

#196 UTF-8 support

Not planned
open
nobody
None
5
2024-02-07
2023-12-22
David
No

Hello.
I would appreciate saving files in UTF-8 to be able to display all characters of my language.

The program now saves and displays only the characters contained in the (8-bit) Windows-1252 character set. In "Options" > "Fonts" I can switch "Script" from Western to another character set, but this is not saved. Because of this, I cannot name the station in the plan according to the real one. Same thing when I create custom track parts.

Thank you,
David

Related

Feature Requests: #196

Discussion

  • Martin Fischer

    Martin Fischer - 2023-12-23

    Hello David,
    which version of XTrackCAD are you running?
    In 2020 I did add some code to XTrackCAD that converts strings to UTF-8 on save and reverse on load. So I made some tests with version 5.2.2GA and found the following.
    For my German environment this seems to work for texts on the layout, stations in elevations and custom parts. I found a bug in notes which needs to be fixed.

    It would be helpful if you could send me a text file containing your special characters to explore further.

     
    • David

      David - 2024-02-04

      Hi Martin.
      Thanks for the advice, but now it's even worse.

      I had version 5.2.0GA (32 bit) and now I have 5.2.2aGA (64 bit). Before, at least the text was saved and displayed the same way every time, and it could be temporarily fixed by changing the encoding. But now the text is saved in UTF-8 and decoded in Windows-1252, so it's an unreadable mess.

      This is what I put in the textbox (characters from Windows-1250 encoding):

      ÄäÁáÂâĂ㥹ĆćÇçČčĎďĐđ
      ËëÉéĚěĘęÍíÎîĹ弾łŁŃńŇň
      ÖöÓóÔôŐőŔŕŘřŚśŞşßŠšŤťŢţ™
      ÜüÚúŮůŰűÝýŻżŹźŽž–˝—‘’‚“”„…
      

      This shows on the plan until I switch the encoding in the menu. The original characters are displayed in the editing window. Until version 5.5.0 this was also saved and loaded next time:

      ÄäÁáÂâÃã¥¹ÆæÇçÈèÏïÐð
      ËëÉéÌìÊêÍíÎîÅå¼¾³£ÑñÒò
      ÖöÓóÔôÕõÀàØøŒœªºßŠšÞþ™
      ÜüÚúÙùÛûÝý¯¿ŸŽž–½—‘’‚“”„…
      

      This is what appears in the text window and on the blueprint in version 5.2.2aGa when I save and reopen the file with the above text:

      ÄäÁáÂâĂ㥹ĆćÇçČčĎďĐđ
      ËëÉéĚěĘęÍíÎîĹ弾łŁŃńŇň
      ÖöÓóÔôŐőŔŕŘřŚśŞşßŠšŤťŢţ™
      ĂśĂĽĂšĂşĹ®ĹŻĹ°Ĺ±ĂťĂ˝Ĺ»ĹĽĹąĹşĹ˝Ĺľâ€“Ëťâ€”â€˜â€™â€šâ€śâ€ťŠş
      

      In addition, I observe that the settings have recently started to be lost. I'll save the metric system and next time I'll have imperial again.

      David

       
      • Martin Fischer

        Martin Fischer - 2024-02-05

        Hi David,

        sorry for making things worse.

        Let me shortly describe how XTrackCAD works when handling text. Internally strings are prcoessed using the system codepage. Only when saving or loading a conversion to or from UTF-8 takes place.

        Basically you found problems in two steps:

        case a: the lower sample: this is what happens when no conversion from UTF-8 takes place. I can't reproduce that with 5.2.2a on my system. Which feature of XTrackCAD did you use? File>Notes or Draw>Text?

        case b: obviously the conversion happens but isn't reversible in this case. I will have to examine closer. This might be difficult as my German codepage only has a few umlauts.

        It would be helpful if you could give me some more information:

        • the XTC file used to create these examples
        • screen shots of the font dialog for the positive and negative cases.

        Greetings
        Martin

         
        • David

          David - 2024-02-07

          It's hard to simulate the problem every time because I haven't figured out how it happens yet. A bad save is somewhere between opening a file with those characters, changing the encoding, inserting another object, and saving.

          Part of the file that got corrupted

          DRAW 1 0 0 0 0 0.000000 0.000000 0 0.000000
              Z 0 7.874016 7.874016 0.000000 0 48.000000 "”„"
              END$SEGS
          

          I list the characters at the end of the second line by bytes (HEX):

          I wrote in the program and expected

          ”   E2 80 9D
          „   E2 80 9E
          …   E2 80 A6
          "   22
          CR  0D
          LF  0A
          

          Good file

          ”   E2 80 9D
          „   E2 80 9E
          "   22
          CR  0D
          LF  0A
          

          ( is missing)

          Corrupted file

          ”   E2 80 9D
          (?) 8A BA
          "   22
          CR  0D
          LF  0A
          

          ( is missing, is replaced by code that is not valid in UTF-8)

           
  • Martin Fischer

    Martin Fischer - 2024-02-06

    David,

    your email address on SF does not work. Could you please contact me directly at martinfischer2008 at t-online dot de?

    I would like to exchange test builds directly

    Greetings
    Martin

     
    • David

      David - 2024-02-07

      Hi Martin.
      I'm afraid that nothing can be sent to the address you provided, neither directly nor through the sourceforge form.

      David


      E-mail

      Your message could not be delivered. Reason: Unknown recipient.

      sourceforge.net

      Unable to send email. The address does not exist.

       

Log in to post a comment.

MongoDB Logo MongoDB