Menu

The question of text-handling

2018-09-18
2018-09-18
  • Pete Maclean

    Pete Maclean - 2018-09-18

    I have been giving thought to how best to handle text in Hermes given the desire to produce versions for Linux and Mac. Broadly there are three options, as follows:

    (1) Handle all text as UTF-8 and, in Windows, convert to UTF-16 at a low level when dealing with the Windows API.

    (2) Handle all text as UTF-16 and, in Linux, convert to UTF-8 at a low level when calling Linux APIs.

    (3) Use MSVC's TCHAR mechanism which allows text to be either UTF-8 or UTF-16 depending on a compilation option. For Windows we would use UTF-16 and for Linux UTF-8. I am unclear which would be better for Mac.

    I dislike (1) mostly because it would require the most extensive changes for Windows and I think we should give priority to getting a Windows version completed with as few changes as practical. I dislike (2) because, while there can be no problem storing text as UTF-16 on all platforms, facilities for manipulating text as UTF-16 may be limited on Linux (and possibly Mac). The bottom line is that I find myself favouring (3). It makes for great consistency. Everywhere text is stored in char's it is assumed to be UTF-8 (except of course in contexts where it is being converted from other single- and multi-byte character sets to UTF-8) and everywhere text is stored in wchar_t's it is assumed to be UTF-16.

    Do we have any consensus on this? Before making a final decision, it would be useful to know what WxWidgets works with -- I hope either. Soren, could you answer this for us, please.

     
    • Soren Bro

      Soren Bro - 2018-09-18

      (AFK)

      WxWidgets works internally with wxString, but we can use std::string and
      std::wstring, only having to convert when calling WxWidgets functions.
      Also, wxstring was designed to work almost exactly as the std:: versions. I
      think std::wstring is UTF-8, but let me get home to be sure, I'm a little
      pressed here...

      In the meantime:

      https://docs.wxwidgets.org/3.1/classwx_string.html

      https://en.cppreference.com/w/cpp/string

      https://stackoverflow.com/questions/4588302/why-isnt-wchar-t-widely-used-in-code-for-linux-related-platforms

      Regards

      On Tuesday, September 18, 2018, Pete Maclean petemaclean@users.sourceforge.net wrote:

      I have been giving thought to how best to handle text in Hermes given the
      desire to produce versions for Linux and Mac. Broadly there are three
      options, as follows:

      (1) Handle all text as UTF-8 and, in Windows, convert to UTF-16 at a low
      level when dealing with the Windows API.

      (2) Handle all text as UTF-16 and, in Linux, convert to UTF-8 at a low
      level when calling Linux APIs.

      (3) Use MSVC's TCHAR mechanism which allows text to be either UTF-8 or
      UTF-16 depending on a compilation option. For Windows we would use UTF-16
      and for Linux UTF-8. I am unclear which would be better for Mac.

      I dislike (1) mostly because it would require the most extensive changes
      for Windows and I think we should give priority to getting a Windows
      version completed with as few changes as practical. I dislike (2) because,
      while there can be no problem storing text as UTF-16 on all platforms,
      facilities for manipulating text as UTF-16 may be limited on Linux (and
      possibly Mac). The bottom line is that I find myself favouring (3). It
      makes for great consistency. Everywhere text is stored in char's it is
      assumed to be UTF-8 (except of course in contexts where it is being
      converted from other single- and multi-byte character sets to UTF-8) and
      everywhere text is stored in wchar_t's it is assumed to be UTF-16.

      Do we have any consensus on this? Before making a final decision, it would
      be useful to know what WxWidgets works with -- I hope either. Soren, could
      you answer this for us, please.


      The question of text-handling
      https://sourceforge.net/p/hermesmail/discussion/general/thread/366d30d6/?limit=25#e917


      Sent from sourceforge.net because you indicated interest in
      https://sourceforge.net/p/hermesmail/discussion/general/

      To unsubscribe from further messages, please visit
      https://sourceforge.net/auth/subscriptions/

      --
      Søren Bro Thygesen

       
    • Soren Bro

      Soren Bro - 2018-09-18

      (AFK)

      But if you ask me if I agree with your deductions I do.

      Regards

      On Tuesday, September 18, 2018, Pete Maclean petemaclean@users.sourceforge.net wrote:

      I have been giving thought to how best to handle text in Hermes given the
      desire to produce versions for Linux and Mac. Broadly there are three
      options, as follows:

      (1) Handle all text as UTF-8 and, in Windows, convert to UTF-16 at a low
      level when dealing with the Windows API.

      (2) Handle all text as UTF-16 and, in Linux, convert to UTF-8 at a low
      level when calling Linux APIs.

      (3) Use MSVC's TCHAR mechanism which allows text to be either UTF-8 or
      UTF-16 depending on a compilation option. For Windows we would use UTF-16
      and for Linux UTF-8. I am unclear which would be better for Mac.

      I dislike (1) mostly because it would require the most extensive changes
      for Windows and I think we should give priority to getting a Windows
      version completed with as few changes as practical. I dislike (2) because,
      while there can be no problem storing text as UTF-16 on all platforms,
      facilities for manipulating text as UTF-16 may be limited on Linux (and
      possibly Mac). The bottom line is that I find myself favouring (3). It
      makes for great consistency. Everywhere text is stored in char's it is
      assumed to be UTF-8 (except of course in contexts where it is being
      converted from other single- and multi-byte character sets to UTF-8) and
      everywhere text is stored in wchar_t's it is assumed to be UTF-16.

      Do we have any consensus on this? Before making a final decision, it would
      be useful to know what WxWidgets works with -- I hope either. Soren, could
      you answer this for us, please.


      The question of text-handling
      https://sourceforge.net/p/hermesmail/discussion/general/thread/366d30d6/?limit=25#e917


      Sent from sourceforge.net because you indicated interest in
      https://sourceforge.net/p/hermesmail/discussion/general/

      To unsubscribe from further messages, please visit
      https://sourceforge.net/auth/subscriptions/

      --
      Søren Bro Thygesen

       
      • Soren Bro

        Soren Bro - 2018-09-18

        (AFK)

        I'm talking Linux only here right now. I assumed, perhaps wrongfully,
        that's what you asked.

        Regards

        On Tuesday, September 18, 2018, sbrothy@gmail.com wrote:

        (AFK)

        But if you ask me if I agree with your deductions I do.

        Regards

        On Tuesday, September 18, 2018, Pete Maclean petemaclean@users. sourceforge.net wrote:

        I have been giving thought to how best to handle text in Hermes given the
        desire to produce versions for Linux and Mac. Broadly there are three
        options, as follows:

        (1) Handle all text as UTF-8 and, in Windows, convert to UTF-16 at a low
        level when dealing with the Windows API.

        (2) Handle all text as UTF-16 and, in Linux, convert to UTF-8 at a low
        level when calling Linux APIs.

        (3) Use MSVC's TCHAR mechanism which allows text to be either UTF-8 or
        UTF-16 depending on a compilation option. For Windows we would use UTF-16
        and for Linux UTF-8. I am unclear which would be better for Mac.

        I dislike (1) mostly because it would require the most extensive changes
        for Windows and I think we should give priority to getting a Windows
        version completed with as few changes as practical. I dislike (2) because,
        while there can be no problem storing text as UTF-16 on all platforms,
        facilities for manipulating text as UTF-16 may be limited on Linux (and
        possibly Mac). The bottom line is that I find myself favouring (3). It
        makes for great consistency. Everywhere text is stored in char's it is
        assumed to be UTF-8 (except of course in contexts where it is being
        converted from other single- and multi-byte character sets to UTF-8) and
        everywhere text is stored in wchar_t's it is assumed to be UTF-16.

        Do we have any consensus on this? Before making a final decision, it
        would be useful to know what WxWidgets works with -- I hope either. Soren,
        could you answer this for us, please.


        The question of text-handling
        https://sourceforge.net/p/hermesmail/discussion/general/thread/366d30d6/?limit=25#e917


        Sent from sourceforge.net because you indicated interest in
        https://sourceforge.net/p/hermesmail/discussion/general/

        To unsubscribe from further messages, please visit
        https://sourceforge.net/auth/subscriptions/

        --
        Søren Bro Thygesen

        --
        Søren Bro Thygesen

         
        • Soren Bro

          Soren Bro - 2018-09-18

          (AFK)

          ...and I meant about the size/type of std::wstring (wchar_t)

          Regards

          On Tuesday, September 18, 2018, Soren Bro sbrothy@users.sourceforge.net
          wrote:

          (AFK)

          I'm talking Linux only here right now. I assumed, perhaps wrongfully,
          that's what you asked.

          Regards

          On Tuesday, September 18, 2018, sbrothy@gmail.com wrote:

          (AFK)

          But if you ask me if I agree with your deductions I do.

          Regards

          On Tuesday, September 18, 2018, Pete Maclean petemaclean@users.
          sourceforge.net petemaclean@users.sourceforge.net wrote:

          I have been giving thought to how best to handle text in Hermes given the
          desire to produce versions for Linux and Mac. Broadly there are three
          options, as follows:

          (1) Handle all text as UTF-8 and, in Windows, convert to UTF-16 at a low
          level when dealing with the Windows API.

          (2) Handle all text as UTF-16 and, in Linux, convert to UTF-8 at a low
          level when calling Linux APIs.

          (3) Use MSVC's TCHAR mechanism which allows text to be either UTF-8 or
          UTF-16 depending on a compilation option. For Windows we would use UTF-16
          and for Linux UTF-8. I am unclear which would be better for Mac.

          I dislike (1) mostly because it would require the most extensive changes
          for Windows and I think we should give priority to getting a Windows
          version completed with as few changes as practical. I dislike (2) because,
          while there can be no problem storing text as UTF-16 on all platforms,
          facilities for manipulating text as UTF-16 may be limited on Linux (and
          possibly Mac). The bottom line is that I find myself favouring (3). It
          makes for great consistency. Everywhere text is stored in char's it is
          assumed to be UTF-8 (except of course in contexts where it is being
          converted from other single- and multi-byte character sets to UTF-8) and
          everywhere text is stored in wchar_t's it is assumed to be UTF-16.

          Do we have any consensus on this? Before making a final decision, it
          would be useful to know what WxWidgets works with -- I hope either. Soren,
          could you answer this for us, please.


          The question of text-handling
          https://sourceforge.net/p/hermesmail/discussion/general/
          thread/366d30d6/?limit=25#e917


          Sent from sourceforge.net because you indicated interest in
          https://sourceforge.net/p/hermesmail/discussion/general/

          To unsubscribe from further messages, please visit
          https://sourceforge.net/auth/subscriptions/

          --
          Søren Bro Thygesen

          --
          Søren Bro Thygesen


          The question of text-handling
          https://sourceforge.net/p/hermesmail/discussion/general/thread/366d30d6/?limit=25#e917/d8f3/4e1b


          Sent from sourceforge.net because you indicated interest in
          https://sourceforge.net/p/hermesmail/discussion/general/

          To unsubscribe from further messages, please visit
          https://sourceforge.net/auth/subscriptions/

          --
          Søren Bro Thygesen

           
          • Soren Bro

            Soren Bro - 2018-09-18

            (AFK)

            No wait. I'm on 64-bit Linux. I'm not thinking straight. Let me get home
            and I'll check to be absolutely sure....

            Regards

            On Tuesday, September 18, 2018, Soren Bro sbrothy@users.sourceforge.net
            wrote:

            (AFK)

            ...and I meant about the size/type of std::wstring (wchar_t)

            Regards

            On Tuesday, September 18, 2018, Soren Bro sbrothy@users.sourceforge.net
            wrote:

            (AFK)

            I'm talking Linux only here right now. I assumed, perhaps wrongfully,
            that's what you asked.

            Regards

            On Tuesday, September 18, 2018, sbrothy@gmail.com wrote:

            (AFK)

            But if you ask me if I agree with your deductions I do.

            Regards

            On Tuesday, September 18, 2018, Pete Maclean petemaclean@users.
            sourceforge.net petemaclean@users.sourceforge.net wrote:

            I have been giving thought to how best to handle text in Hermes given the
            desire to produce versions for Linux and Mac. Broadly there are three
            options, as follows:

            (1) Handle all text as UTF-8 and, in Windows, convert to UTF-16 at a low
            level when dealing with the Windows API.

            (2) Handle all text as UTF-16 and, in Linux, convert to UTF-8 at a low
            level when calling Linux APIs.

            (3) Use MSVC's TCHAR mechanism which allows text to be either UTF-8 or
            UTF-16 depending on a compilation option. For Windows we would use UTF-16
            and for Linux UTF-8. I am unclear which would be better for Mac.

            I dislike (1) mostly because it would require the most extensive changes
            for Windows and I think we should give priority to getting a Windows
            version completed with as few changes as practical. I dislike (2) because,
            while there can be no problem storing text as UTF-16 on all platforms,
            facilities for manipulating text as UTF-16 may be limited on Linux (and
            possibly Mac). The bottom line is that I find myself favouring (3). It
            makes for great consistency. Everywhere text is stored in char's it is
            assumed to be UTF-8 (except of course in contexts where it is being
            converted from other single- and multi-byte character sets to UTF-8) and
            everywhere text is stored in wchar_t's it is assumed to be UTF-16.

            Do we have any consensus on this? Before making a final decision, it
            would be useful to know what WxWidgets works with -- I hope either. Soren,
            could you answer this for us, please.


            The question of text-handling
            https://sourceforge.net/p/hermesmail/discussion/general/
            thread/366d30d6/?limit=25#e917


            Sent from sourceforge.net because you indicated interest in
            https://sourceforge.net/p/hermesmail/discussion/general/

            To unsubscribe from further messages, please visit
            https://sourceforge.net/auth/subscriptions/

            --
            Søren Bro Thygesen

            --
            Søren Bro Thygesen


            The question of text-handling
            https://sourceforge.net/p/hermesmail/discussion/general/
            thread/366d30d6/?limit=25#e917/d8f3/4e1b


            Sent from sourceforge.net because you indicated interest in
            https://sourceforge.net/p/hermesmail/discussion/general/

            To unsubscribe from further messages, please visit
            https://sourceforge.net/auth/subscriptions/

            --
            Søren Bro Thygesen


            The question of text-handling
            https://sourceforge.net/p/hermesmail/discussion/general/thread/366d30d6/?limit=25#e917/d8f3/4e1b/71d4


            Sent from sourceforge.net because you indicated interest in
            https://sourceforge.net/p/hermesmail/discussion/general/

            To unsubscribe from further messages, please visit
            https://sourceforge.net/auth/subscriptions/

            --
            Søren Bro Thygesen

             
            • Soren Bro

              Soren Bro - 2018-09-18

              (AFK)

              But that doesn't change the fact that I agree with your suggestion no 3.
              That'll do fine.

              Regards

              On Tuesday, September 18, 2018, sbrothy@gmail.com wrote:

              (AFK)

              No wait. I'm on 64-bit Linux. I'm not thinking straight. Let me get home
              and I'll check to be absolutely sure....

              Regards

              On Tuesday, September 18, 2018, Soren Bro sbrothy@users.sourceforge.net
              wrote:

              (AFK)

              ...and I meant about the size/type of std::wstring (wchar_t)

              Regards

              On Tuesday, September 18, 2018, Soren Bro sbrothy@users.sourceforge.net
              wrote:

              (AFK)

              I'm talking Linux only here right now. I assumed, perhaps wrongfully,
              that's what you asked.

              Regards

              On Tuesday, September 18, 2018, sbrothy@gmail.com wrote:

              (AFK)

              But if you ask me if I agree with your deductions I do.

              Regards

              On Tuesday, September 18, 2018, Pete Maclean petemaclean@users.
              sourceforge.net petemaclean@users.sourceforge.net wrote:

              I have been giving thought to how best to handle text in Hermes given the
              desire to produce versions for Linux and Mac. Broadly there are three
              options, as follows:

              (1) Handle all text as UTF-8 and, in Windows, convert to UTF-16 at a low
              level when dealing with the Windows API.

              (2) Handle all text as UTF-16 and, in Linux, convert to UTF-8 at a low
              level when calling Linux APIs.

              (3) Use MSVC's TCHAR mechanism which allows text to be either UTF-8 or
              UTF-16 depending on a compilation option. For Windows we would use UTF-16
              and for Linux UTF-8. I am unclear which would be better for Mac.

              I dislike (1) mostly because it would require the most extensive changes
              for Windows and I think we should give priority to getting a Windows
              version completed with as few changes as practical. I dislike (2) because,
              while there can be no problem storing text as UTF-16 on all platforms,
              facilities for manipulating text as UTF-16 may be limited on Linux (and
              possibly Mac). The bottom line is that I find myself favouring (3). It
              makes for great consistency. Everywhere text is stored in char's it is
              assumed to be UTF-8 (except of course in contexts where it is being
              converted from other single- and multi-byte character sets to UTF-8) and
              everywhere text is stored in wchar_t's it is assumed to be UTF-16.

              Do we have any consensus on this? Before making a final decision, it
              would be useful to know what WxWidgets works with -- I hope either. Soren,
              could you answer this for us, please.


              The question of text-handling
              https://sourceforge.net/p/hermesmail/discussion/general/
              thread/366d30d6/?limit=25#e917


              Sent from sourceforge.net because you indicated interest in
              https://sourceforge.net/p/hermesmail/discussion/general/

              To unsubscribe from further messages, please visit
              https://sourceforge.net/auth/subscriptions/

              --
              Søren Bro Thygesen

              --
              Søren Bro Thygesen


              The question of text-handling
              https://sourceforge.net/p/hermesmail/discussion/general/thre
              ad/366d30d6/?limit=25#e917/d8f3/4e1b


              Sent from sourceforge.net because you indicated interest in
              https://sourceforge.net/p/hermesmail/discussion/general/

              To unsubscribe from further messages, please visit
              https://sourceforge.net/auth/subscriptions/

              --
              Søren Bro Thygesen


              The question of text-handling
              https://sourceforge.net/p/hermesmail/discussion/general/thread/366d30d6/?limit=25#e917/d8f3/4e1b/71d4


              Sent from sourceforge.net because you indicated interest in
              https://sourceforge.net/p/hermesmail/discussion/general/

              To unsubscribe from further messages, please visit
              https://sourceforge.net/auth/subscriptions/

              --
              Søren Bro Thygesen

              --
              Søren Bro Thygesen

               
              • Soren Bro

                Soren Bro - 2018-09-18

                (AFK)

                Again, with the reservation that MAC is the joker here. I have zero
                experience with that. There are however examples of MAC builds in the
                samples. Which BTW doesn't compile "out of the box". The configure and
                makefile are regular nightmares.

                I may just start with codelite after all....

                Regards

                On Tuesday, September 18, 2018, sbrothy@gmail.com wrote:

                (AFK)

                But that doesn't change the fact that I agree with your suggestion no 3.
                That'll do fine.

                Regards

                On Tuesday, September 18, 2018, sbrothy@gmail.com wrote:

                (AFK)

                No wait. I'm on 64-bit Linux. I'm not thinking straight. Let me get home
                and I'll check to be absolutely sure....

                Regards

                On Tuesday, September 18, 2018, Soren Bro sbrothy@users.sourceforge.net
                wrote:

                (AFK)

                ...and I meant about the size/type of std::wstring (wchar_t)

                Regards

                On Tuesday, September 18, 2018, Soren Bro sbrothy@users.sourceforge.net
                wrote:

                (AFK)

                I'm talking Linux only here right now. I assumed, perhaps wrongfully,
                that's what you asked.

                Regards

                On Tuesday, September 18, 2018, sbrothy@gmail.com wrote:

                (AFK)

                But if you ask me if I agree with your deductions I do.

                Regards

                On Tuesday, September 18, 2018, Pete Maclean petemaclean@users.
                sourceforge.net petemaclean@users.sourceforge.net wrote:

                I have been giving thought to how best to handle text in Hermes given the
                desire to produce versions for Linux and Mac. Broadly there are three
                options, as follows:

                (1) Handle all text as UTF-8 and, in Windows, convert to UTF-16 at a low
                level when dealing with the Windows API.

                (2) Handle all text as UTF-16 and, in Linux, convert to UTF-8 at a low
                level when calling Linux APIs.

                (3) Use MSVC's TCHAR mechanism which allows text to be either UTF-8 or
                UTF-16 depending on a compilation option. For Windows we would use UTF-16
                and for Linux UTF-8. I am unclear which would be better for Mac.

                I dislike (1) mostly because it would require the most extensive changes
                for Windows and I think we should give priority to getting a Windows
                version completed with as few changes as practical. I dislike (2)
                because,
                while there can be no problem storing text as UTF-16 on all platforms,
                facilities for manipulating text as UTF-16 may be limited on Linux (and
                possibly Mac). The bottom line is that I find myself favouring (3). It
                makes for great consistency. Everywhere text is stored in char's it is
                assumed to be UTF-8 (except of course in contexts where it is being
                converted from other single- and multi-byte character sets to UTF-8) and
                everywhere text is stored in wchar_t's it is assumed to be UTF-16.

                Do we have any consensus on this? Before making a final decision, it
                would be useful to know what WxWidgets works with -- I hope either.
                Soren,
                could you answer this for us, please.


                The question of text-handling
                https://sourceforge.net/p/hermesmail/discussion/general/
                thread/366d30d6/?limit=25#e917


                Sent from sourceforge.net because you indicated interest in
                https://sourceforge.net/p/hermesmail/discussion/general/

                To unsubscribe from further messages, please visit
                https://sourceforge.net/auth/subscriptions/

                --
                Søren Bro Thygesen

                --
                Søren Bro Thygesen


                The question of text-handling
                https://sourceforge.net/p/hermesmail/discussion/general/thre
                ad/366d30d6/?limit=25#e917/d8f3/4e1b


                Sent from sourceforge.net because you indicated interest in
                https://sourceforge.net/p/hermesmail/discussion/general/

                To unsubscribe from further messages, please visit
                https://sourceforge.net/auth/subscriptions/

                --
                Søren Bro Thygesen


                The question of text-handling
                https://sourceforge.net/p/hermesmail/discussion/general/thread/366d30d6/?limit=25#e917/d8f3/4e1b/71d4


                Sent from sourceforge.net because you indicated interest in
                https://sourceforge.net/p/hermesmail/discussion/general/

                To unsubscribe from further messages, please visit
                https://sourceforge.net/auth/subscriptions/

                --
                Søren Bro Thygesen

                --
                Søren Bro Thygesen

                --
                Søren Bro Thygesen

                 
                • Soren Bro

                  Soren Bro - 2018-09-18

                  (destination home)

                  I'm also a little stressed out by the fact that I installed and uninstalled
                  WxWidgets so many times now on Debian that I'm considering reinstalling
                  Linux. That wouldn't be a big deal, if it weren't for all the stuff I'll
                  have to back up.....

                  Regards

                  On Tuesday, September 18, 2018, Soren Bro sbrothy@users.sourceforge.net
                  wrote:

                  (AFK)

                  Again, with the reservation that MAC is the joker here. I have zero
                  experience with that. There are however examples of MAC builds in their
                  samples. Which BTW doesn't compile "out of the box". The configure and
                  makefile are regular nightmares.

                  I may just start with codelite after all....

                  Regards

                  On Tuesday, September 18, 2018, sbrothy@gmail.com wrote:

                  (AFK)

                  But that doesn't change the fact that I agree with your suggestion no 3.
                  That'll do fine.

                  Regards

                  On Tuesday, September 18, 2018, sbrothy@gmail.com wrote:

                  (AFK)

                  No wait. I'm on 64-bit Linux. I'm not thinking straight. Let me get home
                  and I'll check to be absolutely sure....

                  Regards

                  On Tuesday, September 18, 2018, Soren Bro sbrothy@users.sourceforge.net
                  wrote:

                  (AFK)

                  ...and I meant about the size/type of std::wstring (wchar_t)

                  Regards

                  On Tuesday, September 18, 2018, Soren Bro sbrothy@users.sourceforge.net
                  wrote:

                  (AFK)

                  I'm talking Linux only here right now. I assumed, perhaps wrongfully,
                  that's what you asked.

                  Regards

                  On Tuesday, September 18, 2018, sbrothy@gmail.com wrote:

                  (AFK)

                  But if you ask me if I agree with your deductions I do.

                  Regards

                  On Tuesday, September 18, 2018, Pete Maclean petemaclean@users.
                  sourceforge.net petemaclean@users.sourceforge.net wrote:

                  I have been giving thought to how best to handle text in Hermes given the
                  desire to produce versions for Linux and Mac. Broadly there are three
                  options, as follows:

                  (1) Handle all text as UTF-8 and, in Windows, convert to UTF-16 at a low
                  level when dealing with the Windows API.

                  (2) Handle all text as UTF-16 and, in Linux, convert to UTF-8 at a low
                  level when calling Linux APIs.

                  (3) Use MSVC's TCHAR mechanism which allows text to be either UTF-8 or
                  UTF-16 depending on a compilation option. For Windows we would use UTF-16
                  and for Linux UTF-8. I am unclear which would be better for Mac.

                  I dislike (1) mostly because it would require the most extensive changes
                  for Windows and I think we should give priority to getting a Windows
                  version completed with as few changes as practical. I dislike (2)
                  because,
                  while there can be no problem storing text as UTF-16 on all platforms,
                  facilities for manipulating text as UTF-16 may be limited on Linux (and
                  possibly Mac). The bottom line is that I find myself favouring (3). It
                  makes for great consistency. Everywhere text is stored in char's it is
                  assumed to be UTF-8 (except of course in contexts where it is being
                  converted from other single- and multi-byte character sets to UTF-8) and
                  everywhere text is stored in wchar_t's it is assumed to be UTF-16.

                  Do we have any consensus on this? Before making a final decision, it
                  would be useful to know what WxWidgets works with -- I hope either.
                  Soren,
                  could you answer this for us, please.


                  The question of text-handling
                  https://sourceforge.net/p/hermesmail/discussion/general/
                  thread/366d30d6/?limit=25#e917


                  Sent from sourceforge.net because you indicated interest in
                  https://sourceforge.net/p/hermesmail/discussion/general/

                  To unsubscribe from further messages, please visit
                  https://sourceforge.net/auth/subscriptions/

                  --
                  Søren Bro Thygesen

                  --
                  Søren Bro Thygesen


                  The question of text-handling
                  https://sourceforge.net/p/hermesmail/discussion/general/thre
                  ad/366d30d6/?limit=25#e917/d8f3/4e1b


                  Sent from sourceforge.net because you indicated interest in
                  https://sourceforge.net/p/hermesmail/discussion/general/

                  To unsubscribe from further messages, please visit
                  https://sourceforge.net/auth/subscriptions/

                  --
                  Søren Bro Thygesen


                  The question of text-handling
                  https://sourceforge.net/p/hermesmail/discussion/general/
                  thread/366d30d6/?limit=25#e917/d8f3/4e1b/71d4


                  Sent from sourceforge.net because you indicated interest in
                  https://sourceforge.net/p/hermesmail/discussion/general/

                  To unsubscribe from further messages, please visit
                  https://sourceforge.net/auth/subscriptions/

                  --
                  Søren Bro Thygesen

                  --
                  Søren Bro Thygesen

                  --
                  Søren Bro Thygesen


                  The question of text-handling
                  https://sourceforge.net/p/hermesmail/discussion/general/thread/366d30d6/?limit=25#e917/d8f3/4e1b/71d4/b3cf/3771/9a5b


                  Sent from sourceforge.net because you indicated interest in
                  https://sourceforge.net/p/hermesmail/discussion/general/

                  To unsubscribe from further messages, please visit
                  https://sourceforge.net/auth/subscriptions/

                  --
                  Søren Bro Thygesen

                   
                  • Soren Bro

                    Soren Bro - 2018-09-18

                    sizeof(wchar_t) on Debian Linux: 4

                    Regards.

                    On Tue, Sep 18, 2018 at 6:05 PM Soren Bro sbrothy@users.sourceforge.net
                    wrote:

                    (destination home)

                    I'm also a little stressed out by the fact that I installed and uninstalled
                    WxWidgets so many times now on Debian that I'm considering reinstalling
                    Linux. That wouldn't be a big deal, if it weren't for all the stuff I'll
                    have to back up.....

                    Regards

                    On Tuesday, September 18, 2018, Soren Bro sbrothy@users.sourceforge.net
                    wrote:

                    (AFK)

                    Again, with the reservation that MAC is the joker here. I have zero
                    experience with that. There are however examples of MAC builds in their
                    samples. Which BTW doesn't compile "out of the box". The configure and
                    makefile are regular nightmares.

                    I may just start with codelite after all....

                    Regards

                    On Tuesday, September 18, 2018, sbrothy@gmail.com wrote:

                    (AFK)

                    But that doesn't change the fact that I agree with your suggestion no 3.
                    That'll do fine.

                    Regards

                    On Tuesday, September 18, 2018, sbrothy@gmail.com wrote:

                    (AFK)

                    No wait. I'm on 64-bit Linux. I'm not thinking straight. Let me get home
                    and I'll check to be absolutely sure....

                    Regards

                    On Tuesday, September 18, 2018, Soren Bro sbrothy@users.sourceforge.net
                    wrote:

                    (AFK)

                    ...and I meant about the size/type of std::wstring (wchar_t)

                    Regards

                    On Tuesday, September 18, 2018, Soren Bro sbrothy@users.sourceforge.net
                    wrote:

                    (AFK)

                    I'm talking Linux only here right now. I assumed, perhaps wrongfully,
                    that's what you asked.

                    Regards

                    On Tuesday, September 18, 2018, sbrothy@gmail.com wrote:

                    (AFK)

                    But if you ask me if I agree with your deductions I do.

                    Regards

                    On Tuesday, September 18, 2018, Pete Maclean petemaclean@users.
                    sourceforge.net petemaclean@users.sourceforge.net wrote:

                    I have been giving thought to how best to handle text in Hermes given the
                    desire to produce versions for Linux and Mac. Broadly there are three
                    options, as follows:

                    (1) Handle all text as UTF-8 and, in Windows, convert to UTF-16 at a low
                    level when dealing with the Windows API.

                    (2) Handle all text as UTF-16 and, in Linux, convert to UTF-8 at a low
                    level when calling Linux APIs.

                    (3) Use MSVC's TCHAR mechanism which allows text to be either UTF-8 or
                    UTF-16 depending on a compilation option. For Windows we would use UTF-16
                    and for Linux UTF-8. I am unclear which would be better for Mac.

                    I dislike (1) mostly because it would require the most extensive changes
                    for Windows and I think we should give priority to getting a Windows
                    version completed with as few changes as practical. I dislike (2)
                    because,
                    while there can be no problem storing text as UTF-16 on all platforms,
                    facilities for manipulating text as UTF-16 may be limited on Linux (and
                    possibly Mac). The bottom line is that I find myself favouring (3). It
                    makes for great consistency. Everywhere text is stored in char's it is
                    assumed to be UTF-8 (except of course in contexts where it is being
                    converted from other single- and multi-byte character sets to UTF-8) and
                    everywhere text is stored in wchar_t's it is assumed to be UTF-16.

                    Do we have any consensus on this? Before making a final decision, it
                    would be useful to know what WxWidgets works with -- I hope either.
                    Soren,
                    could you answer this for us, please.


                    The question of text-handling
                    https://sourceforge.net/p/hermesmail/discussion/general/
                    thread/366d30d6/?limit=25#e917


                    Sent from sourceforge.net because you indicated interest in
                    https://sourceforge.net/p/hermesmail/discussion/general/

                    To unsubscribe from further messages, please visit
                    https://sourceforge.net/auth/subscriptions/

                    --
                    Søren Bro Thygesen

                    --
                    Søren Bro Thygesen


                    The question of text-handling
                    https://sourceforge.net/p/hermesmail/discussion/general/thre
                    ad/366d30d6/?limit=25#e917/d8f3/4e1b


                    Sent from sourceforge.net because you indicated interest in
                    https://sourceforge.net/p/hermesmail/discussion/general/

                    To unsubscribe from further messages, please visit
                    https://sourceforge.net/auth/subscriptions/

                    --
                    Søren Bro Thygesen


                    The question of text-handling
                    https://sourceforge.net/p/hermesmail/discussion/general/
                    thread/366d30d6/?limit=25#e917/d8f3/4e1b/71d4


                    Sent from sourceforge.net because you indicated interest in
                    https://sourceforge.net/p/hermesmail/discussion/general/

                    To unsubscribe from further messages, please visit
                    https://sourceforge.net/auth/subscriptions/

                    --
                    Søren Bro Thygesen

                    --
                    Søren Bro Thygesen

                    --
                    Søren Bro Thygesen


                    The question of text-handling

                    https://sourceforge.net/p/hermesmail/discussion/general/thread/366d30d6/?limit=25#e917/d8f3/4e1b/71d4/b3cf/3771/9a5b

                    Sent from sourceforge.net because you indicated interest in
                    https://sourceforge.net/p/hermesmail/discussion/general/

                    To unsubscribe from further messages, please visit
                    https://sourceforge.net/auth/subscriptions/

                    --
                    Søren Bro Thygesen


                    The question of text-handling
                    https://sourceforge.net/p/hermesmail/discussion/general/thread/366d30d6/?limit=25#e917/d8f3/4e1b/71d4/b3cf/3771/9a5b/07bb


                    Sent from sourceforge.net because you indicated interest in
                    https://sourceforge.net/p/hermesmail/discussion/general/

                    To unsubscribe from further messages, please visit
                    https://sourceforge.net/auth/subscriptions/

                     

Log in to post a comment.