Menu

Screen section - UTF-8 locale - ACCEPT does not show é § è ç à ö

Anonymous
2020-05-21
2025-04-27
1 2 3 > >> (Page 1 of 3)
  • Anonymous

    Anonymous - 2020-05-21

    I wrote a program using the SCREEN SECTION.

    The program, compiled with both GNUCOBOL 2.2 (and later also with GNUCOBOL 3.0-rc1.0) works fine except for one runtime issue:
    I can't get the input screen to accept the characters é § è ç à ö.

    The input screen always shows 2 blanks spots from where I started writing one of the former character;
    I use Ubuntu 18.04.LTS on a laptop with a fr_BE keyboard, but the same issue happens on a Google Cloud Engine virtual machine with Ubuntu 20.04.LTS and GNUCOBOL 2.2

    I have been looking for a solution on different websites but found no solution:
    locale is en_US.UTF-8 - changing to fr_BE.UTF-8 did not change anything
    using the LOCALE statements from the manual does not change a thing
    * using the drawbox.cob example, I have to define the X fields to x(2) to make the characters visible after ACCEPT/DISPLAY (not screen def) - while using the SCREEN SECTION the characters are again replaced by 2 blank characters

    It seems to be an double byte character issue, but I don't find a solution.

    Can anyone help me with this ?

    Kind regards,

    J.M.

     
  • Simon Sobisch

    Simon Sobisch - 2020-05-21

    When your terminal adds two characters instead of one there this means it really uses UTF-8, as set up but don't display it accordingly. I do think you use ncursesw (cobcrun --info will tell you something about this), correct?
    As you likely use PIC X and if you specify you want 50 bytes you also want to be able to use 50 character positions you may simply fall back to ISO-88591, which likely should be the case if you export LANG=fr_BE before running the program.

     
  • Anonymous

    Anonymous - 2020-05-21

    FYI: programs addes as attachement (cobc info and other information in source code comments)

    Program test only uses DISPLAY/ACCEPT without SCREEN SECTION.

    Program test2 only uses DISPLAY/ACCEPT with as defined in SCREEN SECTION.
    In test 2, characters like é § è ç à ö are not accepted as input and provoke a blank screen entry.

    How to remediate this ?

     
  • Anonymous

    Anonymous - 2020-05-21

    @Simon:

    Thanks for your reply - I just added 2 test programs and yes, ncursesw is used.

    ncursesw was installed when I installed GNUCOBOL 3.0-rc1 from the website, following the procedured described.

    On the other hand, on my Google Cloud Engine virtual machine GNUCOBOL 2.2.0 was installed via sudo apt install gnucobol and also uses ncursesw.

    How can I install with ncurses instead of ncursesw, if that is the issue ?

     
  • Anonymous

    Anonymous - 2020-05-21

    Changing the locale to fr_BE does not change the situation.

    $ export LANG=fr_BE

    $ locale
    locale: Cannot set LC_CTYPE to default locale: No such file or directory
    locale: Cannot set LC_MESSAGES to default locale: No such file or directory
    locale: Cannot set LC_ALL to default locale: No such file or directory
    LANG=fr_BE
    LANGUAGE=en_US
    LC_CTYPE="fr_BE"
    LC_NUMERIC=en_US.UTF-8
    LC_TIME=en_US.UTF-8
    LC_COLLATE="fr_BE"
    LC_MONETARY=en_US.UTF-8
    LC_MESSAGES="fr_BE"
    LC_PAPER=en_US.UTF-8
    LC_NAME=en_US.UTF-8
    LC_ADDRESS=en_US.UTF-8
    LC_TELEPHONE=en_US.UTF-8
    LC_MEASUREMENT=en_US.UTF-8
    LC_IDENTIFICATION=en_US.UTF-8
    LC_ALL=

     
  • Anonymous

    Anonymous - 2020-05-21

    With program test (without SCREEN SECTION) and LANG=en_US-UTF-8 I still can input 2 characters, even é§ is OK.
    Changing the locale doesn't seem to have any effect.

    Is this related to ncursesw ?

     
  • Anonymous

    Anonymous - 2020-05-22

    Did not find this to be related to the terminal application or the font used in the terminal either.
    I use LXTerminal on my Ubuntu 18.04 LTS / GNUCOBOL 3.0-rc1 physical machine
    and I use SSH via https://ssh.cloud.google.com to access my GCE virtual machine with Ubuntu 20.04 LTS / GNUCOBOL 2.2.0.
    Both don't accept the characters é § è ç à ö via a screen defined in SCREEN SECTION.
    Also: same test programs on both machines (cfr. message above with programs attached).

     
  • Anonymous

    Anonymous - 2020-05-23

    No success yet.
    Uninstalled 3.0-rc1 and installed GnuCOBOL 2.2-disco: same issue.
    Changing locales via "export LANG=fr_BE.iso885915@euro" and others did not work.

     
  • Anonymous

    Anonymous - 2020-05-23

    Now also unistalled GnuCOBOL 2.2-disco.

    Installed open-cobol1 via "sudo apt install open-cobol": issue still present.
    Only difference: the terminal window becomes white w/ black character and é§èçàö are represented by single character ? instead of double character blank.

    Giving up at this time.

    I have been working with GnuCOBOL 2.x since a couple of years without issues, on Ubuntu and on MS-Windows.
    I suppose this issue must be linked to (locales at) Ubuntu (both 18.04 LTS on physical machine and 20.04 LTS on Google Cloud Engine).
    Weird !

    Any help welcome - will check for answers on a regular base.

     
  • Anonymous

    Anonymous - 2020-05-24

    UPDATE - May 24, 2020

    I installed an UBUNTU 20.04 LTS machine from scratch and installed GnuCOBOL 2.2.

    The issue described above is still present.

    The issue is only present when;

    using DISPLAY or ACCEPT together with "LINE ll COL cc" or "AT lllccc" (line/col in numeric format)
    using the SCREEN SECTION definition (which of course use LINE/COL too)

    GnuCOBOL install and cobc display no error/warning messages.

     
  • Anonymous

    Anonymous - 2020-05-27

    UPDATE - May 27, 2020

    Just installed GnuCOBOL 3.1-dev.0 on a MS-Windows 10 Pro (English) machine with a Belgian keyboard (cobc --info attached). Installation executed via Arnold Trembley's latest build environment (version 16MAY2020).

    Compiling with cobc does not indicate any error or warning during compilation, but the issue with (French) accented characters like é § è ç à ö is also reproduced on this installation.
    In terminal (Command Prompt) the accented characters display well. As soon as the GnuCOBOL application is run, the accented characters display 2 blanks (and a 'beep') if the program uses DISPLAY/ACCEPT with LINE/COL or if the program uses SCREEN SECTION to accept user input.

    So, to summarize: I face the same issue on 5 different machines and a lot of trial and error with Linux locale:
    laptop with Ubuntu 18.04 LTS (tried with GnuCOBOL 2.2 and 3.0-rc1)
    laptop with Ubuntu 20.04 LTS (tried with GnuCOBOL 2.2 and 3.0-rc1)
    * laptop with MS-Windows 10 Pro (tried with GnuCOBOL 3.1-dev.0))

    Any suggestions how to solve this issue ?

     
    • Arnold Trembley

      Arnold Trembley - 2020-05-28

      I don't know if there is a workable solution. I downloaded the latest PDCurses 4.1.99 from
      https://github.com/Bill-Gray/PDCurses
      which was updated about 5 hours ago.

      I built PDCurses 4.1.99 with
      make -f Makefile.mng INFOEX=N DLL=Y WIDE=Y UTF8=Y
      and with GnuCOBOL 3.1-dev r3580, and then ran the modified zztest2.cob program.
      That one displays the requested characters with row column, but the box drawing characters are corrupted.

      I then built PDCurses 4.1.99 with
      make -f Makefile.mng INFOEX=N DLL=Y (note: no UTF8 support!)
      and GnuCOBOL 3.1-dev r3580, and ran zztest2.cob

      This time the box drawing characters were correct by the requested special characters were wrong.

      Both those builds report PDCurses in cobcrun --info as follows:
      extended screen I/O : pdcurses, version 4.1.99 (CHTYPE=64, WIDE=0)
      mouse support : yes
      in other words, they don't say whether or not UTF8 support is present, even though it obviously changes the test results.

      I did not attempt to build PDCurses 4.1.99 with CHTYPE32=Y, because David Wall's testing suggests that doesn't work, nor did I try building PDCurses 4.1.99 with WinGUI instead of WinCon, also because David Wall's testing suggests that won't work either. It's possible WinGui would behave differently with the most recent download of PDCurses 4.1.99.

      I also didn't test for mouse support, maybe I can get to that tomorrow.

      The separate colors.cbl program seems to product the desired results for either build of PDCurses 4.1.99 (or 4.1.1 from an earlier test). But the build of GC31 that you downloaded from my website was built with PDCurses 4.1.0 from last March, and has less desireable results.

      PDCurses 4.2.0 is expected soon, but there are still unresolved issues with CHTYPE32, UTF8, and WinGui. See related thread:
      https://sourceforge.net/p/open-cobol/discussion/help/thread/bc0f3d2ea5/

      Kind regards,

       

      Last edit: Arnold Trembley 2020-05-28
      • Simon Sobisch

        Simon Sobisch - 2020-05-28

        The UTF8 builds (I try to get this shown in cobcrun --info) won't work with codepage 437 or similar encoded box drawing characters which you need if you use the standard cmd.exe with the standard locale settings.

        If you use the UTF8 versions of the box drawing characters everything should be fine in this builds, check Wikipedia box-drawing_character for a list and their hex values.

        Note: this will (obviously) break the option to use same characters for "simple" ACCEPT/DISPLAY, but you seem to be able to fix this by enabling Windows Beta UTF8-Support.

        When you use a setup like this (globally enabled UTF-8) + Windows UTF-8 support you can easily change your source to use UTF-8 encoding, too (just keep in mind that some characters will take more than one byte) and then can replace the hex characters by the actual characters if you like to.

        I think that the CHTYPE_32 color issue is possibly the result of not building both PCurses and GnuCOBOL with CHTYPE_32 - the recent version of PDCurses 4 (master snaphot 6 hours ago) will not allow this any more (you'll see linking errors), can somebody please check if a "clean" CHTYPE_32 build still has the color issue?

         
  • Anonymous

    Anonymous - 2020-05-28

    Arnold, Simon,
    Thanks for checking this out and for your answers.
    This weekend I will try out some options and publish the results here.
    I would like to have a clean GnuCOBOL 2.2 (or 3.x) system on Ubuntu 20.04 LTS and on MS-Windows 10 Pro, both working with a 'standard' Belgian keyboard. My OS's are installed as English, but the text I/O needs to be able to handle accented characters on screen (since I'm writing in Dutch, French and German, all characters supported by the BE keyboard).
    I also wish to use the SCREEN SECTION (or at least LINE/COL DISPLAY/ACCEPT) in my COBOL programs.
    Keep up the good job with GnuCOBOL !
    Kind regards,
    J.M.

     
    • Simon Sobisch

      Simon Sobisch - 2020-05-28

      In this case I'd try to go full UTF-8.

       
  • Anonymous

    Anonymous - 2020-05-28

    What do I have to do/install to go full UTF-8 ?
    I will also have to check how to compile from source with PDCurses (my current build is ncursesw).

     
  • Vincent (Bryan) Coen

    Go into settings laguage and/or locale it should be in one of them.
    For windows you will need to use settings and system subject to what version you use as I am typing this under Linux and my Win laptop is shut down.

     
  • Anonymous

    Anonymous - 2020-05-29

    Vincent,
    Thank for your reply.
    My standard installation (even a clean install of Ubuntu 20.04 LTS on Google Cloud Platform) has en_US.UTF-8 as the default locale setting.
    I tried different combinations of locale settings (en, fr, ge, da / US, FR, BE, DE, DK / UTF-8 and non-UTF-8 like fr_BE.iso88591) this last week but the issue persists.

    I have the impression that the issue is due to something else:
    The issue only appears when using DISPLAY/ACCEPT with LINE/COL or when using a screen definition via SCREEN SECTION (which uses LINE/COL too of course).
    In case of DISPLAY/ACCEPT without LINE/COL the accented characters are displayed well. Only issue here is that accented character use two bytes of the defined variable, which shifts the characters to the left and so doesn't permit to create a useful fixed position screen display.

    This behaviour even cuts of the last inputted character(s), e.g.
    02 TXT-IN PIC X(6).
    -> abcdef => abcdef
    -> abëdef => abëde (the variable contains 6 bytes)
    -> aböçef => aböç (the variable contains 6 bytes)

    I will write a program to show this behaviour together with some screenshots and publish here shortly.

    I'm thinking about the issue being provoked by other parameters than locale, maybe differences in "curses" behaviour (at this time GnuCOBOL always installed with ncursesw, so maybe I could try compiling from source with the latest version of PDCurses - have to check how to do that).

    Anyway, if I find a solution, I will publish here.

    Any help still welcome :)

     
  • Anonymous

    Anonymous - 2020-05-29

    I hope this small test program brings some light into the darkness - cfr. attachment.
    I joined 2 screenshots from program execution - cfr. attachments.
    ACCEPT/DISPLAY without LINE/COL shows accented characters but uses 2 bytes per accented character.
    ACCEPT/DISPLAY with LINE/COL does not show the accented characters but shows 2 blanks.

     
    • Anonymous

      Anonymous - 2024-01-10

      Hi all,

      I promised to publish when I found a solution.
      Tried something today... and it works !
      After almost 4 years.
      Compiled with GnuCOBOL 3.2.0

      Solution is Yannick Vanhaeren's en_BE locale file described in
      https://gist.github.com/yvh/630368018d7c683aca8da9e2baf7bfb9

      J.M. Lietaer


      **Relevant part of the source code : **

      000025 SOURCE-COMPUTER.
      000026 UBUNTU_22_04_LTS.
      000027 OBJECT-COMPUTER.
      000028 ANY-PLATFORM
      000029 CLASSIFICATION belgian.
      000030 SPECIAL-NAMES.
      000031 LOCALE belgian "en_BE.UTF-8".
      000032 *
      000033 * Set locale to en_BE.UTF-8
      000034 * Cfr. https://gist.github.com/yvh/630368018d7c683aca8da9e2baf7bfb9
      000035 * sudo cp en_BE /usr/share/i18n/locales/en_BE
      000036 * sudo localedef -i en_BE -c -f UTF-8 en_BE
      000037 * echo "en_BE.UTF-8 UTF-8" | sudo tee -a /etc/locale.gen
      000038 * sudo locale-gen
      000039 *
      000040 * Maybe also change files in /var/lib/locales/supported.d/
      000041 *
      000042 * See also : https://www.server-world.info/en/note?os=Ubuntu_
      000043 *


      Locale settings

      $ localectl
      System Locale: LANG=en_BE.UTF-8
      LANGUAGE=fr_BE:fr_FR
      LC_NUMERIC=en_US.UTF-8
      LC_TIME=en_US.UTF-8
      LC_MONETARY=en_US.UTF-8
      LC_PAPER=en_US.UTF-8
      LC_NAME=en_US.UTF-8
      LC_ADDRESS=en_US.UTF-8
      LC_TELEPHONE=en_US.UTF-8
      LC_MEASUREMENT=en_US.UTF-8
      LC_IDENTIFICATION=en_US.UTF-8
      VC Keymap: n/a
      X11 Layout: be
      X11 Model: pc105


       
      • Simon Sobisch

        Simon Sobisch - 2024-01-10

        And the result is?

         
        • Anonymous

          Anonymous - 2024-01-10

          The result...
          The locale file en_BE is not installed by default in Ubuntu (and other Linux distributions?).
          By installing this locale, the characters passing thru the SCREEN SECTION as input and output are now displayed correctly.
          Keyboard is AZERTY (fr_BE).
          OS language is English (en_US).
          Attachments - both inputs are the same &é"'(§è!çà)-ôöùµ[]{}
          en_BE_1.jpg : with en_BE
          en_BE_2.jpg : without en_BE

           
          👍
          1
          • Simon Sobisch

            Simon Sobisch - 2024-01-10

            Note that this likely still has the "issue" that because it is UTF-8, each of those will be counted as 2 bytes and also stored that way.

             
            • Anonymous

              Anonymous - 2024-01-10

              I'm quite sure the characters are still stored as 2 bytes, but the difference is that both the input and the output field from the SCREEN SECTION are now working correctly.
              The buffer zone after the input field is now empty, where before part of/entire characters/bytes were found. The en_BE locale seems to do the trick.
              I'm ready to use GnuCOBOL as a programming language again :)
              Best wishes to the whole team and keep up the excellent work on COBOL !

               
              👍
              1
              • Juan Carlos Escartí

                Hello everyone, after some time updating the S.O. where are my applications, I have migrated the operation from Kernel 3.4.63 to 5.14.21. Suse 12.2 to SuSe Leap 15.5
                I'm going to see if we put GNU Cobol into operation.
                How can this solution be generalized to all languages?
                My experience:
                With the old S.O. A 32-bit Cobol 3.1 version worked for me with CP850
                In all new versions, the 2 null characters appear in both CP850 and UTF-8
                Thank you

                 
1 2 3 > >> (Page 1 of 3)

Anonymous
Anonymous

Add attachments
Cancel





Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.