Menu

#4 UTF-8 characters

v1.0 (example)
open
nobody
None
5
2017-08-22
2017-06-28
Marcos Cruz
No

I'm trying bwBASIC 3.20 and I'm really impressed by the improvements since version 2.60, the last one I had tried.

I run bwBASIC in a GNU screen session, in a XFCE4's Terminal, on the Ratpoison window manager, on the Raspian OS (a variant of Debian compiled for the Raspberry Pi)... Everything works with UTF-8 encoding.

I can type any non-ASCII character I need in the bwBASIC command line, for example to set a string variable, but when the variable is printed the non-ASCII characteres are shown as spaces:

a$="N tilde=<Ñ>"
print a$
N tilde=<  >
rem Two spaces instead of the 2-byte UTF-8 character!

Is there any way to make bwBASIC print UTF-8 multibyte characters as expected, since the system is so configured, even if, of course, the interpreter treats them as single 1-byte characters? Or is there any by-design limitation? I have found no clue about this in the documentation.

Thank you.

Discussion

  • Ted A. Campbell

    Ted A. Campbell - 2017-06-28

    Although I was the original author of bwBASIC, I have not been able to keep up with it in recent years. Is someone responding to these requests? If so, I’m grateful,

    /ted


    Ted Campbell at home
    tc at tedcampbell dot com

    On Jun 28, 2017, at 05:43, Marcos Cruz programandala@users.sf.net wrote:

    [support-requests:#4] https://sourceforge.net/p/bwbasic/support-requests/4/ UTF-8 characters

    Status: open
    Group: v1.0 (example)
    Created: Wed Jun 28, 2017 10:43 AM UTC by Marcos Cruz
    Last Updated: Wed Jun 28, 2017 10:43 AM UTC
    Owner: nobody

    I'm trying bwBASIC 3.20 and I'm really impressed by the improvements since version 2.60, the last one I had tried.

    I run bwBASIC in a GNU screen session, in a XFCE4's Terminal, on the Ratpoison window manager, on the Raspian OS (a variant of Debian compiled for the Raspberry Pi)... Everything works with UTF-8 encoding.

    I can type any non-ASCII character I need in the bwBASIC command line, for example to set a string variable, but when the variable is printed the non-ASCII characteres are shown as spaces:

    a$="N tilde=<Ñ>"
    print a$
    N tilde=< >
    rem Two spaces instead of the 2-byte UTF-8 character!
    Is there any way to make bwBASIC print UTF-8 multibyte characters as expected, since the system is so configured, even if, of course, the interpreter treats them as single 1-byte characters? Or is there any by-design limitation? I have found no clue about this in the documentation.

    Thank you.

    Sent from sourceforge.net because you indicated interest in https://sourceforge.net/p/bwbasic/support-requests/4/ https://sourceforge.net/p/bwbasic/support-requests/4/
    To unsubscribe from further messages, please visit https://sourceforge.net/auth/subscriptions/ https://sourceforge.net/auth/subscriptions/

     
  • Marcos Cruz

    Marcos Cruz - 2017-07-05

    I've tried two older versions: 2.60 works fine, it prints UTF-8 characters; version 3.10 doesn't. Therefore I think this is a bug.

     
  • AF5NE

    AF5NE - 2017-07-05

    Marcos,

    If you are comfortable editing the source code, then please try updating the function named "bwb_chartype" in the file named "bwb_int.c" to add the following. Does this resolve the issue for you?

    static int
    bwb_chartype (int C)
    {
    ... not shown ...

    if( C > 127 )
    {
    return CHAR_IS_PRINT;
    }

    return CHAR_IS_CNTRL;
    }

     

    Last edit: AF5NE 2017-07-05
  • Marcos Cruz

    Marcos Cruz - 2017-07-06

    Perfect! I edited the code, recompiled it and now it works as expected.

    I use X11-Basic and PC-BASIC for certain projects of mine that need UTF-8, but after trying bwBASIC 3.20 I'm considering it as an alternative.

    I hope the UTF-8 support will be fixed in the next version of bwBASIC.

    Thank you.

     
  • AF5NE

    AF5NE - 2017-08-22

    UTF-8 will be supported in the next release of Bywater BASIC.

     

Log in to post a comment.