Menu

Copyright symbol in a string

2024-08-29
2024-08-30
  • Andrew Jameson

    Andrew Jameson - 2024-08-29

    Since my last few problems, I'm keeping a much closer eye on the compiled code - a good learning exercise !

    I noted that the string associated with HSerPrint "©2024" results in a compiled 6 character string :

    StringTable5
    retlw 6
    retlw 194 ;�
    retlw 169 ;�
    retlw 50 ;2
    retlw 48 ;0
    retlw 50 ;2
    retlw 52 ;4

    © is ASCII 169 / xA9 ... you knew that it was coming ... what's the  194 / C2 doing there ?

    Thanks,

    Andrew

     
    • Andrew Jameson

      Andrew Jameson - 2024-08-30

      Thinking one dimensionally ! Never gave UTF character coding a thought !

      Andrew

       
  • Anobium

    Anobium - 2024-08-29

    Andrew... no idea.

    I can replicate.

    The issue is the page code of the source is messing with compiler handling of that character. I have resolved. The compiler looks for the sequence of characters and resolves.

    Take this ZIP in the URL below, apply to your build. Does it resolve? Read the readme.txt

    https://1drv.ms/u/s!Ase-PX_n_4cvhP45TxJysh7wxNKMlQ?e=L5rxzG

    Let me know result.

    Evan


    HSerPrint "©2024©"
    
    
    data tables
    194,169
    end data
    

    Now creates ASM ...

    Testing DATA tables is important as we want to ensure the correct only happens within a string.

    StringTable1:
    .DB 6,169,50,48,50,52,169,0
    
    
    ;********************************************************************************
    
    ; DATA blocks. DATA blocks are contiguous and may, or may not, overlap page boundary(ies).
    TABLES:
            dw 0x00C2, 0x00A9
    
     
  • Andrew Jameson

    Andrew Jameson - 2024-08-29

    Yes, that's fixed it ! Oddly it wasn't showing on the terminal software I'm using but just occasionally it displayed two unknown characters prior to 2024.

    Project was running tight on memory - so just got a batch of 16LF18326s from Mouser, so lots more room for play - jumped from (Memory 1810/2048 words RAM 184/256 bytes) to (Memory 1824/16384 words RAM 244/2048 bytes) and power consumption remained the same.

    Thanks,
    Andrew

     
  • Anobium

    Anobium - 2024-08-29

    Great. Another improvement in the GCBASIC compiler.

    That character was being converted by the compiler by the compiler into two chars as it thinks that you are using a non standard code page for that character.


    More power needed... a 16f1455 or even a little 18f.

     
  • Anobium

    Anobium - 2024-08-29

    I have determined the root cause. The string, however it was created, used encoding UTF-16. Whilst the IDEs can cope with UTF the compiler expects ASCII or Windows-1252.

    So, I have added a warning to the compiler. To change the encodong to ASCII or Windows-1252. This is very easy in GCCODE. Press <f2>, select 'Change file encoding', then 'reopen with encoding, select Windows-1252 then press enter. And, the magic will happen and the file will show two characters where there was one.</f2>

    Sorted. This will be included in the next release.

     
  • Angel Mier

    Angel Mier - 2024-08-29

    the ide can encode in windows-1252, but I strongly suggest a change in the compiler to append text in UTF-8 encoding, this will also help users with other windows localisations and different alphabets, UTF-8 is the defacto standard for almost all of the compilers in windows and in other Unix like operating systems.

    Angel

     
    • Anobium

      Anobium - 2024-08-29

      Re UTF-8. I will add to the development backlog. We need to complete current backlog. :-)

       

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.