Menu

#292 !pragma codepage to switch script encoding on the fly

3.x Stable
closed-wont-fix
Anders
None
5
2024-05-31
2019-03-04
Anders
No

MakeNSIS defaults to parsing .NSI files as ANSI for 2.x compatibility but it would be nice to be able to switch to a specific encoding when the file is really Unicode but without a BOM.

Example script:

# Assuming .NSI file is stored as UTF-8 without BOM
MessageBox MB_OK "홍진영" ; Mojibake
!pragma codepage UTF8
MessageBox MB_OK "홍진영" ; Correct

Looking at the Microsoft toolchain, they support something similar:

RC.exe supports #pragma code_page(1234)

and

CL.exe supports the somewhat related #pragma setlocale(“.1234”) and #pragma execution_character_set("utf-8"). VS2015 added a /source-charset:.1234 command-line parameter but I'm not sure if it exists as a pragma.

The code itself is simple enough, just have to decide if this is useful and what the syntax should be.

build.cpp code:

if (line.gettoken_enum(1, _T("codepage\0")) == 0)
{
  NStreamEncoding enc = GetEncodingFromString(line.gettoken_str(2));
  if (enc.GetCodepage() == NStreamEncoding::UNKNOWN) return (warning_fl(DW_PP_PRAGMA_INVALID, _T("Ignoring invalid charset")), rvWarn);
  if (!curlinereader) return (warning_fl(DW_PP_PRAGMA_INVALID, _T("Unexpected")), rvErr);
  return (curlinereader->StreamEncoding().SetCodepage(enc.GetCodepage()), rvSucc);
}

Discussion

  • Anders

    Anders - 2024-05-30

    What about !pragma source-charset UTF8?

     

    Last edit: Anders 2024-05-30
  • Anders

    Anders - 2024-05-31

    Decided to use Python-style PEP 263 magic comment instead.

     
  • Anders

    Anders - 2024-05-31
    • status: open --> closed-wont-fix
    • assigned_to: Anders
     

Log in to post a comment.