Re: [sdcc-devel] Using libunistring in SDCC?

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Regarding N2932: good that it's rejected.

Here's my perspective as a native user of a Cyrillic script:

Quoting it:

"Thus it prevents Cyrillic mixed with Latin or any other script"

If I understand that correctly, that's exactly a discrimination I
talked about. Proposed in the known political climate, from a
perspective of somebody actually using Cyrillic, it's clearly
malicious, and it should definitely stay from the compilers
themselves.

Then, if it goes through, it will be also a political, not a
reasonable technical solution, as it is effectively racist:
Forbidding mixing of scripts which actually need such mixing would
be the worse than e.g. forbidding using German and English words
for the identifiers in the same source file.

The Unicode was intentionally designed to have different codes for
the same looking glyphs. There could be external security
processes which evaluate some source codes, but implementing
actual "verbotes" in the compilers of the whole scripts is clearly
awful.

Of course as a Cyrillic-using person I could totally imagine
realistic scenarios (which would dominate most of use cases!)
where only some selected identifiers are Cyrillic. Forcing
everybody to have the whole sources only Cyrillic would
effectively result in effectively banning the mere existence of
Cyrillic in the sources.

The whole point of Unicode was allowing all the scripts
coexisting in the same document.

And I am aware that due to the political circumstances that can
even go through the standards currently. It's bad on many levels.

I hope SDCC will allow use of Cyrillic identifiers mixed even
when most are Latin, especially if N2932 is not accepted.

The "secure" shouldn't mean preventing the use of the script.

--- Ursprüngliche Nachricht ---
Von: Philipp Klaus Krause <pk...@sp...>
Datum: 23.10.2025 13:06:43
An: sdc...@li...
Betreff: Re: [sdcc-devel] Using libunistring in SDCC?

Am 23.10.25 um 10:37 schrieb Philipp Klaus Krause:
> * Regarding the security implication of unicode (e.g. homoglyph 
> attacks); Having the normalization and the checks for valid identifiers

> does help here. C23 is safer than C11 was (which AFAIK allowed more

> unicode in identifiers).
> But for a full solution we'd have to do more (N2932, rejected for C2y,

> but WG14 wanted it as TS, which so far didn't happen), but AFAIK, 
> currently libraries that implement everything we'd want for security
are 
> not that widespread (they exist, in particular libu8ident, but I don't

> think many distros package them).

Though, since we'll do the configure time check anyway, we could go for

libu8ident instead of libunistring. That would give more security in 
builds that have the library. But fewer builds would have the library, 
since it is far less common.
I guess we could link against a static libu8ident library, so there'd be

no run-time dependency, at least?

Philipp

_______________________________________________
sdcc-devel mailing list
sdc...@li...
https://lists.sourceforge.net/lists/listinfo/sdcc-devel

Re: [sdcc-devel] Using libunistring in SDCC?

The Small Device C Compiler (SDCC), targeting 8-bit architectures

Re: [sdcc-devel] Using libunistring in SDCC?