Re: [sdcc-devel] Using libunistring in SDCC?

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

I understand it's only to have a checkmark "yes we have that."

Depending on that in a way that the version matters or that we
deliver it in our installation packages will make serious
compatibility problems for all binary targets where they don't
exist as long our binaries don't have special dependencies.

Let me suggest an alternative approach, the way some compilers
managed to not depend on the requirements of "trigraph" processing
at the times that was considered "too costly" (which is nothing
for today's CPU's of course):

Some compilers claimed that they support it as they provided the
tools (or just mentioned that they exist?) which could process
them. But AFAIK their "core" compile executables didn't use these
tools. The users were supposed to invoke them separately if they
needed that functionality.

I suggest that SDCC does the same: that way many costs of using
these could be paid only by those who decide to need them
(effectively nobody) and the core tools remain decoupled.

I'd expect it also easier to develop/maintain if the tools
demanding such libraries are completely separately compiled and
packaged from SDCC core binaries.

--- Ursprüngliche Nachricht ---
Von: Philipp Klaus Krause <pk...@sp...>
Datum: 21.10.2025 22:03:05
An: Development chatter about sdcc <sdc...@li...>
Betreff: [sdcc-devel] Using libunistring in SDCC?

IMO, dealing with Unicode is quite complicated, and not something I want

to go too deeply into. After all, we are building a compiler, not some 
Unicode tool. But to build a compiler, these days requires some Unicode

functionality. In particular well-formedness checks (we have them), 
normalization (we don't have that), checking of properties (we don't 
have that, except for the trivial stuff).

In particular, an identifier in C23 is something that starts with a 
character with the XID_Start property or '_' (or maybe '$'), followed by

any number of characters with the XID_Continue property (or maybe '$').

Two identifiers are equivalent (ignoring the details about significant 
characters) if their identifiers are equal in Unicode normalization form

C (which is defined as Unicode decomposition followed by Unicode 
composition). The details for all this keep changing with Unicode 
standard updates.

I don't want to implement or maintain those utilities. So I suggest we 
use an existing library. Due to its wide availability (it is not just 
part of typical GNU/Linux distributions, but also available as msys2 
package for mingw, packaged for OpenBSD, FreeBSD, etc), I suggest using

GNU libunistring.

Philipp

_______________________________________________
sdcc-devel mailing list
sdc...@li...
https://lists.sourceforge.net/lists/listinfo/sdcc-devel

Re: [sdcc-devel] Using libunistring in SDCC?

The Small Device C Compiler (SDCC), targeting 8-bit architectures

Re: [sdcc-devel] Using libunistring in SDCC?