Some ideas for improvements, noted while working
on UTF-6.1 support.
- The macro GetUniCharInfo() gets a byte from the
pageMap and immediately shifts the value
OFFSET_BITS to the left. We could as well put the
shifted value in groupMap, that saves a bit shift
for each character handled.
- The macro GetDelta() is complicated because it
compensates for compiler differences in handling
sign extension. If we change the group array
to use Bits 8-31 for the case delta, then we have
enough bits for all possible Unicode characters,
so sign extension can be compensated with
using a mask 0x1fffff, or (when handling only
the basic plane) casting to Tcl_UniChar
- The tool uniParse.tcl makes assumptions
about the content of UnicodeData.txt, but
doesn't check those assumptions. It might
be that future Unicode versions need more
categories or case types, then the tool
should warn us about that.
In order to make Unicode-related table
merges between the core branches
possible, this is meant for all open core